[linux-aarch64-rpi3] kernel 4.12.10-1 multiple kernel panics

Problems with packages? Post here, using [tags] of the package name.

[linux-aarch64-rpi3] kernel 4.12.10-1 multiple kernel panics

Postby aroberts » Thu Aug 31, 2017 9:36 am

The last two aarch64, RPI3 kernels have been causing kernel panics for me.
The latest kernel 4.12.10-1 this morning again caused the issue but worse than last kernel.
I'm running the machine headless (no keyboard or HDMI) over ssh. I installed the kernel, rebooted, and
after it was up I logged in. I got the prompt for the password/key, hit return and don't get any further.

This happened with previous kernel as well. With that kernel I simply unplugged device and tried again, and got logged in.
This time multiple power downs didn't help. I tried a keyboard and mouse and was able to log in ok, and connect via ssh.
HOWEVER I suspect headless is not a factor, as I've been able to reproduce what happens with the HDMI and keyboard connected multiple times.

Basically, after the system boots, the last thing it does BEFORE the console login prompt is say the link is up (I'm using DHCP which is quite slow). If everything works you then get the login prompt, and you can also login via ssh. MOST boots however show a kernel panic instead of the login prompt. This scrolls by too fast to read, I tried videoing it but still can't read it.The logs are corrupted, so no help there.

The last part of the kernel dump (manually transcribed) is: (note n_tty_receive_buf..., may be m_tty_recvive_buf, can't tell from video)
$this->bbcode_second_pass_code('', '
[<ffff0000086f1520>] n_tty_receive_buf_common+0x60/0xa20
[<ffff0000086f1f18>] n_tty_receive_buf2+0x40/0x50
[<ffff0000086f4d14>] tty_ldisc_receive_buf+0x44/0x90
[<ffff0000086f58d4>] tty_port_default_receive_buf+0x54/0xa0
[<ffff0000086f4f94>] flush_to_ldisc+0xa4/0xc0
[<ffff0000080ee63c>] process_one_work+0x19c/0x3f0
[<ffff0000080ee8dc>] worker_thread+0x4c/0x420
[<ffff0000080f5100>] kthread+0x138/0x140
[<ffff0000080833b0>] ret_from_fork+0x10/0x20
Code: 912f62e0 f9004ba0 d2044c01 8b010260 (c0dffc03)
')

The log files also contain lots of lines thus:
$this->bbcode_second_pass_code('', '
Aug 31 09:04:00 alarm systemd[1]: Started User Manager for UID 1000.
Aug 31 09:04:06 alarm kernel: i2c-bcm2835 3f805000.i2c: i2c transfer timed out
Aug 31 09:04:17 alarm kernel: i2c-bcm2835 3f805000.i2c: i2c transfer timed out
Aug 31 09:04:39 alarm kernel: i2c-bcm2835 3f805000.i2c: i2c transfer timed out
Aug 31 09:05:01 alarm kernel: i2c-bcm2835 3f805000.i2c: i2c transfer timed out
Aug 31 09:05:03 alarm sshd[389]: Accepted publickey for alarm from 192.168.1.45
')

Also other strange things in logs:
$this->bbcode_second_pass_code('', '
journalctl -b0
...
Aug 16 13:19:33 alarm kernel: Hierarchical RCU implementation.
Aug 16 13:19:33 alarm kernel: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4.
Aug 16 13:19:33 alarm kernel: RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
Aug 16 13:19:33 alarm kernel: NR_IRQS:64 nr_irqs:64 0
Aug 16 13:19:33 alarm kernel: arch_timer: WARNING: Invalid trigger for IRQ2, assuming level low
Aug 16 13:19:33 alarm kernel: arch_timer: WARNING: Please fix your firmware
Aug 16 13:19:33 alarm kernel: arch_timer: cp15 timer(s) running at 19.20MHz (phy
s).
...
')

Is the firmware/devicetree in sync with the kernel version?

Andrew
aroberts
 
Posts: 49
Joined: Tue Mar 15, 2016 4:32 am

Re: [linux-aarch64-rpi3] kernel 4.12.10-1 multiple kernel pa

Postby aroberts » Fri Sep 01, 2017 12:14 pm

Some more info/ RPI3 has no additional hardware attached (HAT's, USB, HDMI etc). Just wired ethernet in DHCP mode.

Once logged on system stays up but journal is showing errors:
$this->bbcode_second_pass_code('', '
Sep 01 06:04:30 alarm kernel: INFO: task kworker/1:1:25990 blocked for more than 120 seconds.
Sep 01 06:04:42 alarm kernel: Tainted: G W 4.12.10-1-ARCH #1
Sep 01 06:04:42 alarm kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 01 06:04:42 alarm kernel: kworker/1:1 D 0 25990 2 0x00000000
Sep 01 06:04:42 alarm kernel: Workqueue: events_freezable mmc_rescan
Sep 01 06:04:42 alarm kernel: Call trace:
Sep 01 06:04:42 alarm kernel: [<ffff0000080857f4>] __switch_to+0x6c/0x78
Sep 01 06:04:42 alarm kernel: [<ffff000008c0db38>] __schedule+0x210/0x860
Sep 01 06:04:42 alarm kernel: [<ffff000008c0e1b4>] schedule+0x2c/0x88
Sep 01 06:04:42 alarm kernel: [<ffff0000089926b4>] __mmc_claim_host+0x84/0x1c0
Sep 01 06:04:42 alarm kernel: [<ffff000008992820>] mmc_get_card+0x30/0x40
Sep 01 06:04:42 alarm kernel: [<ffff00000899c840>] mmc_sd_detect+0x20/0x78
Sep 01 06:04:42 alarm kernel: [<ffff000008996070>] mmc_rescan+0xc0/0x3a8
Sep 01 06:04:42 alarm kernel: [<ffff0000080ee63c>] process_one_work+0x19c/0x3f0
Sep 01 06:04:42 alarm kernel: [<ffff0000080ee8dc>] worker_thread+0x4c/0x428
Sep 01 06:04:42 alarm kernel: [<ffff0000080f5188>] kthread+0x138/0x140
Sep 01 06:04:42 alarm kernel: [<ffff0000080833b0>] ret_from_fork+0x10/0x20
Sep 01 06:06:30 alarm kernel: net_ratelimit: 844459 callbacks suppressed
Sep 01 06:06:34 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 01 06:08:36 alarm kernel: INFO: task kworker/1:1:25990 blocked for more than 120 seconds.
Sep 01 06:08:36 alarm kernel: Tainted: G W 4.12.10-1-ARCH #1
Sep 01 06:08:36 alarm kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 01 06:08:36 alarm kernel: kworker/1:1 D 0 25990 2 0x00000000
Sep 01 06:08:36 alarm kernel: Workqueue: events_freezable mmc_rescan
Sep 01 06:08:36 alarm kernel: Call trace:
Sep 01 06:08:36 alarm kernel: [<ffff0000080857f4>] __switch_to+0x6c/0x78
Sep 01 06:08:36 alarm kernel: [<ffff000008c0db38>] __schedule+0x210/0x860
Sep 01 06:08:36 alarm kernel: [<ffff000008c0e1b4>] schedule+0x2c/0x88
Sep 01 06:08:36 alarm kernel: [<ffff0000089926b4>] __mmc_claim_host+0x84/0x1c0
Sep 01 06:08:36 alarm kernel: [<ffff000008992820>] mmc_get_card+0x30/0x40
Sep 01 06:08:36 alarm kernel: [<ffff00000899c840>] mmc_sd_detect+0x20/0x78
Sep 01 06:08:36 alarm kernel: [<ffff000008996070>] mmc_rescan+0xc0/0x3a8
Sep 01 06:08:36 alarm kernel: [<ffff0000080ee63c>] process_one_work+0x19c/0x3f0
Sep 01 06:08:36 alarm kernel: [<ffff0000080ee8dc>] worker_thread+0x4c/0x428
Sep 01 06:08:36 alarm kernel: [<ffff0000080f5188>] kthread+0x138/0x140
Sep 01 06:08:36 alarm kernel: [<ffff0000080833b0>] ret_from_fork+0x10/0x20
')

System seems very slow and sluggish when heavily loaded (more than it used to be).
aroberts
 
Posts: 49
Joined: Tue Mar 15, 2016 4:32 am

Re: [linux-aarch64-rpi3] kernel 4.12.10-1 multiple kernel pa

Postby aroberts » Tue Sep 05, 2017 3:46 am

Issue is still there in new kernel 4.13.0-1
Boots up and after MAC becomes ready but before login prompt you get a kworker crash dump, as original post.
Hitting return on USB keyboard does not give a login prompt and remote ssh also fails to login.

Power off a few times and eventually get a login prompt at console and can also login via ssh.
aroberts
 
Posts: 49
Joined: Tue Mar 15, 2016 4:32 am

Re: [linux-aarch64-rpi3] kernel 4.12.10-1 multiple kernel pa

Postby aroberts » Tue Sep 05, 2017 4:09 am

Noticed systemd was trying to start Graphical target from log files so did:

$this->bbcode_second_pass_code('', '
systemctl set-default -f multi-user.target
')

Tried a few reboots, but issues still persist, although I got a different crash dump:
$this->bbcode_second_pass_code('', '
Sep 05 04:52:14 alarm systemd[361]: Reached target Default.
Sep 05 04:52:14 alarm systemd[361]: Startup finished in 153ms.
Sep 05 04:52:14 alarm systemd[1]: Started User Manager for UID 1000.
Sep 05 04:52:17 alarm su[370]: (to root) alarm on pts/0
Sep 05 04:52:17 alarm su[370]: pam_unix(su:session): session opened for user root by alarm(uid=1000)
Sep 05 04:52:19 alarm kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:66:crtc-2] flip_done timed out
Sep 05 04:52:19 alarm kernel: [CRTC:66:crtc-2] vblank wait timed out
Sep 05 04:52:19 alarm kernel: ------------[ cut here ]------------
Sep 05 04:52:19 alarm kernel: WARNING: CPU: 1 PID: 326 at drivers/gpu/drm/drm_atomic_helper.c:1236 drm_atomic_helper_wait_for_vblanks.part.7+0x250/0x270 [drm_kms_helper]
Sep 05 04:52:19 alarm kernel: Modules linked in: brcmfmac brcmutil cfg80211 vc4 rfkill smsc95xx drm_kms_helper usbnet mii joydev drm syscopyarea sysfillrect sysimgblt fb_sys_fops pwm_bcm2835 i2c_bcm2835 bcm2835_wdt
Sep 05 04:52:19 alarm kernel: CPU: 1 PID: 326 Comm: (agetty) Tainted: G W 4.13.0-1-ARCH #1
Sep 05 04:52:19 alarm kernel: Hardware name: Raspberry Pi 3 Model B (DT)
Sep 05 04:52:19 alarm kernel: task: ffff800034631c00 task.stack: ffff800033e60000
Sep 05 04:52:19 alarm kernel: PC is at drm_atomic_helper_wait_for_vblanks.part.7+0x250/0x270 [drm_kms_helper]
Sep 05 04:52:19 alarm kernel: LR is at drm_atomic_helper_wait_for_vblanks.part.7+0x250/0x270 [drm_kms_helper]
Sep 05 04:52:19 alarm kernel: pc : [<ffff000000d5d238>] lr : [<ffff000000d5d238>] pstate: 20000145
Sep 05 04:52:19 alarm kernel: sp : ffff800033e637d0
Sep 05 04:52:19 alarm kernel: x29: ffff800033e637d0 x28: 000000000000000b
Sep 05 04:52:19 alarm kernel: x27: 0000000000000070 x26: 0000000000000000
Sep 05 04:52:19 alarm kernel: x25: 0000000000000001 x24: ffff800033de9000
Sep 05 04:52:19 alarm kernel: x23: 0000000000000004 x22: 0000000000000038
Sep 05 04:52:19 alarm kernel: x21: ffff8000314c3480 x20: ffff8000362dc028
Sep 05 04:52:19 alarm kernel: x19: 0000000000000002 x18: 0000000000000001
Sep 05 04:52:19 alarm kernel: x17: 0000ffffad66f588 x16: ffff0000080d9f70
Sep 05 04:52:19 alarm kernel: x15: ffffffffffffffff x14: ffff000009541f50
Sep 05 04:52:19 alarm kernel: x13: ffff000009541b78 x12: ffff000009331000
Sep 05 04:52:19 alarm kernel: x11: 0000000000000000 x10: ffff000009541000
Sep 05 04:52:19 alarm kernel: x9 : 0000000000000000 x8 : ffff00000954ac00
Sep 05 04:52:19 alarm kernel: x7 : 0000000000000000 x6 : 0000000008742d76
Sep 05 04:52:19 alarm kernel: x5 : 00ffffffffffffff x4 : 0000000000000000
Sep 05 04:52:19 alarm kernel: x3 : 0000000000000000 x2 : ffffffffffffffff
Sep 05 04:52:19 alarm kernel: x1 : 0000800032ca3000 x0 : 0000000000000026
Sep 05 04:52:19 alarm kernel: Call trace:
Sep 05 04:52:19 alarm kernel: Exception stack(0xffff800033e63600 to 0xffff800033e63730)
Sep 05 04:52:19 alarm kernel: 3600: 0000000000000002 0001000000000000 ffff800033e637d0 ffff000000d5d238
Sep 05 04:52:19 alarm kernel: 3620: ffff800033e637d0 ffff800033e637d0 ffff800033e63790 00000000ffffffc8
Sep 05 04:52:19 alarm kernel: 3640: ffff800033e63680 ffff00000813542c ffff800033e63750 ffff8000362dc028
Sep 05 04:52:19 alarm kernel: 3660: ffff800033e637d0 ffff800033e637d0 ffff800033e63790 00000000ffffffc8
Sep 05 04:52:19 alarm kernel: 3680: ffff800033e63730 ffff000008134ecc ffff000000d67558 ffff8000361ce200
Sep 05 04:52:19 alarm kernel: 36a0: 0000000000000026 0000800032ca3000 ffffffffffffffff 0000000000000000
Sep 05 04:52:19 alarm kernel: 36c0: 0000000000000000 00ffffffffffffff 0000000008742d76 0000000000000000
Sep 05 04:52:19 alarm kernel: 36e0: ffff00000954ac00 0000000000000000 ffff000009541000 0000000000000000
Sep 05 04:52:19 alarm kernel: 3700: ffff000009331000 ffff000009541b78 ffff000009541f50 ffffffffffffffff
Sep 05 04:52:19 alarm kernel: 3720: ffff0000080d9f70 0000ffffad66f588
Sep 05 04:52:19 alarm kernel: [<ffff000000d5d238>] drm_atomic_helper_wait_for_vblanks.part.7+0x250/0x270 [drm_kms_helper]
Sep 05 04:52:19 alarm kernel: [<ffff000000d5d288>] drm_atomic_helper_wait_for_vblanks+0x30/0x40 [drm_kms_helper]
Sep 05 04:52:19 alarm kernel: [<ffff000000dc9920>] vc4_atomic_complete_commit+0x88/0xe0 [vc4]
Sep 05 04:52:19 alarm kernel: [<ffff000000dc9ad4>] vc4_atomic_commit+0x15c/0x1d8 [vc4]
Sep 05 04:52:19 alarm kernel: [<ffff000000cb59ac>] drm_atomic_commit+0x54/0x70 [drm]
Sep 05 04:52:19 alarm kernel: [<ffff000000d615e4>] drm_fb_helper_pan_display+0x194/0x2b8 [drm_kms_helper]
Sep 05 04:52:19 alarm kernel: [<ffff00000863d744>] fb_pan_display+0x9c/0x120
Sep 05 04:52:19 alarm kernel: [<ffff0000086380d8>] bit_update_start+0x28/0x50
Sep 05 04:52:19 alarm kernel: [<ffff000008635298>] fbcon_switch+0x338/0x548
Sep 05 04:52:19 alarm kernel: [<ffff0000087175f0>] redraw_screen+0x148/0x248
Sep 05 04:52:19 alarm kernel: [<ffff00000871784c>] csi_J+0x15c/0x160
Sep 05 04:52:19 alarm kernel: [<ffff00000871b368>] do_con_trol+0x1258/0x1480
Sep 05 04:52:19 alarm kernel: [<ffff00000871b734>] do_con_write.part.15+0x1a4/0x948
Sep 05 04:52:19 alarm kernel: [<ffff00000871bfe0>] con_write+0x90/0x98
Sep 05 04:52:19 alarm kernel: [<ffff000008703574>] n_tty_write+0x184/0x410
Sep 05 04:52:19 alarm kernel: [<ffff0000086fedc8>] tty_write+0x128/0x2d8
Sep 05 04:52:19 alarm kernel: [<ffff00000826e478>] __vfs_write+0x48/0x130
Sep 05 04:52:19 alarm kernel: [<ffff00000826f99c>] vfs_write+0xa4/0x1b0
Sep 05 04:52:19 alarm kernel: [<ffff000008270b14>] SyS_write+0x54/0xb0
Sep 05 04:52:19 alarm kernel: [<ffff0000080833f0>] el0_svc_naked+0x24/0x28
Sep 05 04:52:19 alarm kernel: ---[ end trace d966b32a5fb42892 ]---
Sep 05 04:52:19 alarm kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:66:crtc-2] flip_done timed out
')

Another reboot, got the original tty crash again, then lost display and got this crash. Login only fails with the tty crash.
$this->bbcode_second_pass_code('', '
[<ffff00000871f114>] uart_write_room+0x1c/0x188
[<ffff0000087058fc>] tty_write_room+0x2c/0x40
[<ffff000008702394>] __process_echos+0x34/0x278
[<ffff000008704924>] n_tty_receive_buf_common+0x394/0xa20
...
')

Again this crash is manually transcribed as I can't login and log's are trashed after pulling the plug.
aroberts
 
Posts: 49
Joined: Tue Mar 15, 2016 4:32 am

Re: [linux-aarch64-rpi3] kernel 4.12.10-1 multiple kernel pa

Postby aroberts » Sat Sep 09, 2017 7:33 am

I've done a fresh install, and also updated to latest kernel.
I'm still seeing issues.

When under heavy load the problems seem worse.

I've had issues with an odroid xu4, which turned out to be the CPU governor set to performance by default. Switching to ondemand solved that issue. The RPI3 under aarch64 doesn't seem to be populating the cpu governor or temperature files in /sys:
$this->bbcode_second_pass_code('', '
/sys/class/thermal/thermal_zone0
and
/sys/devices/system/cpu/cpufreq/policy0
')

So I can't see if thermal/power issues are contributing to this.
aroberts
 
Posts: 49
Joined: Tue Mar 15, 2016 4:32 am

Re: [linux-aarch64-rpi3] kernel 4.12.10-1 multiple kernel pa

Postby aroberts » Wed Sep 13, 2017 10:54 am

With the latest 4.13.1-1-ARCH kernel, things now look ok. it seems stable even while building gcc.

There remain a couple of things which look odd in the journal during boot, though.

$this->bbcode_second_pass_code('', '
Aug 16 13:19:32 alarm kernel: Booting Linux on physical CPU 0x0
Aug 16 13:19:32 alarm kernel: Linux version 4.13.1-1-ARCH (builduser@leming) (gcc version 7.2.0 (GCC)) #1 SMP Sun Sep 10 14:34:41 MDT 2017
Aug 16 13:19:32 alarm kernel: Boot CPU: AArch64 Processor [410fd034]
Aug 16 13:19:32 alarm kernel: Machine model: Raspberry Pi 3 Model B
...
Aug 16 13:19:32 alarm kernel: NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
Aug 16 13:19:32 alarm kernel: arch_timer: WARNING: Invalid trigger for IRQ2, assuming level low
Aug 16 13:19:32 alarm kernel: arch_timer: WARNING: Please fix your firmware
Aug 16 13:19:32 alarm kernel: arch_timer: cp15 timer(s) running at 19.20MHz (phys).
...
Aug 16 13:19:32 alarm kernel: smp: Bringing up secondary CPUs ...
Aug 16 13:19:32 alarm kernel: Detected VIPT I-cache on CPU1
Aug 16 13:19:32 alarm kernel: arch_timer: WARNING: Invalid trigger for IRQ2, assuming level low
Aug 16 13:19:32 alarm kernel: arch_timer: WARNING: Please fix your firmware
Aug 16 13:19:32 alarm kernel: CPU1: Booted secondary processor [410fd034]
Aug 16 13:19:32 alarm kernel: Detected VIPT I-cache on CPU2
Aug 16 13:19:32 alarm kernel: arch_timer: WARNING: Invalid trigger for IRQ2, assuming level low
Aug 16 13:19:32 alarm kernel: arch_timer: WARNING: Please fix your firmware
Aug 16 13:19:32 alarm kernel: CPU2: Booted secondary processor [410fd034]
Aug 16 13:19:32 alarm kernel: Detected VIPT I-cache on CPU3
Aug 16 13:19:32 alarm kernel: arch_timer: WARNING: Invalid trigger for IRQ2, assuming level low
Aug 16 13:19:32 alarm kernel: arch_timer: WARNING: Please fix your firmware
Aug 16 13:19:32 alarm kernel: CPU3: Booted secondary processor [410fd034]
Aug 16 13:19:32 alarm kernel: smp: Brought up 1 node, 4 CPUs
')

Then later on in boot:
$this->bbcode_second_pass_code('', '
Aug 16 13:19:32 alarm kernel: EDAC MC: Ver: 3.0.0
Aug 16 13:19:32 alarm kernel: dmi: Firmware registration failed.
Aug 16 13:19:32 alarm kernel: Advanced Linux Sound Architecture Driver Initialized.
')

Then later on some module fails to load, is it built from arm rather than aarch64?
$this->bbcode_second_pass_code('', '
Aug 16 13:19:32 alarm kernel: random: fast init done
Aug 16 13:19:32 alarm kernel: modprobe[65]: undefined instruction: pc=0000ffff8caf46dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[66]: undefined instruction: pc=0000ffff921b36dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[70]: undefined instruction: pc=0000ffff9c42b6dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[71]: undefined instruction: pc=0000ffff9e6006dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[75]: undefined instruction: pc=0000ffffa185e6dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[76]: undefined instruction: pc=0000ffff9bd3a6dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[80]: undefined instruction: pc=0000ffffa08e76dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[81]: undefined instruction: pc=0000ffff986da6dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[98]: undefined instruction: pc=0000ffff879636dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: modprobe[99]: undefined instruction: pc=0000ffffb40496dc
Aug 16 13:19:32 alarm kernel: Code: d503201f 8a180320 92750001 365ffc20 (d5380001)
Aug 16 13:19:32 alarm kernel: workingset: timestamp_bits=46 max_order=18 bucket_order=0
Aug 16 13:19:32 alarm kernel: zbud: loaded
')
aroberts
 
Posts: 49
Joined: Tue Mar 15, 2016 4:32 am

Re: [linux-aarch64-rpi3] kernel 4.12.10-1 multiple kernel pa

Postby aroberts » Thu Sep 14, 2017 3:01 am

OK I jumped the gun here, I'm still seeing problems.

After the last kernel update I build gcc twice, and got no errors from the kernel.
BUT I built it using make -j1, just using a single core.

I rebuilt again last night with -j2, and the problems came straight back. I doubt this is a thermal issue, with only 2 cores involved.
The errors I was getting are:

Initially about 2 million of these, between 20:35 and 20:45
$this->bbcode_second_pass_code('', '
Sep 13 20:35:53 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 13 20:36:23 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
...
Sep 13 20:43:23 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 13 20:44:44 alarm kernel: net_ratelimit: 1154844 callbacks suppressed
Sep 13 20:44:55 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
...
Sep 13 20:45:00 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
Sep 13 20:45:00 alarm kernel: smsc95xx 1-1.1:1.0 eth0: kevent 2 may have been dropped
')

Then at 20:50:
$this->bbcode_second_pass_code('', '
Sep 13 20:50:11 alarm kernel: Not tainted 4.13.1-1-ARCH #1
Sep 13 20:50:11 alarm kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 13 20:50:11 alarm kernel: kworker/1:0 D 0 9088 2 0x00000000
Sep 13 20:50:11 alarm kernel: Workqueue: events_freezable mmc_rescan
Sep 13 20:50:11 alarm kernel: Call trace:
Sep 13 20:50:11 alarm kernel: [<ffff0000080856e4>] __switch_to+0x6c/0x78
Sep 13 20:50:11 alarm kernel: [<ffff000008c4b6d0>] __schedule+0x210/0x860
Sep 13 20:50:11 alarm kernel: [<ffff000008c4bd4c>] schedule+0x2c/0x88
Sep 13 20:50:11 alarm kernel: [<ffff0000089b632c>] __mmc_claim_host+0x84/0x1c0
Sep 13 20:50:11 alarm kernel: [<ffff0000089b6498>] mmc_get_card+0x30/0x40
Sep 13 20:50:11 alarm kernel: [<ffff0000089c02d8>] mmc_sd_detect+0x20/0x78
Sep 13 20:50:11 alarm kernel: [<ffff0000089b9728>] mmc_rescan+0xc0/0x3b0
Sep 13 20:50:11 alarm kernel: [<ffff0000080ef39c>] process_one_work+0x19c/0x3f0
Sep 13 20:50:11 alarm kernel: [<ffff0000080ef63c>] worker_thread+0x4c/0x428
Sep 13 20:50:11 alarm kernel: [<ffff0000080f5f10>] kthread+0x138/0x140
Sep 13 20:50:11 alarm kernel: [<ffff0000080833b0>] ret_from_fork+0x10/0x20
Sep 13 20:50:11 alarm kernel: INFO: task kworker/1:0:9088 blocked for more than 120 seconds.
Sep 13 20:50:11 alarm kernel: Not tainted 4.13.1-1-ARCH #1
Sep 13 20:50:11 alarm kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 13 20:50:11 alarm kernel: kworker/1:0 D 0 9088 2 0x00000000
Sep 13 20:50:11 alarm kernel: Workqueue: events_freezable mmc_rescan
Sep 13 20:50:11 alarm kernel: Call trace:
Sep 13 20:50:11 alarm kernel: [<ffff0000080856e4>] __switch_to+0x6c/0x78
Sep 13 20:50:11 alarm kernel: [<ffff000008c4b6d0>] __schedule+0x210/0x860
Sep 13 20:50:11 alarm kernel: [<ffff000008c4bd4c>] schedule+0x2c/0x88
Sep 13 20:50:11 alarm kernel: [<ffff0000089b632c>] __mmc_claim_host+0x84/0x1c0
Sep 13 20:50:11 alarm kernel: [<ffff0000089b6498>] mmc_get_card+0x30/0x40
Sep 13 20:50:11 alarm kernel: [<ffff0000089c02d8>] mmc_sd_detect+0x20/0x78
Sep 13 20:50:11 alarm kernel: [<ffff0000089b9728>] mmc_rescan+0xc0/0x3b0
Sep 13 20:50:11 alarm kernel: [<ffff0000080ef39c>] process_one_work+0x19c/0x3f0
Sep 13 20:50:11 alarm kernel: [<ffff0000080ef63c>] worker_thread+0x4c/0x428
Sep 13 20:50:11 alarm kernel: [<ffff0000080f5f10>] kthread+0x138/0x140
Sep 13 20:50:11 alarm kernel: [<ffff0000080833b0>] ret_from_fork+0x10/0x20
Sep 13 20:50:11 alarm systemd[1]: systemd-journald.service: Main process exited, code=dumped, status=6/ABRT
Sep 13 20:50:11 alarm systemd[1]: systemd-journald.service: Unit entered failed state.
Sep 13 20:50:11 alarm systemd[1]: systemd-journald.service: Failed with result 'watchdog'.
Sep 13 20:50:11 alarm systemd[1]: systemd-journald.service: Service has no hold-off time, scheduling restart.
Sep 13 20:50:11 alarm systemd[1]: Stopped Flush Journal to Persistent Storage.
Sep 13 20:50:11 alarm systemd[1]: Stopping Flush Journal to Persistent Storage...
Sep 13 20:50:11 alarm systemd-coredump[9096]: MESSAGE=Process 242 (systemd-journal) of user 0 dumped core.
Sep 13 20:50:11 alarm systemd-coredump[9096]: Coredump diverted to /var/lib/systemd/coredump/core.systemd-journal.0.5e29610cc71341b9995f4d652279b992.242.1505332199000000.lz4
Sep 13 20:50:11 alarm systemd-coredump[9096]: Stack trace of thread 242:
Sep 13 20:50:11 alarm systemd-coredump[9096]: #0 0x0000ffffbb3a401c pthread_join (libpthread.so.0)
Sep 13 20:50:11 alarm systemd-coredump[9096]: #1 0x0000ffffbb4b2ec4 n/a (libsystemd-shared-234.so)
Sep 13 20:50:11 alarm systemd-coredump[9096]: #2 0x0000ffffbb4b2ec4 n/a (libsystemd-shared-234.so)
Sep 13 20:50:11 alarm systemd-coredump[9096]: #3 0x0000ffffbb4b2fa8 n/a (libsystemd-shared-234.so)
Sep 13 20:50:11 alarm systemd-coredump[9096]: #4 0x0000ffffbb4b3e60 journal_file_append_object (libsystemd-shared-234.so)
Sep 13 20:50:11 alarm systemd-coredump[9096]: #5 0x0000ffffbb4b4d58 n/a (libsystemd-shared-234.so)
Sep 13 20:50:11 alarm systemd-coredump[9096]: #6 0x0000ffffbb4b70fc journal_file_append_entry (libsystemd-shared-234.so)
Sep 13 20:50:11 alarm kernel: systemd-coredum: 9 output lines suppressed due to ratelimiting
Sep 13 20:50:11 alarm systemd[1]: Stopped Journal Service.
Sep 13 20:50:11 alarm systemd[1]: Starting Journal Service...
Sep 13 20:50:11 alarm systemd-journald[9100]: Journal started
Sep 13 20:50:11 alarm systemd-journald[9100]: System journal (/var/log/journal/46b12ab949134c208562ecd0d1b27bbc) is 112.0M, max 2.3G, 2.2G free.
Sep 13 20:49:27 alarm systemd[1]: systemd-journald.service: Watchdog timeout (limit 3min)!
Sep 13 20:49:27 alarm systemd[1]: systemd-journald.service: Killing process 242 (systemd-journal) with signal SIGABRT.
Sep 13 20:50:11 alarm systemd[1]: Started Journal Service.
Sep 13 20:50:11 alarm systemd[1]: Starting Flush Journal to Persistent Storage...
Sep 13 20:50:11 alarm systemd[1]: Started Flush Journal to Persistent Storage.
')

Initially when this first started I was building gcc with make -j4, but backed off due to excessive memory use and swapping.
So the problem seem only to happen when using multiple cores.

I'll rebuild using -j4, and see what issues that creates.
aroberts
 
Posts: 49
Joined: Tue Mar 15, 2016 4:32 am


Return to Packages

Who is online

Users browsing this forum: No registered users and 38 guests