Clearfog Pro - New Oops: 37 on 5.16.2-2

This forum is for supported devices using an ARMv7 Marvell SoC.

Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby Mettacrawler » Mon Jan 24, 2022 12:16 am

$this->bbcode_second_pass_code('', '
[ 0.000000] Linux version 5.16.2-2-ARCH (builduser@leming) (armv7l-unknown-linux-gnueabihf-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 SMP PREEMPT Sat
Jan 22 03:08:40 UTC 2022
[ 0.000000] CPU: ARMv7 Processor [414fc091] revision 1 (ARMv7), cr=10c5387d
[ 0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[ 0.000000] OF: fdt: Machine model: SolidRun Clearfog Pro A1

...skip ahead...

[ 11.417197] Unable to handle kernel paging request at virtual address fffffff4
[ 11.425720] [fffffff4] *pgd=2fffd861, *pte=00000000, *ppte=00000000
[ 11.432146] Internal error: Oops: 37 [#1] PREEMPT SMP ARM
')

u-boot default kernel and dtb load addresses may need to be changed.
Mettacrawler
 
Posts: 56
Joined: Sun Mar 18, 2018 7:19 pm

Re: Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby moonman » Mon Jan 24, 2022 7:24 am

Hmm, I can't reproduce on my board.
$this->bbcode_second_pass_code('', '[oleg@ClearFog ~]$ uname -a
Linux ClearFog 5.16.2-2-ARCH #1 SMP PREEMPT Sat Jan 22 03:08:40 UTC 2022 armv7l GNU/Linux')

Have you added anything to /boot/uEnv.txt by chance?
Pogoplug V4 | GoFlex Home | Raspberry Pi 4 4GB | CuBox-i4 Pro | ClearFog | BeagleBone Black | Odroid U2 | Odroid C1 | Odroid XU4
-----------------------------------------------------------------------------------------------------------------------
[armv5] Updated U-Boot | [armv5] NAND Rescue System
moonman
Developer
 
Posts: 3387
Joined: Sat Jan 15, 2011 3:36 am

Re: Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby Mettacrawler » Mon Jan 24, 2022 9:51 pm

No, it's the way it came, all comments.
$this->bbcode_second_pass_code('', '
$ cat /boot/uEnv.txt
## Tell kernel where rootfs is located
# root=/dev/mmcblk0p1

## Optional kernel arguments
# optargs=elevator=cfq

## Name of a different dtb file. Must be in /boot/dtbs/
# fdtfile=armada-388-clearfog-base.dtb
')
Mettacrawler
 
Posts: 56
Joined: Sun Mar 18, 2018 7:19 pm

Re: Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby Mettacrawler » Mon Jan 24, 2022 10:31 pm

The Clearfog Pro continues to run despite the Oops: 37. It's not a fatal error.

The lan5 is the only port on the switch with a device plugged in when this happened. Ignore the "degraded" messages, it's a systemd issue.

$this->bbcode_second_pass_code('', '
$ networkctl -a
IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 br0 bridge degraded-carrier configured
3 eth0 ether routable configured
4 eth1 ether degraded configured
5 eth2 ether off unmanaged
6 wlan0 wlan enslaved unmanaged
7 wlan1 wlan enslaved unmanaged
8 lan5 dsa enslaved configured
9 lan4 dsa no-carrier configured
10 lan3 dsa no-carrier configured
11 lan2 dsa no-carrier configured
12 lan1 dsa no-carrier configured
13 lan6 dsa off unmanaged

13 links listed.
')

The Oops happens while configuring the DSA.

$this->bbcode_second_pass_code('', '
Jan 23 18:40:57 example.com systemd-networkd[232]: eth1: Gained carrier
Jan 23 18:40:57 example.com kernel: mv88e6085 f1072004.mdio-mii:04 lan5 (uninitialized): PHY [mv88e6xxx-7:00] driver [Marvell 88E1540] (irq=72)
Jan 23 18:40:57 example.com systemd[1]: Starting Hostapd IEEE 802.11 AP, IEEE 802.1X/WPA/WPA2/EAP/RADIUS Authenticator...
Jan 23 18:40:57 example.com kernel: br0: port 1(lan5) entered blocking state
Jan 23 18:40:57 example.com kernel: br0: port 1(lan5) entered disabled state
Jan 23 18:40:57 example.com kernel: device eth1 entered promiscuous mode
Jan 23 18:40:57 example.com kernel: 8<--- cut here ---
Jan 23 18:40:57 example.com audit: ANOM_PROMISCUOUS dev=eth1 prom=256 old_prom=0 auid=4294967295 uid=0 gid=0 ses=4294967295
Jan 23 18:40:57 example.com kernel: Unable to handle kernel paging request at virtual address fffffff4
Jan 23 18:40:57 example.com kernel: [fffffff4] *pgd=2fffd861, *pte=00000000, *ppte=00000000
Jan 23 18:40:58 example.com kernel: Internal error: Oops: 37 [#1] PREEMPT SMP ARM
Jan 23 18:40:58 example.com kernel: Modules linked in: rt2800usb rt2x00usb rt2800lib rt2x00lib mac80211 libarc4 tag_dsa marvell_cesa mvneta mcp3021 mvneta_bm orion_wdt phy_armada38x_comphy evdev uio_pdrv_genirq uio sfp mdio_i2c cfg80211 rfkill sch_fq_codel mv88e6xxx dsa_core hsr bridge stp llc phylink fuse ip_tables x_tables
Jan 23 18:40:58 example.com kernel: device lan5 entered promiscuous mode
Jan 23 18:40:58 example.com kernel: CPU: 0 PID: 62 Comm: kworker/u4:2 Not tainted 5.16.2-2-ARCH #1
Jan 23 18:40:58 example.com kernel: Hardware name: Marvell Armada 380/385 (Device Tree)
Jan 23 18:40:58 example.com kernel: Workqueue: dsa_ordered dsa_slave_switchdev_event_work [dsa_core]
Jan 23 18:40:58 example.com kernel: mv88e6085 f1072004.mdio-mii:04 lan4 (uninitialized): PHY [mv88e6xxx-7:01] driver [Marvell 88E1540] (irq=73)
Jan 23 18:40:58 example.com kernel: PC is at dsa_port_do_fdb_add+0x50/0x1a0 [dsa_core]
Jan 23 18:40:58 example.com kernel: LR is at dsa_port_do_fdb_add+0x38/0x1a0 [dsa_core]
Jan 23 18:40:58 example.com kernel: pc : [<bf0b10b0>] lr : [<bf0b1098>] psr: a0070013
Jan 23 18:40:58 example.com kernel: sp : c242fe40 ip : c4697264 fp : c25a6900
Jan 23 18:40:58 example.com kernel: r10: c447c060 r9 : 00000005 r8 : c3290f38
Jan 23 18:40:58 example.com kernel: r7 : c41664c0 r6 : 00000001 r5 : c3290e00 r4 : c3290f4c
Jan 23 18:40:58 example.com kernel: r3 : c25a6900 r2 : fffffff4 r1 : 0000fb18 r0 : f3edea4e
Jan 23 18:40:58 example.com kernel: Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Jan 23 18:40:58 example.com kernel: Control: 10c5387d Table: 03af804a DAC: 00000051
Jan 23 18:40:58 example.com kernel: Register r0 information: non-paged memory
Jan 23 18:40:58 example.com kernel: Register r1 information: non-paged memory
Jan 23 18:40:58 example.com kernel: Register r2 information: non-paged memory
Jan 23 18:40:58 example.com kernel: Register r3 information: slab task_struct start c25a6900 pointer offset 0
Jan 23 18:40:58 example.com kernel: Register r4 information: slab kmalloc-512 start c3290e00 pointer offset 332 size 512
Jan 23 18:40:58 example.com kernel: Register r5 information: slab kmalloc-512 start c3290e00 pointer offset 0 size 512
Jan 23 18:40:58 example.com kernel: Register r6 information: non-paged memory
Jan 23 18:40:58 example.com kernel: Register r7 information: slab kmalloc-192 start c4166480 pointer offset 64 size 192
Jan 23 18:40:58 example.com kernel: Register r8 information: slab kmalloc-512 start c3290e00 pointer offset 312 size 512
Jan 23 18:40:58 example.com kernel: mv88e6085 f1072004.mdio-mii:04 lan5: configuring for phy/gmii link mode
')

The stack trace shows it was in dsa_port_do_fdb_add()

$this->bbcode_second_pass_code('', '
Jan 23 18:40:58 example.com kernel: [<bf0b10b0>] (dsa_port_do_fdb_add [dsa_core]) from [<bf0b1ffc>] (dsa_switch_event+0xb5c/0xe28 [dsa_core])
Jan 23 18:40:58 example.com kernel: [<bf0b1ffc>] (dsa_switch_event [dsa_core]) from [<c03678d4>] (raw_notifier_call_chain+0x34/0x68)
Jan 23 18:40:58 example.com kernel: [<c03678d4>] (raw_notifier_call_chain) from [<bf0aa7d8>] (dsa_tree_notify+0xc/0x20 [dsa_core])
Jan 23 18:40:58 example.com kernel: [<bf0aa7d8>] (dsa_tree_notify [dsa_core]) from [<bf0ac930>] (dsa_port_host_fdb_add+0x64/0x88 [dsa_core])
Jan 23 18:40:58 example.com kernel: [<bf0ac930>] (dsa_port_host_fdb_add [dsa_core]) from [<bf0b03a8>] (dsa_slave_switchdev_event_work+0x1e4/0x25c [dsa_core])
Jan 23 18:40:58 example.com kernel: [<bf0b03a8>] (dsa_slave_switchdev_event_work [dsa_core]) from [<c035e44c>] (process_one_work+0x1f4/0x434)
Jan 23 18:40:58 example.com kernel: [<c035e44c>] (process_one_work) from [<c035e6dc>] (worker_thread+0x50/0x53c)
Jan 23 18:40:58 example.com kernel: [<c035e6dc>] (worker_thread) from [<c036627c>] (kthread+0x150/0x184)
Jan 23 18:40:58 example.com kernel: [<c036627c>] (kthread) from [<c03001a8>] (ret_from_fork+0x14/0x2c)
Jan 23 18:40:58 example.com kernel: Exception stack(0xc242ffb0 to 0xc242fff8)
')
Mettacrawler
 
Posts: 56
Joined: Sun Mar 18, 2018 7:19 pm

Re: Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby moonman » Fri Jan 28, 2022 8:25 pm

I still do not see the oops, but I don't configure DSA switch in any way right now.

5.16.3 has been released with some fixes for DSA, and there is also linux-armv7-rc kernel that you can try. See if the new kernel fixes the issue for you. If neither works, you may want to roll back the kernel and wait for a fix upstream.
Pogoplug V4 | GoFlex Home | Raspberry Pi 4 4GB | CuBox-i4 Pro | ClearFog | BeagleBone Black | Odroid U2 | Odroid C1 | Odroid XU4
-----------------------------------------------------------------------------------------------------------------------
[armv5] Updated U-Boot | [armv5] NAND Rescue System
moonman
Developer
 
Posts: 3387
Joined: Sat Jan 15, 2011 3:36 am

Re: Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby Mettacrawler » Wed Feb 23, 2022 7:31 pm

Thanks. The DSA oops is still present in Linux 5.16.10-1 and doesn't seem to have any consequences, everything still works with respect to the DSA and networking.
Mettacrawler
 
Posts: 56
Joined: Sun Mar 18, 2018 7:19 pm

Re: Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby Mettacrawler » Fri Mar 04, 2022 9:50 pm

I didn't see an Oops on 5.16.12-1-ARCH. Guess they fixed it upstream.
Mettacrawler
 
Posts: 56
Joined: Sun Mar 18, 2018 7:19 pm

Re: Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby moonman » Sat Mar 05, 2022 4:52 am

Cheers, thanks for the feedback.
Pogoplug V4 | GoFlex Home | Raspberry Pi 4 4GB | CuBox-i4 Pro | ClearFog | BeagleBone Black | Odroid U2 | Odroid C1 | Odroid XU4
-----------------------------------------------------------------------------------------------------------------------
[armv5] Updated U-Boot | [armv5] NAND Rescue System
moonman
Developer
 
Posts: 3387
Joined: Sat Jan 15, 2011 3:36 am

Re: Clearfog Pro - New Oops: 37 on 5.16.2-2

Postby tron » Tue Mar 08, 2022 5:59 am

Greetings,

First, thank you for helping with maintenance. Unfortunately, the oops still seems to be there on 5.16.12-1.

$this->bbcode_second_pass_code('', '
[ 11.206363] 8<--- cut here ---
[ 11.209698] Unable to handle kernel paging request at virtual address fffffff4
[ 11.223470] [fffffff4] *pgd=2fffd861, *pte=00000000, *ppte=00000000
[ 11.229867] Internal error: Oops: 37 [#1] PREEMPT SMP ARM
[ 11.235286] Modules linked in: ath10k_pci ath10k_core ath mac80211 btusb btrtl btbcm btintel libarc4 bluetooth tag_dsa ecdh_generic ecc mvneta marvell_cesa
mvneta_bm mcp3021 orion_wdt phy_armada38x_comphy evdev sfp uio_pdrv_genirq uio mdio_i2c cfg80211 rfkill sch_fq_codel mv88e6xxx dsa_core hsr bridge stp llc phyl
ink fuse ip_tables x_tables
[ 11.265582] CPU: 0 PID: 69 Comm: kworker/u4:3 Not tainted 5.16.12-1-ARCH #1
[ 11.272564] Hardware name: Marvell Armada 380/385 (Device Tree)
[ 11.278497] Workqueue: dsa_ordered dsa_slave_switchdev_event_work [dsa_core]
[ 11.285619] PC is at dsa_port_do_fdb_add+0x50/0x1a0 [dsa_core]
[ 11.291496] LR is at dsa_port_do_fdb_add+0x38/0x1a0 [dsa_core]
[ 11.297371] pc : [<bf0b30dc>] lr : [<bf0b30c4>] psr: a0080113
[ 11.303652] sp : c26a5e40 ip : 00000000 fp : c26d0000
[ 11.308888] r10: c5055be0 r9 : 00000005 r8 : c21b7b38
[ 11.314124] r7 : c3e1fa00 r6 : 00000001 r5 : c21b7a00 r4 : c21b7b4c
[ 11.320666] r3 : c26d0000 r2 : 00006ec6 r1 : 8f030582 r0 : fffffff4
[ 11.327208] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 11.334362] Control: 10c5387d Table: 03ec004a DAC: 00000051
[ 11.340119] Register r0 information: non-paged memory
[ 11.345185] Register r1 information: non-paged memory
[ 11.350248] Register r2 information: non-paged memory
[ 11.355311] Register r3 information: slab task_struct start c26d0000 pointer offset 0
[ 11.363166] Register r4 information: slab kmalloc-512 start c21b7a00 pointer offset 332 size 512
[ 11.371980] Register r5 information: slab kmalloc-512 start c21b7a00 pointer offset 0 size 512
[ 11.380618] Register r6 information: non-paged memory
[ 11.385681] Register r7 information: slab kmalloc-192 start c3e1f9c0 pointer offset 64 size 192
[ 11.394407] Register r8 information: slab kmalloc-512 start c21b7a00 pointer offset 312 size 512
[ 11.403220] Register r9 information: non-paged memory
[ 11.408283] Register r10 information: slab kmalloc-64 start c5055bc0 pointer offset 32 size 64
[ 11.416922] Register r11 information: slab task_struct start c26d0000 pointer offset 0
[ 11.424862] Register r12 information: NULL pointer
[ 11.429663] Process kworker/u4:3 (pid: 69, stack limit = 0x15b8dcb3)
[ 11.436034] Stack: (0xc26a5e40 to 0xc26a6000)
[ 11.440402] 5e40: c3db91c0 c3e1fa00 c3e1fa10 c26a5ed4 c21b7a00 c5055be0 00000080 c31ed305
[ 11.448600] 5e60: c26d0000 bf0b3fc8 86e56e2c 00000001 00000000 eedd0880 c26a5f44 c037b028
[ 11.456797] 5e80: 00000001 c20176c0 c1c284c0 c32dae00 c1a11530 c0d751c5 00000000 ffffffff
[ 11.464994] 5ea0: 00000005 c26a5ed4 c5055be0 c036861c c3d96400 c3e1fa00 c5055bc0 c5055bcc
[ 11.473192] 5ec0: c5055be0 bf0ac7f4 c3d96400 bf0ae96c ffffffff 00000000 00000000 c5055be0
[ 11.481389] 5ee0: c32d0001 c0d751c5 c3d96400 bf0b23d0 c26a5f44 c11a43d0 00000000 c055f410
[ 11.489586] 5f00: c26a5f00 c055f4e4 c11a4b40 c0d751c5 c5055bcc c2529600 c2007400 c31ed300
[ 11.497783] 5f20: 00000000 c035f378 c2007400 c2007400 c2007418 c2529600 c2007400 c2529618
[ 11.505981] 5f40: c2007418 c1a04d00 00000088 c26d0000 c2007400 c035f608 c1bd42d0 c1bc9096
[ 11.514178] 5f60: c25d63e0 c26d0000 c25d68c0 c25d63c0 00000000 c035f5b8 c2529600 c2697ec4
[ 11.522375] 5f80: c25d63e0 c0366fc0 00000000 c25d68c0 c0366e70 00000000 00000000 00000000
[ 11.530572] 5fa0: 00000000 00000000 00000000 c03001a8 00000000 00000000 00000000 00000000
[ 11.538769] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 11.546966] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
[ 11.555167] [<bf0b30dc>] (dsa_port_do_fdb_add [dsa_core]) from [<bf0b3fc8>] (dsa_switch_event+0xb14/0xdec [dsa_core])
[ 11.565869] [<bf0b3fc8>] (dsa_switch_event [dsa_core]) from [<c036861c>] (raw_notifier_call_chain+0x34/0x68)
[ 11.575763] [<c036861c>] (raw_notifier_call_chain) from [<bf0ac7f4>] (dsa_tree_notify+0xc/0x20 [dsa_core])
[ 11.585476] [<bf0ac7f4>] (dsa_tree_notify [dsa_core]) from [<bf0ae96c>] (dsa_port_host_fdb_add+0x6c/0x94 [dsa_core])
[ 11.596081] [<bf0ae96c>] (dsa_port_host_fdb_add [dsa_core]) from [<bf0b23d0>] (dsa_slave_switchdev_event_work+0x1e0/0x25c [dsa_core])
[ 11.608164] [<bf0b23d0>] (dsa_slave_switchdev_event_work [dsa_core]) from [<c035f378>] (process_one_work+0x1f4/0x434)
[ 11.618835] [<c035f378>] (process_one_work) from [<c035f608>] (worker_thread+0x50/0x53c)
[ 11.626949] [<c035f608>] (worker_thread) from [<c0366fc0>] (kthread+0x150/0x184)
[ 11.634365] [<c0366fc0>] (kthread) from [<c03001a8>] (ret_from_fork+0x14/0x2c)
[ 11.641609] Exception stack(0xc26a5fb0 to 0xc26a5ff8)
[ 11.646673] 5fa0: 00000000 00000000 00000000 00000000
[ 11.654870] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 11.663066] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 11.669699] Code: e240000c 0a00000e e59a1000 e1da20b4 (e5903000)
[ 11.676043] ---[ end trace 516d77152b6b925f ]---
')

I am unable to check with upstream right at this moment for any potential bugs, but I will do so sometime later this week. In my case, the system is not mission critical or exposed to the internet, but I can not say for sure if it is stable without testing it more.

Cheers.

EDIT ON 2022/04/05: Kernel 5.17.1 on the repo does not exhibit this behavior. I believe the cause was due to or at least related in some form to [1]. Please feel free to flag this thread as solved if there are no more issues.

[1] https://git.kernel.org/pub/scm/linux/ke ... 3ea44b2b39
tron
 
Posts: 1
Joined: Tue Mar 08, 2022 5:50 am


Return to Marvell

Who is online

Users browsing this forum: No registered users and 2 guests