Kernel panic trying to forward ethernet connection

Plugging in devices such as LAN adapters, Wi-Fi cards, printers, etc. to Arch Linux ARM.

Re: Kernel panic trying to forward ethernet connection

Postby pepedog » Mon Oct 29, 2012 2:38 pm

OK, I can confirm that a carl9170 stick in bridge mode, on a dockstar, using an iphone to browse google, with that kernel, crash within one minute
pepedog
Developer
 
Posts: 2431
Joined: Mon Jun 07, 2010 3:30 pm
Location: London UK

Re: Kernel panic trying to forward ethernet connection

Postby Kurlon » Mon Oct 29, 2012 2:47 pm

Awesome. Now to nail down WHY it crashes.

In the mean time, I just posted an update to linux-kirkwood bumping it to 3.6.4. Unfortunately I didn't see anything in the changelog for 3.6.4 that seems related to the crash you guys are observing.
Kurlon
 
Posts: 132
Joined: Fri Jan 06, 2012 10:05 pm

Re: Kernel panic trying to forward ethernet connection

Postby Kurlon » Mon Oct 29, 2012 2:51 pm

If it makes you feel better, you're not alone in spotting this:
https://bugzilla.redhat.com/show_bug.cg ... &id=870515
Kurlon
 
Posts: 132
Joined: Fri Jan 06, 2012 10:05 pm

Re: Kernel panic trying to forward ethernet connection

Postby pepedog » Mon Oct 29, 2012 2:53 pm

I helped my son thru the glibc thing, updated his uboot, systemd, etc. He's got a usb nic, and now this explains why he turned his back on linux-kirkwood (said it was unstable)
pepedog
Developer
 
Posts: 2431
Joined: Mon Jun 07, 2010 3:30 pm
Location: London UK

Re: Kernel panic trying to forward ethernet connection

Postby Kurlon » Mon Oct 29, 2012 2:56 pm

Heh, I wish more people would tell me when they find issues with it. :D

It's been rock solid on my GFN with a USB wifi adapter, USB VGA, onboard eth dedicated to distcc work, but I also haven't played with IPv6 or bridging.
Kurlon
 
Posts: 132
Joined: Fri Jan 06, 2012 10:05 pm

Re: Kernel panic trying to forward ethernet connection

Postby ebbix » Mon Oct 29, 2012 2:59 pm

Just posting using my iPhone on Goflex wifi bridge.
Self-compiled kernel 3.4.15, no crash yet. Will of course test a longer time.
But it seems that the bug was introduced in 3.5.x since 3.5.4 is broken.
Maybe it would be helpful to do some further testing in 3.5 branch, but that would mean compiling 3.5.0 through 3.5.3 first :(

EDIT: from reading the redhat bugzilla entry, it seems as this panic only occurs when ipv6 is involved. I think I'll just upgrade to a more recent kernel, turn ipv6 off and try again.
EDIT²: Currently trying 3.5.4 with ipv6 disabled, seems to work fine for now.
Code: Select all
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

Since I don't see any reason why I should need ipv6 in my local network, I think I'll just disable it and go to most recent kernel (provided it works as expected...)
If I ever decide that I need ipv6, there's still the option of downgrading to 3.4 or 3.2 long-term support kernels...
However, I am willing to try other options.
EDIT³: Nope, 3.5.4 without ipv6 doesn't work :lol:
Code: Select all
[ 2551.894715] ------------[ cut here ]------------                 
[ 2551.899364] kernel BUG at mm/slub.c:3474!                                   
[ 2551.903391] Internal error: Oops - BUG: 0 [#1] PREEMPT ARM                   
[ 2551.908901] Modules linked in: ath9k_htc ath9k_common ath9k_hw ath mac80211 4
[ 2551.924704] CPU: 0    Not tainted  (3.5.4-0-ARCH #1)                         
[ 2551.929696] PC is at kfree+0xb8/0x18c                                       
[ 2551.933372] LR is at __kfree_skb+0x14/0xc0                                   
[ 2551.937488] pc : [<c00c0dc8>]    lr : [<c039e36c>]    psr: 40000013         
[ 2551.937488] sp : c5363cb8  ip : 00000020  fp : c5362000                     
[ 2551.949012] r10: c6bcb8cc  r9 : 00000004  r8 : c6bcb9b8                     
[ 2551.954254] r7 : 00000001  r6 : c039e36c  r5 : c7bf03c0  r4 : c7a38000       
[ 2551.960803] r3 : 00000000  r2 : 00007a38  r1 : c07ef700  r0 : c7a38000       
[ 2551.967362] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kerl
[ 2551.974703] Control: 0005397f  Table: 06a4c000  DAC: 00000017               
[ 2551.980467] Process nfsd (pid: 286, stack limit = 0xc5362270)               
[ 2551.986240] Stack: (0xc5363cb8 to 0xc5364000)                               
[ 2551.990618] 3ca0:                                                       c7bf0
[ 2551.998836] 3cc0: 00000000 00000001 c6bcb9b8 c039e36c c6bcb5a0 c03e66cc 00004
[ 2552.007051] 3ce0: 00000004 00000000 00000000 c6bcb5a0 c7bf03c0 00000020 c7a3c
[ 2552.015265] 3d00: 00000004 c6bcb8cc c5362000 c03e97cc c6bcb5a0 00bf03c0 c7a30
[ 2552.023480] 3d20: c6bcb5a0 c5362000 c062bb8c 00000000 c6bcb8cc c03eff34 c5364
[ 2552.031693] 3d40: 68900afa 12c19689 00000000 00000002 00000000 c6bcb5a0 c7bf0
[ 2552.039908] 3d60: c062bb8c c0399538 c6bcb5a0 00000004 c7b76900 00000000 00004
[ 2552.048121] 3d80: c061ac40 00000001 00000001 00000000 00000000 00000000 c5360
[ 2552.056335] 3da0: 00000001 00000000 9b64c2b0 00000000 00000001 c5363efc c0620
[ 2552.064551] 3dc0: c5363efc 00000040 c51caa00 c40aa6e0 c5362000 c03fe218 00000
[ 2552.072765] 3de0: c5363dec 00000003 00000000 00000000 00000000 c5363e08 00004
[ 2552.080979] 3e00: 00000040 0002f792 00000000 b53e7887 00000040 00000004 c40a0
[ 2552.089193] 3e20: 00000000 c5363efc 00000000 00000000 00000000 c5319530 00000
[ 2552.097407] 3e40: c5319530 c0042fa0 c0042f18 00000000 00000000 00000001 ffff0
[ 2552.105621] 3e60: 00000000 00000000 00000000 00000000 c5319500 c05ea01c 00000
[ 2552.113837] 3e80: c5319698 00000000 c5363e08 c5319698 c5363f24 c0438b7c c7818
[ 2552.122050] 3ea0: c5362000 c07cc2b4 c07fd3d4 c04378e8 00000000 c07fd3d8 00000
[ 2552.130264] 3ec0: f6ffffe0 c781b000 c40aa6e0 00000000 00000001 c5363f48 c68c4
[ 2552.138479] 3ee0: c781be14 c0396aac c781b000 bf06dafc 00000004 00000040 c5360
[ 2552.146692] 3f00: 00000000 c5363f48 00000001 00000000 00000000 00000040 c68c0
[ 2552.154907] 3f20: c68c0000 00000004 00000000 bf06e63c 00200200 0008e718 c06a0
[ 2552.163122] 3f40: c5319500 ffffffff c68c01e0 00000000 00000000 c781b000 c51c0
[ 2552.171335] 3f60: 00000000 00000000 00000000 c781be14 c5362000 bf079e1c c07c0
[ 2552.179549] 3f80: 00200200 00000000 c5319500 c003fcd4 00100100 00200200 00008
[ 2552.187764] 3fa0: c781b000 00000000 00000013 00000000 00000000 00000000 0000c
[ 2552.195978] 3fc0: c5193e8c c781b000 bf15e000 c00347a8 00000000 00000000 c7810
[ 2552.204193] 3fe0: c5363fe0 c5363fe0 c5193e8c c0034728 c0009588 c0009588 838e8
[ 2552.212413] [<c00c0dc8>] (kfree+0xb8/0x18c) from [<c039e36c>] (__kfree_skb+0)
[ 2552.220284] [<c039e36c>] (__kfree_skb+0x14/0xc0) from [<c03e66cc>] (tcp_data)
[ 2552.229029] [<c03e66cc>] (tcp_data_queue+0x4b0/0xb1c) from [<c03e97cc>] (tcp)
[ 2552.238642] [<c03e97cc>] (tcp_rcv_established+0x4fc/0x5c0) from [<c03eff34>])
[ 2552.248089] [<c03eff34>] (tcp_v4_do_rcv+0x28/0x1e0) from [<c0399538>] (relea)
[ 2552.256836] [<c0399538>] (release_sock+0xa4/0x11c) from [<c03e1b04>] (tcp_re)
[ 2552.265489] [<c03e1b04>] (tcp_recvmsg+0x6c0/0x834) from [<c03fe218>] (inet_r)
[ 2552.274060] [<c03fe218>] (inet_recvmsg+0x40/0x54) from [<c0396a44>] (sock_re)
[ 2552.282544] [<c0396a44>] (sock_recvmsg+0xb8/0xe0) from [<c0396aac>] (kernel_)
[ 2552.291312] [<c0396aac>] (kernel_recvmsg+0x40/0x74) from [<bf06dafc>] (svc_r)
[ 2552.300886] [<bf06dafc>] (svc_recvfrom+0x58/0x60 [sunrpc]) from [<bf06e63c>])
[ 2552.311527] [<bf06e63c>] (svc_tcp_recvfrom+0x60/0x4cc [sunrpc]) from [<bf079)
[ 2552.322024] [<bf079e1c>] (svc_recv+0x66c/0x764 [sunrpc]) from [<bf15e09c>] ()
[ 2552.331177] [<bf15e09c>] (nfsd+0x9c/0x16c [nfsd]) from [<c00347a8>] (kthread)
[ 2552.339227] [<c00347a8>] (kthread+0x80/0x90) from [<c0009588>] (kernel_threa)
[ 2552.347620] Code: 1a000006 e5913000 e3130903 1a000000 (e7f001f2)             
[ 2552.353929] ---[ end trace b9d948a4d11cfeed ]---                             
[ 2552.358568] Kernel panic - not syncing: Fatal exception in interrupt


EDIT^4: Kernel 3.5 introduced Tcp early retransmit. New hypothesis: Maybe this causes errors? Currently on kernel 3.6.4-1 with
Code: Select all
sysctl net.ipv4.tcp_early_retrans=0
15 minutes without error until now :) I hope this is the crash reason...
ebbix
 
Posts: 48
Joined: Fri Aug 10, 2012 1:55 pm

Re: Kernel panic trying to forward ethernet connection

Postby ebbix » Wed Oct 31, 2012 7:24 am

Big update (dont want to do an edit^5 :twisted: )
Code: Select all
sysctl net.ipv4.tcp_early_retrans=0
really seems to do the trick. AccessPoint running stable for about 18 hours until now. Looks like tcp early retransmit really causes that panic.
ebbix
 
Posts: 48
Joined: Fri Aug 10, 2012 1:55 pm

Re: Kernel panic trying to forward ethernet connection

Postby pepedog » Wed Oct 31, 2012 3:14 pm

I read that
Early retransmit is enabled with the tcp_early_retrans sysctl, found at /proc/sys/net/ipv4/tcp_early_retrans. It accepts three values: "0" (disables early retransmit), "1" (enables it), and "2", the default one, which enables early retransmit but delays fast recovery and fast retransmit by a fourth of the RTT (this mitigates connection falsely recovers when network has a small degree of reordering)

0 and 1 seems ok to me, haven't put in the hours you have though. Obviously 2 is not ok
pepedog
Developer
 
Posts: 2431
Joined: Mon Jun 07, 2010 3:30 pm
Location: London UK

Re: Kernel panic trying to forward ethernet connection

Postby ebbix » Sat Nov 03, 2012 5:30 pm

Yeah, 4 days without panic on 3.6.4. I think tcp_early_retrans is really the problem, glad I don't need to stay on LTS kernel, I like to be bleeding edge :lol:
ebbix
 
Posts: 48
Joined: Fri Aug 10, 2012 1:55 pm

Previous

Return to Hardware

Who is online

Users browsing this forum: No registered users and 4 guests