[solved] SSH broken pipe problems

Ask questions about Arch Linux ARM. Please search before making a new topic.

[solved] SSH broken pipe problems

Postby fhteagle » Thu Nov 09, 2023 4:42 pm

Running arch linux arm aarch64 on a Raspberry Pi 3B. I am getting random but consistent "client_loop: send disconnect: Broken pipe" type errors whenever sending data to this machine from any remote source where packets arrive via the onboard ethernet NIC.

At its worst, just a few keystrokes can trigger this error. But for hours to days after a reboot, the behavior improves enough that it will receive maybe 20-200MB worth of data before issuing the error. I can replicate the error with 100% failure rate using SCP of a large regular file, btrbk (over ssh), and commands like [user@remotemachine] # cat /dev/random | ssh <badmachine> 'tee /dev/null' .

However, the badmachine can send large amounts of data over SSH/SCP reliably, and I have 100% success sending a 1GB test text file of randomness to multiple other machines. It also seems to receive correctly over the USB Wifi NIC, which means the root of the problem is likely to be in the internal ethernet NIC, its drivers, etc.

The exact error message varies significantly depending on what command and verbosity level I use. Examples:

" ssh_dispatch_run_fatal: Connection to <othermachine> port 22: message authentication code incorrect"
" ERROR: ... SSH command failed (exitcode=255) ... ERROR: ... client_loop: send disconnect: Broken pipe"
" scp: debug3: In write loop, ack for 620 261120 bytes at 160588800
debug2: channel 0: rcvd adjust 163840
debug2: channel 0: rcvd adjust 65536
debug2: channel 0: rcvd adjust 65536
debug2: channel 0: rcvd adjust 32768
debug2: channel 0: rcvd adjust 65536
scp: debug3: Sent message SSH2_FXP_WRITE I:685 O:177561600 S:261120
scp: debug3: SSH2_FXP_STATUS 0
scp: debug3: In write loop, ack for 621 261120 bytes at 160849920
debug3: send packet: type 1
client_loop: send disconnect: Broken pipe
lost connection "

I ran scp -o MACs=<someMACoption> <remotefile> <localfile> with every MAC option that was on the list of available options, but all MAC options errored out. It is the same list on all of the local and remote machines I have compared.

# cat /proc/device-tree/model
Raspberry Pi 3 Model B

# uname -a
Linux 2691ap9 6.2.10-1-aarch64-ARCH #1 SMP PREEMPT_DYNAMIC Fri Apr 7 10:32:52 MDT 2023 aarch64 GNU/Linux

Packets from my LAN would all be arriving via the built-in ethernet port, which is part of a bridge with the built-in wireless NIC and one USB Wifi NIC (to make wifi AP functionality).

# ethtool iLAN (the built in ethernet NIC)
Settings for iLAN:
Supported ports: [ TP MII ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: Yes
Advertised FEC modes: Not reported
Link partner advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
Link partner advertised pause frame use: Symmetric Receive-only
Link partner advertised auto-negotiation: Yes
Link partner advertised FEC modes: Not reported
Speed: 100Mb/s
Duplex: Full
Auto-negotiation: on
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
MDI-X: Unknown
netlink error: Operation not permitted
Current message level: 0x00000007 (7)
drv probe link
Link detected: yes

iperf3 tests to all other machines on my LAN proceed normally, give 90+/- 8 Mbps speed, and retries in the 20-200 count range for a 10 second test in both directions.

I have tried quite a few random suggestions without improvement, including adding these to /etc/ssh/sshd_config and restarting sshd.service .
TCPKeepAlive no
IPQoS 0x00
IPQoS none
Compression no
ClientAliveInterval 60
ClientAliveCountMax 5


If anyone else has any good ideas of what else to try to troubleshoot this, please let me know! Thanks!
Last edited by fhteagle on Mon Nov 27, 2023 4:27 pm, edited 3 times in total.
fhteagle
 
Posts: 10
Joined: Tue Nov 07, 2023 3:05 pm

Re: SSH broken pipe problems

Postby fhteagle » Sun Nov 12, 2023 12:58 am

This seems to have solved it...

$ sudo /usr/bin/ethtool --offload <ethdev> rx off

where <ethdev> is the onboard ethernet device name.
fhteagle
 
Posts: 10
Joined: Tue Nov 07, 2023 3:05 pm

Re: SSH broken pipe problems

Postby fhteagle » Sun Nov 12, 2023 4:07 pm

Looks like I spoke too soon. The behavior is back even with the ethtool mod I listed above.

Computers..............
fhteagle
 
Posts: 10
Joined: Tue Nov 07, 2023 3:05 pm

Re: SSH broken pipe problems

Postby fhteagle » Wed Nov 15, 2023 2:31 pm

Update:

Forcing the sshd server to use ssh_host_rsa_key in sshd_config seems to have improved things. Will monitor awhile longer before I call it fixed again though.
fhteagle
 
Posts: 10
Joined: Tue Nov 07, 2023 3:05 pm

Re: SSH broken pipe problems

Postby fhteagle » Mon Nov 20, 2023 2:25 pm

Still not fixed. Forcing RSA keys on both ends has cut the broken pipe error rate significantly, but not to zero.

Its still only this pi, only connections coming over the ethernet port that are having this issue. Every other combo of machines on my LAN I have tested great on ping / iperf3 / ssh'ing large files of randomness / btrbk'ing stuff.

The other thing that is weird is this pi behaved just fine on RaspiOS before I switched to Arch Linux ARM. So its either newly defective hardware that coincidentally started acting up right at the OS changeover, or there's something goofy in software.
fhteagle
 
Posts: 10
Joined: Tue Nov 07, 2023 3:05 pm

Re: SSH broken pipe problems

Postby lategoodbye » Mon Nov 20, 2023 10:09 pm

Hi, i've the suspicion that you hit the performance regression of the driver for hardware random generator on the Raspberry Pi.

You are using a outdated mainline kernel, which isn't maintained anymore.

So you have 2 options:
1) update to a recent mainline kernel which contains the fix (e.g. 6.5.12 or 6.6.2)
2) switch to the vendor kernel linux-rpi
lategoodbye
 
Posts: 116
Joined: Sat Dec 29, 2018 1:24 am

Re: SSH broken pipe problems

Postby fhteagle » Thu Nov 23, 2023 2:07 pm

I think you were right about the RNG being the root cause.

Reverted to linux-rpi 6.1.63 and so far so good. Will keep monitoring for awhile though.

I am confused by your comment about outdated mainline kernel. linux-aarch64 6.2.10 was and is still the newest version of the package atmake_clickable_callback(MAGIC_URL_FULL, ' ', 'https://archlinuxarm.org/packages', '', ' class="postlink"') . Are you saying that arch linux arm is not packaging linux-rpi, nor linux-aarch64 anymore? Is there a different source you'd recommend for recent aarch64 kernel packages? Yes I could roll my own, but would rather not waste the computing watts if it is already done and released elsewhere....
fhteagle
 
Posts: 10
Joined: Tue Nov 07, 2023 3:05 pm

Re: SSH broken pipe problems

Postby lategoodbye » Sun Nov 26, 2023 9:46 pm

The mainline kernel and linux-rpi are not really comparable. The mainline kernel is released and maintained for all Linux devices. Every ~ 2 month there is a new major release with tons of new features (according to kernel.org is Linux 6.6.2 the last stable). At the end of a year the last stable is declared as LTS, which means it is maintained for a few years. If you look at kernel.org you wont see the Linux 6.2.x branch, because it's outdated.

linux-rpi usually uses the last LTS release and the Raspberry Pi guys places a lot of their optimization patches on top. So linux-rpi 6.1.63 is up to date.
lategoodbye
 
Posts: 116
Joined: Sat Dec 29, 2018 1:24 am

Re: SSH broken pipe problems

Postby graysky » Mon Nov 27, 2023 12:31 pm

linux-rpi is not just optimizations, there are drivers and fixes unique to the platform.

Also, they have a "stable" kernel which tracks the latest LTS but they also rebase their stuff on top of mainline. I happen to package this kernel in my unofficial user repo. See:make_clickable_callback(MAGIC_URL_LOCAL, ' ', 'https://archlinuxarm.org/forum', 'viewtopic.php?f=3&t=16144', ' class="postlink-local"')
graysky
Developer
 
Posts: 1731
Joined: Sun Jun 26, 2011 6:56 am
Location: /run/user/1000

Re: SSH broken pipe problems

Postby fhteagle » Mon Nov 27, 2023 4:27 pm

@lategoodbye -

Yeah I noticed linux-rpi was on LTS, just checking that there was not another issue.

Not seeing any more ssh issues in like 4 days. So linux-rpi kernel was the fix.

Thanks for the help, marking as solved for real this time hopefully....
fhteagle
 
Posts: 10
Joined: Tue Nov 07, 2023 3:05 pm


Return to User Questions

Who is online

Users browsing this forum: No registered users and 12 guests