device drops offline, after 2-48 hours, remains on

This forum is for discussion about general software issues.

device drops offline, after 2-48 hours, remains on

Postby psutokth » Sun Nov 03, 2024 7:10 pm

Hello, I'm trying to troubleshoot a Pinebook Pro that I am using as a headless server (the screen broke off ;) ). It is working well, serving a few apps, but after some uptime that is usually a day or two, but sometimes just a couple of hours, the apps' webpages and any open SSH sessions go down, the device disappears from the router's list, and I cannot ping it. But its power LED remains green.

When I look at the logs for that boot, e.g.
[code]journalctl -b -1[/code]
I see the lines switch from working fine to errors and timeouts while trying to connect to the other devices the server is collecting data from. No errors or warning from networkd or other system services. The apps continue along saying they can't reach their devices until I notice and force shutdown the system. The journal records that power key press and power down, so the system is not crashing.

Dmesg might be more helpful than the journal, but how do I save a copy of that before the shutdown? I saw old forum references to doing something like this with lines in rc.local, but I don't think that exists anymore.
EDIT: the -k option in journalctl makes it show dmesg, but there is nothing there within 15 minutes of when this last happened.

[list][*]The device is connected to ethernet through a usb adapter. (The Pinebook Pro has other problems with its wifi chip).
[*]I think I'm using systemd's network manager, not Network Manager, but the former is new to me. (I was a long time Gentoo user, Systemd is sort of new to me, but I've used it on servers and laptops for the past few years.)
[*]/etc/systemd/networkd.conf is all defaults
[*]I have the same static IP assigned in /etc/systemd/network/20-en.network and /etc/systemd/network/20-eth.network because I see logs that rename eth0 to enu1u1.[/list]

[code]$ networkctl
IDX LINK TYPE OPERATIONAL SETUP
1 lo loopback carrier unmanaged
2 enu1u1 ether routable configured[/code]
psutokth
 
Posts: 3
Joined: Sun Nov 03, 2024 6:46 pm

Re: device drops offline, after 2-48 hours, remains on

Postby psutokth » Sun Nov 24, 2024 12:48 am

I had posted the above question to try to figure out the root cause of this issue. Failing that, since the system doesn't go down, I tried writing a script that checks for connectivity then restarts networking. I landed on a systemd service, using the ExecCondition to check for connectivity and print a message, and ExecStart to restart networkd and/or resolved. I played around until it correctly printed whether the network was down or not and restarted one or both of those services when it was down, but after those service restarts, connectivity is still down. Reboots fix it 100% of the time.

[code][Unit]
Description=Network check and restart script
After=network-online.service

[Service]
Type=oneshot
ExecCondition=/usr/bin/sh -c "if [[ $(ping -c1 [router IP]) =~ '100% packet loss' ]]; then echo restart networkd and or resolved; exit 0; else echo connection ok; exit 1; fi"
ExecStart=/bin/systemctl restart systemd-networkd
ExecStart=/bin/systemctl restart systemd-resolved

[Install]
WantedBy=multi-user.target
[/code]

The other thing I tried was uncommenting the line below in /etc/systemd/networkd.conf. It didn't seem to make a difference.

[code][DHCPServer]
#PersistLeases=yes
[/code]
psutokth
 
Posts: 3
Joined: Sun Nov 03, 2024 6:46 pm


Return to General

Who is online

Users browsing this forum: No registered users and 53 guests