Network issues investigation

A forum for discussion on the software for the WMT River Control System
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Network issues investigation

Post by hamishmb »

Starting a new thread for tracking progress on this issue.

Recently we've encountered a number of network issues. It seems the pis at the G4 and G6 sites go offline after an unknown period of time, and cannot be accessed, even to reboot them. Penri has encountered issues where the graphs on his phone won't update, but the reboot command works, so this could be a bug in the app, or the network being too slow.

This afternoon, the Wbutts Pi and Stage Pi wouldn't communicate until I power cycled the devices. We also encountered an issue do to a software misconfiguration.

We're thinking of doing the following to attempt to resolve these issues:
  • Update Stage Pi and Sump Pi to Raspbian 10 "Stretch".
  • Comment out the configuration for pis that aren't present in the software until we install them, to avoid the configeration error in future.
  • Trying the spare network adapters in the G4 and G6 sites in case that fixes things.
  • Checking the indicator lights on the network adaptors when the network isn't working to see if the link is active at all.
Terry, can we connect your keyboard and monitor without rebooting the pis or disconnecting the network adaptor? If possible, this could be another way to diagnose the system when it's not working.
Hamish
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: Network issues investigation

Post by hamishmb »

Quick thought:

This is almost exactly the same as an issue I have with a powerline adaptor and mini wireless AP at home. From time to time, the AP has no access to my home network, even though the link light is on, and it usually needs a reboot and/or the network cable to be reseated. I think it was the cable connectors, but after replacing them it still has the same issue, albeit less often. Cables that I didn't make don't have the same issues. Sometimes it goes for weeks with no issue, and other times it fails multiple times in one week.

Do we have any manufactured cables to try? I think maybe the RJ45 connectors on the market are just not very good.
Hamish
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: Network issues investigation

Post by TerryJC »

hamishmb wrote: 14/10/2019, 19:33Terry, can we connect your keyboard and monitor without rebooting the pis or disconnecting the network adaptor? If possible, this could be another way to diagnose the system when it's not working.
Sorry, I missed this thread when you started it.

It isn't possible to connect the keyboard to the installed system, because the network adaptor is occupying the USB socket. If a USB Hub were to be connected semi-permanently, this could be overcome, but I have seen instances where disconnecting the keyboard has caused the Pi to reboot.

Similarly, we should be able to plug the monitor into the running system, but I've seen instances when the screen remains blank until the Pi is rebooted. If we started the Pi with the monitor attached and then unplugged it, it should be possible to plug it in again and regain the display, but how this would behave when the Pi isn't responding, I don't know.

The only thing is, we may be able to see what is going on during a normal boot where the system doesn't start properly (as we've seen a few times recently), but that's a bit hit and miss.
Terry
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: Network issues investigation

Post by TerryJC »

hamishmb wrote: 16/10/2019, 22:25This is almost exactly the same as an issue I have with a powerline adaptor and mini wireless AP at home. From time to time, the AP has no access to my home network, even though the link light is on, and it usually needs a reboot and/or the network cable to be reseated. I think it was the cable connectors, but after replacing them it still has the same issue, albeit less often. Cables that I didn't make don't have the same issues. Sometimes it goes for weeks with no issue, and other times it fails multiple times in one week.

Do we have any manufactured cables to try? I think maybe the RJ45 connectors on the market are just not very good.
We could buy some from RS, who tend not to use cheap Chinese imports. For example https://uk.rs-online.com/web/p/rj45-connectors/4086777/ seem to be a reasonable compromise. It would only cost £5.70 (Minimum order Quantity is 10) and the manufacturer is in Pennsylvania.

Penri, Do you have any thoughts on this? We've certainly suffered in the past when we bought cheap Chinese copies, so perhaps we should steer clear of them in the future.
Terry
Penri
Posts: 1284
Joined: 18/05/2017, 21:28

Re: Network issues investigation

Post by Penri »

I’m not convinced but if it eliminates a nagging doubt then let’s get the new connectors.
Penri
Posts: 1284
Joined: 18/05/2017, 21:28

Re: Network issues investigation

Post by Penri »

I’m not convinced but if it eliminates a nagging doubt then let’s get the new connectors.
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: Network issues investigation

Post by TerryJC »

Order placed.
Terry
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: Network issues investigation

Post by hamishmb »

Sounds good. I already thought we were using RS ones, but I guess not.
Hamish
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: Network issues investigation

Post by TerryJC »

We are, but they are their cheapest type. They should be OK, but the one's I've ordered are better. Unless we are somehow making the connections badly, that will completely exonerate the Ethernet connectivity.

Really, I'm with Penri on this one. I cannot see how rebooting/restarting a device can solve a physical network problem, but lets make sure that the connections are good, once and for all.

If we still have the problem, then the only physical device that could be causing this is the Ethernet Adaptor; maybe the internal electronics gets swamped with traffic or something. Again, I can't really see it because Edimax is a reliable manufacturer, but we can swap adaptors around to see if the fault moves or something.
Terry
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: Network issues investigation

Post by hamishmb »

Agreed that it doesn't make a lot of sense. I figure we might as well just try each possibility in turn, and we'll eventually figure out what's going on.

My AP is a TP-Link one, but I don't know what kind of ethernet adaptor it uses. Perhaps it could also be one of those USB controller/driver issues that occasionally pops up with Raspberry Pis? I think they've fixed most of them, but I don't know.
Hamish
Post Reply