(August 19th 2019 onwards) River System Status

Sub-forum to hold river system status topics
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: (August 19th 2019 onwards) River System Status

Post by TerryJC »

I was also unable to access the V4 Gate Pi over the WiFi with my laptop so I did a power reset at approx 10;15 this morning. I was then able to gain access to the Results and Logs and download them.

We aren't sure why the Pi stopped communicating, but Penri did a reset on Saturday and it was OK then. According to its log file, the Pi was still active until 10:22 this morning, which implies that the clock was wrong or my estimate of the time that I downloaded the data was out. Looking at the Sump Pi log, it would appear that everything was fine until approx 09:06 on the morning of the 12th, when communication with the Wendy gate Valve was lost until around 09:30 on the 14th when (presumably) Penri reset it again. At approx 05:42 on the 16th comms were lost again and the Gate Valve stopped sending data.

So we don't know why this is happening and there seems to be no regular patterns to help us diagnose. After Christmas, if we've come up with no ideas before then, I will instigate a general reset of all Pis at midnight every night. I'd rather work out why the problem is occurring, but at the same time, we need a reliable system, so needs must.
Attachments
2019-12-17_Results&Logs.zip
(7.58 MiB) Downloaded 91 times
Terry
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: (August 19th 2019 onwards) River System Status

Post by hamishmb »

After some analysis of these files, it quickly became apparent to me that the system time doesn't seem to be syncing properly between Sump Pi and Gate Valve V4 Pi, making correlating the readings and logs quite difficult.

AFAICT, Sump Pi eventually thinks the connection was closed cleanly, but V4 pi thinks the connection is still open. As such, Sump Pi is unable to reconnect, and no data can be sent between the two pis. This latest crash, at least, seems to be network and software-related.

I also observe that the RAM and CPU monitor doesn't seem to be running on V4 pi. I don't know if this was intentional, but it seems like a good idea to run it there as well.

Here are potential ideas I have to further diagnose this issue:
  • Run the software in testing mode at WMT (disables hardware access)
    • This idea arises because I've been unable to cause a crash at home using the VMs. Perhaps it has something to do with hardware access?
  • Reset all the network connections every 24 hours, in case some TCP-related maximum connection time is being reached.
  • Disable some elements of the software.
    • The gate valve seems to crash most often as far as I can see from the forum posts. I wonder what would happen if I disabled the gate valve control thread, and whether it would still crash.
Hamish
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: (August 19th 2019 onwards) River System Status

Post by TerryJC »

At WMT this morning, I downloaded all of the Result and Log files (see attached) and cleared the directories so that the new files wouldn't fill up the SD Card. All four Pis were communicating fine over the WMT-Guest Wi-Fi connection, and the Gate Valve seems to have been working fine since the 17th of December, from the brief scan of the logs that I've just done.

Note: The server wouldn't accept the full set of logfiles, because they we each 45 MB and the compressed size was 36 MB. This upload therefore only contains the most recent logfiles.

I then shut down the V4 Gate Valve and changed the SD Card to bring the Gate Valve software into line with the software on the other three Pis. I had tried to do that towards the end of last year, but the valve refused to work and I swapped it back to the old one again. I remade the SD Card but then it rained, and rained and rained, and then I forgot all about it :(

Anyway, the card that I've just installed is the replacement that I prepared and it seems to be working fine. We will monitor the situation before we try any of the other suggestions above.
Attachments
2020-01-08_Results&Logs.zip
(8.28 MiB) Downloaded 87 times
Terry
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: (August 19th 2019 onwards) River System Status

Post by TerryJC »

Forgot - I brought Penri's completed Hall-Effect Probes (four off) back with me for testing.
Terry
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: (August 19th 2019 onwards) River System Status

Post by hamishmb »

Ah, good to hear. Another idea is to send acknowledgements upon receipt of data - if one isn't received, then we can be fairly sure the connection died one way or another. It might be a good idea in the long run regardless of whether the issue occurs again, just as a preventative measure.
Hamish
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: (August 19th 2019 onwards) River System Status

Post by TerryJC »

This morning I took all of the readings and logs (attached).

I then shut down Stage Butts Pi so that the probes could be removed in readiness for the upcoming replacement of the Butts platform (viewtopic.php?f=15&p=2979#p2979)
Attachments
2020-01-14_Results&Logs.zip
(1.06 MiB) Downloaded 85 times
Terry
Penri
Posts: 1284
Joined: 18/05/2017, 21:28

Re: (August 19th 2019 onwards) River System Status

Post by Penri »

Having taken a look at the results files I'd say that everything is working pretty normally.
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: (August 19th 2019 onwards) River System Status

Post by TerryJC »

I took a new set of readings this morning (see attached).

No results for the Stage Pi of course, sine the Pi is disconnected while the Butts platform is replaced.
Attachments
2020-01-20_Results&Logs.zip
(1.72 MiB) Downloaded 87 times
Terry
Penri
Posts: 1284
Joined: 18/05/2017, 21:28

Re: (August 19th 2019 onwards) River System Status

Post by Penri »

Interesting drop of level in the Wendy Butts this morning (20Jan20) around 03:30, reported by the results files.

I'm about to take a look at this morning logs to see if there are any clues on what may have happened.

Gate valve appears to be doing exactly what's expected of it.
Penri
Posts: 1284
Joined: 18/05/2017, 21:28

Re: (August 19th 2019 onwards) River System Status

Post by Penri »

Just looked at today's log for the Wendy Butts, the first reading recorded is at just before 03:30, it says 375mm, which tallies with the results file.

The gate valve was opening and closing during the period from midnight to 03:30 so water was being supplied from the Butts to the Sump. But why the sudden drop?

Float frozen onto sensor, but if so why drop at 03:30? we've not seen this behaviour before, I don't think.
Post Reply