General Software Improvements (3rd June 2020 onwards)

Sub-forum for general software improvements threads
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: General Software Improvements (3rd June 2020 onwards)

Post by hamishmb »

More improvements:
  • Fix unit tests broken by previous changes.
  • General code cleanup in preparation for merge with master.
  • Log when database errors are encountered more effectively.
  • Add shutdown, update, and reboot events to the log.
  • Remove some old unneeded notes.
  • Reset system tick to zero when it reaches the database limit (2^31^ as it's a signed 32-bit integer).
Testing:
  • All unit tests have been run multiple times.
  • A system test with the NAS box and 5 Pi VMs has been conducted.
    • Aside from one permission issue with a database user, no issues were found.
    • Pi VMs used: SUMP, G3, G4, G6, VALVE4.
    • Free memory: around 70MB (though 2.5 GB of swap space is available if we need it).
    • Load avg on the NAS box was around 0.30, with peaks of 0.50 and troughs of 0.20.
    • This should be fine. We will have more pis in the system when deployed, but I am comfortable with the amount of headroom this gives us (I estimate at least 20% CPU free, being conservative).
    • Please note I had two SSH sessions open (which seem to hog the CPU) and the engineer GUI open to monitor during this test, so actual CPU load in production will be lower.
I now consider this to be ready to deploy (after I've merged it with master, which shouldn't be particularly arduous). We will also have a little bit of CPU headroom, which means we should be able to do things like:
  • Run scheduled tasks like database backups on low priority.
  • Cope with a flood of database queries to add readings if we encounter a temporary network issue.
  • Run maintenance tasks to eg fill in missing tick values if the above happens.
  • Generate simple graphs on the NAS box (preferably ahead of time because then we can use a low process priority).
One thing we do need to do is sort out database backups, but that will probably be a separate piece of code running as a cron job - no need to delay deploying. I have some spare low-capacity memory sticks that can be borrowed for this task, but I imagine we'll want something more robust than that in the long term.
Hamish
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: General Software Improvements (3rd June 2020 onwards)

Post by hamishmb »

I have now merged the use-database branch into master, and the unit tests and some other quick checks seem to be fine, so I'll now do some of the post-merge tasks.
Hamish
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: General Software Improvements (3rd June 2020 onwards)

Post by hamishmb »

Latest changes:
  • Documentation has been updated to match current system state, and missing parts added and amended.
  • <various smaller post-merge tasks>
    • The rest will need to be done after we've deployed the NAS box/after merging Patrick's code - I don't really want to make potentially breaking changes this close to deployment.
Engineer GUI:
  • Sort System Status table by system ID for improved readability.
  • Display Engineer GUI version number at bottom of pages.
  • Add manual valve control to my to-do list for after deployment and testing on-site.
The NAS box shall be ready for deployment by the end of the week - just doing finishing touches now.
Hamish
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: General Software Improvements (3rd June 2020 onwards)

Post by hamishmb »

Latest changes:
  • Enable circulation pump on Sump Pi prior to connecting to NAS box to avoid water overflow.
  • Bump version to 0.11.0 and update release date.
NAS box done:
  • Backed up and updated, and stored on fileserver:
    • rivercontrolsystem.tar.xz (initial version of software to deploy).
    • engineer-gui.tar.xz (initial version of engineer gui to deploy).
    • nas-other-root-files.tar.xz (Misc files used during boot).
    • User setup queries for the database.
    • Database structure.
    • NAS system settings (via D-Link web interface).
NOTE: To scp data to the NAS box, instead of doing this:

Code: Select all

scp /path/to/file/on/local/machine [email protected]:/path/on/NAS/box
We have to do this:

Code: Select all

ssh [email protected] (log in to NAS box)
scp localusername@localip:/path/to/file /path/on/NAS/box
This is due to a slight misconfiguration of the updated SSH software I build for the NAS box, but I'm not exactly sure what the issue is. This has not been a problem so far. I'm happy to fix this if needed once deployed (and don't wish to delay -- workaround below), but to transfer files (eg logs) from the NAS box, we should be able to do this:

Code: Select all

ssh [email protected] (log in to NAS box)
ln -s /path/to/file /var/www
And then download the file using (on your machine):

Code: Select all

wget http://192.168.0.25/filename
Alternatively, you could use a web browser.

Then remove the link:

Code: Select all

rm /var/www/file
Terry, does the VPN allow pis/systems at WMT to SSH to our machines when we're connected? If so, no wget command needed. Otherwise the work-around I've supplied should suffice.
Last edited by hamishmb on 19/08/2020, 11:42, edited 2 times in total.
Hamish
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: General Software Improvements (3rd June 2020 onwards)

Post by TerryJC »

hamishmb wrote: 18/08/2020, 20:30
Terry, does the VPN allow pis/systems at WMT to SSH to our machines when we're connected? If so, no wget command needed. Otherwise the work-around I've supplied should suffice.
I think it is unlikely. The whole idea with VPN is that the client logs into the server, which then routes traffic to and from the hosts. As far as I know that traffic has to be initiated by the client.

We could try it but I doubt it will work.
Terry
TerryJC
Posts: 2616
Joined: 16/05/2017, 17:17

Re: General Software Improvements (3rd June 2020 onwards)

Post by TerryJC »

TerryJC wrote: 18/08/2020, 20:48I think it is unlikely. The whole idea with VPN is that the client logs into the server, which then routes traffic to and from the hosts. As far as I know that traffic has to be initiated by the client.

We could try it but I doubt it will work.
I've been thinking about this overnight. If someone logged in to the NAS Box could get to your machine, then so could someone logged into any of the machines. In a corporate network, that would make a user whose machine is logged into the company network from home open to any of the hundreds of other people on the network.
Terry
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: General Software Improvements (3rd June 2020 onwards)

Post by hamishmb »

Yeah, I guess that's why it doesn't work then.

NB: I tested my instructions using the webserver to transfer data from the NAS box to my laptop, and it worked fine. I don't think we'll need to transfer files directly to the NAS box - seeing as it has internet access I can download and update the code directly from the repository in gitlab. If this does turn out to be a problem, I'll come up with something anyway.

NB 2: I have updated the instructions to create a symbolic link instead of copying to /var/www - these are all RAM filesystems apart from /mnt/HD/HD_a2 (the RAID HDD storage), so there's no room to copy files.
Hamish
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: General Software Improvements (3rd June 2020 onwards)

Post by hamishmb »

This morning, when integrating the NAS box, I fixed the following issues:
  • Gate valve startup failed because GPIO was configured incorrectly (not sure how this crept in but it was an easy fix).
  • Sump Pi was using the wrong readings for G4.
This issues could probably have been fixed by testing the new software on Terry's test rig. Perhaps next time there are big changes we should do that. However, configuring the NAS box VM can be quite a pain.

We have the following unresolved issue:
  • System tick is not being restored when the NAS box is rebooted. This will cause problems if the database isn't cleared before a reboot.
I have not yet been able to reproduce the tick restoring issue at home - using a database dump from the NAS box and the NAS box VM, it works fine. I have also observed it to work fine when I tested it with the real NAS box at home. Something weird is going on here :lol:

Notes:
  • The NAS box has been behaving fine in the week prior to integration - all seems fine.
  • We can no longer collect all the readings from Sump Pi.
  • ^ Once the NAS box is fully integrated, we will be able to do it from there, but this is not yet the case.
  • As Sump Pi changes its reading interval depending on system state, there won't be a reading for every system tick. This is normal and not a cause for concern.
  • Please don't reboot the NAS box until we have figured out why the system tick isn't being restored reliably - it will cause problems with the river system
Hamish
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: General Software Improvements (3rd June 2020 onwards)

Post by hamishmb »

Bigger update coming later, but for now:

A new repository has been created on GitLab for Terry's Fetch_ReadingsandLog script and my adaptation for the NAS box. This is available at: https://gitlab.com/wmtprojectsteam/fetc ... ree/master.
Hamish
hamishmb
Posts: 1891
Joined: 16/05/2017, 16:41

Re: General Software Improvements (3rd June 2020 onwards)

Post by hamishmb »

Next set of changes, to be deployed on Tuesday/whenever Penri is next in:
  • Improved sump pi error handling if database goes offline (cause of one of the crashes last week).
  • Don't flood the event log with unneeded events:
    • Don't update status and log event if the status is the same as last time.
    • Don't attempt to control a device again when we already have control and all is as we need it.
    • ^ Also saves network bandwidth and CPU cycles on the NAS box.
  • Restore system tick from the system tick table instead of analysing latest readings.
    • Faster, more reliable, lower CPU load on NAS box during startup.
    • Also fixes one of the issues we saw the other day where incorrect tick numbers were used in some situations.
Should be a good set of changes, and all tested with the WMT VMs so should go nice and smoothly.
Hamish
Post Reply