|
The NIFE103 comes with a chip (namely NCT7904D) that allows (next to monitoring other critical system sensor parameters) to use a hardware based watch dog.
A (very rough) documentation of the NIFE103's Watch Dog functionality could be found in [1] and a documentation of the specific hardware monitoring chip in [2].
In short, the watch dog needs to be configured and kicked via SMBus ioctls. An attempt to do so via a Python script directly was unsuccessfull so the actual calls needed to be implemented and compiled in C.
The binaries provided by this package requires to run on a Nexcom NIFE103 hardware that has a NCT7904D chip build in which answers on i2c-7
bus on address 2dhex.
Do not attempt to run these binaries on any other hardware as they may otherwise permanently damage your system. |
nct7904
kernel module needs to be blacklisted in modprobe configuration.i2c_i801
must NOT be blacklisted in modprobe configuration.If not already provided by your system's installed OS image, extract the files from the archive into your ${SITECONTROLLER_HOME}/scripts
folder. This is usually located at /opt/azeti/SiteController/scripts
:
mkdir -p /opt/azeti/SiteController/scripts tar -xvzf NIFE103-control.tar.gz -C /opt/azeti/SiteController/scripts |
make sure the binaries have correct ownership and permissions:
chown root:root /opt/azeti/SiteController/scripts/NIFE103-wdt* \ /opt/azeti/SiteController/scripts/watchdog.sh chmod 0700 /opt/azeti/SiteController/scripts/NIFE103-wdt* \ /opt/azeti/SiteController/scripts/watchdog.sh |
Double check nct7904
is blacklisted:
grep "nct7904" /etc/modprobe.d/blacklist* /etc/modprobe.d/blacklist.conf:blacklist nct7904 |
If it is not blacklisted, just add a blacklist nct7904
entry to /etc/modprobe.d/blacklist.conf
or remove the #
in the beginning of the line, if that entry was just commented out.
Double check i2c_i801
is NOT blacklisted (note the #
):
grep "i2c_i801" /etc/modprobe.d/blacklist* /etc/modprobe.d/blacklist.conf:#blacklist i2c_i801 |
If it is blacklisted, just add a #
at the beginning of that line.
/etc/modprobe.d/
you need to reboot the machine for the changes to take effect.modify the SiteController.cfg
file a.k.a. Site configuration so that in section [remote_exec_calls]
the following entries are present:
[remote_exec_calls] watchdog_start = /opt/azeti/SiteController/scripts/watchdog.sh start watchdog_stop = /opt/azeti/SiteController/scripts/watchdog.sh stop watchdog_kick = /opt/azeti/SiteController/scripts/watchdog.sh kick |
upload the watchdog-NIFE103.template.xml
file as a new component template to the cloud (if it's not already available there, that is)
add this component template to your Site template in the usual way.
NIFE103-wdt-init | will initialize the watch dog timer (needs to be run once before using the watch dog). It accepts a single parameter which sets the timeout value for the watch dog in minutes. If omitted the parameter defaults to 10 minutes. |
NIFE103-wdt-start | starts the timer. |
NIFE103-wdt-stop | stops the timer. After this command the watch dog is no longer guarding the system. |
NIFE103-wdt-reset_timer | resets the timer and thus 'kicks' the watch dog. This binary also requires the timeout value as a command line parameter. Same as the wdt-init executable the value is expected to be specified in minutes and defaults to 10 if the parameter is not provided. |
README.md | this file you're reading right now. |
watchdog.sh | interface shell script to be used with the SiteController. You may want to edit the TIMEOUT_MINUTES variable in this script. |
watchdog-NIFE103.template.xml | a component template that could be used to run the watch dog timer with the SiteController. If you changed the TIMEOUT_MINUTES in watchdog.sh , you may also want to adapt the timer parameter in xpath('/component_template/ac_rules/rule[1]/timers/timer[1]/@delay') of watchdog-NIFE103.template.xml . |
The SiteController is now set up to be monitoring the system state. All it remains to do is a restart of the SiteController. Once restarted the hardware based watch dog is initialized and started, the timer in the automation rule set will kick the watchdog_kick
action every 120 seconds which in turn will call the necessary io controls to reset the timer.
Should the watch dog run into a time-out after 10 minutes without any call to these ioctl
, the system is rebooted.
lm-sensors
package to monitor the NIFE103 hardware sensors because otherwise the kernel module locks the access to the SMBus.watchdogd
[Wikipedia]: https://en.wikipedia.org/wiki/Watchdog_timer (various authors, quoted content from 2019-11-13)
[1]: http://files.nexcom.com/Driver/NIFE103/User_Manual_NIFE103_170928.pdf (local copy)
[2]: https://www.nuvoton.com/resource-files/NCT7904D_Datasheet_V1.44.pdf (local copy)
|