Use Case
Setting up a high availability (HA) monitoring envirnoment between two SONARPLEX devices.
Example Setup
The instructions are based on the example setup as described below.
SONARPLEX HA Master
This is the SONARPLEX HA master device which contains the productive host- and servicechecks.
IP-Address | 172.16.0.100 |
---|---|
azeti Agent Port | 4192 |
azeti Agent Password | SamplePW |
SONARPLEX HA Slave
This is the SONARPLEX HA slave device which is completely empty and contains no productive information.
IP-Address | 172.16.0.200 |
---|
The slave runs just one single HA service check to maintain HA capabilities. This service should run in short intervals (3-5 min.) to reduce the outage times and gaps in logging and graphs.
The general procedure is as follows:
- Configure the SONARPLEX HA master as the productive device. No further configuration is necessary.
- Configure the SONARPLEX HA slave with just the host of the HA master. Add a servicecheck to that host with the template "check_azeti_ha".
- Let the SONARPLEX HA slave run this servicecheck to start syncronisation.
Step-by-step guide
SONARPLEX HA Master
- Open the Administration Web Interface > Configuration > Network > Agent Configuration
- Set Agent Password to "SamplePW" (without quotes)
- Click to save the configuration.
SONARPLEX HA Slave
- Open the Administration Web Interface> Configuration > Setup > Hosts
Create new host with the IP of the SONARPLEX HA master, in this case 172.16.0.100.
- Create new service with the plugin "check_azeti_ha" and add the just created host to the list of hosts to check.
- Change the normal check interval to a value between 3-5 minutes to reduce gaps due to failover transitions.
- Set Password (Optional) to the given SONARPLEX HA master Agent password "SamplePW" (without quotes).
- Click to save the configuration.
Verify the setup
To verify the functionality of the high availability setup on your SONARPLEX devices, follow these steps:
On SONARPLEX HA Slave
- Open User Webinterface > Monitoring > Services
- See the just created HA service check to report "OK" with a note on syncronized files, for example:
Failover behavior: automatic, HA mode 0: Monitor process is up, last complete syncronization: 2014-08-20 08:13:28 (302 files)
On SONARPLEX HA master
- If the syncronization is complete, shut down the SONARPLEX HA master device to test HA functionality.
Final result
Let the SONARPLEX HA slaves service check "check_azeti_ha" run through the following HA modes:
Check result | HA mode |
---|---|
OK | HA mode 0: Monitor process is up |
WARNING | HA mode 1: Machine seems to be down |
CRITICAL | HA mode 2: Machine seems to be down |
2nd. CRITICAL | Machine reboots with HA master configuration |
The SONARPLEX HA slave is about to reboot with all configurations, log files and graphs from the last complete syncronization.
How to trigger a failover manually
There are three ways to trigger a manual failover in case of a functionality test or a scheduled maintenance. These all result in the slave SONARPLEX becoming the master SONARPLEX.
Shutdown the HA master SONARPLEX
The first method is to shutdown the HA master SONARPLEX completely. To do this, follow these steps:
- Open the Administration Web Interface> Status > Summary
- Click on "Click here to shut your appliance down"
- Wait for the HA slave SONARPLEX to take over
Stop the HA master SONARPLEX monitoring process
The next way to trigger a manual failover is to stop the monitoring process on the HA master SONARPLEX:
- Open the Administration Web Interface> Status > Monitor
- Click on "Stop the monitor process"
- Wait for the HA slave SONARPLEX to take over
Disconnect from the network (not recommended)
The last way is to just disconnect the HA master SONARPLEX from the network so the HA slave SONARPLEX won't recognize it anymore. This is not recommended as the monitoring procss on the former HA master SONARPLEX is still running and you can end up with inconsistent graphs and logs, once the connection has been re-established.
- Disconnect the ethernet connection from the HA master SONARPLEX
- Wait for the HA slave SONARPLEX to take over
How to switch back after failover
If a failover has taken place and all problems have been solved, many wish to switch back to the original state of the setup (former slave being slave and former master being master again). This can be done simply by starting both SONARPLEX appliances and having them connected to your network in a way that they can communicate with each other. The former HA slave SONARPLEX will notice the HA master SONARPLEX to be back online and automatically synchronizes the collected graphs and log files. After completion, it will recover its former configuration with the only check being the HA master check.
Only log files and graphs are synchronized back to the HA master SONARPLEX! Changes in configuration done on the HA slave SONARPLEX while running as HA master are not getting transferred back and will be lost! If you made changes to the config, create a backup and save it prior to the switchback.
How to proceed in case of hardware failures/RMA
If one of your HA members suffers a hardware failure and needs to be replaced, contact support@azeti.net for assistance in creating an RMA. The following article describes the procedure after receiving your RMA replacement device.
TODO