Diagnostics

Last updated: September 17. 2018


1. Logs

Despite careful tests, it cannot be altogether ruled out that unexpected errors may occur, which are difficult to diagnose without looking at the operating system.

One option is to have the log entries that are generated on the system sent via syslog to a syslog server. However, the log entries of the individual monitoring instances are not processed via syslog, meaning they are not forwarded and can only be viewed on the device.

In order to make diagnostics on the device easier, there is a view that displays the device’s various log files. You can go to this view by clicking on the menu item Log files in the main menu of the web interface.

You can select the logs of the device here and view their current content.

The system log is reinitialised each time the device is started up. If you would like to keep the log entries, you must send them to a syslog server. You can also view the system log on the local console. The last entries of the system log are displayed on the second terminal. You can get to this terminal via the key combination CTRL + ALT + F2. All kernel messages are displayed on the third terminal. In the case of hardware problems, you will find the relevant messages here. This terminal can be accessed via the key combination CTRL + ALT + F3. The key combination CTRL + ALT + F1 will take you back to the status screen.

2. Available Memory

The system memory of the device is available to your monitoring sites, reduced by the amount of memory which is needed by the system processes of the Check_MK Appliance.

To provide a stable system platform, a fixed amount if memory is reserved for the mandatory system processes. The exact amount of reserved memory depends on your device configuration:

  • Standalone device (no cluster configuration): 100 MB
  • Geclustert: 300 MB

If you want to know exactly how much memory is available to your monitoring sites and how much is currently used, you can monitor your device using Check_MK. After service discovery the host automatically monitors a service User_Memory which shows you the current and historical values.

In case your you monitoring instances are trying consume more memory than available, one of the processes of the monitoring sites is automatically killed. This is done by standard mechanisms of the Linux Kernel.