Last updated: September 17. 2018

In order to preserve your monitoring data in case of a hardware failure or similar destruction, a backup of your data can be configured via your appliance’s web user interface.

To be certain the data really is backed up it must be saved to another device, e.g., a file server. For this, via mount management, first configure the network file sharing to be used for the backup. This will be defined as the target when configuring the data backup. Once this is completed a backup job can be created that at predefined intervals saves a backup of your system to the shared network.

The full backup includes all of the configurations defined on the system, installed files and likewise your monitoring instances.

The backup is executed (online) during active operations. This can however first be fully-realised when all monitoring instances on the appliance use Check_MK 1.2.8p6, 1.4.0i1 or a Daily-Build from or newer than 22.07.2016. Active instances using older versions will be stopped before, and restarted after the backup.

1. Automatic backup

To set up an automatic data backup, configure one or more backup jobs. A backup data set must be created on the target system for each backup job. When each new backup is completed, the previous backup will be deleted – meaning that on the target system double the storage allocation will be temporarily required.

The backup does not manage multiple generations. If you require more copies over an extended time frame to be retained, you will need to create these yourself.

2. Configuring the backup

With help from file system management first configure your network file sharing. In our example a network file sharing is configured under the file path /mnt/auto/backup.

Next, select the Appliance Backup item in the web interface’s main menu, and in the next menu open the Backup target. Then create a New backup target. The title’s ID has a free syntax. Under the target directory for backup item, configure the mounted file sharing’s data path - in this case /mnt/auto/backup. The Is a Mount-Point option must be active if you are backing up to a network file sharing – this verifies to the backup that the file sharing really is mounted.

Once the backup target has been created, return to the Appliance backup page and from there select New job. Here again you can choose an ID and a title. Next, select the newly-created backup target and define the desired periods for running the backup.

After saving you will see an entry for your new backup job on the Appliance backup page. The scheduled time for the next execution will be shown at the end of this line. As soon as the job has started, or respectively, completed, its status will be shown in this view. You can also manually start, or if needed, interrupt running backups here.

To test your newly created job, click on the Play-icon. You will see in the table that your job is currently running. By clicking on the Log-Icon you can display the job’s progress in the form of log output.

As soon as the backup has completed this will also be shown in the table.

3. Backup format

Every backup job creates a directory on the backup target. This directory’s name conforms to the following schema:

  • Appliance backups: Check_MK_Appliance-[HOSTNAME]-[LOCAL_JOB_ID]-[STATE]
  • Instance backups: Check_MK-[HOSTNAME]-[SITE]-[LOCAL_JOB_ID]-[STATE]

In the wildcard character fields, any - (minus) characters are replaced by + so as not to be confused with the field separators.

During the backup the directory will be saved with the suffix: -incomplete. Once completed the directory is renamed and the suffix changed to: -complete.

A data set containing the meta information pertaining to the backup is saved in the directory. Alongside this file a number of archives are saved to the directory.

The archive named system contains the appliance’s configuration, system-data contains the data file system’s data – excluding the that of the monitoring instances. The monitoring instances are saved in separate archives that use the site-[SITENAME] naming schema.

Depending on the backup’s mode, these data sets are saved with the .tar file extension for uncompressed and unencrypted, .tar.gz for compressed but unencrypted, and .tar.gz.enc for compressed and encrypted archives.

4. Encryption

If you want to encrypt your backup, you can configure this directly from the web user interface. Your backed-up data will then be completely encrypted before being transferred to the backup target. The encryption is achieved using a predefined encryption key. This key is protected by a password defined when creating the key, and with which the key must be securely retained, as only with these is it possible to retrieve the backed up data.

To this end, open the Backup page and from there select the Encryption key page. Here you can create a new encryption key. When entering the password be sure to use a sufficiently complex character string – the longer and more complex your password, the harder it is for an attacker to decrypt your key and thus your backup.

Once you have created your key, download it and retain it in a secure location.

An encrypted backup can only be restored with the encryption key and its corresponding password. Now, from the Appliance backup, edit the backup job that is to create the encrypted backups, there activate the Encryption item and select the freshly created encryption key.

Once you have confirmed the dialog, the next backup will be automatically encrypted.

5. Compression

It is possible to compress the data during the copy procedure. This can be useful if you need to save bandwidth or if space on the target system is limited.

But please be aware however that the compression requires noticeably more CPU time and therefore the backup procedure will take longer. As a rule it is advisable not to activate compression.

Uncompressed backups are first supported from Check_MK-Version 1.2.8p5. If you run monitoring instances with older versions, you must activate compression for the complete backup.

6. Recovery

Using the web user interface’s built-in functions you can only make a complete restore. Restoring individual data sets via the web interface is not provided for. It is nevertheless possible via the command line and by manually unpacking from the backup.

If you wish to restore a complete backup on a currently running appliance, select the Restore item on the Appliance backup page and on the next page select the backup target from where you want to source the backed up data. Once the backup target has been selected a list of all of its available backups will be shown.

Next, click on the arrow beside the backup data you wish to use and the restore will initiate – and following confirmation of a security query the restore will start.

While the restore is running you can view its progress by refreshing the Restore page that will be automatically displayed.

At the end of the restore the appliance will automatically restart – following this new start the restore will be complete.

6.1. Disaster recovery

If you need to completely restore an appliance the disaster recovery runs the following steps:

  • You have an appliance with the factory default configuration (a new, identical appliance, or an appliance that has been reset to the factory default).
  • Ensure that the firmware version matches that of the backup.
  • Configure the following minimum settings on the console:
  • Network settings
  • Access to the web interface
  • In the web interface, configure:
  • the backup target from which you wish to restore.
  • for an encrypted backup, upload the security key.
  • Now start the restore as described in the preceeding chapter.

7. Monitoring

From Check_MK version 1.4.0i1, for every configured backup job the Service Discovery on the appliance has a new service: Backup [JOB-ID]. This service informs of potential problems with the backup, and displays useful values such as size and duration.

8. Special features with clusters

The complete configuration of the backups, including the encryption keys will be synchronised between the cluster nodes. The cluster nodes run the backups separately, and likewise save separate directories for their backups on the backup target.

The active cluster node backs up the complete appliance, including the data from the data file system and the monitoring site. The inactive cluster node saves only its local appliance configuration.

Thus, when restoring a backup, only an active cluster node’s backup can restore the monitoring instances.