Monitoring of VMWare ESXi

Last updated: November 18. 2016


1. Introduction

With Check_MK you can monitor ESXi-Hosts and also its VMs. Thus, for example, on the host it is possible to query Disk-IO, datastore performance, the status of physical network interfaces, diverse hardware sensors, and much more. Check_MK likewise offers a series of check plug-ins for the VMs. A comprehensive list of these can be found in Catalog of check plug-ins in the "VMWare ESX" section.

Using the piggyback technique VM data will be displayed directly in its associated host. Thus the VM related data is found right where it is actually required, and where it can be compared to that registered by the VM's OS:

Access to this data is achieved via the HTTP-based vSphere-API – not over the normal agents or SNMP. This means that no agent or other software needs to be installed on the ESXi-Hosts and that the access is very simple to set up. Older systems – from Version 4.1 – are also supported.

2. Setting up

2.1. Setting up via the ESXi host system

The initial setup for monitoring a ESXi-server is very simple and can be completed in less than five minutes. Before you can set up the access however, the following prerequisites must be satisfied:

  • You must have defined a user on the ESXi-server. It is sufficient that this user only has read access.
  • You must have defined the ESXi-server as the host in Check_MK, and configured it as an agent (Check_MK Agent). Tip: Select the host name so that it is the same as that known to the server itself.

Once the prerequisites have have been satisfied you can create the Check state of VMWare ESX via vSphere Rule. This will be assigned to the defined host, so that instead of the standard agent the Special-Agent will be used for retrieving data from the VMware-monitoring.

Enter the user's name and password as they have been defined on the ESXi-Server. The condition for the rule must be set on the host defined in Check_MK. After this the first installation will be complete and Check_MK can retrieve the data from the server.

Finally, go back to the host configuration, execute a Service discovery, then activate the changes as usual. If no services are identified, you can search for errors in the configuration with the Diagnostic options, as described later below.

2.2. Setting up using vCenter

If a vCenter is available, with this you can also retrieve a lot of your ESXi-environment's data. This method has various advantages and disadvantages:

#

# # #

Advantages Disadvantages
Simple application in situations where VMs are assigned dynamically using vMotion. No monitoring if vCenter is unavailable.
Monitoring of a cluster's total RAM usage is possible. No monitoring of hardware-specific data in the cluster's nodes.
Monitoring of the licenses' overall status is possible.

A combination of both methods can also be utilised – then you can have the best of both worlds.

Configuring the vCenter

Similar preconditions apply for this configuration as for the configuration over a single ESXi-Server:

  • A user with read access is present on the vCenter
  • The vCenter has been defined as a host and configured as an Agent Check_MK Agent in Check_MK
  • If the ESXi-Servers have already been configured in Check_MK and you wish to combine the monitoring, then in vCenter their names will be the same as they are configured as hosts in Check_MK

As described earlier, create a rule for the VMware-monitoring's special agent, in Type of Query select the vCenter, and set the condition to the appropriate host as defined in Check_MK:

With this the configuration will be completed. As described above, execute a service discovery for the vCenter-host.

Retrieving from ESXi-Hosts and vCenter

In order to avoid duplicated data retrieval when using a combination of both configuration methods, the rule for the vCenter can be configured to retrieve only specific data. One possibility is to access the Datastores and the Virtual Machines over the vCenter, and the other data directly from the ESXi-hosts. The license usage can be fetched in both configurations as the vCenter reports an overall status.

If you have already configured the ESXi-hosts, its rules will be adapted accordingly. Here only access to the Host Systems and Performance Counters is offered, since these belong unalterably to a particular ESXi-server. The license status is applicable only to the accessed ESXi-server.

2.3. Monitoring the VMs

By default, only the status of the VMs as services is created and assigned to the ESXi, or the vCenter respectively. There is however even more information from these VMs – from RAM, or the Snapshots, for example. This data is retrieved over the ESXi/vCenter and stored as piggyback data.

In order to make this data visible, the VM must be defined as a host in Check_MK. You can of course install the Check_MK agents on the VM and take full advantage of their functions. The piggyback data will simply be added to that already available.

Renaming the piggyback data

The host name in Check_MK must be identical to the VM name, so that the data allocations function correctly. The piggyback data's name will be the same as the VM's name on the ESXi. If these names do not match then there are several options in Check_MK to make the piggyback names conform. In the configurations rule itself the following are available:

  • From Version 1.4.0i1 it is possible to use the host name of the OS on the VM, if this can be accessed via the vSphere-API
  • If the VM's name includes blank characters, the name will be truncated after the first blank. Alternatively, the blanks can be replaced with underscores

If the host's name is quite different in Check_MK, an explicit allocation can be performed with the help of the hostname translation for piggybacked hosts rule.

If the host is configured in Check_MK and the names conform, you can activate the Display VM power state on check box in the configurations rule – select if and where the data is to be made available. Select The Virtual Machine here.

With a service discovery on the host(s) the new services will now be identified and can be activated. Be aware that the information from the services could differ from one another. The ESXi-Server will see a virtual machine's RAM usage differently to how the machine's own OS reports it.

3. Diagnostic options

When searching for the source of an error there are a number of 'ports of call'. Since the data comes from the ESXi-/vCenter-Server, this is a logical place to start searching for the error. Later it is important that the the data gets to the Check_MK-Server, and can be correctly processed and displayed there.

For problems with an ESXi-/vCenter-Server configuration:

With the curl command you can verify whether the server is accessible to the monitoring:

OMD[mysite]:~$ curl -Ik https://myESXhost.my-domain.net
HTTP/1.1 200 OK
Date: Fri, 4 Nov 2016 14:29:31 GMT
Connection: Keep-Alive
Content-Type: text/html
X-Frame-Options: DENY
Content-Length: 5426

Whether the access data has been entered correctly, and Check_MK can access the host, can be tested on the console with the Special-Agents. Use the --help/-h options to receive a complete list of the available options. In the example, with the aid of grep the output was limited to a specific section and the first four lines following it. You can omit this part in order to receive a complete output, or filter for another:

OMD[mysite]:~$ share/check_mk/agents/special/agent_vsphere --debug --user myesxuser --secret myesxpassword -D myESXhost | grep -A4 esx_vsphere_objects
<<<esx_vsphere_objects:sep(9)>>>
hostsystem      myESXhost           poweredOn
hostsystem      myESXhost2          poweredOn
virtualmachine  myVM123             myESXhost   poweredOn
virtualmachine  myVM126             myESXhost   poweredOn

Whether Check_MK can access the host can be verified on the console. Here the output is also limited to five lines:

OMD[mysite]:~$ cmk -d myESXhost | grep -A4 esx_vsphere_objects
<<<esx_vsphere_objects:sep(9)>>>
hostsystem      myESXhost           poweredOn
hostsystem      myESXhost2          poweredOn
virtualmachine  myVM123             myESXhost   poweredOn
virtualmachine  myVM126             myESXhost   poweredOn

Alternatively, the test can also be performed in WATO:

If everything works up to this point the output should have been saved to a temporary directory. If such a file has been produced and if the content is correct can be determined with the following:

OMD[mysite]:~$ ll tmp/check_mk/cache/myESXhost
-rw-r--r-- 1 mysite mysite 17703 Nov  4 15:42 myESXhost
OMD[mysite]:~$ head -n5 tmp/check_mk/cache/myESXhost
<<<esx_systeminfo>>>
Version: 6.0
AgentOS: VMware ESXi
<<<esx_systeminfo>>>
vendor VMware, Inc.

Problems with piggyback data:

Check_MK creates a directory containing a text file for each host. In this text file can be found the data which is to be allocated to the hosts.

OMD[mysite]:~$ ll tmp/check_mk/piggyback/
total 0
drwxr-xr-x 2 mysite mysite 60 Nov  4 15:51 myVM123/
drwxr-xr-x 2 mysite mysite 60 Nov  4 15:51 myVM124/
drwxr-xr-x 2 mysite mysite 60 Nov  4 15:51 myVM126/
drwxr-xr-x 2 mysite mysite 60 Nov  4 15:51 myESXhost2/
OMD[mysite]:~$ ll tmp/check_mk/piggyback/myVM123/
-rw-r--r-- 1 mysite mysite 1050 Nov  4 15:51 myESXhost

If these directories or files are absent they have not been created by the Special-Agents. You can see if the VM's data is included in the agent's output. Should this situation arise, look in the configuration rule for the ESXi-/vCenter-host to see if the data retrieval has been activated.

OMD[mysite]:~$ grep "<<<<myVM123>>>>" tmp/check_mk/cache/myESXhost
<<<<myVM123>>>>

In the case of a very large number of such directories for piggyback data it can be very difficult to find those that have no allocation to a host. Here we provide a script with which unassigned piggyback hosts can easily be found:

OMD[mysite]:~$ share/doc/check_mk/treasures/find_piggy_orphans
myESXhost2

From the script output it can be that Check_MK can't find a host with the same name to which it can allocate the data. The piggyback names can however be altered in a number of ways.

4. Files and directories

Path Function
tmp/check_mk/piggyback/ WATO saves the piggyback data here. For each host a subfolder is created with the host's name - this subfolder contains a text file with the host's data. The filename is the name of the host providing the data.
tmp/check_mk/cache/ Here the respective latest agent output from all hosts is temporarily saved. The content of a host's file is identical to the cmk -d myhost command.
share/check_mk/agents/special/agent_vsphere The special agent for executing a query of ESXi- and vCenter-servers. This script can also be executed manually for testing purposes.
share/doc/check_mk/treasures/find_piggy_orphans A script for finding piggyback data that is not allocated to a host.