5. Heartbeat resources details

This chapter provides detailed information on heartbeat resources.

This chapter covers:

5.1. What are heartbeat resources?

A server in a cluster configuration monitors whether the other server is being activated. For this monitoring, heartbeat resources are used.

  1. LAN heartbeat/kernel mode LAN heartbeat (primary interconnect)

    Two servers with a shared disk connected

    Fig. 5.1 LAN heartbeat/kernel mode LAN heartbeat (primary interconnect)

  2. LAN heartbeat/kernel mode LAN heartbeat (secondary interconnect)

    Two servers with a shared disk connected

    Fig. 5.2 LAN heartbeat/kernel mode LAN heartbeat (secondary interconnect)

  3. Disk heartbeat

    Two servers with a shared disk connected

    Fig. 5.3 Disk heartbeat

  4. Witness heartbeat

    Two servers with a shared disk connected

    Fig. 5.4 Witness heartbeat

Heartbeat resource name

Abbreviation

Functional overview

LAN heartbeat resource (1)(2)

lanhb

Uses a LAN to monitor if servers are activated.
Used for communication within the cluster as well.

Kernel mode LAN heartbeat resource (1)(2)

lankhb

A kernel mode module uses a LAN to monitor if servers are activated.

Disk heartbeat resource (3)

diskhb

Uses a dedicated partition in the shared disk to monitor if servers are activated.

Witness heartbeat resource (4)

witnesshb

A module uses the Witness server to monitor whether or not servers are active

  • For an interconnect with the highest priority, configure LAN heartbeat resources or kernel mode LAN heartbeat resources which can be exchanged between all servers.

  • Configuring at least two kernel mode LAN heartbeat resources is recommended unless it is difficult to add a network to an environment such as the cloud or a remote cluster.

  • It is recommended to register both an interconnect-dedicated LAN and a public LAN as LAN heartbeat resources.

5.2. Understanding LAN heartbeat resources

5.2.1. LAN heartbeat resources

  • You need to set at least one LAN heartbeat resource or kernel mode LAN heartbeat resource. It is recommended to have two or more LAN heartbeat resources; the one dedicated to interconnect and the one shared with interconnect and public.

  • Communication data for alert synchronization is transmitted on an interface that is registered with the interconnect. You should consider network traffic when you configure the settings.

  • A LAN heartbeat resource/kernel mode LAN heartbeat resource monitors not only the other server but also the local server.

5.3. Understanding kernel mode LAN heartbeat resources

5.3.1. Environment where the kernel mode LAN heartbeat resources works

Note

This function is dependent on the distribution and kernel version. Refer to " Supported distributions and kernel versions" in "Software" in "Installation requirements for EXPRESSCLUSTER" in the "Getting Started Guide" before you configure the settings.

5.3.2. The settings of the kernel mode LAN heartbeat resources

With the kernel mode driver module, kernel mode LAN heartbeat resource offer similar functions that LAN heartbeats provide. The kernel mode LAN heartbeat resources have the following features.

  • Kernel mode LAN heartbeat resource is less likely to be impacted by load of OS because it uses the kernel mode driver. This reduces the misinterpreting disconnect of interconnection.

  • When used with the keepalive settings to watch user-mode monitor resource, the kernel mode LAN heartbeat resource allows reset to be recorded in other servers when the user mode stalling is detected.

5.3.3. kernel mode LAN heartbeat resources

  • It is recommended to specify two or more kernel mode LAN heartbeat resources; the one dedicated to interconnect and the one shared with interconnect and public.

  • A LAN heartbeat resource/kernel mode LAN heartbeat resource monitors not only the other server but also the local server.

5.4. Understanding disk heartbeat resources

5.4.1. Setting the disk heartbeat resources

To use a heartbeat resource, you need to have the following settings.

  • Allocate a dedicated partition on the shared disk. (You do not need to create any file system.)

  • Configure settings that allow all servers to access the dedicated partition on the shared disk by the same device name.

When a disk heartbeat resource is being used, it can be checked if other servers are active even if the network is disconnected.

  1. The figure below shows two servers connected to a shared disk.
    One of the partitions on the shared disk is used for the disk heartbeat.
    Two servers and a shared disk

    Fig. 5.5 Disk heartbeat resource (1)

  2. One of the network connections between the two servers is disconnected.

    Two servers and a shared disk

    Fig. 5.6 Disk heartbeat resource (2)

  3. Even if all the network connections between the two servers are disconnected, the disk heartbeat resource prevents the file system on the shared disk from being corrupted by a split brain syndrome.

    Two servers and a shared disk

    Fig. 5.7 Disk heartbeat resource (3)

If the cluster consists of three or more servers, you can have a configuration using a disk heartbeat resource as below. You can configure the settings that allow usage of the disk heartbeat resource only among the servers in the cluster using the shared disk.

For details, see "Interconnect tab" in "Cluster properties" in "2. Parameter details" in this guide.

Three servers and a shared disk

Fig. 5.8 Configuration with a disk heartbeat resource (three servers)

5.4.2. Disk heartbeat resources

  • It is recommended to use both a LAN heartbeat resource and a disk heartbeat resource when you use a shared disk.

  • In each LUN, allocate a partition dedicated to a disk heartbeat. LUNs that do not use a disk heartbeat should also have a dummy partition because the file system can be damaged if device names are moved due to disk failure or other causes.
    Partitions dedicated to disk heartbeat should have the same number across all the LUNs.
    The figure below shows two storages, each of which contains four LUNs.
    To each of the LUNs in the storages, a partition dedicated to a disk heartbeat is allocated. However, actually used is: the disk heartbeat partition on LUN 1-1 and that on LUN 2-1.
    The other partitions, which are dedicated to a disk heartbeat for the other LUNs, are not actually used. These dummy partitions serve as substitutes in case the file system is damaged due to an unintentional change in the device name.
    Two storage chassis with LUNs

    Fig. 5.9 Partition dedicated to a disk heartbeat

  • Do not register to storage pool.

5.5. Understanding Witness heartbeat resources

5.5.1. Settings of the Witness heartbeat resources

To use the Witness heartbeat resources, the following settings are required.

  • The communication needs to be available between all the servers using Witness heartbeat resources and the server where the Witness server service operates (Witness server). For the Witness server, refer to "Witness server service" in "8. Information on other settings".

The Witness heartbeat resources allow to regularly check the server alive information which the Witness server retains. In addition, by using the HTTP network partition resolution resource as well, "communication disconnection between a local server and Witness server" and "communication disconnection between other servers and Witness server" are distinguished while the Witness heartbeat resources are operated.

5.5.2. Notes on the Witness heartbeat resources

  • If spaces are included in cluster names, Witness heartbeat resources do not work correctly. Do not use spaces for cluster names.

  • In the communication with the Witness server, NIC and a source address are selected according to the OS settings.

  • If Use Proxy is enabled, Use SSL is recommended to be enabled as well. We have confirmed that, when communicating with the Witness server via the proxy server with Squid, the TIME_WAIT state occurs in a port on the proxy server upon each HTTP request, depending on the behavior of Squid. In the case of HTTPS, however, this phenomenon does not occur.