1. Preface

1.1. Who Should Use This Guide

The Installation and Configuration Guide is intended for system engineers and administrators who want to build, operate, and maintain a cluster system. Instructions for designing, installing, and configuring a cluster system with EXPRESSCLUSTER are covered in this guide.

1.2. How This Guide is Organized

1.3. EXPRESSCLUSTER X Documentation Set

The EXPRESSCLUSTER X manuals consist of the following six guides. The title and purpose of each guide is described below:

Getting Started Guide

This guide is intended for all users. The guide covers topics such as product overview, system requirements, and known problems.

Installation and Configuration Guide

This guide is intended for system engineers and administrators who want to build, operate, and maintain a cluster system. Instructions for designing, installing, and configuring a cluster system with EXPRESSCLUSTER are covered in this guide.

Reference Guide

This guide is intended for system administrators. The guide covers topics such as how to operate EXPRESSCLUSTER, function of each module and troubleshooting. The guide is supplement to the Installation and Configuration Guide.

Maintenance Guide

This guide is intended for administrators and for system administrators who want to build, operate, and maintain EXPRESSCLUSTER-based cluster systems. The guide describes maintenance-related topics for EXPRESSCLUSTER.

Hardware Feature Guide

This guide is intended for administrators and for system engineers who want to build EXPRESSCLUSTER-based cluster systems. The guide describes features to work with specific hardware, serving as a supplement to the Installation and Configuration Guide.

Legacy Feature Guide

This guide is intended for administrators and for system engineers who want to build EXPRESSCLUSTER-based cluster systems. The guide describes EXPRESSCLUSTER X 4.0 WebManager, Builder, and EXPRESSCLUSTER Ver 8.0 compatible commands.

1.4. Conventions

In this guide, Note, Important, See also are used as follows:

Note

Used when the information given is important, but not related to the data loss and damage to the system and machine.

Important

Used when the information given is necessary to avoid the data loss and damage to the system and machine.

See also

Used to describe the location of the information given at the reference destination.

The following conventions are used in this guide.

Convention

Usage

Example

Bold

Indicates graphical objects, such as fields, list boxes, menu selections, buttons, labels, icons, etc.

In User Name, type your name.
On the File menu, click Open Database.

Angled bracket within the command line

Indicates that the value specified inside of the angled bracket can be omitted.

clpstat -s [-h host_name]

Monospace (courier)

Indicates path names, commands, system output (message, prompt, etc), directory, file names, functions and parameters.

c:\Program files\EXPRESSCLUSTER

Monospace bold (courier)

Indicates the value that a user actually enters from a command line.

Enter the following:
clpcl -s -a
Monospace italic
(courier)

Indicates that users should replace italicized part with values that they are actually working with.

clpstat -s [-h host_name]

1.5. Contacting NEC

For the latest product information, visit our website below:

https://www.nec.com/global/prod/expresscluster/

2. Determining a system configuration

This chapter provides instructions for determining the cluster system configuration that uses EXPRESSCLUSTER.

This chapter covers:

2.1. Steps from configuring a cluster system to installing EXPRESSCLUSTER

Before you set up a cluster system that uses EXPRESSCLUSTER, you should carefully plan the cluster system with due consideration for factors such as hardware requirements, software to be used, and the way the system is used. When you have built the cluster, check to see if the cluster system is successfully set up before you start its operation.

This guide explains how to create a cluster system with EXPRESSCLUSTER through step-by-step instructions. Read each chapter by actually executing the procedures to install the cluster system. The following is the steps you take from designing the cluster system to operating EXPRESSCLUSTER:

See also

Refer to the "Reference Guide" as you need when operating EXPRESSCLUSTER by following the procedures introduced in this guide. See the "Getting Started Guide" for the latest information including system requirements and lease information.

Before installing EXPRESSCLUSTER, create the hardware configuration, the cluster system configuration and the information on the cluster system configuration.

  • 2. Determining a system configuration

    Review the overview of EXPRESSCLUSTER and determine the configurations of the hardware, network and software of the cluster system.

  • 3. Configuring a cluster system

    Plan a failover group that is to be the unit of a failover, and determine the information required to install the cluster system.
    Install EXPRESSCLUSTER and apply the license registration and the cluster configuration data to it.
  • 4. Installing EXPRESSCLUSTER

    Install EXPRESSCLUSTER on the servers that constitute a cluster.

  • 5. Registering the license

    Register the license required to operate EXPRESSCLUSTER.

  • 6. Creating the cluster configuration data

    Based on the failover group information determined in the step 2, create the cluster configuration data by using the Cluster WebUI, and then configure a cluster.

  • 7. Verifying a cluster system

    Check if the cluster system has been created successfully.
    Conduct a dummy test, parameter tuning and operational simulation required to be done before operating the cluster system. The procedures to uninstall and reinstall are also explained in this section.
  • 8. Verifying operation

    Check the operation and perform parameter tuning by a dummy-failure.

  • 9. Preparing to operate a cluster system

    Check the task simulation, backup and/or restoration and the procedure to handle an error, which are required to operate a cluster system.

  • 10. Uninstalling and reinstalling EXPRESSCLUSTER

    This chapter explains how to uninstall, and reinstall the EXPRESSCLUSTER.

2.2. What is EXPRESSCLUSTER?

EXPRESSCLUSTER is software that enhances availability and expandability of systems by a redundant (clustered) system configuration. The application services running on the active server are automatically inherited to the standby server when an error occurs on the active server.

The following can be achieved by installing a cluster system that uses EXPRESSCLUSTER.

  • High availability
    The down time is minimized by automatically failing over the applications and services to a "healthy" server when one of the servers which configure a cluster stops.
  • High expandability
    Both Windows and Linux support large scale cluster configurations having up to 32 servers.

See also

For details on EXPRESSCLUSTER, refer to "Using EXPRESSCLUSTER" in the "Getting Started Guide".

2.2.1. EXPRESSCLUSTER modules

EXPRESSCLUSTER X consists of following two modules:

  • EXPRESSCLUSTER Server
    The main module of EXPRESSCLUSTER and has all high availability functions of the server. Install this module on each server constituting the cluster.
  • Cluster WebUI
    This is a tool to create the configuration data of EXPRESSCLUSTER and to manage EXPRESSCLUSTER operations. The Cluster WebUI is installed in EXPRESSCLUSTER Server, but it is distinguished from the EXPRESSCLUSTER Server because the Cluster WebUI is operated through a Web browser on the management PC.

2.3. Planning system configuration

You need to determine an appropriate hardware configuration to install a cluster system that uses EXPRESSCLUSTER. The configuration examples of EXPRESSCLUSTER are shown below.

See also

For latest information on system requirements, refer to "Installation requirements for EXPRESSCLUSTER" and "Latest version information" in the "Getting Started Guide".

2.3.1. Shared disk type and mirror disk type

There are three types of system configurations: shared disk type, mirror disk type and hybrid disk type.

  • Shared disk type
    When the shared disk type configuration is used, application data is stored on a shared disk that is physically connected to servers, by which access to the same data after failover is ensured.
    You can make settings that block the rest of the server from accessing the shared disk when one server is using a specific space of the shared disk.
    The shared disk type is used in a system such as a database server where a large volume of data is written because performance in writing data does decrease.
  • Mirror disk type
    When the mirror disk type configuration is used, application data is mirrored between disks of two servers, by which access to the same data after failover is ensured.
    When data is written on the active server, the data also needs to be written on the standby server. As a result, the writing performance will decrease.
    However, the cost of the system can be reduced because no external disk such as a shared disk is necessary, and the cluster can be achieved only by disks on servers.
    When configuring a remote cluster by placing the standby server in a remote site for disaster control, a shared disk cannot be used. Thus the mirror disk type is used.
  • Hybrid type
    This configuration is a combination of the shared disk type and the mirror disk type. By mirroring the data on the shared disk, the data is placed in the third server, which prevents the shared disk being a single point of failure.
    Data writing performance, operational topology and precautions of the mirror disk type apply to the hybrid type.

The following show configuration examples of the shared disk type, the mirror disk type and the hybrid type. Use these examples to design and set up your system.

2.3.2. Example 1: Configuration using a shared disk with 2 nodes

This is the most commonly used system configuration:

  • Different models can be used for servers. However, mirroring disks should have the same drive letter in both servers.

  • Use cables for interconnection. A dedicated HUB can be used for connection the same way as 3-nodes configuration.

  • Connect COM (RS-232C) ports using a cross cable.

2.3.3. Example 2: Configuration using mirror disks with 2 nodes

  • Different models can be used for servers. However, the mirrors disk should have the same drive letter on both servers.

  • It is recommended to use cables for interconnection. (It is recommended to connect one server to another server directly using a cable. A HUB can also be used.)

2.3.4. Example 3: Configuration using mirror partitions on the disks for OS with 2 nodes

  • A mirroring partition can be created on the disk used for the OS.

See also

For mirror partition settings, refer to "Group resource details" and "Understanding mirror disk resources" in the "Reference Guide".

2.3.5. Example 4: Configuring a remote cluster by using asynchronous mirror disks with 2 nodes

  • Configuring a cluster between servers in remote sites by using WAN, as shown below, is a solution for disaster control.

  • Using asynchronous mirror disks can curb a decrease in disk performance due to the network delay. There is still a chance that the information updated immediately before a failover gets lost.

  • It is necessary to secure enough communication bandwidth for the traffic amount of updated information on mirror disks. Insufficient bandwidth can cause delay of communication with a business operation client or interruption of mirroring.

  • Use Dynamic DNS resource or Virtual IP resource to switch the connected server.

See also

For information on resolving network partition and the VIP settings, see "Understanding virtual IP resources" in "Group resource details" and "Details on network partition resolution resources" in the "Reference Guide".

2.3.6. Example 5: Configuration using a shared disk with 3 nodes

  • The same way as 2 nodes-configuration, connect servers to a shared disk. The shared disk should have the same drive letter on all servers.

  • Install a dedicated HUB for interconnection.

  • It is not necessary to establish connectivity between servers using the connect COM (RS-232C).

2.3.7. Example 6: Configuration using both mirror disks and a shared disk with 3 nodes

  • It is possible to use both mirror disks and a shared disk on one cluster. In this example, the system is configured with three nodes: one for a shared disk type, one for a mirror disk type and, and one for standby.

  • It is not necessary to connect a shared disk to the server where business applications using the shared disk do not run. However the shared disk needs to have the same drive letter on the all connecting servers.

  • Install a dedicated HUB for interconnection.

  • It is not necessary to establish connectivity between servers using the connect COM (RS-232C).

2.3.8. Example 7: Configuration using the hybrid type with 3 nodes

This is a configuration with three nodes that consists of two nodes connected to the shared disk and one node having a disk to be mirrored.

  • The servers should not necessarily be the same model.

  • Install a dedicated HUB for interconnection and LAN of mirror disk connect.

  • Use a HUB with faster performance as much as possible.

2.4. Checking system requirements for each EXPRESSCLUSTER module

EXPRESSCLUSTER X consists of two modules: EXPRESSCLUSTER Server (main module) and Cluster WebUI. Check configuration and operation requirements of each machine where these modules will be used. For details about the operating environments, see "Installation requirements for EXPRESSCLUSTER" in the "Getting Started Guide".

2.5. Determining a hardware configuration

Determine a hardware configuration considering an application to be duplicated on a cluster system and how a cluster system is configured. Read "3. Configuring a cluster system" before you determine a hardware configuration.

See also

Refer to "3. Configuring a cluster system."

2.6. Settings after configuring hardware

After you have determined the hardware configuration and installed the hardware, verify the following:

2.6.1. Shared disk settings (Required for shared disk)

Set up the shared disk by following the steps below:

Important

When you continue using the data on the shared disk (in the cases such as reinstalling the server), do not create partitions or a file system. If you create partitions or a file system, data on the shared disks will be deleted.

Note

The partition to be allocated as described below cannot be used by mounting it on an NTFS folder.

  1. Allocate a partition for disk heartbeat.
    Allocate a partition on a shared disk to be used by the DISK Network Partition Resolution Resources in EXPRESSCLUSTER. Create a partition on one of the servers in the cluster that uses the shared disk. Create the partition in the same way as you create ordinary partitions through "Disk Management" function of OS and set a drive letter. Configure it as RAW partition without formatting. Perform this operation on one of the servers to which a shared disk is connected. And then set the same drive letter on other servers that also use the same shared disk. Because the partition has been already created, you do not need to create a partition. Set only the drive letter without formatting from the OS disk management.

    Note

    A disk heartbeat partition should be 17 MB (17,825,792 bytes) or larger. Leave the disk heartbeat partitions as RAW partition without formatting.

  2. Allocate a cluster partition if you are using the hybrid disk type. 1.
    Create a partition to be used for controlling the status of hybrid disk on the shared disk to be mirrored with hybrid disk resource. The procedures for making the cluster partition are the same as the ones for a partition of disk heartbeat resources.

    Important

    A cluster partition should be 1GB (1,073,741,824 bytes) or larger. Leave the cluster partition as RAW partition without formatting.

  3. Allocate a switchable partition for disk resources or a data partition for hybrid disk resources on the shard disk.
    Create a switchable partition for disk resources or a data partition for hybrid disk resources on a shared disk. Create a partition on one of the servers in the cluster that uses the shared disk. Create the partition through "Disk Management" function of OS, set a drive letter, and format NTFS.
    Configure the same drive letter on the other server connected to the shared disk. Because the partition has been already created, you do not need to create a partition or format it.
    Because the access control for the shared disk starts performing after the setup of cluster has completed, do not start the multiple servers connected to the shared disk until the setup has completed. Otherwise, files or folders stored on the shared disk may be corrupted. Thus, make sure not to start the multiple servers connected to the shared disk at once till the server with EXPRESSCLUSTER installed has been rebooted after a partition for disk resources has been formatted.

    Important

    Do not start multiple servers connected with the shared disk simultaneously. The data on the shared disk may be corrupted.

2.6.2. Mirror partition settings (Required for mirror disks)

Set up partitions for mirror disk resources by following the steps below. This is required for a local disk (a disk connected to only one of the servers) to be mirrored with the shared disk in the hybrid configuration.

Note

When you cluster a single server and continue using data on the existing partitions, do not re-create the partitions. If you re-create partitions, data on the shared disks will be deleted.

Note

The partition to be allocated as described below cannot be used by mounting it on an NTFS folder.

  1. Allocate cluster partitions.
    Create partitions to be used by the mirror disk resources/hybrid disk resources. The partition is used for managing the status of mirror disk resources/hybrid disk resources. Create the partition in every server in the cluster that uses mirror resources. Create partitions by using "Disk Management" function of OS, and leave them as raw partition without formatting. Configure a drive letter for them.

    Note

    The cluster partition should be 1GB (1,073,741,824 bytes) or larger. Leave the disk cluster partition as RAW partition without formatting.

  2. Allocate data partitions
    Create the data partitions for mirroring by mirror disk resources/hybrid disk resources. For mirror disk resources, create the data partitions on the two servers on which disk mirroring is performed.
    Format partitions with NTFS from "Disk Management" function of OS and configure a drive letter.

    Note

    When partitions (drive) to be mirrored already exist (in the cases such as reinstalling EXPRESSCLUSTER), you do not need to create partitions again. When data that should be mirrored already exist on partitions, if you create partitions again or format partitions, the data will be deleted.
    A drive with a system drive and/or page file and a drive where EXPRESSCLUSTER is installed cannot be used as partitions for mirror disk resources. The data partitions in both servers must be precisely the same size in byte. If the geometries of the servers differ among the servers, it might not be able to create precisely same size of partitions. Check the partition sizes with the clpvolsz command and adjust them. The same drive letter must be configured on the partitions in the servers.

2.6.3. Adjustment of the operating system startup time (Required)

It is necessary to configure the time from power-on of each node in the cluster to the server operating system startup to be longer than the following:

  • The time from power-on of the shared disk to the point they become available.

  • Heartbeat timeout time (30 seconds by default.)

Adjustment of the startup time is necessary to prevent the following problems:

  • If the cluster system is started by powering on the shared disk and servers, starting a shared disk is not completed before the OS is rebooted. OS is started in the status where the shared disk is not recognized, and activation of disk resources fails.

  • A failover fails if a server, with data you want to fail over by rebooting the server, reboots within the heartbeat timeout. This is because a remote server assumes that the heartbeat is continued.

Consider the times durations above and adjust the operating system startup time by using the bcdedit command of Windows.

Note

If only one OS is installed in the system, the wait time that you configured may be disabled. So that, add copy of operating system information.
Use the bcdedit command with the /copy option specified.

2.6.4. Verification of the network settings (Required)

On all servers in the cluster, verify the status of the following network resources using the ipconfig or ping command.

  • Public LAN (used for communication with all the other machines)

  • LAN dedicated to interconnect (used for communication between EXPRESSCLUSTER Servers)

  • Host name

Note

It is not necessary to specify the IP addresses of floating IP resources virtual resources used in the cluster in the operating system.

2.6.5. Verification of the firewall settings (Required)

EXPRESSCLUSTER uses several port numbers for communication between the modules. For details about the port numbers to be used, see "Before installing EXPRESSCLUSTER" of "Notes and Restrictions" in the "Getting Started Guide".

2.6.7. Power saving function - OFF (Required)

In EXPRESSCLUSTER, power saving function (for example, standby or hibernation) with OnNow, ACPI, and/or APM functions cannot be used. Make sure to turn off the power saving function.

2.6.8. Setup of SNMP service (Required if ESMPRO Server is to be used cooperated with EXPRESSCLUSTER)

SNMP service is required if ESMPRO Server is to be used cooperated with EXPRESSCLUSTER. Set up SNMP service first before installing EXPRESSCLUSTER.

2.6.9. Setup of BMC and ipmiutil (Required for using the forced stop function of a physical machine and chassis ID lamp association)

For using the forced stop function of a physical machine and Chassis ID lamp association, configure the Baseboard Management Controller (BMC) of the servers to enable the communication between IP addresses of LAN ports for managing BMC and IP addresses used by the OS. These functions are not available when BMC is not installed on the server or when the network for managing BMC is disabled. For information on how to configure the BMC, refer to the manuals of your server.

These functions are used to control the BMC firmware in the servers by using IPMI Management Utilities (ipmiutil) provided as open source by the BSD license. ipmiutil must be installed on the servers to use these functions.
As of January 2018, ipmiutil can be obtained from the Website below.

Use ipmiutil of the versions 2.0.0 to 3.0.8

EXPRESSCLUSTER uses the hwreset command or ireset command, and alarms command or ialarms command of ipmiutil. To execute these commands without specifying path, include the path of the ipmiutil execution file in the system environment variable PATH or copy the execution file to the folder including the variable in its path (for example, the bin folder in the folder where EXPRESSCLUSTER is installed).

Because EXPRESSCLUSTER does not use the function that requires the IPMI driver, it is not necessary to install the IPMI driver.

To control BMC via LAN by the above commands, an IPMI account with Administrator privilege in BMC in each server. When you use NEC Express5800/100 series server, use User IDs 4 or later to add or change the account, because User IDs 3 or earlier are reserved by other tools. Use tools complying with the IPMI standards such as IPMITool for checking and changing account configuration.

2.6.10. Setup of a function equivalent to rsh provided by the network warning light vendor (Required)

For using the network warning light, set up a command equivalent to rsh supported by the warning light vendor.

3. Configuring a cluster system

This chapter provides information required to configure a cluster including requirements of applications to be duplicated, cluster topology, and explanation on resources constituting a cluster.

This chapter covers:

3.1. Configuring a cluster system

This chapter provides information necessary to configure a cluster system, including the following topics:

  1. Determining a cluster system topology

  2. Determining applications to be duplicated

  3. Creating the cluster configuration data

The following is a typical example of cluster environment with 2 nodes where standby is uni-directional.

3.2. Determining a cluster topology

EXPRESSCLUSTER supports multiple cluster topologies. There are uni-directional standby cluster system that considers one server as an active server and other as standby server, and multi-directional standby cluster system in which both servers act as active and standby servers for different operations.

  • Uni-directional standby cluster system
    In this operation, only one application runs on an entire cluster system. There is no performance deterioration even when a failover occurs. However, resources in a standby server will be wasted.
  • The same application multi-directional standby cluster system
    In this operation, the same application runs on more than one server simultaneously in a cluster system. Applications used in this system must support multi-directional standby operations.
  • Different applications multi-directional standby cluster system
    In this operation, different applications run on different servers and standby each other. Resources will not be wasted during normal operation; however, two applications run on one server after failing over and system performance deteriorates.

3.2.1. Failover in uni-directional standby cluster

On a uni-directional standby cluster system, the number of groups for an operation service is limited to one as described in the diagrams below:

3.2.2. Failover in multi-directional standby cluster

On a multi-directional standby cluster system, different applications run on servers. If a failover occurs on the one sever, multiple applications start to run on the other server. As a result, the failover destination server is more loaded than the time of normal operation and performance decreases.

3.3. Determining applications to be duplicated

When you determine applications to be duplicated, study candidate applications taking what is described below into account to see whether or not they should be clustered in your EXPRESSCLUSTER cluster system.

3.3.1. Server applications

3.3.1.1. Note 1: Data recovery after an error

If an application was updating a file when an error has occurred, the file update may not be completed when the standby server accesses to that file after the failover.

The same problem can happen on a non-clustered server (single server) if it goes down and then is rebooted. In principle, applications should be ready to handle this kind of errors. A cluster system should allow recovery from this kind of errors without human interventions (from a script).

3.3.1.2. Note 2: Application termination

When EXPRESSCLUSTER stops or transfers (performs online failback of) a group for application, it unmounts the file system used by the application group. Therefore, you have to issue an exit command for applications so that all files on the shared disk or mirror disk are stopped.

Typically, you give an exit command to applications in their stop scripts; however, you have to pay attention if an exit command completes asynchronously with termination of the application.

3.3.1.3. Note 3: Location to store the data

EXPRESSCLUSTER can pass the following types of data between severs:

  • Data in the switchable partition on the disk resource, or data in the data partition on the mirror disk resource/hybrid disk resource.

  • The value of a registry key synchronized by a registry synchronous resource
    Application data should be divided into the data to be shared among servers and the data specific to the server, and these two types of data should be saved separately.

Data type

Example

Where to store

Data to be shared among servers

User data, etc.

Switching partition of the disk resource or data partition of the mirror disk resource/hybrid disk resource

Data specific to a server

Programs, configuration data

On server's local disks

3.3.1.4. Note 4: Multiple application service groups

When you run the same application service in the multi-directional standby operation, you have to assume (in case of degeneration due to a failure) that multiple application groups are run by the same application on a server.
Applications should have capabilities to take over the passed resources by one of the following methods described in the diagram below. A single server is responsible for running multiple application groups.
The figures displayed below are the same with an example of a shared disk and/or mirror disk.
  • Starting up multiple instances
    This method invokes a new process.
    More than one application should co-exist and run.
  • Restarting the application
    This method stops the application which was originally running.
    Added resources become available by restarting it.
  • Adding dynamically
    This method adds resources in running applications automatically or by instructions from script.

3.3.1.5. Note 5: Mutual interference and compatibility with applications

Sometimes mutual interference between applications and EXPRESSCLUSTER functions or the operating system functions required to use EXPRESSCLUSTER functions prevents applications or EXPRESSCLUSTER from working properly.

  • Access control of a shared disk and mirror disk
    Access to switchable partitions managed by a disk resource or the data partitions mirrored by a mirror disk resource/hybrid disk resource is restricted when such resource is inactive. The partitions become not readable and writable. If a shared disk or a mirror disk whose application is inactive (in other words not being accessible from user or application), is accessed, an I/O error occurs.

Generally, you can assume when an application that is started up by EXPRESSCLUSTER is started, the switchable partition or data partition to which it should access is already accessible.

  • Multi-home environment and transfer of IP addresses
    In general, one server has multiple IP addresses in a cluster system. The IP address configuration of n each server changes dynamically because a floating IP address and a virtual address move between servers. If an application used in the system does not support such multi-home environment, the system can malfunction. For example, an attempt to acquire the IP address of the local server may result in acquisition of the LAN address for interconnection, which is different from the address used for communicating with clients. For applications that should be conscious of the IP address on a server, IP address to be used should be specified explicitly.
  • Access to shared disks or mirror disks from applications
    The stopping of application groups is not notified to other applications that coexist with the application. Therefore, if such an application is accessing a switchable partition or data partition used by an application group at the time when the application group stops, disk isolation will fail.
    Some applications like those responsible for system monitoring service periodically access all disk partitions. To use such applications in your cluster environment, they need a function that allows you to specify monitoring partitions.

3.3.2. Configuration relevant to the notes

What you need to consider differs depending on which standby cluster system is selected for an application. Following is the notes for each cluster system. The numbers corresponds to the numbers of notes (1 through 5) described below:

  • Note for uni-directional standby [Active-Standby]: 1, 2, 3, and 5

  • Note for multi-directional standby [Active-Active]: 1, 2, 3, 4, and 5

  • Note for co-existing behaviors: 5
    (Applications co-exist and run. The cluster system does not fail over the applications.)

3.3.3. Solutions to the problems relevant to the notes

Problems

Solution

Note to refer

When an error occurs while updating a data file, the application does not work properly on the standby server.

Modify the program, or add/modify script source to run a process to recover being updated during failover.

Note 1: Data recovery after an error

The application keeps accessing shared disk or mirror disk for a certain period of time even after it is stopped.

Execute the sleep command during stop script execution.

Note 2: Application termination

The same application cannot be started more than once on one server.

In multi-directional operation, reboot the application at failover and pass the shared data.

Note 3: Location to store the data

3.3.4. How to determine a cluster topology

Carefully read this chapter and determine the cluster topology that suits your needs:

  • When to start which application

  • Actions that are required at startup and failover

  • Data to be placed in switchable partitions or data partitions

3.4. Planning a failover group

A failover group (hereafter referred to as group) is a set of resources required to perform an independent operation service in a cluster system. Failover takes place by the unit of group. A group has its own group name and the attribute of the group resources.

Resources in each group are handled by the unit of the group. If a failover occurs in group1 that has disk resource1 and Floating IP resource1, a failover of Disk resource1 and a failover of Floating IP1 are concurrent. (Disk resource 1 never fails over alone.) Likewise, a resource is never included in other groups.

3.5. Considering group resources

For a failover to occur in a cluster system, a group that works as a unit of failover must be created. A group consists of group resources. In order to create an optimal cluster, you must understand what group resources to be added to the group you create, and have a clear vision of your operation.

See also

For details on each resource, refer to "Group resource details" in the "Reference Guide".

The following are currently supported group resources:

Group Resource Name

Abbreviation

Application resource

appli

CIFS resource

cifs

Dynamic DNS resource

ddns

Floating IP resource

fip

Hybrid disk resource

hd

Mirror disk resource

md

NAS resource

nas

Registry synchronization resource

regsync

Script resource

script

Disk resource

sd

Service resource

service

Print spooler resource

spool

Virtual computer name resource

vcom

Virtual IP resource

vip

VM resource

vm

AWS Elastic IP resource

awseip

AWS Virtual IP resource

awsvip

AWS DNS resource

awsdns

Azure probe port resource

azurepp

Azure DNS resource

azuredns

Google Cloud virtual IP resource

gcvip

Oracle Cloud virtual IP resource

ocvip

3.6. Understanding monitor resources

Monitor resources monitor specified targets. If an error is detected in a target, a monitor resource restarts and/or fails over the group resources.
There are two types of timing for monitoring monitor resources: always monitor and monitor when active.
Always monitors

Monitoring is performed from when the cluster is started up until it is shut down.

Monitors while activated

Monitoring is performed from when a group is activated until it is deactivated.

See also

For the details of each resource, see "Monitor resource details" in the "Reference Guide".

The following are currently supported monitor resources:

Monitor Resource Name

Abbreviation

Always monitors

Monitors While activated

Application monitor resource

appliw

CIFS monitor resource

cifsw

DB2 monitor resource

db2w

Dynamic DNS monitor resource

ddnsw

Disk RW monitor resource

diskw

Floating IP monitor resource

fipw

FTP monitor resource

ftpw

Custom monitor resource

genw

Hybrid disk monitor resource

hdw

Hybrid disk TUR monitor resource

hdtw

HTTP monitor resource

httpw

IMAP4 monitor resource

imap4w

IP monitor resource

ipw

Mirror disk monitor resource

mdw

Mirror connect monitor resource

mdnw

NIC Link UP/Down monitor resource

miiw

Multi target monitor resource

mtw

NAS monitor resource

nasw

ODBC monitor resource

odbcw

Oracle monitor resource

oraclew

WebOTX monitor resource

otxw

POP3 monitor resource

pop3w

PostgreSQL monitor resource

psqlw

Registry synchronization monitor resource

regsyncw

Disk TUR monitor resource

sdw

Service monitor resource

servicew

SMTP monitor resource

smtpw

Print spooler monitor resource

spoolw

SQL Server monitor resource

sqlserverw

Tuxedo monitor resource

tuxw

Virtual computer name monitor resource

vcomw

Virtual IP monitor resource

vipw

Websphere monitor resource

wasw

Weblogic monitor resource

wlsw

VM monitor resource

vmw

Message receive monitor resource

mrw

JVM monitor resource

jraw

System monitor resource

sraw

Process resource monitor resource

psrw

Process name monitor resource

psw

User mode monitor resource

userw

AWS Elastic IP monitor resource

awseipw

AWS Virtual IP monitor resource

awsvipw

AWS AZ monitor resource

awsazw

AWS DNS monitor resource

awsdnsw

Azure probe port monitor resource

azureppw

Azure load balance monitor resource

azurelbw

Azure DNS monitor resource

azurednsw

Google Cloud virtual IP monitor resource

gcvipw

Google Cloud load balance monitor resource

gclbw

Oracle Cloud virtual IP monitor resource

ocvipw

Oracle Cloud load balance monitor resource

oclbw

3.7. Understanding heartbeat resources

Servers in a cluster system monitor whether or not other servers in the cluster are active.

Type of Heartbeat Resource

Abbreviation

Functional Overview

Kernel mode LAN heartbeat resource (1), (2)

lankhb

A kernel mode module uses a LAN to monitor whether or not servers are active.

BMC heartbeat (3)

bmchb

A module uses BMC to monitor whether or not servers are active.

Witness heartbeat resource (4)

witnesshb

A module uses the Witness server to monitor whether or not servers are active.

  • At least one kernel mode LAN heartbeat resource needs to be set. Setting up more than two is recommended.

  • Set up one or more kernel mode LAN heartbeat resource to be used among all the servers.

3.8. Understanding network partition resolution resources

Network partitioning refers to the status where all communication channels have problems and the network between servers is partitioned.

In a cluster system that is not equipped with solutions for network partitioning, a failure on a communication channel cannot be distinguished from an error on a server. This can cause data corruption brought by access from multiple servers to the same resource. EXPRESSCLUSTER, on the other hand, distinguishes a failure on a server from network partitioning when the heartbeat from a server is lost. If the lack of heartbeat is determined to be caused by the server failure, the system performs a failover by activating each resource and rebooting applications on a server running normally. When the lack of heartbeat is determined to be caused by network partitioning, emergency shutdown is executed because protecting data has higher priority over continuity of the operation. Network partitions can be resolved by the following methods:

  • COM method

    • Available in a 2-nodes cluster

    • Cross cables are needed.

    • The COM channel is used to check if the other server is active and then to determine whether or not the problem is caused by network partitioning.

    • If a server failure occurs when there is a failure in the COM channel (such as COM port and serial cross cable), resolving the network partition fails. Thus, a failover does not take place. Emergency shutdown takes place in servers including the normal server.

    • If a failure occurs on all network channels when the COM channel is working properly, it is regarded as network partitions. In this case, emergency shutdown takes place in all servers except the master server.

    • If a failure occurs on all network channels when there is a problem in the COM channel (such as COM port and serial cross cable), emergency shutdown takes in all servers excluding the master server.

    • If failures occur in all network channels between cluster server and the COM channel simultaneously, both active and standby servers fail over. This can cause data corruption due to access to the same resource from multiple servers.

  • PING method

    • A device that is always active to receive and respond to the ping command (hereafter described as ping device) is required.

    • More than one ping device can be specified.

    • When the heartbeat from the other server is lost, but the ping device is responding to the ping command, it is determined that the server without heartbeat has failed and a failover takes place. If there is no response to the ping command, the local server is isolated from the network due to network partitioning, and emergency shutdown takes place. This will allow a server that can communicate with clients to continue operation even if network partitioning occurs.

    • When the status where no response returns from the ping command on all servers continues before the heartbeat is lost, which is caused by a failure in the ping device, the network partitions cannot be resolved. If the heartbeat is lost in this status, a failover takes place in all servers. Because of this, using this method in a cluster with a shared disk can cause data corruption due to access to a resource from multiple servers.

  • HTTP method

    • A Web server that is always active is required.

    • When the heartbeat from the other server is lost, but there is a response to an HTTP HEAD request, it is determined that the server without heartbeat has failed and a failover takes place. If there is no response to an HTTP HEAD request, it is determined that the local server is isolated from the network due to network partitioning, and an emergency shutdown takes place. This will allow a server that can communicate with clients to continue operation even if network partitioning occurs.

    • When there remains no response to an HTTP HEAD request before the heartbeat is lost, which is caused by a failure in Web server, the network partitions cannot be resolved. If the heartbeat is lost in this status, emergency shutdowns occur in all the servers.

  • DISK method

    • Available to a cluster that uses a shared disk.

    • A dedicated disk partition (disk heartbeat partition) is required on the shared disk.

    • Network partitioning is determined by writing data periodically on a shared disk and calculating the last existing time of the other server.

    • If the heartbeat from other server is lost while there is any failure in the shared disk or channel to the shared disk (such as SCSI bus), resolving network partitions fails, which means failover does not take place. In this case, emergency shutdown takes place in servers working properly.

    • If failures occur on all network channels while the shared disk is working properly, a network partition is detected. Then failover takes place in the master server and a server that can communicate with the master server. Emergent shutdown takes place in the rest of servers.

    • Compared to the other methods, the time needed to resolve network partitions is longer in the shared disk method because the delay of the disk I/O must be taken into account. The time is about twice as long as the heartbeat time-out and disk I/O wait time.

    • If the I/O time to the shared disk is longer than the disk I/O wait time, the resolving network may time out, and failover may not take place.

    Note

    Shared DISK method cannot be used if VERITAS Storage Foundation is used.

  • COM + DISK method

    • This is a method that combines the COM method and the DISK method. This method is available in a cluster that uses a shared disk with two nodes.

    • This method requires serial cross cables. A dedicated disk partition (disk heartbeat partition) must be allocated on the shared disk.

    • When the COM channel (such as a COM port and serial cross cable) is working properly, this method works in the same way as the COM method. When an error occurs on the COM channel, this method switches to the shared DISK method. This mechanism offers higher availability than the COM method. The method also achieves network partition resolving faster than the DISK method.

    • Even if failures occur on all network channels between cluster servers and the COM channel simultaneously, emergency shutdown takes place at least on one of the servers. This will prevent data corruption.

  • PING + DISK method

    • This is a method that the PING method and the DISK are combined.

    • This method requires a device (a ping device) that can always receive the ping command and return response. You can specify more than one ping device. This method also requires the dedicated disk partition (disk heartbeat partition) on the shared disk.

    • This method usually works in the same way as the PING method. However, if the state where a response to the ping command on all servers does not return continues, due to a failure of the ping device before the heartbeat is lost, the method is switched to the DISK method. If the servers using the NP resolution resources of the PING method and those using the NP resolution resources of the DISK method do not match (such as when the PING method resources are used by all servers, but the DISK method resources are used only by some servers connected to a shared disk), the resources of these two types work independently. Therefore, the DISK method works as well, regardless of the state of the ping device.

    • If the heartbeat from the other server is lost while there is a failure in the shared disk and/or a path to the shared disk, emergency shutdown takes place even if there is response to the ping command.

  • Majority method

    • This method can be used in a cluster with three or more nodes.

    • This method prevents data corruption caused by the Split Brain syndrome by shutting down a server that can no longer communicate with the majority of the servers in the entire cluster because of network failure. When communication with exactly half of the servers in the entire cluster is failing, emergency shutdown takes place in a server that cannot communicate with the master server.

    • When more than half of the servers are down, the rest servers running properly also go down.

    • If all servers are isolated due to a hub error, all servers go down.

  • Not solving the network partition

    • This method can be selected in a cluster that does not use any disk resource (a shared disk).

    • If a failure occurs on all network channels between servers in a cluster, all servers failover.

The following are the recommended methods to resolve the network partition:

  • The ping + shared disk method is recommended for a cluster that uses a shared disk with three or more nodes. When using the hybrid type, use the PING + DISK method for the servers connected to the DISK, and use only the PING method for the servers not connected to the shared disk.

  • The PING method is recommended for a cluster with three or more nodes but without a shared disk.

  • The COM + DISK method or the PING + DISK method is recommended for a cluster that uses a shared disk with two nodes.

  • The COM method or the PING method is recommended for a cluster with two nodes but without a shared disk.

  • The HTTP method is recommended for a cluster that uses the Witness heartbeat resource but does not use a shared disk.

Method to resolve a network partition

Number of nodes

Required hardware

Circumstance where failover cannot be performed

When all network channels are disconnected

Circumstance where both servers fail over

Time required to resolve network partition

COM

2

Serial cable

COM error

The master server survives

COM error and network disconnection occur simultaneously

0

DISK

No limit

Shared disk

Disk error

The master server survives

None

Time calculated by the heartbeat timeout and disk I/O wait time is needed

PING

No limit

Device to receive the ping command and return a response

None

Server that responses to the ping command survives

All networks are disconnected after the ping command timeouts the specified times consecutively

0

HTTP

No limit

Web server

Web server failure

A server that can communicate with the Web server survives

None

0

COM +
DISK

2

Serial cables shared disk

COM error and
disk error

The master server survives

None

0

PING +
DISK

No limit

Device to receive the ping command and return response
Shared disk

None

Server responding to the ping command survives

None

0

Majority

3 or more

None

Majority of servers go down

A server that can communicate with majority of servers survives

None

0

None

No limit

None

None

All servers fail over

All networks are disconnected

0

4. Installing EXPRESSCLUSTER

This chapter provides instructions for installing EXPRESSCLUSTER.

This chapter covers:

4.1. Steps from Installing EXPRESSCLUSTER to creating a cluster

The following describes the steps from installing EXPRESSCLUSTER, license registration, cluster system creation, to verifying the cluster system status.

Before proceeding to the following steps, make sure to read "2. Determining a system configuration" and "3. Configuring a cluster system" and check system requirements and the configuration of a cluster.

  1. Install the EXPRESSCLUSTER Server

    Install the EXPRESSCLUSTER Server, which is the core EXPRESSCLUSTER module, to each server that constitutes a cluster. When installing the Server, a license registration is performed as well. (See "4. Installing EXPRESSCLUSTER.")
    Reboot the server
  2. Create the cluster configuration data using Cluster WebUI

    Create the cluster configuration data by using the Cluster WebUI. (See "6. Creating the cluster configuration data.")

  3. Create a cluster

    Create a cluster by applying the cluster configuration data created with theCluster WebUI. (See "6. Creating the cluster configuration data".)

  4. Verify the cluster status using the Cluster WebUI

    Verify the status of a cluster that you have created using the Cluster WebUI. (See "7. Verifying a cluster system.")

See also

You need to refer to the "Reference Guide" as needed by following the steps written in this guide to perform operation following this guide. For the latest information on the system requirements and lease information, refer to "Installation requirements for EXPRESSCLUSTER" and "Latest version information" in the "Getting Started Guide".

4.2. Installing the EXPRESSCLUSTER Server

Install the EXPRESSCLUSTER Server, which is an EXPRESSCLUSTER module, on each server machine constituting a cluster system.
License registration is required in installing the Server. Make sure to have the required license file or license sheet.

The EXPRESSCLUSTER Server consists of the following system services:

Service Display Name

Service Name

Description

Startup Type

Service Status (usual)

EXPRESSCLUSTER

clpstartup

EXPRESSCLUSTER

Automatic

Running

EXPRESSCLUSTER API

clprstd

Control of the EXPRESSCLUSTER RESTful API

Automatic

Stopped

EXPRESSCLUSTER Disk Agent

clpdiskagent

Shared disk, mirror disk, hybrid disk control

Manual

Running

EXPRESSCLUSTER Event

clpevent

Event log output

Automatic

Running

EXPRESSCLUSTER Information Base

clpibsv

Cluster information management

Automatic

Running

EXPRESSCLUSTER Java Resource Agent

clpjra

Java Resource Agent

Manual

Stopped

EXPRESSCLUSTER Manager

clpwebmgr

WebManager Server

Automatic

Running

EXPRESSCLUSTER Old API Support

clpoldapi

Compatible API process

Automatic

Running

EXPRESSCLUSTER Server

clppm

EXPRESSCLUSTER Server

Automatic

Running

EXPRESSCLUSTER System Resource Agent

clpsra

System Resource Agent

Manual

Stopped

EXPRESSCLUSTER Transaction

clptrnsv

Communication process

Automatic

Running

EXPRESSCLUSTER Web Alert

clpwebalt

Alert synchronization

Automatic

Running

Note

The status of EXPRESSCLUSTER Java Resource Agent will be "Running" when JVM monitor resource is set.

Note

The status of EXPRESSCLUSTER System Resource Agent will be "Running" When the system monitor resource or the process resource monitor resource is set or Collect the System Resource Information is checked on the Monitor tab in Cluster Properties.

4.2.1. Installing the EXPRESSCLUSTER Server for the first time

Install the EXPRESSCLUSTER X on all servers that constitute the cluster by following the procedures below.

Important

When a shared disk is used, make sure not to start more than one OS on servers connected to the shared disk before installing EXPRESSCLUSTER. Data on the shared disk may be corrupted.

Note

Install the EXPRESSCLUSTER Server using Administrator account.

Note

When installing EXPRESSCLUSTER server, Windows media sense function which is the function to deactivate IP address due to disconnection of the cable at link down occurrence will be disabled.

Note

If the Windows SNMP Service has already been installed, the SNMP linkage function will be automatically set up when the EXPRESSCLUSTER Server is installed. If, however, the Windows SNMP Service has not yet been installed, the SNMP linkage function will not be set up.
When setting up the SNMP linkage function after installing the EXPRESSCLUSTER Server, refer to "4.2.4. Setting up the SNMP linkage function manually".
  1. Insert the installation CD-ROM into the CD-ROM drive.

  2. After the menu window is displayed, select EXPRESSCLUSTER for Windows.

    Note

    If the menu window does not open automatically, double-click the menu.exe in the root folder of the CD-ROM.

  3. Select EXPRESSCLUSTER X 4.2 for Windows.

  4. The NEC EXPRESSCLUSTER Setup window is displayed. Click Next.

  5. The Choose Destination Location dialog box is displayed. When changing the install destination, click Browse to select a directory.

  6. In the Ready to Install the Program window, click Install to start installing.

  7. After the installation is completed, click Next without changing the default value in Port Number.

    Note

    The port number configured here needs to be configured again when creating the cluster configuration data. For details on port number, refer to "Parameter details" in the "Reference Guide".

  8. In Filter Settings of Shared Disk, right-click SCSI controller or HBA connected to a shared disk, and click Filtering. Click Next.

    Important

    When a shared disk is used, configure filtering settings to the SCSI controller or HBA to be connected to the shared disk. If the shared disk is connected without configuring filtering settings, data on the shared disk may be corrupted. When the disk path is duplicated, it is necessary to configure the filter for all the HBAs physically connected with the shared disk though it may look the shared disk is connected to only one HBA.

    Important

    When using mirror disk resources, do not perform filtering settings for SCSI controller/HBA which an internal disk for the mirroring target is connected. If the filter is activated on mirror disk resources, starting mirror disk resources fails. However, it is essential to perform filtering settings when shared disks are expected to consist mirroring.

  9. The window that shows the completion of setting is displayed. Click Yes.

  10. License Manager is displayed. Click Register to register the license. For detailed information on the registration procedure, refer to "5. Registering the license" in this guide.

  11. Click Finish to close the License Manager dialog box.

  12. The Complete InstallShiled Wizard dialog box is displayed. Select Restarting and click Finish. The server will be rebooted.

Note

When a shared disk is used, it cannot be accessed due to access restriction after OS reboot.

4.2.2. Installing the EXPRESSCLUSTER Server in Silent Mode

In silent mode, the EXPRESSCLUSTER Server is installed automatically without displaying any dialog box to prompt a user to response while the installer is running. This installation function is useful when the installation folder and installation options for all server machines are the same. This function not only eliminates the user's effort but also prevents wrong installation due to wrong specifications.
Install the EXPRESSCLUSTER Server in all servers configuring the cluster by following the procedure below.

Note

Installation in silent mode is not available for a shared disk configuration. For a shared disk configuration, install the EXPRESSCLUSTER Server by referring to "Installing the EXPRESSCLUSTER Server for the first time."

Note

Install the EXPRESSCLUSTER Server using Administrator account.

Note

When installing EXPRESSCLUSTER server, Windows media sense function which is the function to deactivate IP address due to disconnection of the cable at link down occurrence will be disabled.

Note

If the Windows SNMP Service has already been installed, the SNMP linkage function will be automatically set up when the EXPRESSCLUSTER Server is installed. If, however, the Windows SNMP Service has not yet been installed, the SNMP linkage function will not be set up.
When setting up the SNMP linkage function after installing the EXPRESSCLUSTER Server, refer to "4.2.4. Setting up the SNMP linkage function manually".

Preparation

If you want to change the installation folder (default: C:\Program Files\EXPRESSCLUSTER), create a response file in advance following the procedure below.

  1. Copy the response file from the installation CD-ROM to any accessible location in the server.
    Copy the following file in the installation CD-ROM.
    Windows\4.2\common\server\x64\response\setup_inst_en.iss
  2. Open the response file (setup_inst_en.iss) by using a text editor and change the folder indicated by (*).

Installation procedure

  1. Execute the following command from the command prompt to start setup.
    # "<Path of silent-install.bat>silent-install.bat" -i <Path of response file>
    * <Path of silent-install.bat>:
    Windows\4.2\common\server\x64\silent-install.bat
    in the installation CD-ROM.
    * When installing the EXPRESSCLUSTER Server in the default directory (C:\Program Files\EXPRESSCLUSTER), omit <Path of response file>.
  2. Restart the server.

  3. Execute the following command from the command prompt to register the license.
    # "<Installation folder>\bin\clplcnsc.exe" -i <Path of license file>

4.2.3. Upgrading EXPRESSCLUSTER Server from the previous version

Before starting the upgrade, read the following notes.

  • It is possible to upgrade version from EXPRESSCLUSTER X 1.0, 2.0, 2.1, 3.0, 3.1, 3.2 or 3.3 to EXPRESSCLUSTER X 4.2.

  • You need CD-ROM contains setup files and software licenses for EXPRESSCLUSTER X 4.2.

  • You cannot use the cluster configuration data that was created by using EXPRESSCLUSTER X higher than EXPRESSCLUSTER X in use.

  • The cluster configuration data that was created by using EXPRESSCLUSTER X 1.0, 2.0, 2.1, 3.0, 3.1, 3.2, 3.3,4.0,4.1 or 4.2 for Windows is available for EXPRESSCLUSTER X in use.

  • If mirror disk resources or hybrid disk resources are set, cluster partitions require space of 1 GB or larger. And also, executing full copy of mirror disk resources or hybrid disk resources is required.

  • If mirror disk resources or hybrid disk resources are set, it is recommended to backup data in advance. For details of a backup procedure, refer to "Performing a snapshot backup" in "The system maintenance information" in the "Maintenance Guide".

  • EXPRESSCLUSTER Server must be upgraded with the account having the Administrator's privilege.

See also

For the update from X 4.0/4.1 to X 4.2, see "Update Procedure Manual".

The following procedures explain how to upgrade from EXPRESSCLUSTER X 1.0, 2.0, 2.1, 3.0, 3.1, 3.2 or 3.3 to EXPRESSCLUSTER X 4.2.

  1. Before upgrading, confirm that the servers in the cluster and all the resources are in normal status by using WebManager or the command.

  2. Save the current cluster configuration file with the Builder or clpcfctrl command. For details about saving the cluster configuration file with clpcfctrl command, refer to "Backing up the cluster configuration data (clpcfctrl --pull)" of "Creating a cluster and backing up configuration data (clpcfctrl command)" in "EXPRESSCLUSTER command reference" in the "Reference Guide".

  3. When the EXPRESSCLUSTER Server service of the target server is configured as Auto Startup, change the settings to Manual Startup.

  4. Shut down the entire cluster.

  5. Start only one server, and uninstall the EXPRESSCLUSTER Server. For details about uninstalling the EXPRESSCLUSTER Server, refer to "10.1.1. Uninstalling the EXPRESSCLUSTER Server" in "10. Uninstalling and reinstalling EXPRESSCLUSTER" in this guide.

  6. Install the EXPRESSCLUSTER X 4.2 on the server from which was uninstalled old version of the EXPRESSCLUSTER server in the step 5, and then register the license as necessary. For details about how to install the EXPRESSCLUSTER Server, refer to "4.2. Installing the EXPRESSCLUSTER Server" in "4. Installing EXPRESSCLUSTER" in this guide.

  7. Shut down the server on which was installed the EXPRESSCLUSTER X 4.2 in the step 6.

  8. Perform the steps 5 to 7 on each server.

  9. Start all the servers.

  10. If mirror disk resources or hybrid disk resources are set, allocate cluster partition (The cluster partition should be 1 GB or larger).

  11. Access the below URL to start the WebManager.
    http://actual IP address of an installed server:29003/main.htm
    Import the cluster configuration file which was saved in the step 2.
    If the drive letter of the cluster partition is different from the configuration, modify the configuration. And regarding the groups which mirror disk resources or hybrid disk resources belong to, if Startup Attribute is Auto Startup on the Attribute tab of Group Properties, change it to Manual Startup.
    In order to use the values of Maximum Failover Count which were set before version up EXPRESSCLUSTER, set Cluster Properties -> Extension tab -> Failover Count Method to Cluster from Server.
  12. Upload the cluster configuration data with the Cluster WebUI.
    When the message "There is difference between the disk information in the configuration information and the disk information in the server. Are you sure you want automatic modification?" appears, select Yes.
    If the fixed-term license is used, run the following command.

    clplcnsc --distribute

  13. Start the cluster on Cluster WebUI.

  14. If mirror disk resources or hybrid disk resources are set, from the mirror disk list, execute a full copy assuming that the server with the latest data is the copy source.

  15. Start the group and confirm that each resource starts normally.

  16. If Startup Attribute was changed from Auto Startup to Manual Startup in step 11, use the config mode of Cluster WebUI to change this to Auto Startup. Then, click Apply the Configuration File to apply the cluster configuration data to the cluster.

  17. This completes the procedure for upgrading the EXPRESSCLUSTER Server. Check that the servers are operating normally as the cluster by the clpstat command or Cluster WebUI

4.2.4. Setting up the SNMP linkage function manually

Note

If you are using only the SNMP trap transmission function, you do not need to perform this procedure.

To handle information acquisition requests on SNMP, the Windows SNMP Service must be installed separately and the SNMP linkage function must be registered separately.
If the Windows SNMP Service has already been installed, the SNMP linkage function will be automatically registered when the EXPRESSCLUSTER Server is installed. If, however, the Windows SNMP Service has not been installed, the SNMP linkage function will not be registered.

When the Windows SNMP Service has not been installed, follow the procedure below to manually register the SNMP linkage function.

Note

Use an Administrator account to perform the registration.

  1. Install the Windows SNMP Service.

  2. Stop the Windows SNMP Service.

  3. Register the SNMP linkage function of EXPRESSCLUSTER with the Windows SNMP Service.
    3-1. Start the registry editor.
    3-2. Open the following key:
    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\SNMP\Parameters\ExtensionAgents
    3-3. Specify the following to create a string value in the opened key:
    Value name : mgtmib
    Value type : REG_SZ
    Value data : SOFTWARE\NEC\EXPRESSCLUSTER\SnmpAgent\mgtmib\CurrentVersion
    3-4. Exit the registry editor.
  4. Start the Windows SNMP Service.

Note

Specify the settings required for SNMP communication on the Windows SNMP Service.

5. Registering the license

To run EXPRESSCLUSTER as a cluster system, you need to register the license. This chapter describes how to register an EXPRESSCLUSTER license.

This chapter covers:

5.1. Registering the license

EXPRESSCLUSTER licenses can be registered during installation, as well as be added or deleted after installation.

5.1.1. Registering the CPU license

For the following CPU licenses of the EXPRESSCLUSTER, register the license to the master server of a cluster.

Main Products

  • EXPRESSCLUSTER X 4.2 for Windows

  • EXPRESSCLUSTER X SingleServerSafe 4.2 for Windows

  • EXPRESSCLUSTER X SingleServerSafe for Windows Upgrade

5.1.2. Registering the node license

For the following node licenses of the EXPRESSCLUSTER, register the license to each cluster server.

Main Products

  • EXPRESSCLUSTER X 4.2 for Windows VM

  • EXPRESSCLUSTER X SingleServerSafe 4.2 for Windows VM

  • EXPRESSCLUSTER X SingleServerSafe for Windows VM Upgrade

Optional Products

  • EXPRESSCLUSTER X Replicator 4.2 for Windows

  • EXPRESSCLUSTER X Replicator DR 4.2 for Windows

  • EXPRESSCLUSTER X Replicator DR 4.2 Upgrade for Windows

  • EXPRESSCLUSTER X Database Agent 4.2 for Windows

  • EXPRESSCLUSTER X Internet Server Agent 4.2 for Windows

  • EXPRESSCLUSTER X Application Server Agent 4.2 for Windows

  • EXPRESSCLUSTER X Java Resource Agent 4.2 for Windows

  • EXPRESSCLUSTER X System Resource Agent 4.2 for Windows

  • EXPRESSCLUSTER X Alert Service 4.2 for Windows

Note

If the licenses for optional products have not been installed, the resources and monitor resources corresponding to those licenses are not shown in the list on the Cluster WebUI.

There are two ways of license registration; specifying the license file and using the information on the license sheet.

5.1.3. Notes on the CPU license

Notes on using the CPU license are as follows:

  • After registration of the CPU license on the master server, Cluster WebUI on the master server must be used in order to edit and reflect the cluster configuration data as described in "6. Creating the cluster configuration data".

5.1.4. Registering the license by entering the license information

The following describes how to register the license by specifying the license.
Before you register the license, make sure that:

EXPRESSCLUSTER CPU license

  • You have the license sheet you officially obtained from the sales agent. The values on this license sheet are used for registration.

  • You have the administrator privileges to log in the server intended to be used as master server in the cluster.

EXPRESSCLUSTER node license

  • You have the license sheet you officially obtained from the sales agent. The number of license sheets you need is as many as the number of servers on which the product will be used. The values on this license sheet are used for registration.

  • You have the administrator privileges to log in the server on which you intend to use the product.

  1. On the Start menu, click License Manager of the EXPRESSCLUSTER Server.

  2. In the License Manager dialog box, click Register.

  3. In the window to select a license method, select Register with License Information.

  4. In the Product selection dialog box, select the product category, and click Next.

  5. In the License Key Entry dialog box, enter the serial number and license key of the license sheet. Click Next.

  6. Confirm what you have entered on the License Registration Confirmation dialog box. Click Next.

  7. Make sure that the pop-up message, "The license was registered." is displayed. If the license registration fails, start again from the step 2.

5.1.5. Registering the license by specifying the license file

The following describes how to register the license by specifying the license.

Before you register the license, check that:

EXPRESSCLUSTER CPU license

  • You have the administrator privileges to log in the server intended to be used as master server in the cluster.

  • The license file is located in the server intended to be used as master server in the cluster.

EXPRESSCLUSTER node license

  • You have the administrator privileges to log in the server on which you intend to use the product.

  • The license file is located in the server in which you intend to use products among servers that constitute a cluster system.

  1. On the Start menu, click License Manager of the EXPRESSCLUSTER Server.

  2. In the License Manager dialog box, click Register.

  3. In the window to select a license method is displayed, select Register with License File.

  4. In the License File Specification dialog box, select the license file to be registered and then click Open.

  5. The message confirming registration of the license is displayed. Click OK.

  6. Click Finish to close the license manager.

5.2. Referring and/or deleting the license

5.2.1. How to refer to and/or delete the registered license

The following procedure describes how to refer to and delete the registered license.

  1. On the Start menu, click License Manager of the EXPRESSCLUSTER Server.

  2. In the License Manager dialog box, click Refer / Delete.

  3. The registered licenses are listed.

  4. Select the license to delete and click Delete.

  5. The confirmation message to delete the license is displayed. Click OK.

5.3. Registering the fixed term license

EXPRESSCLUSTER licenses can be registered during installation, as well as be added or deleted after installation.
Use the fixed term license to operate the cluster system which you intend to construct for a limited period of time.
This license becomes effective on the date when the license is registered and then will be effective for a certain period of time.
In preparation for the expiration, the license for the same product can be registered multiple times. Extra licenses are saved and a new license will take effect when the current license expires.

The fixed term license applies to the EXPRESSCLUSTER X 4.2 for Windows and optional products as shown below. Among servers that constitute the cluster, use the master server to register the fixed term license.

Main Products

  • EXPRESSCLUSTER X 4.2 for Windows

Optional Products

  • EXPRESSCLUSTER X Replicator 4.2 for Windows

  • EXPRESSCLUSTER X Replicator DR 4.2 for Windows

  • EXPRESSCLUSTER X Database Agent 4.2 for Windows

  • EXPRESSCLUSTER X Internet Server Agent 4.2 for Windows

  • EXPRESSCLUSTER X Application Server Agent 4.2 for Windows

  • EXPRESSCLUSTER X Java Resource Agent 4.2 for Windows

  • EXPRESSCLUSTER X System Resource Agent 4.2 for Windows

  • EXPRESSCLUSTER X Alert Service 4.2 for Windows

Note

If the licenses for optional products have not been installed, the resources and monitor resources corresponding to those licenses are not shown in the list on the Cluster WebUI.

A License is registered by specifying the license file.

5.3.1. Notes on the fixed term license

Notes on using the fixed term license are as follows:

  • The fixed term license cannot be registered to serveral of the servers constituting the cluster to operate them.

  • After registration of the license on the master server, Cluster WebUI on the master server must be used in order to edit and reflect the cluster configuration data as described in "6. Creating the cluster configuration data".

  • The number of the fixed term license must be larger than the number of the servers constituting the cluster.

  • After starting the operation of the cluster, additional fixed term license must be registered in the master server.

  • Once enabled, the fixed term license cannot be reregistered despite its validity through the license/server removal or the server replacement.

5.3.2. Registering the fixed term license by specifying the license file

The following describes how you register a fixed term license.
Before you register the license, check that:
  • You have the administrator privileges to log in the server intended to be used as master server in the cluster.

  • The license files for all the products you intend to use are stored in the server that will be set as a master server among servers that constitute the cluster system.

Follow the following steps to register all the license files for the products to be used. If you have two or more license files for the same product in preparation for the expiration, execute the command to register the extra license files in the same way as the following steps.

  1. On the Start menu, click License Manager of the EXPRESSCLUSTER Server.

  2. In the License Manager dialog box, click Register.

  3. In the window to select a license method is displayed, select Register with License File.

  4. In the License File Specification dialog box, select the license file to be registered and then click Open.

  5. The message confirming registration of the license is displayed. Click OK.

  6. Click Finish to close the license manager.

5.4. Referring and/or deleting the fixed term license

5.4.1. How to refer to and/or delete the registered fixed term license

The procedure for referring and/or deleting the registered fixed term license is the same as that described in "5.2.1. How to refer to and/or delete the registered license".

6. Creating the cluster configuration data

In EXPRESSCLUSTER, data that contains information on how a cluster system is configured is called "cluster configuration data."This data is created using the Cluster WebUI. This chapter provides the information on how to start the Cluster WebUI and the procedures to create the cluster configuration data using the Cluster WebUI with a sample cluster configuration.

This chapter covers:

6.1. Creating the cluster configuration data

Creating the cluster configuration data is performed by using the config mode of Cluster WebUI, the function for creating and modifying cluster configuration data.

Start the Cluster WebUI accessed from the management PC and create the cluster configuration data. The cluster configuration data will be applied in the cluster system by the Cluster WebUI.

6.2. Starting up the Cluster WebUI

Accessing to the Cluster WebUI is required to create cluster configuration data. This section describes the overview of the Cluster WebUI, and how to create cluster configuration data.

See also

For the system requirements of the Cluster WebUI, refer to "Installation requirements for EXPRESSCLUSTER" in the Getting Started Guide.

6.2.1. What is Cluster WebUI?

The Cluster WebUI is a function for setting up the cluster, monitoring its status, starting up or stopping servers and groups, and collecting cluster operation logs through a Web browser. The overview of the Cluster WebUI is shown in the following figures.

Specify the floating IP address or virtual IP address for accessing Cluster WebUI for the URL when connecting from a Web browser of the management PC.These addresses are registered as the resources of the management group. When the management group does not exist, you can specify the address of one of servers configuring the cluster (fixed address allocated to the server) to connect management PC with the server. In this case, the Cluster WebUI cannot acquire the status of the cluster if the server to be connected is not working.

6.2.2. Browsers supported by the Cluster WebUI

For information about evaluated Web browsers, refer to the "Getting Started Guide".

6.2.3. Starting the Cluster WebUI

The following describes how to start the Cluster WebUI.

  1. Start your Web browser.

  2. Enter the actual IP address and port number of the server where the EXPRESSCLUSTER Server is installed in the Address bar of the browser.

  3. The Cluster WebUI starts. To create the cluster configuration data, select Config Mode from the drop down menu of the tool bar.

  4. Click Cluster generation wizard to start the wizard.

See also

For encrypted communication with EXPRESSCLUSTER Server, see "WebManager tab" of "Cluster properties" in "Parameter details" in the "Reference Guide". Enter the following to perform encrypted communication.
https://10.0.0.1:29003/

6.3. Checking the values to be configured

Before you create the cluster configuration data using Cluster generation wizard, check values you are going to enter. Write down the values to see whether your cluster is efficiently configured and there is no missing information.

6.3.1. Sample cluster environment

As shown in the below, this chapter uses a typical cluster configuration with two nodes and the hybrid disk configuration with three nodes.

When a shared disk with two nodes is used:

When mirroring disks with two nodes are used:

When mirror disk resources with remotely-constructed two nodes are used:

When hybrid disks with three nodes are used:

The following table lists sample values of the cluster configuration data to achieve the cluster system shown above. The step-by-step instruction for creating the cluster configuration data with these values is provided in the following sections. When you actually set the values, you may need to modify them according to the cluster you are intending to create. For information on how you determine the values, refer to the Referenced Guide.

Example of configuration with 2 nodes

Target

Parameter

Value (For shared disk)

Value (For mirror disk)

Value (For remote construction)

Cluster configuration

Cluster name

Cluster

Cluster

Cluster

Number of servers

2

2

2

Number of management groups

1

1

-

Number of failover groups

1

1

1

Number of monitor resources

5

6

1

Heartbeat resources

Number of kernel mode LAN heartbeats

2

2

1

First server information
(Master server)

Server name

server1

server1

server1

Interconnect IP address
(Primary)
192.168.0.1
192.168.0.1
10.0.0.1
Interconnect IP address
(Backup)
10.0.0.1
10.0.0.1
-

Public IP address

10.0.0.1

10.0.0.1

10.0.0.1

Mirror connect I/F

-

192.168.0.1

10.0.0.1

HBA

HBA connected to a shared disk

-

-

Second server information

Server name

server2

server2

server2

Interconnect IP address
(Primary)

192.168.0.2

192.168.0.2

10.0.0.2

Interconnect IP address
(Backup)

10.0.0.2

10.0.0.2

-

Public IP address

10.0.0.2

10.0.0.2

10.0.0.2

Mirror connect I/F

-

192.168.0.2

10.0.0.2

HBA

HBA connected to a shared disk

-

-

First NP resolution resource

Type

COM

-

Ping

Ping target

-

-

10.0.0.254

Server1

COM1

-

Use

Server2

COM1

-

Use

Second NP resolution resource

Type

DISK

-

-

Ping target

-

-

-

Server1

E:

-

-

Server2

E:

-

-

Group for management (For the Cluster WebUI)

Type

cluster

cluster

cluster

Group name

ManagementGroup

ManagementGroup

ManagementGroup

Startup server

all servers

all servers

all servers

Number of group resources

1

1

1

Group resources for management 1

Type

Floating IP resource

Floating IP resource

floating IP resource

Group resource name

ManagementIP

ManagementIP

ManagementIP

IP address

10.0.0.11

10.0.0.11

10.0.0.11

Failover group

Type

failover

failover

failover

Group name

failover1

failover1

failover1

Startup server

server1 -> server2

server1 -> server2

server1 -> server2

Number of group resources

3

3

3

First group resources

Type

Floating IP resource

Floating IP resource

Floating IP resource

Group resource name

fip1

fip1

fip1

IP address

10.0.0.12

10.0.0.12

10.0.0.12

Second group resources

Type

Disk resource

Mirror disk resource

Mirror disk resource

Group resource name

sd1

md1

md1

Disk resource drive letter

F:

-

-

Mirror disk resource cluster partition drive letter

-

E:

E:

Mirror disk resource data partition drive letter

-

F:

F:

Third group resources

Type

Application resource

Application resource

Application resource

Group resource name

appli1

appli1

appli1

Resident type

Resident

Resident

Resident

Start path

Path of execution file

Path of execution file

Path of execution file

First monitor resource

Type

User-mode monitor

User-mode monitor

User-mode monitor

(Created by default)

Monitor resource name

userw

userw

userw

Second monitor resources

Type

Disk RW monitor

Disk RW monitor

Disk RW monitor

Monitor resource name

diskw1

diskw1

diskw1

File name

C:\check.txt 2

C:\check.txt 2

C:\check.txt 2

I/O size

2000000

2000000

2000000

Action to be taken when detecting stall error

Intentional stop error occurs

Intentional stop error occurs

Intentional stop error occurs

Action When Diskfull Is Detected

Recover

Recover

Recover

Recovery target

cluster

cluster

cluster

Final action

Intentional stop error occurs

Intentional stop error occurs

Intentional stop error occurs

Third monitor resources (Automatically created after the creation of disk resources)

Type

Disk TUR monitor

-

-

Monitor resource name

sdw1

-

-

Disk resource

sd1

-

-

Recovery target

failover1sd1

-

-

Final action

None

-

-

Fourth monitor resource
(Automatically created after the creation of ManagementIP resources)

Type

Floating IP monitor

Floating IP monitor

Floating IP monitor

Monitor resource name

fipw1

fipw1

fipw1

Monitor target

ManagementIP

ManagementIP

ManagementIP

Recovery target

ManagementIP

ManagementIP

ManagementIP

Fifth monitor resource
(Automatically created after the creation of fip1 resources)

Type

Floating IP monitor

Floating IP monitor

Floating IP monitor

Monitor resource name

fipw2

fipw2

fipw2

Monitor target

fip1

fip1

fip1

Recovery target

fip1

fip1

fip1

Sixth monitor resources

Type

IP monitor

IP monitor

IP monitor

Monitor resource name

ipw1

ipw1

ipw1

Monitored IP address
10.0.0.254
(Gateway)
10.0.0.254
(Gateway)
10.0.0.254
(Gateway)

Recovery target

All Groups

All Groups

All Groups

Seventh monitor resource (Automatically created after the creation of application resources when the application resources are of resident type)

Type

Application monitoring

Application monitoring

Application monitoring

Monitor resource name

appliw1

appliw1

appliw1

Target resource

appli1

appli1

appli1

Recovery target

failover1appli1

failover1

failover1

Eighth monitor resource (Automatically created after creation of mirror disk resource)

Type

-

mirror connect monitoring

mirror connect monitoring

Monitor resource name

-

mdnw1

mdnw1

Mirror disk resource

-

md1

md1

Recovery target

-

md1

md1

Final action

-

None

None

Ninth monitor resource (Automatically created after creation of mirror disk resource)

Type

-

Mirror disk monitor

Mirror disk monitor

Monitor resource name

-

mdw1

mdw1

Mirror disk resource

-

md1

md1

Recovery target

-

md1

md1

Final action

-

None

None

1

You should have a floating IP address to access the Cluster WebUI. You can access the Cluster WebUI from your Web browser with a floating IP address when an error occurs.

2(1,2,3)

To monitor the local disk, specify the file name on the system partition for the file name of the disk RW monitor resource.

Example of hybrid disk configuration

Target

Parameter

Value

Cluster configuration

Cluster name

cluster

Number of servers

3

Number of management groups

1

Number of failover groups

1

Number of monitor resources

6

  • Heartbeat resources

Number of kernel mode LAN heartbeats

2

First server information(Master server)

Server name

server1

Interconnect IP address
(Dedicated)

192.168.0.1

Interconnect IP address
(Backup)

10.0.0.1

Public IP address

10.0.0.1

Mirror connect I/F

192.168.0.1

HBA

HBA connected to a shared disk

Second server information

Server name

server2

Interconnect IP address
(Dedicated)

192.168.0.2

Interconnect IP address
(Backup)

10.0.0.2

Public IP address

10.0.0.2

Mirror connect I/F

192.168.0.2

HBA

HBA connected to a shared disk

Third sever information

Server name

Server3

Interconnect IP address
(Dedicated)

10.0.0.3

Interconnect IP address
(Backup)

192.168.0.3

Public IP address

192.168.0.3

Mirror connect I/F

192.168.0.3

HBA

-

First NP resolution resource

Type

DISK

Ping target

-

Server1

E:

Server2

E:

Server3

Do not use

Second NP resolution resource

Type

Ping

Ping target

10.0.0.254 (gateway)

Server1

Use

Server2

Use

Server3

Use

Third NP resolution resource 3

Type

Ping

Ping target

10.0.0.254 (gateway)

Server1

Use

Server2

Use

Server3

Do not use

First server group

Server group name

svg1

Belonging server

server1, server2

Second server group

Server group name

svg2

Belonging server

server3

Group for management(For the Cluster WebUI)

Type

failover

Group name

ManagementGroup

Startup server

All servers

Number of group resources

1

  • Group resource for Management 4

Type

Floating IP resource

Group resource name

ManagementIP

IP address

192.168.0.11

Failover group

Type

failover

Group name

failover1

Server group

svg1 -> svg2

Number of group resources

3

  • First group resources

Type

Floating IP resource

Group resource name

fip1

IP address

192.168.0.12

  • Second group resources

Type

hybrid disk resource

Group resource name

hd1

Cluster partition drive letter

F:

Data partition drive letter

G:

  • Third group resources

Type

Application resource

Group resource name

appli1

Resident type

Resident

Start path

Path of execution file

First monitor resources(Created by default)

Type

User-mode monitor

Monitor resource name

userw

Second monitor resource

Type

Disk RW monitor

Monitor resource name

diskw1

File name

C:\check.txt 5

I/O size

2000000

Action to be taken when detecting stall error

Intentional stop error occurs

Action When Diskfull Is Detected

Recover

Recovery target

cluster

Final action

Intentional stop error occurs

Third monitor resources(Auto creation after hybrid disk resource is created)

Type

Hybrid disk monitor

Monitor resource name

hdw1

Hybrid disk resource

hd1

Recovery target

failover1

Final action

None

Fourth monitor resources(Auto creation after hybrid disk resource is created)

Type

Hybrid disk TUR monitor

Monitor resource name

hdtw1

Hybrid disk resource

hd1

Recovery target

failover1

Final action

None

Fifth monitor resources(Automatically created after the creation of ManagementIP resources)

Type

floating ip monitor

Monitor resource name

fipw1

Monitor target

ManagementIP

Recovery target

ManagementIP

Sixth monitor resource(Automatically created after the creation of fip1 resources)

Type

floating ip monitor

Monitor resource name

fipw2

Monitor target

fip1

Recovery target

fip1

Seventh monitor resource

Type

IP monitor

Monitor resource name

ipw1

Monitor IP address

10.0.0.254 (gateway)

Recovery target

All Groups

Eighth monitor resources (Automatically created after the creation of application resources when the application resources are of resident type)

Type

Application monitor

Monitor resource name

appliw1

Target resource

appli1

Recovery target

failover1appli1

3

Only the first and the second server which are connected to the shared disk needs two resources. The one is Ping method NP resolution resource that is used for the whole cluster and the other is Ping method resource that is used for only first and second server. Because the first and the second server use ping + shared disk resolution for network partition resolution.

4

You should have a floating IP address. Even if an error occurs, you can access the Cluster WebUI run by the working server from your Web browser with this floating IP address.

5

To monitor a local disk, specify the file name on the system partition for the file name of the disk RW monitor resource.

6.4. Procedure for creating the cluster configuration data

Creating the cluster configuration data involves creating a cluster, group resources, and monitor resources. Use the cluster creation wizard to create new configuration data. The procedure is described below.

Note

The created cluster configuration data can be modified later by using the rename function or properties view function.

6.4.1. Create a cluster

Create a cluster. Add a server that constitutes a cluster and determine the priorities of the server and heartbeat.

6.4.1.1. Add a cluster

  1. On the Cluster window in Cluster generation wizard, click Language field to select the language to be used by the OS.

    Note

    Only one language can be used in one cluster. When the OS with multi languages is used in a cluster, specify "English."

  2. Enter the cluster name in the Cluster Name box.

  3. Enter the floating IP address (192.168.0.11) used to connect the Cluster WebUI in the Management IP Address box. Click Next.
    The Basic Settings window for the server window is displayed. The server (server1) for which the IP address was specified as the URL when starting up the Cluster WebUI is registered in the list.

6.4.1.2. Add a server

Add the second and subsequent servers to the cluster.

  1. In Server Definitions, click Add.

  2. The Add Server dialog box is displayed. Enter the server name, FQDN name, or IP address of the second server, and then click OK. The second server (server2) is added to the Server Definitions.

  3. For the hybrid disk configuration, add the third server (server3) in the same way.

  4. For the hybrid disk configuration, follow the procedure in "1-3 Create a server group."

  5. Click Next.

6.4.1.3. Create a server group

For the hybrid disk configuration, create a group of servers connected to the disk on each disk to be mirrored before creating a hybrid disk resource.

  1. Click Settings in Server Group Definition.

  2. Click Add in Server group.

  3. The Server Group Definition dialog box is displayed. Enter the server group name svg1 in the Name box.

  4. Click server1 from Available Servers, and then, click Add. The server1 is added in Servers that can run the Group.
    Likewise, add server2.
  5. Click OK. The svg1 is added in Server Group Definitions.

  6. Click Add to open the Server Group Definition dialog box. Enter the server group name svg2 in the Name box.

  7. Click server3 from Available Servers, and then, click Add. The server3 is added in Servers that can run the Group.

  8. Click OK. The svg1 and svg2 are added in Server Group Definitions.

  9. Click Close.

  10. Click Next.

6.4.1.4. Set up the network configuration

Set up the network configuration between the servers in the cluster.

  1. Add or delete them by using Add or Delete, click a cell in each server column, and then select or enter the IP address. For a communication route to which some servers are not connected, leave the cells for the unconnected servers blank.

  2. For a communication route used for heartbeat transmission (interconnect), click a cell in the Type column, and then select Kernel Mode. When using only for the data mirroring communication of the mirror disk resource or the hybrid disk resource and not using for the heartbeat, select Mirror Communication Only.
    At least one communication route must be specified for the interconnect. Specify as many communication routes for the interconnect as possible.
    If multiple interconnects are set up, the communication route for which the Priority column contains the smallest number is used preferentially for internal communication between the servers in the cluster. To change the priority, change the order of communication routes by selecting arrows.
  3. When using BMC heartbeat, click a cell in the Type column, and then select BMC. Next, click a cell of each server, and then enter the BMC IP address. For servers that do not use BMC heartbeat, make the cells of those servers blank.

  4. When using Witness heartbeat, click a cell in the Type column, and select Witness. Next, click Properties, and enter the address of Witness server for Target Host. Then enter the port number for Service Port. For servers that do not use Witness heartbeat, click the cells of those servers, and select Do Not Use.

  5. For a communication route used for data mirroring communication for mirror disk resources or hybrid disk resources, click a cell of the MDC column, and then select the mirror disk connect name (mdc1 to mdc16) assigned to the communication route. Select Do Not Use for communication routes not used for data mirroring communication.

  6. Click Next.

6.4.1.5. Set up network partition resolution

Set up the network partition resolution resource.

  1. To use NP resolution in the COM mode, click Add and add a row to NP Resolution List, click Type and select COM, and then, click the cell of each server and select the COM port of each server which is connected with a cross cable. If there are any servers that are not connected, make the cells of the servers to blank.
    For the setup example in this chapter, add COM mode row and select COM1 on the cell of each server to use the shared disk.
  2. To use NP resolution in the DISK mode, click Add and add a row to NP Resolution List, click Type and select DISK, and then, click the cell of each server and select the disk drive to be used for the partition for disk heartbeat. If there are any servers that are not connected to the shared disk, make the cells of the servers blank.
    For the setup example in this chapter, add a DISK mode row and click the column of each server, and then select the E: drive to use the shared disk. To use a hybrid disk, add a DISK mode column, click the cells of server1 and server2, and then select the E: drive. Make the server3 cell blank.
  3. To use NP resolution in the PING mode, click Add and add a row to NP Resolution List, click Type and select Ping, click the cell of Ping Target, and enter the IP addresses of the ping destination target devices (such as a gateway). When multiple IP addresses separated by commas are entered, they are regarded as isolated from the network if there is no ping response from any of them.
    If the PING mode is used only on some servers, set the cell of the server not to be used to Do Not Use.
    For the setup example in this chapter, a row for the PING mode is added and 192.168.0.254 is specified for Ping Target.
  4. To use NP resolution in the HTTP mode, add a row to NP Resolution List by clicking Add, click the cell in Type column, and select HTTP/HTTPS. Then click Properties, enter the address of the Web server in Target Host, and enter the port number in Service Port. If the HTTP mode is used only on some servers, set the cells of the servers not to be used to Do Not Use.
    For the setup example in this chapter, the HTTP mode is not used.
  5. To use the majority method for NP resolution, click Add and add a row to NP Resolution List, click the cell of Type column, and then select Majority.
    For the setup example in this chapter, the majority method is not used.
  6. Click Next.

6.4.2. Create a failover group

Add a failover group that executes an application to the cluster. (Below, failover group is sometimes abbreviated to group.)

6.4.2.1. Add a failover group

Set up a group that works as a unit of failover at the time an error occurs.

  1. Click Add in the Group List to open the Group Definition dialog box.
    For the setup example in this chapter, select Use Server Group Settings checkbox to use a hybrid disk. Enter the group name (failover1) in the Name box, and click Next.
  2. Specify a server on which the failover group can start up. For the setup example in this chapter, to use the shared disk or the mirror disk, select the Failover is possible at all servers check box or add server1 and then server2 from the Available Servers and add them to the Servers that can run the Group. To use the hybrid disk, add svg1 and then svg2 from the Available Server Groups to the Server Groups that can run the Group
  3. Click Next.

  4. Specify each attribute value of the failover group. Because all the default values are used for the setup example in this chapter, click Next.
    The Group Resource List is displayed.

6.4.2.2. Add a group resource (Floating IP resource)

Add a group resource, a configuration element of the group, to the failover group you have created in Step 2-1.

  1. Click Add in the Group Resource List.

  2. The Resource Definition of Group | failover1 dialog box is displayed. In the Resource Definition of Group(failover1) dialog box, select the group resource type Floating IP resource in the Type box, and enter the group resource name fip1 in the Name box. Click Next.

  3. The Dependent Resources page is displayed. Specify nothing. Click Next.

  4. The Recovery Operation at Activation Failure Detection and Recovery Operation at Deactivation Failure Detection pages are displayed. Click Next.

  5. Enter IP address (10.0.0.12) to IP Address box. Click Finish.
    The floating IP resource is added to Group Resource List.

6.4.2.3. Add a group resource (Disk resource/Mirror disk resource/Hybrid disk resource)

When using a shared disk

Add a shared disk as a group resource.

  1. Click Add in Group Resource List.

  2. The Resource Definition of Group | failover1 dialog box is displayed. In the Resource Definition of Group | failover1 dialog box, select the group resource type disk resource in the Type box, and enter the group resource name sd1 in the Name box. Click Next.

  3. The Dependent Resources page is displayed. Specify nothing. Click Next.

  4. The Recovery Operation at Activation Failure Detection and Recovery Operation at Deactivation Failure Detection pages are displayed. Click Next.

  5. Select server1 in the Servers that can run the Group. Click Add.

  6. The Selection of partition dialog box is displayed. Select the partition F:. Click OK.

    Important

    For disk resource partition, specify an unformatted partition on the shared disk that is connected to the filtering-configured HBA.

    Make sure not to specify the disk resource partition to partition for disk heartbeat partition, or cluster partition or data partition for mirror disk resource. Data on the shared disk may be corrupted.

  7. Similarly, add server2 to Servers that can run the Group, and click Finish.
    The disk resource is added to Group Resource List.

When using a mirror disk

Add a mirror disk as a group resource.

  1. Click Add in Group Resource List.

  2. The Resource Definition of Group | failover1 dialog box is displayed. In the Resource Definition of Group | failover1 dialog box, select the group resource type mirror disk resource in the Type box, and enter the group resource name md1 in the Name box. Click Next.

  3. The Dependent Resources page is displayed. Specify nothing. Click Next.

  4. The Recovery Operation at Activation Failure Detection and Recovery Operation at Deactivation Failure Detection pages are displayed. Click Next.

  5. Select server1 in the Servers that can run the Group. Click Add.

  6. The Selection of partition dialog box is displayed. In the Selection of Partition dialog box, click Connect, and then, select a data partition F: and cluster partition E:. Click OK.

    Important

    Specify different partitions for data partition and cluster partition. If the same partition is specified, data on the mirror disk may be corrupted. Make sure not to specify a partition on the shared disk for the data partition and cluster partition of mirror disk resource.

  7. Similarly, add server2 to Servers that can run the Group, and click Finish.
    The mirror disk resource is added to Group Resource List.

When using a hybrid disk

Add a hybrid disk as a group resource.

  1. Click Add in Group Resource List.

  2. The Resource Definition of Group | failover1 dialog box is displayed. In the Resource Definition of Group | failover1 dialog box, select the group resource type hybrid disk resource in the Type box, and enter the group resource name hd1 in the Name box. Click Next.

  3. The Dependent Resources page is displayed. Specify nothing. Click Next.

  4. The Recovery Operation at Activation Failure Detection and Recovery Operation at Deactivation Failure Detection pages are displayed. Click Next.

  5. Enter the drive letter (G:) of the data partition for mirroring in the Data Partition Drive Letter box, the drive letter (F:) of the cluster partition in the Cluster Partition Drive Letter box.

    Important

    Specify different partitions for data partition and cluster partition. If the same partition is specified, data on the mirror disk may be corrupted.

  6. Click Obtain information. The GUID information of data and cluster partitions on each server is displayed. Click Finish.
    The hybrid disk resource is added to Group Resource List.

6.4.2.4. Add a group resource (Application resource)

Add an application resource that can start and stop the application.

  1. Click Add in Group Resource List.

  2. The Resource Definition of Group | failover1 dialog box is displayed. In the Resource Definition of Group | failover1 dialog box, select the group resource type Application resource in the Type box, and enter the group resource name appli1 in the Name box. Click Next.

  3. The Dependent Resources page is displayed. Specify nothing. Click Next.

  4. The Recovery Operation at Activation Failure Detection and Recovery Operation at Deactivation Failure Detection pages are displayed. Click Next.

  5. Select Resident in the Resident Type. Specify the path of the execution file for the Start Path.

    Note

    For the Start Path and Stop Path, specify an absolute path of the executable file or the name of the executable file of which the path configured with environment variable is effective. Do not specify a relative path. If it is specified, starting up the application resource may fail.

  6. Click Finish.
    The application resource is added to Group Resource List.
  7. Click Finish.

6.4.3. Create monitor resources

Add a monitor resource that monitors a specified target to the cluster.

6.4.3.1. Add a monitor resource (Disk RW monitor resource)

Add RW monitor resource to monitor the local disk.

  1. Click Next in Group List.

  2. The Monitor Resource List is displayed. In the Monitor Resource List, click Add. Select the monitor resource type disk RW monitor in the Type box, and enter the monitor resource name diskw1 in the Name box. Click Next.

  3. Enter the monitor settings. Select Always in the Monitor Timing box. Click Next.

  4. Set the file name C:/check.txt and I/O size (2000000). Select Action on Stall (Generate an Intentional Stop Error) and Action When Diskfull Is Detected (Recover), and click Next. For File Name, specify the file of the partition where OS is installed.

  5. Select Execute only the final action in the Recovery Action box.

  6. Select Generate an Intentional Stop Error in the Final Action box, and click Finish.
    The disk RW monitor resource diskw1 is added to the Monitor Resource List.

    Note

    By specifying a file in the local disk for the monitoring target of the disk RW monitoring resource, monitoring can be performed as the local disk monitoring. In such a case, select Generate an Intentional Stop Error for the Final Action.

6.4.3.2. Add a monitor resource (IP monitor resource)

Add monitor resources that monitor IP.

  1. Click Add in the Monitor Resource List dialog box. Select the monitor resource type ip monitor in the Type box, and enter the monitor resource name ipw1 in the Name box. Click Next.

  2. Enter the monitor settings. Change nothing from the default values. Click Next.

  3. Click Add in the IP Addresses. Enter the IP address to be monitored 192.168.0.254 in the IP Address box, and click OK.

    Note

    For monitoring target of the IP monitor resource, specify the IP address of a device (for example, gateway) that is assumed to be always active on the public LAN.

  4. The IP address you have entered is set in the IP Addresses. Click Next.

  5. Specify the recovery target. Click Browse.

  6. Click All Groups in the tree view and click OK. All Groups is set in the Recovery Target.

  7. Click Finish.
    The IP monitor resource ipw1 is added to the Monitor Resource List.

6.4.4. Disabling the cluster operation

When you click Finish after creating a monitor resource, the following popup message appears:

Clicking No disables automatic group startup, recovery on the activation/deactivation failure of a group resource, and recovery on the failure of a monitor resource. To start a cluster for the first time after creating the cluster configuration data, it is recommended to disable the automatic start and the recovery and to check the cluster configuration data for errors.

To disable the cluster operation, go to Cluster properties -> Extension tab -> Disable cluster operation.

Note

Even if the cluster operation is disabled, failover is performed upon a server failure.

Disabling the recovery on the failure of a monitor resource is not applied to the function of detecting the stall of the disk RW monitor resource.

Create cluster configuration information is complete. Proceed to the next section, "6.6. Starting a cluster".

6.5. Saving the cluster configuration data

The cluster configuration data can be saved in a file system or in media such as a floppy disk.

6.5.1. Saving the cluster configuration data

Follow the procedures below to save the cluster configuration.

  1. Click Export in the config mode of Cluster WebUI.

  2. Select a location to save the data and save it.

    Note

    One file (clp.conf) and one directory (scripts) are saved. If any of these are missing, the command to create a cluster does not run successfully. Make sure to treat these two as a set. When new configuration data is edited, clp.conf.bak is created in addition to these two.

Note

When installing EXPRESSCLUSTER, if the port number different from the default value is specified in Port Number, click Cluster Properties and click Port Number and specify the same values for WebManager HTTP Port Number and Disk Agent Port Number specified at the time of installation before saving the cluster configuration data.

6.6. Starting a cluster

After creating and/or modifying a cluster configuration data, apply the configuration data on the servers that constitute a cluster and create a cluster system.

6.6.1. How to create a cluster

After creation and modification of the cluster configuration data are completed, create a cluster in the following procedures.

  1. Click Apply the Configuration File in the config mode of Cluster WebUI.
    A popup message asking "Do you want to perform the operations?" is displayed. Click OK.
    When the upload ends successfully, a popup message saying "The application finished successfully." is displayed. Click OK.
    If the upload fails, perform the operations by following the displayed message.
  2. Select the Operation Mode on the drop down menu of the toolbar in Cluster WebUI to switch to the operation mode.

  3. Select Start Cluster in the Status tab of Cluster WebUI and click.
    Confirm that a cluster system starts and the status of the cluster is displayed to the Cluster WebUI. If the cluster system does not start normally, take action according to an error message.
    For how to operate and check the Cluster WebUI, see the online manual from the button on the upper right of the screen.

Note

When installing EXPRESSCLUSTER, if the port number different from the default value is specified in Port Number, click Cluster Properties and click Port Number and specify the same values for WebManager HTTP Port Number and Disk Agent Port Number specified at the time of installation before saving the cluster configuration data.

7. Verifying a cluster system

This chapter describes how you verify that the created cluster system runs normally.
This chapter covers:

7.1. Verifying the status using the Cluster WebUI

This chapter provides instructions for verifying the cluster system by using the Cluster WebUI. The Cluster WebUI is installed at the time of the EXPRESSCLUSTER Server installation. Therefore, it is not necessary to install it separately. The overview of the Cluster WebUI is provided. Then how to verify a cluster by accessing the Cluster WebUI is described.

See also

For system requirements of the Cluster WebUI, see the "Getting Started Guide".

Follow the steps below to verify the operation of the cluster after creating the cluster and connecting to the Cluster WebUI.

See also

For how to operate Cluster WebUI, see the online manual. If any error is detected while checking the status, troubleshoot the error referring to "Troubleshooting" in the "Reference Guide".

  1. Check heartbeat resources
    Check on the Cluster WebUI that the each server has been rebooted and that the heartbeat resource status of each server is normal. Check that no alert or error is recorded in the alert view of the Cluster WebUI.
  2. Check monitor resources
    Verify that the status of each monitor resource is normal on the Cluster WebUI.
  3. Start up a group
    Start a group.
    Check on the Cluster WebUI that the group has been started and that group resources included in the group have been started.
    Check that no alert or error is recorded in the alert view of the Cluster WebUI.
  4. Check a disk resource and mirror disk resources/hybrid disk resource
    Check that you can access the resource switching partition or data partition on the server where a disk resource/mirror disk resource/hybrid disk is active. Check that you cannot access the resource switching partition or data partition on the server where any resource described above is not active.
  5. Check a floating IP resource
    Check that you can ping a floating IP address while the floating IP is active.
  6. Check an application resource
    Check that an application is working on the server where an application resource is active.
  7. Check a service resource
    Check that a service is working on the server where a service resource is active.
  8. Stop a group
    Stop a group.
    Verify on the Cluster WebUI that the group has been stopped and that each group resource included in the group has been stopped. Verify that no alert or error is recorded in the alert view of the Cluster WebUI.
  9. Start a group
    Start a group.
    Verify on the Cluster WebUI that the group has been started.
  10. Move a group
    Move a group to another server.
    Check on the Cluster WebUI that the group has been started on the moving destination sever.
    Verify that each group resource has been started successfully and that no alert or error is recorded in the alert view of the Cluster WebUI.
    Move the group to all servers included in the failover policy to check above mentioned issue.
  11. Perform failover
    Shut down the server where a group is active.
    After the heartbeat timeout, check to see the group has failed over. Verify that the status of the group becomes activated on the failover destination server on the Cluster WebUI.
  12. Perform failback
    When the automatic failback is set, start the server that you shut down for checking failover. Verify that the group fails back to the original server after it is started. Check on the Cluster WebUI that the status of group becomes activated on the failback destination server.

    Note

    For groups that include mirror disk resource or hybrid disk resource, auto failback cannot be set because mirror recovery is required.

  13. Check the alert option
    When the alert option is set, check that an alert mail message is sent after checking a failover.
  14. Shut down the cluster
    Shut down the cluster. Verify that all servers in the cluster are successfully shut down Also, check that all servers start successfully by restarting them. At the same time, check that no alert or error is recorded in the Alert logs of the Cluster WebUI.

7.2. Verifying status using commands

Follow the steps below to verify the status of the cluster from a server constituting the cluster using command lines after the cluster is created.

See also

For details on how to use commands, see "EXPRESSCLUSTER command reference" in the "Reference Guide". If any error is detected while verifying the status, troubleshoot the error referring to "Troubleshooting" in the "Reference Guide".

  1. Check heartbeat resources
    Check that the status of each server is activated by using the clpstat command.
    Verify that the heartbeat resource status of each server is normal.
  2. Check monitor resources
    Verify that the status of each monitor resource is normal by using the clpstat command.
  3. Start groups
    Start the groups with the clpgrp command.
    Verify that the status of groups is activated by using the clpstat command.
  4. Check a disk resource/mirror disk resource/hybrid disk resource
    Check that you can access the resource switching partition or data partition on the server where a disk resource/mirror disk resource/hybrid disk is active. Check that you cannot access the resource switching partition or data partition on the server where any resource described above is not active.
  5. Check a floating IP resource
    Verify that you can ping a floating IP address while the IP resource is active.
  6. Check an application resource
    Verify that an application is working on the server where the application resource is active.
  7. Check a service resource
    Verify that a service is working on the server where the service resource is active.
  8. Stop a group
    Stop a group by using the clpgrp command. Check that the group is stopped by using the clpstat command.
  9. Start a group
    Start a group by using the clpgrp command. Check that the group is activated by using the clpstat command.
  10. Move a group
    Move a group to another server by using the clpstat command.
    Verify that the status of the group is activated by using the clpstat command.
    Move the group to all servers in the failover policy and verify that the status changes to activated on each server.
  11. Perform failover
    Shut down a server where a group is active.
    After the heartbeat timeout, check to see the group has failed over by using the clpstat command. Verify that the status of the group becomes activated on the failover destination server using the clpstat command.
  12. Perform failback (When it is set)
    When the automatic failback is set, start the server which you shut down in the previous step, "11. Perform failover." Verify that the group fails back to the original server after it is started using the clpstat command. Verify that the status of the group becomes activated on the failback destination server using the clpstat command.
  13. Check the alert option (When it is set)
    When the alert option is set, verify that a mail message is sent at failover.
  14. Shut down the cluster
    Shut down the cluster by using the clpstdn command. Verify that all servers in the cluster are successfully shut down.

8. Verifying operation

This chapter provides information on how to run dummy-failure tests to see the behaviors of your cluster system and how to adjust parameters.

This chapter covers:

8.1. Operation tests

Perform dummy-failure tests, backup, and restoration of the shared disk to verify that the monitor resource can detect errors normally, and that no unexpected errors occur. Also verify that the recovery operations performed when the monitor resource detects an error are performed as intended.
If monitor resources do not detect errors successfully or detect or any stoppage of the server or the OS occurs, the time-out value or other settings need to be adjusted.
  1. Transition of recovery operations due to dummy failure
    When Dummy Failure is enabled, a test must be conducted to check that recovery of the monitor resources in which an error was detected is performed as set.
    You can perform this test from Cluster WebUI or with the clpmonctrl command. For details, see the online manual or "EXPRESSCLUSTER command reference" in the "Reference Guide".
  2. Dummy-failure of the shared disks
    (When the shared disk is RAID-configured and dummy-failure tests can be run)
    The test must include error, replacement, and recovery of RAID for the shared disk.
    • Set a dummy-failure to occur on the shared disk.

    • Recover RAID from the degenerated state to normal state.

    For some shared disk, I/O may temporarily stop or delay when it switches to the degenerated operation or when the RAID is reconfigured.
    If any time-out and/or delay occurs in disk rw monitor resource or disk TUR monitor resource, adjust the time-out value of each monitor resource.
  3. Dummy-failure of the paths to shared disks
    (When the path to the shared disk is redundant paths and dummy-failure tests can be run.)
    The test must include an error in the paths and switching of one path to another.
    • Set a dummy-failure to occur in the primary path.

    It takes time for some path-switching software (driver) to switch the failed path to the path normally working. In some cases, the control may not be returned to the operating system (software).
    If any time-out and/or delay occurs in disk rw monitor resource or disk TUR monitor resource, adjust the time-out value of each monitor resource.
  4. Backup/Restoration
    If you plan to perform regular backups, run a test backup.
    Some backup software and archive commands make CPU and/or disk I/O highly loaded.
    If any heartbeat delays, delay in monitor resources, or time-out occur, adjust the heartbeat time-out value and/or time-out value of each monitor resource.

The following describes dummy-failures and what occur by the dummy-failures on a device basis. What occurs varies depending on a system configuration and resource settings. The table in the next page shows the operational examples in the general setting and configuration.

Device

Dummy-failure

What happens:

Disk device SCSI/FC path

Unplug the cable on the active server (for redundant disk cable, unplug both cables)

When the shared disk is monitored, an error is detected, and failover to the standby server occurs. When no disk is monitored, the operation stops.
Deactivation of a disk resource may fail when performing failover.

Unplug the cable on the standby server (for redundancy, unplug both cables)

When the disk TUR monitor resource monitors the disk path on the standby server, an error is detected. The operation continues on the active server.

Unplug the cable of the primary path when the disk path is redundant. (When FC Switch is used, power it off as well.)

Switching of the disk path is performed by the path switching software. No error is detected on the EXPRESSCLUSTER and the operation continues.

In the state of one side path described above, restart the server by moving a group or shutting down the cluster.

The disk path operates in the same way as it is normal.

Degenerate and/or recover the RAID of the disk device.

No error is detected on EXPRESSCLUSTER, and the operation continues.

When the disk device controller is duplicated, stop the one side.

When the path is duplicated, the disk path is switched by the path switching software. No error is detected on EXPRESSCLUSTER, and the operation continues.
When the path is not duplicated and each server is connected directly to the disk, an error is detected by the disk TUR monitor resource on the server connected to the stopped controller, failover to the standby server is performed. (When the controller on the standby server stops, the operation continues.)

Interconnect LAN

Unplug the cable dedicated to LAN

The LAN heartbeat resource on the interconnect becomes offline.
A warning is issued to the alert log.
Communication between servers continues by using a public LAN
= Operation continues.

Public LAN

Unplug the LAN cable or power off the HUB

Communication with the operational client stops, application stalls or an error occurs.
LAN heartbeat resource on the public LAN becomes inactive. A warning is issued to the alert log.
An error is detected when using IP monitor resource and/or NIC Link Up/Down monitor resource. When the cable on the active server is unplugged, a failover occurs. (When HUB is powered off, a failover is repeated up to the largest count configured.
When the public LAN is the only communication channel between servers (such as the remote cluster configuration), emergency shutdown due to the network partition resolving in the ping method takes place in the server where LAN cable is unplugged.

Server UPS

Unplug the power cable of UPS on the active server from outlet

The active server shuts down
Failover to the standby server occurs

UPS on a shared disk

When the power of the shared disk is duplicated, unplug one of the power cables from outlet.

No error is detected on EXPRESSCLUSTER and the operation continues. When UPS supplies the power to one server, the server shuts down. (If it is the active server, failover to the standby server takes place)

LAN for UPS

Unplug the LAN cable

UPS becomes uncontrollable. However, no error is detected on EXPRESSCLUSTER and operation continues.

COM

Unplug the RS-232C cable of the COM network partition resolving.

A warning is issued to the alert log.
Operation continues.

OS error

Run the shutdown command on the active server

The active server shuts down
Failover to a standby server occurs.

Mirror connect

When more than one LAN cable is set up for the mirror connect and one or more of them are connected
Unplug only the LAN cable that is being used as the mirror connect.

Continue the mirroring operation

When only one LAN cable is set up for the mirror connect, or when more than one LAN cable is set up for the mirror connect but none of them are connected
Unplug only the LAN cable that is being used as the mirror connect.
A warning is issued to the alert log (mirroring stops)
Operation continues on the active server but switching to a standby server becomes impossible.
An error is detected in mirror disk monitor resource/mirror connect disk resource/hybrid disk monitor resource.

Disk resource

Start up the disk resource on the server where the disk path is unplugged.

The disk resource does not get activated.

Failover to a standby server occurs.

Application resource

Start up the application resource on the server where the name of the file or folder configured for the start path of the application resource was temporarily changed.

The application resource does not get activated.
Failover to a standby server occurs.

Application monitor resource

Stop a process to be monitored by the task manager.

An error is detected. The application is restarted or a failover to the standby server occurs.

Service resource

Start up the service resource on the server where the path or name of the service's execution file was temporarily changed.

The service resource does not get activated.
Failover to a standby server occurs.

Service monitor resource

Stop a service to be monitored.

An error is detected. The service is restarted or a

failover to a standby server occurs.

Floating IP address

Specify the IP address that was set to a floating IP address to a machine in the same segment, and then start up the floating IP address resource.

The floating IP resource does not get activated.
Failover to a standby server occurs. (Activation fails at the failover destination. Failover is repeated up to the largest count configured)

VM resource

Disconnect the shared disk containing the virtual machine image.

The VM resource is not activated.

VM monitor resource

Shut down the virtual machine.

The virtual machine is started by restarting the resource.

See also

For information on how to change each parameter, see the "Reference Guide".

8.2. Backup and restoration

The following figure illustrates backup and restoration of data. For details on how to back up, see "The system maintenance information" in the "Maintenance Guide" and manuals backup software.

The following is an example of the backup on the uni-directional standby server.

9. Preparing to operate a cluster system

This chapter describes what you have to do before you start operating EXPRESSCLUSTER.
This chapter covers:

9.1. Operating the cluster

Before you start using your cluster system, check to see your cluster system work properly and make sure you can use the system properly. The operations described below can be executed by using Cluster WebUI or EXPRESSCLUSTER commands. For details of functions of Cluster WebUI, see the online manual. For the details of EXPRESSCLUSTER commands, see "EXPRESSCLUSTER command reference " in the "Reference Guide".The following describes procedures to start up and shut down a cluster and to shut down a server.

9.1.1. Activating a cluster

To activate a cluster, follow the instructions below:

  1. When you are using any shared or add-in disk, start the disk.

  2. Start all the servers in the cluster.

After cluster activation synchronization between the servers has been confirmed, a cluster is activated on each server. After the cluster has been activated, a group is activated on an appropriate server according to the settings.

Note

When you start all the servers in the cluster, make sure they are started within the duration of time set to Server Sync Wait Time on the Timeout tab of the Cluster Properties in the Cluster WebUI. Be careful that failover occurs if startup of any server fails to be confirmed within the specified time duration.

Note

The shared disk spends a few minutes for initialization after its startup. If a server starts up during the initialization, the shared disk cannot be recognized. Make sure to set servers to start up after the shared disk initialization is completed.

9.1.2. Shutting down a cluster and server

To shut down a cluster or server, use EXPRESSCLUSTER commands or shut down through the Cluster WebUI.

Note

When you are using the Replicator/Replicator DR, mirror break may occur if you do not use any EXPRESSCLUSTER commands or Cluster WebUI to shut down a cluster.

9.1.3. Shutting down the entire cluster

The entire cluster can be shut down by running the clpstdn command, executing cluster shutdown from the Cluster WebUI or performing cluster shutdown from the Start menu. To shut down the entire cluster, wait for all the groups to stop and then terminate each server. By shutting down a cluster, all servers in the cluster can be stopped properly as a cluster system.

9.1.4. Shutting down a server

Shut down a server by running the clpdown command or executing server shutdown from the Cluster WebUI. Failover occurs when you shut down a server. Mirroring performed by mirror disk resources/hybrid disk resources is interrupted when you are using the Replicator/Replicator DR. If you intend to use a standby server while performing hardware maintenance, shut down the active server.

9.1.5. Suspending/resuming a cluster

When you want to update the cluster configuration information, you can stop the EXPRESSCLUSTER service without stopping the current operation. Stopping the EXPRESSCLUSTER in this way is referred to as "suspending". Returning from the suspended status to the normal operation status is referred to as "resuming".
When suspending or resuming a cluster, a request for processing is issued to all the servers in the cluster. Suspending must be executed with the EXPRESSCLUSTER service on all the servers in the cluster being active.
Use EXPRESSCLUSTER commands or Cluster WebUI to suspend or resume a cluster.

When a cluster is suspended, some functions are disabled as described below because the EXPRESSCLUSTER service stops while the active resources are kept active.

  • All heartbeat resources stop.

  • All network partition resolution resources stop.

  • All monitor resources stop.

  • Groups or group resources are disabled (cannot be started, stopped, or moved).

  • The cluster status cannot be displayed or operated by Cluster WebUI or the clpstat command.

  • The following commands cannot be used:

    • clpstat

    • clpcl command options other than --resume

    • clpdown

    • clpstdn

    • clpgrp

    • clptoratio

    • clpmonctrl

    • clprsc

    • clpcpufreq

9.1.6. How to suspend a cluster

You can suspend a cluster by executing the clpcl command or by using Cluster WebUI.

9.1.7. How to resume a cluster

You can resume a cluster by executing the clpcl command or by using Cluster WebUI.

9.2. Suspending EXPRESSCLUSTER

There are two ways to stop running EXPRESSCLUSTER. One is to stop the service of the EXPRESSCLUSTER Server, and the other is to set the Server service to be manually started.

9.2.1. Stopping the EXPRESSCLUSTER Server service

To stop only the EXPRESSCLUSTER Server service without shutting down the operating system, use the clpcl command or Stop cluster from the Cluster WebUI.

See also

For more information on the clpcl command, see "EXPRESSCLUSTER command reference" in the "Reference Guide".

9.2.2. Setting the EXPRESSCLUSTER Server service to be manually activated

To make the EXPRESSCLUSTER Server service not start when the OS starts, make the setting by using the OS service manager so that the Server service is manually started. By doing this, the EXPRESSCLUSTER will not start when the OS is rebooted next time.

9.2.3. Changing the setting of the EXPRESSCLUSTER Server service from the manual startup to automatic startup

The OS service manager is also used to set the EXPRESSCLUSTER Server service to be started automatically. Even you change the settings, the EXPRESSCLUSTER Server service remains stopped until it is directly started up or the server is restarted.

9.3. Modifying the cluster configuration data

The following describes procedures and precautions for modifying the configuration data after creating a cluster.

9.3.1. Modifying the cluster configuration data by using the Cluster WebUI

  1. Start the Cluster WebUI.

  2. Select the Config Mode icon from the drop down menu of the tool bar in Cluster WebUI.

  3. Modify the configuration data after the current cluster configuration data is displayed.

  4. Upload the modified configuration data. Depending on the data modified, it may become necessary to suspend or stop the cluster and/or to restart by shutting down the cluster. In such a case, uploading is canceled once and the required operation is displayed. Follow the displayed message and do as instructed to perform upload again.

9.3.2. Applying the modified cluster configuration data

To upload the modified cluster configuration data by the Cluster WebUI or the clpcfctrl command, select the operation from the following depending on the modification. For the operation required to apply the modified data, refer to "Parameter details"in the "Reference Guide".

The way you apply the changed data may affect the applications on the system and the behavior of the EXPRESSCLUSTER Server. For details, see the table below:

#

The way to apply changes

Effect

1

Upload only

The operation of the EXPRESSCLUSTER Server is not affected. Heartbeat resources, group resources and monitor resource do not stop.

2

Upload data and then restart the API service

The operation of the EXPRESSCLUSTER Server is not affected. Heartbeat resources, group resources and monitor resource do not stop.

3

Restart the WebManager server after uploading

The operation of the EXPRESSCLUSTER Server is not affected. Heartbeat resources, group resources and monitor resource do not stop.

4

Upload data and then restart the Information Base service

The operation of the EXPRESSCLUSTER Server is not affected. Heartbeat resources, group resources and monitor resource do not stop.

5

Upload after stopping the group whose setting has been changed

Group resources are stopped. Because of this, the applications on the system that are controlled by the group are stopped until the group is started after uploading.

6

Upload after suspending the cluster

The EXPRESSCLUSTER is partly stopped.
During the period when the EXPRESSCLUSTER Server service is suspended, heartbeat resources and monitor resources are stopped. Because group resources do not stop, the applications on the system continue to operate.

7

Upload after stopping the cluster

The EXPRESSCLUSTER totally stops. Groups stop as well. Therefore, the applications used on the system are stopped until data is uploaded and the cluster is started.

8

Shut down and restart the cluster after uploading the data

The applications used on the system are stopped until the cluster restarts and the group is started.

Note

If the cluster needs to be suspended or stopped to apply the modified data, ensure suspension on stopping is complete before applying the data.
Check if the message on the Cluster WebUI Alert logs shows "Type : Info,Module name: pm, Event ID: 2". For more information on messages, see "Error messages" in the "Reference Guide".
When the Cluster WebUI is not available, check the syslog to see if "Module type: pm, Event type: information, Event ID: 2" is displayed on the event viewer.
After checking the message stated above, apply the cluster configuration data on the EXPRESSCLUSTER environment.

10. Uninstalling and reinstalling EXPRESSCLUSTER

This chapter provides instructions for uninstalling and reinstalling EXPRESSCLUSTER.
This chapter covers:

10.1. Uninstallation

10.1.1. Uninstalling the EXPRESSCLUSTER Server

Note

You must log on as Administrator when uninstalling the EXPRESSCLUSTER Server. It is recommended to extract configuration information before performing uninstallation. For details, refer to "EXPRESSCLUSTER command reference" in the "Reference Guide".

Follow the procedures below to uninstall the EXPRESSCLUSTER Server:

  1. Switch the type of service startup to manual startup.

    clpsvcctrl.bat --disable -a

  2. Shutdown the server.

  3. If the shared disk is used, please unplug all disk cables connected to the server because disk filtering will be disabled after uninstallation.

  4. Turn on the server.

  5. In Control Panel in OS, click Programs and Features.

  6. Select EXPRESSCLUSTER Server, and then click Uninstall.

  7. The EXPRESSCLUSTER Server Setup dialog box is displayed.

  8. Click Yes in the uninstallation confirmation dialog box. If you click No, uninstallation will be canceled.

  9. If the SNMP service is started, the message to confirm to stop the SNMP service is displayed. Click Yes. If you click No, uninstallation will be canceled.
  10. The message asking whether to return the media sense function (TCP/IP disconnection detection) to the state before installing the EXPRESSCLUSTER Server is displayed. Click Yes to return to the state before installing the EXPRESSCLUSTER Server. If you click No, EXPRESSCLUSTER will be uninstalled while media sense function is not effective.
  11. The completion message is displayed when uninstallation is completed in the EXPRESSCLUSTER Server Setup dialog box. Click Finish.

  12. The confirmation message whether to restart the computer is displayed. Select whether to restart the PC and click Finish. Uninstallation of the EXPRESSCLUSTER Server is completed.

Important

If the shared disk is used, make sure not to start the OS while the shared disk is connected after uninstalling EXPRESSCLUSTER. Data on the shared disk may be corrupted.

Note

If you uninstall EXPRESSCLUSTER with CPU frequency changed by using CPU Frequency Control of EXPRESSCLUSTER, the CPU frequency does not return to the state before changing. In this case, return the CPU frequency to the defined value by the following way.

Select Balanced in Power Options -> Choose or customize a power plan in Control Panel.

10.2. Reinstallation

10.2.1. Reinstalling the EXPRESSCLUSTER Server

To reinstall the EXPRESSCLUSTER Server, you have to prepare the cluster configuration data (or the latest data if you reconfigured the cluster) created by the Cluster WebUI.

After changing the configuration data, make sure to save the latest cluster configuration data. The configuration data backup can be created by the clpcfctrl command as well as it can be saved in the Cluster WebUI when being created. For details, refer to "Creating a cluster and backing up configuration data (clpcfctrl command)" in "EXPRESSCLUSTER command reference" in the "Reference Guide".

To reinstall EXPRESSCLUSTER Server on the entire cluster

To reinstall the EXPRESSCLUSTER Server, follow the procedures below:

  1. Unplug disk all cables connected to all servers because access restriction does not function until reinstallation of the EXPRESSCLUSTER Server is completed.

  2. Uninstall the EXPRESSCLUSTER Server in all servers that configure a cluster system. When reinstalling OS, it is not necessary to uninstall EXPRESSCLUSTER. However, if EXPRESSCLUSTER will be reinstalled to the folder where it was installed before, all files in the installation folder need to be deleted.
    For details on the uninstallation procedures, refer to "Uninstalling the EXPRESSCLUSTER Server" in this chapter.
  3. Shut down OS after uninstalling the EXPRESSCLUSTER Server is completed.

    Important

    When a shared disk is used, make sure not to start the server connected to the shared disk while EXPRESSCLUSTER is uninstalled. Data on the shared disk may be corrupted.

  4. Install the EXPRESSCLUSTER Server and register the license as necessary. Shut down the OS after installing the EXPRESSCLUSTER Server is completed. If the shared disk is used, connect the shared disk and then start the OS. If the shared disk is not used, simply start the OS.
    For details on how to install the EXPRESSCLUSTER Server, refer to "4. Installing EXPRESSCLUSTER" in this guide. For how to register the license, refer to "5. Registering the license" in this guide.

    Important

    When a shared disk is used, make sure not to connect the shared disk to HBA without filtering settings or SCSI controller. Data on the shared disk may be corrupted.

  5. Create the cluster configuration data and a cluster.
    For details on how to create the cluster configuration data and a cluster, refer to "6. Creating the cluster configuration data" in this guide.

To reinstall EXPRESSCLUSTER Server on some servers in the cluster

To reinstall the EXPRESSCLUSTER Server, follow the procedures below:

  1. When a shared disk is used, unplug all disk cables connected to the servers on which you want to reinstall the EXPRESSCLUSTER Server. This is because the access control does not work until the reinstallation is completed.

  2. Uninstall the EXPRESSCLUSTER Server. If you are reinstalling the OS, it is not necessary to uninstall the EXPRESSCLUSTER. However, when reinstalling in the folder on which EXPRESSCLUSTER was installed, the files in the installation folder must be deleted.
    For details on uninstallation procedures, refer to "Uninstalling the EXPRESSCLUSTER Server" in this chapter.
  3. Shut down the OS when uninstalling the EXPRESSCLUSTER Server is completed.

    Important

    When a shared disk is used, make sure not to start the server connected to the shared disk while EXPRESSCLUSTER is uninstalled. Data on the shared disk may be corrupted.

  4. Install the EXPRESSCLUSTER Server to the server where it was uninstalled, and register the license as necessary. Shut down the OS when installing EXPRESSCLUSTER Server is completed. When a shared disk is used, connect the shared disk and then start the OS. If a shared disk is not used, simply start the OS.
    For details on how to install the EXPRESSCLUSTER Server, refer to "4. Installing EXPRESSCLUSTER" in this guide. For how to register the license, refer to "5. Registering the license" in this guide.

    Important

    When a shared disk is used, make sure not to connect the shared disk to HBA without filtering settings or SCSI controller. Data on the shared disk may be corrupted.

  5. Connect to the Cluster WebUI in other servers in a cluster and switch to the Config mode.

  6. If a shared disk is used and the OS is reinstalled, or if you modify HBA to connect the shared disk, update the filtering information in HBA tab in Server Properties of the server where the OS is reinstalled.

    Important

    To configure the filtering settings, click Server Properties of the server where the EXPRESSCLUSTER Server is installed, click HBA tab, and then click Connect. If the filtering setting is configured without clicking Connect, data on the shared disk may be corrupted.

  7. From the server where the web browser of the Cluster WebUI is connected, run clpcl --suspend --force from the command prompt and suspend the cluster.

  8. Apply the changes by the Config mode.

    If the fixed-term license is used, run the following command.

    clplcnsc --reregister <a folder path for saved license files>

  9. The following message is displayed if the changes has successfully been applied.

The application finished successfully.

  1. Change the Cluster WebUI to Operation mode and resume the cluster from the Service menu.

    Note

    When resuming the cluster from the Cluster WebUI, the message "Failed to resume the cluster. Click the Reload button, or try again later." is displayed, but ignore this message.

  2. Select Start Server Service for the server where EXPRESSCLUSTER Server is reinstalled from Cluster WebUI.

  3. When Off is selected in Auto Return in Cluster Properties, click the server where the EXPRESSCLUSTER Server is reinstalled by using the Cluster WebUI and select Recover.

  4. If necessary, move the group.

11. Troubleshooting

11.1. Error messages when installing the EXPRESSCLUSTER Server

Behavior and Message

Cause

Action

failed to set up
Error code: %x
%x: error code

Refer to the given error code.

Refer to the action for the error code.

Less than 9.0 has been installed. After uninstalling, reinstall it again.

The old version of the EXPRESSCLUSTER has been installed.

Uninstall the old version of the EXPRESSCLUSTER and install the current version.

Failed to set up (%d)
Error code: %x
After restart, install it.
%d: internal code
%x: error code

Refer to the explanation of the given error code.

Refer to the action for the given error code.

11.2. Licensing

Behavior and Message

Cause

Action

When the cluster was shut down and rebooted after distribution of the configuration data created by the Cluster WebUI to all servers, the following message was displayed on the alert log, and the cluster stopped.
"The license is not registered. (Product name: %1)"
%1: Product name

The cluster has been shut down and rebooted without its license being registered.

Register the license according to "Registering the license".

When the cluster was shut down and rebooted after distribution of the configuration data created by the Cluster WebUI to all servers, the following message appeared on the alert log, but the cluster is working properly.
"The number of licenses is insufficient. The number of insufficient licenses is %1. (Product name:%2)"
%1: The number of licenses in short of supply
%2: Product name

Not enough license

Obtain a license and register it.

While the cluster was operated on the trial license, the following message is displayed and the cluster stopped.

"The trial license has expired in %1. (Product name: %2)"
%1: Trial end date
%2: Product name

The license has already expired.

Ask your sales agent for extension of the trial version license, or obtain and register the product version license.

While the cluster was operated on the fixed term license, the following message appeared.
"The fixed term license has expired in %1. (Product name:%2)"
%1: Fixed term end day
%2: Product name

The license has already expired.

Obtain the license for the product version from the vendor, and then register the license.

12. Glossary

Active server
A server that is running for an application set.
(Related term: Standby server)
Cluster partition
A partition on a mirror disk. Used for managing mirror disks.
(Related term: Disk heartbeat partition)
Cluster shutdown

To shut down an entire cluster system (all servers that configure a cluster system).

Cluster system

Multiple computers are connected via a LAN (or other network) and behave as if it were a single system.

Data partition
A local disk that can be used as a shared disk for switchable partition. Data partition for mirror disks.
(Related term: Cluster partition)
Disk heartbeat partition

A partition used for heartbeat communication in a shared disk type cluster.

Failback

A process of returning an application back to an active server after an application fails over to another server.

Failover

The process of a standby server taking over the group of resources that the active server previously was handling due to error detection.

Failover group

A group of cluster resources and attributes required to execute an application.

Failover policy

A priority list of servers that a group can fail over to.

Floating IP address
Clients can transparently switch one server from another when a failover occurs.
Any unassigned IP address that has the same network address that a cluster server belongs to can be used as a floating address.
Heartbeat
Signals that servers in a cluster send to each other to detect a failure in a cluster.
(Related terms: Interconnect, Network partition)
Interconnect
A dedicated communication path for server-to-server communication in a cluster.
(Related terms: Private LAN, Public LAN)
Management client

Any machine that uses the Cluster WebUI to access and manage a cluster system.

Master server

The server displayed at the top of Master Server in Server Common Properties of the config mode of Cluster WebUI

Mirror connect

LAN used for data mirroring in a data mirror type cluster. Mirror connect can be used with primary interconnect.

Mirror disk type cluster

A cluster system that does not use a shared disk. Local disks of the servers are mirrored.

Moving failover group

Moving an application from an active server to a standby server by a user.

Network partition
All heartbeat is lost and the network between servers is partitioned.
(Related terms: Interconnect, Heartbeat)
Node

A server that is part of a cluster in a cluster system. In networking terminology, it refers to devices, including computers and routers, that can transmit, receive, or process signals.

Primary (server)
A server that is the main server for a failover group.
(Related term: Secondary server)
Private LAN
LAN in which only servers configured in a clustered system are connected.
(Related terms: Interconnect, Public LAN)
Public LAN
A communication channel between clients and servers.
(Related terms: Interconnect, Private LAN)
Secondary server
A destination server where a failover group fails over to during normal operations.
(Related term: Primary server)
Server Group

A group of servers connected to the same network or the shared disk device

Shared disk

A disk that multiple servers can access.

Shared disk type cluster

A cluster system that uses one or more shared disks.

Standby server
A server that is not an active server.
(Related term: Active server)
Startup attribute

A failover group attribute that determines whether a failover group should be started up automatically or manually when a cluster is started.

Switchable partition
A disk partition connected to multiple computers and is switchable among computers.
(Related terms: Disk heartbeat partition)
Virtual IP address

IP address used to configure a remote cluster.