3. Group resource details

This chapter provides information on group resources that constitute a failover group.

For overview of group resources, see " Configuring a cluster system" in the "Installation and Configuration Guide".

This chapter covers:

3.1. Group resources and supported EXPRESSCLUSTER versions

The following is the number of group resources that can be registered with a group:

Version

Number of group resources

(per group)

4.0.0-1 or later

256

Currently supported group resources are:

Group resource name

Abbreviation

Functional overview

Supported version

Exec resource

exec

See "Understanding EXEC resources"

4.0.0-1 or later

Disk resource

disk

See "Understanding Disk resource"

4.0.0-1 or later

Floating IP resource

fip

See "Understanding Floating IP resource"

4.0.0-1 or later

Virtual IP resource

vip

See "Understanding Virtual IP resources"

4.0.0-1 or later

Mirror disk resource

md

See "Understanding Mirror disk resources"

4.0.0-1 or later

Hybrid disk resource

hd

See "Understanding Hybrid disk resources"

4.0.0-1 or later

NAS resource

nas

See "Understanding NAS resource"

4.0.0-1 or later

Volume manager resource

volmgr

See "Understanding Volume manager resources"

4.0.0-1 or later

VM resource

vm

See "Understanding VM resources"

4.0.0-1 or later

Dynamic DNS resource

ddns

See "Understanding Dynamic DNS resources"

4.0.0-1 or later

AWS Elastic IP resource

awseip

See "Understanding AWS Elastic IP resources"

4.0.0-1 or later

AWS Virtual IP resource

awsvip

See "Understanding AWS Virtual IP resources"

4.0.0-1 or later

AWS DNS resource

awsdns

See "Understanding AWS DNS resources"

4.0.0-1 or later

Azure probe port resource

azurepp

See "Understanding Azure probe port resources"

4.0.0-1 or later

Azure DNS resource

azuredns

See "Understanding Azure DNS resources"

4.0.0-1 or later

Google Cloud Virtual IP resource

gcvip

See "Understanding Google Cloud Virtual IP resources"

4.2.0-1 or later

Oracle Cloud Virtual IP resource

ocvip

See "Understanding Oracle Cloud Virtual IP resources"

4.2.0-1 or later

The group resources that currently support dynamic resource addition are as follows:

Group resource name

Abbreviation

Functional overview

Supported version

Exec resource

exec

See "Understanding EXEC resources"

4.0.0-1 or later

Disk resource

disk

See "Understanding Disk resource"

4.0.0-1 or later

Floating IP resource

fip

See "Understanding Floating IP resource"

4.0.0-1 or later

Virtual IP resource

vip

See "Understanding Virtual IP resources"

4.0.0-1 or later

Volume manager resource

volmgr

See "Understanding Volume manager resources"

4.0.0-1 or later

3.2. Attributes common to group resources

A group is a failover unit. Rules regarding the failover operations (failover policies) can be specified for a group.

3.2.1. Understanding the group type

The following two types of groups exist: virtual machine groups and failover groups.

  • Virtual machine groups
    Failovers (migration) are performed on a virtual machine basis. The following resources can be registered with this group: virtual machine resource, mirror disk resource, disk resource, hybrid disk resource, EXEC resource, NAS resource, and volume manager resource. A virtual machine group automatically follows even when the virtual machine is moved to a different server by a means other than EXPRESSCLUSTER.
  • Failover groups
    Resources necessary to continue operations are grouped and failovers are performed on an operation basis. Up to 256 group resources can be registered with each group. However, no VM resource can be registered.

3.2.2. Understanding the group properties

The following properties can be specified for each group:

  • Servers that can run the Group
    Select a server that can run the group from the servers in the cluster.
    Specify the order of servers that can run the group and the priority according to which the group is started.
  • Startup Attribute
    Specify automatic or manual startup as the group startup attribute.
    For automatic startup, the group is automatically started on the server that can run the group and has the highest priority when the cluster is started.
    For manual startup, the group is not started when the server is started. Manually start the group by using the Cluster WebUI or clpgrp command after the server is started. For details about the Cluster WebUI, see the online manual. For details about the clpgrp command, see "Operating groups (clpgrp command)" in "8. EXPRESSCLUSTER command reference" in this guide.
  • Failover attribute
    The failover attribute can be used to specify the failover mode. The following failover attributes can be specified.

    Automatic failover

    A heartbeat timeout or error detection by a group or monitor resource triggers an automatic failover.

    For an automatic failover, the following options can be specified.

    • Use the startup server settings
      The failover destination is determined according to the priority of the servers that can run the group.
    • Fail over dynamically
      The failover destination is determined by considering the statuses of each server's monitor resource or failover group, and then a failover is performed.

The failover destination is determined in the following way.

Determination factor

Condition

Result

Status of exclusion target monitor resource

Error (all servers)

When there is no failover destination, proceed to the processing for determining a forced failover judgment .

Normal (single server)

A normal server is used as the failover destination.

Normal (multiple servers)

Proceed to the process that compares error levels.

Perform a forced failover

Set

Proceed to the process that ignores the status of the exclusion target monitor resource and which compares error levels for all the activated servers.

Not set

Failover is not performed.

Number of servers with the lowest error level

1

The server that has the lowest error level is used as the failover destination.

Two or more

The operation levels are compared for those servers that have the lowest error level.

Prioritize failover policy in the server group

Set
and
Within the same server group as the failover source, there is a server that can perform failover.

The server in the same server group is used as the failover destination.

Set
and
Within the same server group as the failover source, there is no server that can perform failover.

Proceed to the smart failover judgment process.

Not set

Proceed to the smart failover judgment process.

Perform a smart failover

Set
and
The number of servers recommended as the failover destination is 1.

The server recommended by the smart failover is used as the failover destination.

Set
and
The number of servers recommended as the failover destination is 2 or more.

Proceed to the running level judgment process.

Not set

Proceed to the running level judgment process.

Number of servers with the lowest running level

1

The server with the lowest running level is used as the failover destination.

Two or more

Of the activated servers, the server with the highest priority is used as the failover destination.

Note

Critical monitor resource
Exclude the server that detected an error in a monitor resource from the failover destination.
The exclusive list is set with the Cluster WebUI.

Error level
Number of monitor resources that detected errors

Smart failover
A function that assigns the server with the smallest load as the failover destination, based on the system resource information collected by the System Resource Agent. To enable this function, a System Resource Agent license must be registered on all the servers set as the failover destination and the system monitor resources must be set as the monitor resource. For detail about the system monitor resources, see "Understanding System monitor resources" in "4. Monitor resource details" in this guide.

Running level
Number of started failover groups or number of failover groups that are being started, excluding management groups
  • Prioritize failover policy in the server group
    If a server in the same server group can be used as the failover destination, this server is preferably used. The server that can run the failover group and has the highest priority among the running servers is used as the failover destination.
    If no server in the same server group can be used as the failover destination, a server in another server group is used as the failover destination.
  • Allow only a manual failover between server groups
    This can be selected only when the above Prioritize failover policy in the server group is set.
    An automatic failover is performed only if a server within the same server group is the destination.
    If no servers in the same server group can be used as the failover destination, failing over to a server in another server group is not automatically performed.
    To move the group to a server in another server group, use the Cluster WebUI or clpgrp command.

Manual failover

A failover is not automatically performed when a heartbeat timeout occurs. Manually start a failover by using the Cluster WebUI or clpgrp command. However, even when manual failover is specified, an automatic failover is performed if a group resource or monitor resource detects an error.

Note

If Execute Failover to outside the Server Group is set in message receive monitor resource setting, dynamic failover setting and failover setting between server groups will be invalid. A failover is applied to the server that is in a server group other than the server group to which the failover source server belongs and which has the highest priority.

  • Failback attribute

    Specify automatic or manual failback. However, This cannot be specified when the following conditions match.

    • Mirror disk resource or hybrid disk resource is set to fail over group.

    • Failover attribute is Fail over dynamically.

    For automatic failback, an automatic failback is performed when the server that has the highest priority is started after a failover.

    For manual failback, no failback occurs even when the server is started.

3.2.3. Understanding failover policy

A failover policy is a priority that determines a server to be the failover destination from multiple servers. When you configure the failover policy, avoid making certain servers heavily loaded at a failover.

The following describes how servers behave differently depending on failover policies when a failover occurs using example of the server list that can fail over and failover priority in the list.

<Symbols and meaning>

Server status

Description

O

Normal (properly working as a cluster)

X

Stopped (cluster is stopped)

3-node configuration:

Group

Priority order of servers

1st priority server

2nd priority server

3rd priority server

A

server1

server3

server2

B

server2

server3

server1

2-node configuration:

Group

Priority order of servers

1st priority server

2nd priority server

A

server1

server2

B

server2

server1

It is assumed that the group startup attributes are set to auto startup and the failback attributes are set to manual failback for both Group A and B.

  • For groups belonging to exclusion rules in which exclusive attributes are Normal or Absolute, the server which they start up or fail over is determined by the failover priority to the server. If a group has two or more servers of the same failover priority, it is determined by the order of numbers, the specific symbols and alphabets of the group name. For details on the failover exclusive attribute, refer to "Understanding Exclusive Control of Group".

    When Group A and B do not belong to the exclusion rules:

    1. Cluster startup

    2. Cluster shutdown

    3. Failure of server1 Fails over to the next priority server.

    4. Server1 power on

    5. Cluster shutdown

    6. Move group A

    7. Failure of server2: Fails over to the next priority server.

    8. Failure of server2: Fails over to the next priority server.

    9. Failure of server3: Fails over to the next priority server.

    10. Failure of server2: Fails over to the next priority server.

    11. Failure of server3: Fails over to the next priority server.

When Group A and B belong to the exclusion rules in which the exclusive attribute is set to Normal:

  1. Cluster startup

  2. Cluster shutdown

  3. Failure of server1: Fails over to a server where no normal exclusive group is active.

  4. Server1 power on

  5. Cluster shutdown

  6. Move groupA

  7. Failure of server2: Fails over to a server where a normal exclusive group is not active.

  8. Failure of server2: There is no server where a normal exclusive group is not active, but failover to the server because there is a server that can be started.

  9. Failure of server3: There is no server where a normal exclusive group is not active, but failover to the server because there is a server that can be started.

  10. Failure of server2: Fails over to a server where a normal exclusive group is not active.

  11. Failure of server3: Fails over to a server where a normal exclusive group is not active.

When Group A and B belong to the exclusion rules in which the exclusive attribute is set to Absolute:

  1. Cluster startup

  2. Cluster shutdown

  3. Failure of server1: Fails over to the next priority server.

  4. server1 power on

  5. Cluster shutdown

  6. Move groupA

  7. Failure of server2: Fails over to the next priority server.

  8. Failure of server2: Does not failover (GroupB stops).

  9. Failure of server3: Does not failover (GroupA stops).

  10. Failure of server2: Fails over to the server where no full exclusive group is active.

  11. Failure of server3: Fails over to the server where no full exclusive group is active.

For Replicator (two-server configuration) When Group A and B do not belong to the exclusion rules:

  1. Cluster startup

  2. Cluster shutdown

  3. Failure of server1: Fails over to the standby server of GroupA.

  4. Server1 power on

  5. Cluster shutdown

  6. Move groupA

  7. Failure of server2: Fails over to the standby server of GroupB.

  8. Failure of server2

  9. Failure of server3: Fails over to the standby server.

3.2.4. Operations at detection of activation and deactivation failure

When an activation or deactivation error is detected, the following operations are performed:

  • When an error in activation of group resources is detected:

    • When an error in activation of group resources is detected, activation is retried.

    • When activation retries fail as many times as the number set to Retry Count at Activation Failure, a failover takes place.

    • If the failover fails as many times as the number set to Failover Threshold, the final action is performed.

  • When an error in deactivation of group resources is detected:

    • When an error in deactivation of group resources is detected, deactivation is retried.

    • When deactivation retries fail as many times as the number set to Retry Count at Deactivation Failure, the final action is performed.

Note

Activation retries and failovers are counted on a server basis. The Retry Count at Activation Failure and Failover Threshold are maximum activation retry count and failover count on a server basis respectively.
The activation retry count and failover count are reset in a server where the group activation is successful.
Note that a failed recovery action is also counted as one for the activation retry count or failover count.

The following describes how an error in activation of a group resource is detected:

When the following settings are made:
Retry Count at Activation Failure 3 times
Failover Threshold 1 time
Final Action Stop Group

3.2.5. Script before final action

When a group resource activation error is detected, a script before final action can be executed before the last action during detection of a deactivation error.

Environment variables used with a script before final action

When executing a script, EXPRESSCLUSTER sets information such as the state in which it is executed (when an activation error occurs, when a deactivation error occurs) in the environment variables.

In the script, processing that is appropriate for the system operation can be described using the environment variables listed below as branch conditions.

Environment variable

Value

Description

CLP_TIMING
...Execution timing

START

Executes a script before final action in the event of a group resource activation error.

STOP

Executes a script before final action in the event of a group resource deactivation error.

CLP_GROUPNAME
...Group name

Group name

Indicates the name of the group containing the group resource in which an error that causes the script before final action to be executed is detected.

CLP_RESOURCENAME
...Group resource name

Group resource name

Indicates the name of the group resource in which an error that causes the script before final action to be executed is detected.

Flow used to describe a script before final action

The following explains the environment variables in the previous topic and an actual script, associating them with each other.

Example of a script before final action in the event of an activation error

Tips for creating a script before final action

Note the following when creating a script:

  • If the script contains a command that will take some time to execute, always leave a trace that will indicate the completion of the execution of that command. If a problem occurs, you can use this information to isolate the failure. One way of leaving such a trace is to use clplogcmd.

  • Method of describing in a script by using clplogcmd
    Using clplogcmd, you can output messages to the Alert logs of Cluster WebUI or syslog of the OS. For details on the clplogcmd command, see "Outputting messages (clplogcmd command)" in "8. EXPRESSCLUSTER command reference" in this guide.

    (Example: Script image)

    clplogcmd -m "recoverystart.."
    recoverystart
    clplogcmd -m "OK"
    

Notes on script before final action

  • Stack size of the commands and application to be started from a script

    A recovery script and a script before recovery action are executed with the stack size set to 2 MB. For this reason, if the commands and applications to be started from the script require a stack size of 2 MB or greater, a stack overflow will occur.
    If a stack overflow occurs, set the stack size before starting the commands and applications.
  • Condition that a script before final action is executed

    A script before final action is executed before the final action upon detection of a group resource activation or deactivation failure. Even if No operation (Next Resources Are Activated/Deactivated) or No operation (Next Resources Are Not Activated/Deactivated) is set as the final action, a script before final action is executed.
    If the final action is not executed because the maximum restart count has reached the upper limit or by the function to suppress the final action when all other servers are being stopped, a script before final action is not executed.

3.2.6. Script Before and After Activation/Deactivation

An arbitrary script can be executed before and after activation/deactivation of group resources.

Environment variables used with a script after activation/deactivation

When executing a script, EXPRESSCLUSTER sets information such as the state in which it is executed (before activation, after activation, before deactivation, or after deactivation) in the environment variables.

Environment variable

Value

Description

CLP_TIMING
...Execution timing

PRESTART

Executes a script before a group resource is activated.

POSTSTART

Executes a script after a group resource is activated.

PRESTOP

Executes a script before a group resource is deactivated.

POSTSTOP

Executes a script after a group resource is deactivated.

CLP_GROUPNAME
...Group name

Group name

Indicates the group name of the group resource containing the script.

CLP_RESOURCENAME
...Group resource name

Group resource name

Indicates the name of the group resource containing the script.

Flow used to describe a script before and after activation/deactivation

The following explains the environment variables in the previous topic and an actual script, associating them with each other.

Example of a script before and after activation/deactivation

Tips for creating a script before and after activation/deactivation

Note the following when creating a script:

  • If the script contains a command that will take some time to execute, always leave a trace that will indicate the completion of the execution of that command. If a problem occurs, you can use this information to isolate the failure. One way of leaving such a trace is to use clplogcmd.

  • Method of describing in a script by using clplogcmd
    Using clplogcmd, you can output messages to the Alert logs of Cluster WebUI or syslog of the OS. For details on the clplogcmd command, see "Outputting messages (clplogcmd command)" in "8. EXPRESSCLUSTER command reference" in this guide.

    (Example: Script image)

    clplogcmd -m "start.."
    :
    clplogcmd -m "OK"
    

Notes on script before and after activation/deactivation

  • Stack size of the commands and application to be started from a script
    A script before and after activation/deactivation is executed with the stack size set to 2 MB. For this reason, if the commands and applications to be started from the script require a stack size of 2 MB or greater, a stack overflow will occur.
    If a stack overflow occurs, set the stack size before starting the commands and applications.

3.2.7. Reboot count limit

If the action which is accompanied by OS reboot is selected as the final action to be taken when any error in activation or deactivation is detected, you can limit the number of shutdowns or reboots caused by detection of activation or deactivation errors.

This maximum reboot count is the upper limit of reboot count of each server.

Note

The maximum reboot count is the upper limit of reboot count of a server because the number of reboots is recorded per server.

The number of reboots that are taken as a final action in detection of an error in group activation or deactivation and those by a monitor resource are recorded separately.

If the time to reset the maximum reboot count is set to zero (0), the number of reboots will not be reset. Run the clpregctrl command to reset this number. For details on the clpregctrl command, see "Controlling reboot count (clpregctrl command)" in "8. EXPRESSCLUSTER command reference".

The following describes the flow of operations when the limitation of reboot count is set as shown below:

As a final action, Stop cluster daemon and reboot OS is executed once because the maximum reboot count is set to one (1).

If group activation is successful at a reboot following the cluster shutdown, the reboot count is reset after 10 minutes because the time to reset maximum reboot count is set to 10 minutes.

Setting example
Retry Count at Activation Failure 0 time
Failover Threshold 0 time
Final Action Stop cluster service and reboot OS
Max Reboot Count 1 time
Max Reboot Count Reset Time 10 minutes

3.2.8. Resetting the reboot count

Run the clpregctrl command to reset the reboot count. For details on the clpregctrl command, see "Controlling reboot count (clpregctrl command)" in "8. EXPRESSCLUSTER command reference" in this guide.

3.2.9. Checking a double activation

When a group is started, it is possible to check whether a double activation will occur or not.

  • If a double activation is determined not to occur:
    A group startup begins.
  • If a double activation is determined to occur (ih a timeout occurs):
    A group startup does not begin. If the server attempts to start up the group, that group is stopped.

Note

  • If a single resource is started while its relevant group is stopped, a double activation check will be performed. However, if a single resource is started while any resource in the group is activated, a double activation check will not be performed.

  • If there are no floating IP resources for the group for which Detect double activation is selected, a double activation is not executed and the group startup begins.

  • If a double activation is determined to occur, the statuses of groups and resources may not match among servers.

3.2.10. Understanding setting of group start dependence and group stop dependence

You can set the group start and stop order by setting group start dependence and group stop dependence.

  • When group start dependence is set:

    • For group start, start processing of this group is performed after start processing of the group subject to start dependence completes normally.

    • For group start, if a timeout occurs in the group for which start dependence is set, the group does not start.

  • When group stop dependence is set:

    • For group stop, stop processing of this group is performed after stop processing of the group subject to stop dependence completes normally.

    • If a timeout occurs in the group for which stop dependence is set, the group stop processing continues.

    • Stop dependence is performed according to the conditions specified in Cluster WebUI.

To display the settings made for group start dependence and group stop dependence, click group properties in the config mode of Cluster WebUI and then click the Start Dependency tab and the Stop Dependency tab.

Depths for group start dependence are listed below as an example.

The following explains group start execution using examples of simple status transition.

When two servers have three groups

Group failover policy

groupA server1
groupB server2
groupC server1 -> server2

Group start dependence setting

groupA Start dependence is not set.
groupB Start dependence is not set.
groupC groupA start dependence is set.
groupC Start dependence is set when groupC is started by the server of groupB.
  1. When server1 starts groupA and groupC

    server1 starts groupC after groupA has been started normally.

  2. When server1 starts groupA and server2 starts groupC

    server2 starts groupC after server1 has started groupA normally.

    Wait Only when on the Same Server is not set, so groupA start dependence by another server is applied.

  3. When server1 starts groupC and server2 starts groupB

    server1 starts groupC without waiting for the normal start of groupB. groupC is set to wait for groupB start only when it is started by the same server. However, start dependence is not applied to groupC because groupB is set such that it is not started by server1.

  4. When server1 starts groupA and groupC

    If server1 fails in groupA start, groupC is not started.

  5. When server1 starts groupA and groupC

    If server1 fails in groupA start and a failover occurs in server2 due to groupA resource recovery, server2 starts groupA and then server1 starts groupC.

  6. When server1 starts groupA and groupC

    If a groupA start dependence timeout occurs on server1, groupC is not started.

  7. When server1 starts only groupC

    server1 has not started groupA, so a start dependence timeout occurs. If this timeout occurs, groupC is not started.

Note

  • When a group is started, there is no function to automatically start the group for which start dependence is set.

  • The group is not started if a timeout occurs in the group for which start dependence is set.

  • The group is not started if the group for which start dependence is set fails to start.

  • If the group for which start dependence is set contains a normally started and a normally stopped resource, the group is judged to have started normally.

  • When a group is stopped, there is no function to automatically stop the group for which stop dependence is set.

  • The group stop processing continues if a timeout occurs in the group for which stop dependence is set.

  • The group stop processing continues if the group for which stop dependence is set fails to stop.

  • The group stop processing or resource stop processing by the Cluster WebUI or clpgrp command does not apply stop dependence. Stop dependence is applied according to the setting (when the cluster or a server stops) made with the Cluster WebUI.

  • If a start waiting timeout occurs at the time of a failover, the failover fails.

3.2.11. Understanding Exclusive Control of Group

The Failover exclusive attributes set exclusive attributes of the group at failover. However, they cannot set any attribute under the following conditions:

  • If Virtual machine group is specified as the group type

  • When failover attribute is one of Fail over dynamically, Prioritize failover policy in the server group or Enable only manual failover among the server groups.

The settable failover exclusive attributes are as follows:

Off

Exclusion is not performed at failover. Failover is performed on the server of the highest priority among the servers that can fail over.

Normal

Exclusion is performed at failover. Failover is performed on the server on which the other normal exclusion groups are not started and which is given the highest priority among the servers that can run the group.

However, if the other normal exclusion groups have already been started on all servers that the failover can be performed, exclusion is not performed. Failover is performed on the server that is given the highest priority among the servers on which failover can be performed.

Absolute

Exclusion is performed at failover. Failover is performed on the server on which the other absolute exclusion groups are not started and which is given the highest priority among the servers that can run the group.

However, failover is not performed if the other absolute exclusion groups have already been started on all servers on which failover can be performed.

Note

Exclusion is not performed to the groups with different exclusion rules. Exclusive control is performed only among the groups with the same exclusion rule, according to the set exclusion attribute. In either case, exclusion is not performed with the no-exclusion group. For details on the failover exclusive attribute, see "Understanding failover policy". Furthermore, For details on the settings of the exclusion rules, see "Group common properties".

3.2.12. Understanding server groups

This section explains about server groups.

Server groups are mainly groups of servers which are required when hybrid disk resources are used.

Upon using hybrid disk resources in a shared disk device, servers connected by the same shared disk device are configured as a server group.

Upon using hybrid disk resources in a disk which is not shared, a server is configured as a server group.

3.2.13. Understanding the settings of dependency among group resources

By specifying dependency among group resources, the order of activating them can be specified.

  • When the dependency among group resources is set:

    • When activating a failover group that a group resource belongs to, its activation starts after the activation of the Dependent Resources is completed.

    • When deactivating a group resource, the deactivation of the "Dependent Resources" starts after the deactivation of the group resource is completed.

Depths for group start dependence are listed below as an example.

3.2.14. Setting group resources for individual server

Some setting values of group resources can be configured for individual servers. On the properties of resources which can be set for individual servers, tabs for each server are displayed on the Details tab.

The following resources can be set for individual servers.

Group resource name

Supported version

Disk resource

4.0.0-1 or later

Floating IP resource

4.0.0-1 or later

Virtual IP resource

4.0.0-1 or later

Mirror disk resource

4.0.0-1 or later

Hybrid disk resource

4.0.0-1 or later

Dynamic DNS resource

4.0.0-1 or later

Virtual machine resource

4.0.0-1 or later

AWS Elastic IP resource

4.0.0-1 or later

AWS Virtual IP resource

4.0.0-1 or later

AWS DNS resource

4.0.0-1 or later

Azure DNS resource

4.0.0-1 or later

Note

Some parameters of Virtual IP resources, AWS Elastic IP resources, AWS Virtual IP resources, and Azure DNS resources should be configured for individual servers.

For parameters that can be set for individual servers, see the descriptions of parameters on each group resource. On those parameters, the Server Individual Setup icon is displayed.

In this example, the server individual setup for a Floating IP resource is explained.

Server Individual Setup

Parameters that can be set for individual servers on a Floating IP resource are displayed.

Set Up Individually

Click the tab of the server on which you want to configure the server individual setting, and select this check box. The boxes for parameters that can be configured for individual servers become active. Enter required parameters.

Note

When setting up a server individually, you cannot select Tuning.

3.3. Group common properties

3.3.1. Exclusion tab

Add

Add exclusion rules. Select Add to display the Definition of Exclusion Rule dialog box.

Remove

The confirmation dialog box is displayed.

Rename

The change server group name dialog box of the selected exclusion rule is displayed.

There are the following naming rules.

  • Up to 31 characters (31 bytes).

  • Names cannot start or end with a hyphen (-) or a space.

  • A name consisting of only numbers is not allowed.

Names should be unique (case-insensitive) in the exclusion rule.

Properties

Display the properties of the selected exclusion rule.

Definition of exclusion rule

The name of the exclusion rule and the exclusive attribute are set. Either Normal or Absolute can be set for an exclusive attribute. Normal can be set just one time, whereas Absolute can be set more than one time. If an exclusion rule in which Normal is set already exists, Normal cannot be set any more.

Name

Display the exclusion rule name.

Exclusive Attribute

Display the exclusive attribute set in the exclusion rule.

Group

Display the list of failover group names which belong to the exclusion rule.

After selecting a group which you want to register into the exclusion rule from Available Group, press Add. Exclusive Group displays groups registered into the exclusion rule. A failover group added in another exclusion rule is not displayed on Available Group.

3.3.2. Start Dependency tab

Display the start dependency list.

3.3.3. Stop Dependency tab

Display the stop dependency list.

3.4. Group properties

3.4.1. Info tab

Type

The group type is displayed.

Use Server Group Settings

  • When the check box is selected
    Server group settings are used.
  • When not selected
    Server group settings are not used.

Name

The group name is displayed.

Comment (Within 127 bytes)

Enter a comment for group. Use only one-byte alphabets and numbers.

3.4.2. Startup Server tab

There are two types of settings for the server that starts up the group: starting up the group on all servers or on only the specified servers and server groups that can run the group.

If the setting on which the group is started up by all the servers is configured, all the servers in a cluster can start a group. The group startup priority of servers is same as the one of servers. For details on the server priority, see "Master server tab" in "Server Common Properties" in "2. Parameter details" in this guide.

When selecting servers and server groups that can run the group, you can select any server or server group from those registered to the cluster. You can also change the startup priority of servers and server groups that can run the group.

To set the server to start up the failover group:

Failover is possible on all servers

Specify the server that starts a group.

  • When the check box is selected:
    All servers registered to a cluster can start a group. The priority of starting up a group is same as the one of the servers.
  • When not selected:
    You can select the servers that can start a group, and change the startup priority.

Add

Use this button to add a server. Select a server that you want to add from Available Servers, and then click Add. The server is added to Servers that can run the Group.

Remove

Use this button to remove a server. Select a server that you want to remove from Servers that can run the Group, and then click Remove. The server is added to Available Servers.

Order

Use these buttons to change the priority of the servers that can be started. Select a server whose priority you want to change from Servers that can run the Group. Click the arrows to move the selected row upward or downward.

To use the server group settings:

It is necessary to configure a server group that starts up the failover group for the settings of a server that starts up a group including a hybrid disk resource.

Add

Use Add to add a server group to Server Groups that can run the Group. Select a server group that you want to add from Available Server Groups, and then click Add. The selected server group is added to Server Groups that can run the Group.

Remove

Use Remove to remove a server group from Server Groups that can run the Group. Select a server group that you want to remove from Available Server Groups, and then click Remove. The server is added to Server Groups that can run the Group.

Order

Use these buttons to change the priority of a server group. Select a server group whose priority you want to change from Server Groups that can run the Group. Click the arrows to move the selected row upward or downward.

3.4.3. Attribute tab

Startup Attribute

Select whether to automatically start the group from EXPRESSCLUSTER (auto startup), or to manually start from the Cluster WebUI or by using the clpgrp command (manual startup) at the cluster startup.

  • Auto Startup
    The group will automatically be started at the cluster startup (active state).
  • Manual Startup
    The group will not be started at the cluster startup (inactive state).
    You can start the group from the Cluster WebUI or by using the clpgrp command (active state).

Execute Multi-Failover-Service Check

Check whether a double activation will occur or not before a group is started. If this function is set to disabled for the group whose floating IP resource exists, the following pop-up window appears when the cluster configuration information is applied.

If Yes is selected, Detect double activation is automatically enabled, and the cluster configuration information is uploaded. If No is selected, the cluster configuration information is uploaded while Detect double activation remains disabled.

Timeout (1 to 9999)

Specify the maximum time to be taken to check a double activation. The default value is set as 300 seconds. Specify a larger value than the one set for Ping Timeout of Floating IP Resource Tuning Properties for the floating IP resource that belongs to the group.

Failover Attribute

Select if the failover is automatically performed when a server fails.

  • Auto Failover
    Failover is executed automatically. In addition, the following options can be selected.
    • Use the startup server settings
      This is the default setting.
    • Fail over dynamically
      The failover destination is determined by considering the statuses of each server's monitor or failover group at the time of the failover.
      If this option button is selected, all the failback attribute parameters are reverted to the default values and grayed out.
      If dynamic failover is selected, each option can be set. For details, see "Understanding the group properties".
    • Prioritize failover policy in the server group
      This function controls failovers between sites (between server groups).
      However, if no server group is specified for the failover group, the display for failovers between sites is grayed out.
      The Enable only manual failover among the server groups check box can be selected only when this option button is selected.
      If the Prioritize failover policy in the server group option button is selected, the failover policies in the same server group take priority when determining the failover destination.
      If the Prioritize failover policy in the server group option button and Enable only manual failover among the server groups check box are selected, failovers across server groups are not automatically performed. Manually move groups between server groups.
  • Manual Failover
    Failover is executed manually.

Failback Attribute

Select if the failback is executed automatically to the group when a server that has a higher priority than other server where the group is active is started. For groups that have mirror disk resources or hybrid disk resources, select manual failback.

  • Auto Failback
    Failback is executed automatically.
  • Manual Failback
    Failback is not executed automatically.

Edit exclusive monitor

Dynamic failover excludes the server for which the monitor resource has detected an error, from the failover destinations. If Fail over dynamically is selected as the failover attribute, you can set the monitor resource to be excluded.

The exclusive list can be set with the monitor resource type and monitor resource name.

  • Add exclusive monitor resource type
    Adds the exclusive monitor resource type.
    Any server, in which even one monitor resource of the added monitor resource type is abnormal, is excluded from the failover destinations.

    Adds the selected monitor resource type.

  • Remove exclusive monitor resource type
    Removes the selected exclusive monitor resource type.
  • Add exclusive monitor resource group
    Adds the exclusive monitor resource group.
    The maximum number of exclusive monitor resource groups to be registered is 32.

If multiple monitor resources are registered in a single exclusive monitor resource group, the server in which all the registered monitor resources are abnormal is excluded from the failover destinations.

Moreover, if multiple exclusive monitor resource groups are registered, a server that satisfies at least one of the conditions is excluded from the failover destinations.

Add

Adds the monitor resource selected from Available monitor resource list to Monitor resource list.

Remove

Removes the monitor resource selected with Monitor resource list, from the list.

  • Delete exclusive monitor resource group
    Removes the selected exclusive monitor resource group.
  • Edit exclusive monitor resource group
    Edits the selected exclusive monitor resource group.

Note

The following monitor resource types cannot be registered for the exclusive monitor resource type. Moreover, a resource name cannot be registered for the exclusive monitor resource group.
- User mode monitor
- ARP monitor
- Virtual IP monitor
- Mirror disk connect monitor
- Hybrid disk monitor
- Hybrid disk connect monitor

Note

The monitor resource in the warning status is not handled as being abnormal. The exception to this is the mirror disk monitor resource.
The monitor resource set for monitoring at activation does not enter the abnormal status because it does not perform monitoring for a server other than the group start server.
The monitor resource stopped with the Cluster WebUI or clpmonctrl command enters the normal status.
A server that has not been set to monitor a monitor resource does not enter the abnormal status because it does not perform monitoring.

Note

In the case of the mirror disk monitor resource, a check is made as to whether the mirror disk resource can be activated. There is no dependence on the status of the mirror disk monitor resource.
Even if the mirror disk monitor resource is in the error status, the server on which the mirror disk resource can be activated normally is not excluded from the failover destination.
Even if the mirror disk monitor resource is in the normal or caution status, the server on which the mirror disk resource cannot be activated normally is excluded from the failover destination.

3.4.4. Start Dependency tab

Add

Clicking Add adds the group selected from Available Group to Dependent Group.

Remove

Clicking Remove removes the group selected from Dependent Group.

Start Wait Time (0 to 9999)

Specify how many seconds you want to wait before a timeout in the target group start process. The default value is 1800 seconds.

Property

Clicking Property changes the properties of the group selected from Dependent Group.

Wait Only when on the Same Server

Specify whether you wait for start waiting only when the group which starts waiting and the target group start on the same server.

  • When Wait Only when on the Same Server is selected

    • When the server which starts the group that starts waiting isn't included in the Startup Server of a target group, you don't wait.

    • When a target group fails to start on a server other than the server which starts the group that starts waiting, you don't wait.

3.4.5. Stop Dependency tab

Add

Clicking Add adds the group selected from Available Group to Dependent Group.

Remove

Clicking Remove removes the group selected from Dependent Group.

Stop Wait Time (0 to 9999)

Specify how many seconds to wait before a timeout occurs in the target group stop processing. The default value is 1800 seconds.

Wait the Dependent Groups when a Cluster Stops

Specify whether to wait for the dependent groups to stop when the cluster stops.

Wait the Dependent Groups when a Server Stops

Specify whether to wait for the dependent groups to stop when a single server stops. This option waits for the stop of only those groups running on the same server, among all the dependent groups.

Wait the Dependent Groups when a Group Stops

Specify whether to wait for the dependent groups to stop when the groups are being stopped. This option waits for the stop of only those groups running on the same server, among all the dependent groups.

3.4.6. Entire Dependency tab

Displays the settings of dependency among group resources.

3.5. Resource Properties

3.5.1. Info tab

Name

The resource name is displayed.

Comment (Within 127 bytes)

Enter a comment for the resource. Use only one-byte alphabets and numbers.

3.5.2. Dependency tab

Follow the default dependence

Select if the selected group resource follows the default EXPRESSCLUSTER dependency.

  • When Follow the default dependence is selected:
    The selected group resource depends on the type(s) of resources.
    See "Parameters list" in 2. Parameter details" for the default dependency of each resource.
    When there is more than one resource of the same type, the selected group resource depends on all resources of that type.
  • When Follow the default dependence is not selected:
    The selected group resource depends on the specified resource.

Add

It is used when adding the group resource selected in Available Resources to Dependent Resources.

Remove

It is used when removing the group resource selected in Dependent Resources from Dependent Resources.

3.5.3. Recovery Operation tab

When an error in activation of the group resource is detected

  • When an error is detected while activating the group resource, try activating it again.

  • When the activation retry count exceeds the number of times set in Retry Count at Activation Failure, failover is executed.

  • When the group resource cannot be activated even after executing a failover as many times as specified in Failover Threshold, the final action is taken.

When an error in deactivation of the group resource is detected

  • When an error is detected while deactivating the group resource, try deactivating it again.

  • When the deactivation retry count exceeds the number of times set in Retry Count at Deactivation Failure, the final action is taken.

Execute Script before or after Activation or Deactivation

Select whether script is running or not before and after activation/deactivation of group resources. To configure the script settings, click Script Settings.

The script can be run at the specified timing by selecting the checkbox.

Exec Timing

Execute Script before Activation

  • Checkbox is on
    The script is executed before the resource is activated.
  • Checkbox is off
    The script is not executed before the resource is activated.

Execute Script after Activation

  • Checkbox is on
    The script is executed after the resources is activated.
  • Checkbox is off
    The script is not executed after the resources is activated.

Execute Script before Deactivation

  • Checkbox is on
    The script is executed before the resource is deactivated.
  • Checkbox is off
    The script is not executed before the resource is deactivated.

Execute Script after Deactivation

  • Checkbox is on
    The script is executed after the resource is deactivated.
  • Checkbox is off
    The script is not executed after the resource is deactivated.

To configure the script settings, click Script Settings.

User Application

Use an executable file (executable shell script file or execution file) on the server as a script. For the file name, specify an absolute path or name of the executable file of the local disk on the server. If there is any blank in the absolute path or the file name, put them in double quotation marks ("") as follows.

Example:
"/tmp/user application/script.sh"

Each executable files is not included in the cluster configuration information of the Cluster WebUI. They must be prepared on each server because they cannot be edited nor uploaded by the Cluster WebUI.

Script created with this product

Use a script file which is prepared by the Cluster WebUI as a script. You can edit the script file with the Cluster WebUI if you need. The script file is included in the cluster configuration information.

File (Within 1023 bytes)

Specify a script to be executed (executable shell script file or execution file) when you select User Application.

View

Click here to display the script file when you select Script created with this product.

Edit

Click here to edit the script file when you select Script created with this product. Click Save to apply the change. You cannot modify the name of the script file.

Replace

Click here to replace the contents of a script file with the contents of the script file which you selected in the file selection dialog box when you select Script created with this product. You cannot replace the script file if it is currently displayed or edited. Select a script file only. Do not select binary files (applications), and so on.

Timeout (1 to 9999)

Specify the maximum time to wait for completion of script to be executed.
The default value of the time taken to execute script before and after activation/deactivation is 30 seconds.
The default value of the timeout settable from Settings button of Execute Script before Final Action for Recovery Operation at Activation Failure Detection or Recovery Operation at Deactivation Failure Detection is 5 seconds.

Recovery Operation at Activation Failure Detection

Retry Count at Activation Failure (0 to 99)

Enter how many times to retry activation when an activation error is detected. If this is set to zero (0), the activation will not be retried.

Failover Threshold (0 to 99)

Enter how many times to retry failover after activation retry fails as many times as the number of times set in Retry Count at Activation Failure when an error in activation is detected.
If this is set to zero (0), failover will not be executed.

Final Action

Select an action to be taken when activation retry failed the number of times specified in Activation Retry Threshold and failover failed as many times as the number of times specified in Failover Threshold when an activation error is detected.

Select a final action from the following:

  • No Operation (Activate next resource):
    Continues the group start process.
  • No Operation (Not activate next resource):
    Cancels the group start process.
  • Stop Group:
    Deactivates all resources in the group of which the group resource that an activation error is detected.
  • Stop cluster service:
    Stops the cluster service of the server of which an activation error is detected.
  • Stop cluster service and shutdown OS:
    Stops the cluster service of the server of which an activation error is detected, and shuts down the OS.
  • Stop cluster service and reboot OS:
    Stops the cluster service of the server where an activation error is detected, and restarts the OS.
  • Sysrq Panic:
    Performs the sysrq panic.

    Note

    If performing the sysrq panic fails, the OS is shut down.

  • Keepalive Reset:
    Resets the OS using the clpkhb or clpka driver.

    Note

    If resetting keepalive fails, the OS is shut down. Do not select this action on the OS and kernel where the clpkhb and clpka drivers are not supported

  • Keepalive Panic:
    Performs the OS panic using the clpkhb or clpka driver.

    Note

    If performing the keepalive panic fails, the OS is shut down. Do not select this action on the OS and kernel where the clpkhb and clpka drivers are not supported.

  • BMC Reset:
    Perform hardware reset on the server by using the ipmi command.

    Note

    If resetting BMC fails, the OS is shut down. Do not select this action on the server where OpenIPMI is not installed, or the ipmitool command does not run.

  • BMC Power Off:
    Powers off the OS by using the ipmi command. OS shutdown may be performed due to the ACPI settings of the OS.

    Note

    If powering off BMC fails, the OS is shut down. Do not select this action on the server where OpenIPMI is not installed, or the ipmitool command does not run.

  • BMC Power Cycle:
    Performs the power cycle (powering on/off) of the server by using the ipmi command. OS shutdown may be performed due to the ACPI settings of the OS.

    Note

    If performing the power cycle of BMC fails, the OS is shut down. Do not select this action on the server where OpenIPMI is not installed, or the ipmitool command does not run.

  • BMC NMI:
    Uses the ipmi command to cause NMI occur on the server. Actions after NMI occurrence depend on the OS settings.

    Note

    If BMC NMI fails, the OS shutdown is performed. Do not select this action on the server where OpenIPMI is not installed, or the ipmitool command does not run.

  • I/O Fencing(High-End Server Option)
    It can't be used.

    Note

    If I/O Fencing(High-End Server Option) fails, the OS shutdown is performed.

Execute Script before Final Action

Select whether script is run or not before executing final action when an activation failure is detected.

  • When the check box is selected:
    A script/command is run before executing final action. To configure the script/command setting, click Settings.
    For the settings of the script, refer to the explanation about the script settings in "Execute Script before or after Activation or Deactivation".
  • When the check box is not selected:
    Any script/command is not run.

Recovery Operation at Deactivation Failure Detection

Retry Count at Deactivation Failure (0 to 99)

Enter how many times to retry deactivation when an error in deactivation is detected.

If you set this to zero (0), deactivation will not be retried.

Final Action

Select the action to be taken when deactivation retry failed the number of times specified in Retry Count at Deactivation Failure when an error in deactivation is detected.

Select the final action from the following:

  • No Operation (Deactivate next resource):
    Continue the group stop process.

    Note

    If No Operation is selected as the final action when a deactivation error is detected, group does not stop but remains in the deactivation error status.
    Make sure not to set No Operation in the production environment.
  • No Operation (Not deactivate next resource):
    Cancel the group start process.

    Note

    If No Operation is selected as the final action when a deactivation error is detected, group does not stop but remains in the deactivation error status.
    Make sure not to set No Operation in the production environment.
  • Stop cluster service and shutdown OS:
    Stop the cluster daemon on the server of which error in deactivation is detected, and shut down the OS.
  • Stop cluster service and reboot OS:
    Stop the cluster daemon on the server where an error in deactivation is detected, and restart the OS.
  • Sysrq Panic:
    Performs the sysrq panic.

    Note

    If performing the sysrq panic fails, the OS is shut down.

  • Keepalive Reset:
    Resets the OS using the clpkhb or clpka driver.

    Note

    If resetting keepalive fails, the OS is shut down. Do not select this action on the OS and kernel where the clpkhb and clpka drivers are not supported

  • Keepalive Panic:
    Performs the OS panic using the clpkhb or clpka driver.

    Note

    If performing the keepalive panic fails, the OS is shut down. Do not select this action on the OS and kernel where the clpkhb and clpka drivers are not supported.

  • BMC Reset:
    Perform hardware reset on the server by using the ipmi command.

    Note

    If resetting BMC fails, the OS is shut down. Do not select this action on the server where OpenIPMI is not installed, or the ipmitool command does not run.

  • BMC Power Off:
    Powers off the OS by using the ipmi command. OS shutdown may be performed due to the ACPI settings of the OS.

    Note

    If powering off BMC fails, the OS is shut down. Do not select this action on the server where OpenIPMI is not installed, or the ipmitool command does not run.

  • BMC Power Cycle:
    Performs the power cycle (powering on/off) of the server by using the ipmi command. OS shutdown may be performed due to the ACPI settings of the OS.

    Note

    If performing the power cycle of BMC fails, the OS is shut down. Do not select this action on the server where OpenIPMI is not installed, or the ipmitool command does not run.

  • BMC NMI:
    Uses the ipmi command to cause NMI occur on the server. Actions after NMI occurrence depend on the OS settings.

    Note

    If BMC NMI fails, the OS shutdown is shut down. Do not select this action on the server where OpenIPMI is not installed, or the ipmitool command does not run.

  • I/O Fencing(High-End Server Option):
    It can't be used.

    Note

    If I/O Fencing(High-End Server Option) fails, the OS shutdown is performed.

Execute Script before Final Action

Select whether script is run or not before executing final action when a deactivation failure is detected.

  • When the check box is selected:
    A script/command is run before executing final action. To configure the script/command setting, click Settings.
    For the settings of the script, refer to the explanation about the script settings in "Execute Script before or after Activation or Deactivation".
  • When the check box is not selected:
    Any script/command is not run.

3.5.4. Details tab

The parameters specific to each resource are described in its explanation part.

3.6. Understanding EXEC resources

You can register applications and shell scripts that are managed by EXPRESSCLUSTER and to be run when starting, stopping, failing over or moving groups in EXPRESSCLUSTER. It is also possible to register your own programs and shell scripts in EXEC resources. You can write codes as required for respective application because shell scripts are in the same format as an sh shell script.

Note

The same version of the application to be run from EXEC resources must be installed on all servers in failover policy.

3.6.1. Dependency of EXEC resources

By default, exec resources depend on the following group resource types:

Group resource type

Floating IP resource

Virtual IP resource

Disk resource

Mirror disk resource

Hybrid disk resource

NAS resource

VM resource

Volume manager resource

Dynamic DNS resource

AWS elastic ip resource

AWS virtual ip resource

AWS DNS resource

Azure probe port resource

Azure DNS resource

3.6.2. Method of judging EXEC resource activation/deactivation results

The activation/deactivation results are judged based on the results of executing the applications and shell scripts registered in the EXEC resources.
If the end code of an application or a shell script is 0, it is judged that activation/deactivation was performed normally and successfully.
If the end code is other than 0, it is judged that activation/deactivation has failed.
If a start/stop script timeout occurs, it is judged that activation/deactivation has failed.

3.6.3. Scripts in EXEC resources

Types of scripts

Start script and stop script are provided in EXEC resources. EXPRESSCLUSTER runs a script for each EXEC resource when the cluster needs to change its status. You have to write procedures in these scripts about how you want applications to be started, stopped, and restored in your cluster environment.

Start: Start script
Stop: Stop script

3.6.4. Environment variables in EXEC resource script

When EXPRESSCLUSTER runs a script, it records information such as condition when the scrip was run (script starting factor) in environment variables.

You can use the environment variables in the table below as branching condition when you write codes for your system operation.

Stop script returns the contents of the previous start script in the environment variable as a value. Start script does not set environment variables of CLP_FACTOR and CLP_PID.

The environment variable CLP_LASTACTION is set only when the environment variable CLP_FACTOR is CLUSTERSHUTDOWN or SERVERSHUTDOWN.

Environment Variable

Value of environment variable

Meaning

CLP_EVENT
...script starting factor

START

The script was run:

  • by starting a cluster;

  • by starting a group;

  • on the destination server by moving a group;

  • on the same server by restarting a group due to the detection of a monitor resource error; or

  • on the same server by restarting a group resource due to the detection of a monitor resource error.

FAILOVER

The script was run on the failover target server:

  • by the failure of the server;

  • due to the detection of a monitor resource error; or

  • because activation of group resources failed.

CLP_FACTOR
...group stopping factor

CLUSTERSHUTDOWN

The group was stopped by stopping the cluster.

SERVERSHUTDOWN

The group was stopped by stopping the server.

GROUPSTOP

The group was stopped by stopping the group.

GROUPMOVE

The group was moved by moving the group.

GROUPFAILOVER

The group failed over because an error was detected in monitor resource; or
the group failed over because of activation failure in group resources.

GROUPRESTART

The group was restarted because an error was detected in monitor resource.

RESOURCERESTART

The group resource was restarted because an error was detected in monitor resource.

CLP_LASTACTION
...process after cluster
shutdown

REBOOT

In case of rebooting OS

HALT

In case of halting OS

NONE

No action was taken.

CLP_SERVER
...server where the script was run

HOME

The script was run on the primary server of the group.

OTHER

The script was run on a server other than the primary server of the group.

CLP_DISK 1
...partition connection
information on shared or
mirror disks

SUCCESS

There was no partition where connection had failed.

FAILURE

There was one or more partition where connection had failed.

CLP_PRIORITY
... the order in failover
policy of the server
where the script is run

1 to the number of servers in the cluster

Represents the priority of the server where the script is run. This number starts from 1 (The smaller the number, the higher the server's priority).
If CLP_PRIORITY is 1, it means that the script is run on the primary server.
CLP_GROUPNAME
...Group name

Group name

Represents the name of the group to which the script belongs.

CLP_RESOURCENAME
...Resource name

Resource name

Represents the name of the resource to which the script belongs.

CLP_PID
...Process ID

Process ID

Represents the process ID of start script when the property of start script is set to asynchronous. This environment variable is null when the start script is set to synchronous.

CLP_VERSION_FULL
...EXPRESSCLUSTER
full version

EXPRESSCLUSTER full version

Represents the EXPRESSCLUSTER full version.
(Example) 4.2.0-1
CLP_VERSION_MAJOR
...EXPRESSCLUSTER
major version

EXPRESSCLUSTER major version

Represents the EXPRESSCLUSTER major version.
(Example) 4
CLP_PATH
...EXPRESSCLUSTER
installation path

EXPRESSCLUSTER install path

Represents the path where EXPRESSCLUSTER is installed.
(Example) /opt/nec/clusterpro
CLP_OSNAME
...Server OS name

Server OS name

Represents the OS name of the server where the script was executed.
(Example)
1. When the OS name could be acquired:
Red Hat Enterprise Linux Server release 6.8 (Santiago)
2. When the OS name could not be acquired:
Linux
CLP_OSVER
...Server OS version

Server OS version

Represents the OS version of the server where the script was executed.
(Example)
1. When the OS version could be acquired: 6.8
2. When the OS version could not be acquired: Blank
1

It is available for disk resource, mirror disk resource, hybrid disk resource, NAS resource and volume manager resource.

If the script is executed on the standby server, with Execute on standby server of Exec Resource Tuning Properties enabled, the following information is recorded in environment variables:

Environment variable

Value of environment variable

Meaning

CLP_EVENT
...script starting factor

STANDBY

The script was run on the standby server.
CLP_SERVER
...server where the script was run

HOME

The script was run on the primary server of the group.

OTHER

The script was run on a server other than the primary server of the group.

CLP_PRIORITY
... the order in failover
policy of the server
where the script is run

1 to the number of servers in the cluster

Represents the priority of the server where the script is run. This number starts from 1 (The smaller the number, the higher the server's priority).
If CLP_PRIORITY is 1, it means that the script is run on the primary server.
CLP_GROUPNAME
...Group name

Group name

Represents the name of the group to which the script belongs.

CLP_RESOURCENAME
...Resource name

Resource name

Represents the name of the resource to which the script belongs.

CLP_VERSION_FULL
...Full version of EXPRESSCLUSTER

Full version of EXPRESSCLUSTER

Represents the full version of EXPRESSCLUSTER (e.g. 4.2.0-1 ).

CLP_VERSION_MAJOR
...Major version of EXPRESSCLUSTER

Major version of EXPRESSCLUSTER

Represents the major version of EXPRESSCLUSTER (e.g. 4).

CLP_PATH
...EXPRESSCLUSTER installation path

EXPRESSCLUSTER installation path

Represents the EXPRESSCLUSTER installation path (e.g. /opt/nec/clusterpro).

CLP_OSNAME
...Server OS name

Server OS name

Represents the OS name of the server where the script was executed.
(Example)
1. When the OS name was acquired:
Red Hat Enterprise Linux Server release 6.8 (Santiago)
2. When the OS name was not acquired:
Linux
CLP_OSVER
...Server OS version

Server OS version

Represents the OS version of the server where the script was executed.
(Example)
1. When the OS version was acquired: 6.8
2. When the OS version was not acquired: Blank

3.6.5. Execution timing of EXEC resource script

This section describes the relationships between the execution timings of start and stop scripts and environment variables according to cluster status transition diagram.

  • To simplify the explanations, 2-server cluster configuration is used as an example. See the supplements for the relations between possible execution timings and environment variables in 3 or more server configurations.

  • O and X in the diagrams represent the server status.

Server

Server status

O

Normal (properly working as a cluster)

X

Stopped (cluster is stopped)

(Example) OA: Group A is working on a normally running server.

  • Each group is started on the top priority server among active servers.

  • Three Group A, B and C are defined in the cluster, and they have their own failover policies as follows:

Group

1st priority server

2nd priority server

A

server1

server2

B

server2

server1

C

server1

server2

  • The upper server is referred to as server1 and the lower one as server2.

<Cluster status transition diagram>

This diagram illustrates a typical status transition of cluster.

Numbers 1. to 11. in the diagram correspond to descriptions as follows.

  1. Normal startup

    Normal startup here means that the start script has been run properly on the primary server.

    Each group is started on the server with the highest priority among the active servers.

    Environment variables for Start

    Group

    Environment variable

    Value

    A

    CLP_EVENT

    START

    CLP_SERVER

    HOME

    B

    CLP_EVENT

    START

    CLP_SERVER

    HOME

    C

    CLP_EVENT

    START

    CLP_SERVER

    HOME

  2. Normal shutdown

    Normal shutdown here means a cluster shutdown immediately after the start script corresponding to the stop script that was run by performing normal startup or by moving a group (online failback).

    Environment variables for Stop

    Group

    Environment variable

    Value

    A

    CLP_EVENT

    START

    CLP_SERVER

    HOME

    B

    CLP_EVENT

    START

    CLP_SERVER

    HOME

    C

    CLP_EVENT

    START

    CLP_SERVER

    HOME

  3. Failover at server1 down

    When the start scrip of a group which has server1 as its primary server, it is run on a lower priority server (server2) when an error occurs. You need to write CLP_EVENT(=FAILOVER) as a branching condition for triggering application startup and recovery processes (such as database rollback process) in the start script in advance.

    For the process to be performed only on a server other than the primary server, specify CLP_SERVER(=OTHER) as a branching condition and describe the process in the script.

    Environment variables for Start

    Group

    Environment variable

    Value

    A

    CLP_EVENT

    FAILOVER

    CLP_SERVER

    OTHER

    C

    CLP_EVENT

    FAILOVER

    CLP_SERVER

    OTHER

  4. Cluster shutdown after failover of server1

    The stop scripts of the Group A and C are run on server2 where the groups fail over (the stop script of Group B is run by a normal shutdown).

    Environment variables for Stop

    Group

    Environment variable

    Value

    A

    CLP_EVENT

    FAILOVER

    CLP_SERVER

    OTHER

    B

    CLP_EVENT

    START

    CLP_SERVER

    HOME

    C

    CLP_EVENT

    FAILOVER

    CLP_SERVER

    OTHER

  5. Moving of Group A and C

    After the stop scripts of Group A and C are run on server2 where the groups fail over, their start scripts are run on server1.

    Environment variables for Stop

    Group

    Environment variable

    Value

    A

    CLP_EVENT

    FAILOVER 2

    CLP_SERVER

    OTHER

    C

    CLP_EVENT

    FAILOVER

    CLP_SERVER

    OTHER

    Environment variables for Start

    Group

    Environment variable

    Value

    A

    CLP_EVENT

    START

    CLP_SERVER

    HOME

    C

    CLP_EVENT

    START

    CLP_SERVER

    HOME

    2
    Environment variables in a stop script take those in the previous start script.
    For moving in "5. Moving of Group A and C" because it is not preceded by a cluster shutdown, the environment variable used here is FAILOVER. However, if a cluster shutdown is executed before moving in "5. Moving of Group A and C," the environment variable is START.
  6. Error in Group C and failover

    When an error occurs in Group C, its stop script is run on server1 and start script is run on server2.

    Stop for server1

    Group

    Environment variable

    Value

    C

    CLP_EVENT

    START

    CLP_SERVER

    HOME

    Start for server2

    Group

    Environment variable

    Value

    C

    CLP_EVENT

    FAILOVER

    CLP_SERVER

    OTHER

  7. Moving of Group C

Move the Group C that is failed over to server2 in 6. from server2 to server1. Run the stop script on server2, and then run the start script on server1.

Stop (because this is failed over in 6.)

Group

Environment variable

Value

C

CLP_EVENT

FAILOVER

CLP_SERVER

OTHER

Start

Group

Environment variable

Value

C

CLP_EVENT

START

CLP_SERVER

HOME

  1. Stopping Group B

    The stop script of Group B is run on server2.

    Stop

    Group

    Environment variable

    Value

    B

    CLP_EVENT

    START

    CLP_SERVER

    HOME

  2. Starting Group B

    The start script of Group B is run on server2.

    Start

    Group

    Environment variable

    Value

    B

    CLP_EVENT

    START

    CLP_SERVER

    HOME

  3. Stopping Group C

The stop script of Group C is run on server2.

Stop

Group

Environment variable

Value

C

CLP_EVENT

FAILOVER

CLP_SERVER

OTHER

  1. Starting Group C

    The start scrip of Group C is run on server2.

    Start

    Group

    Environment variable

    Value

    C

    CLP_EVENT

    START

    CLP_SERVER

    OTHER

Supplementary information 1

For a group that has three or more servers specified in the failover policy to behave differently on servers other than the primary server, use CLP_PRIORITY instead of CLP_SERVER(HOME/OTHER).

Example 1: "3. Failover at server1 down" in the cluster status transition diagram

A group has server1 as its primary server. If an error occurs on server1, its start script is run on server2 that has next highest priority failover policy. You need to write CLP_EVENT(=FAILOVER) as the branching condition for triggering applications' startup and recovery processes (such as database rollback process) in the start script in advance.

For a process to be performed only on the server that has the second highest priority failover policy, it is necessary to write CLP_PRIORITY(=2) as the branching condition.

Environment variables for Start

Group

Environment variable

Value

A

CLP_EVENT

FAILOVER

CLP_SERVER

OTHER

CLP_PRIORITY

2

C

CLP_EVENT

FAILOVER

CLP_SERVER

OTHER

CLP_PRIORITY

2

Example 2: "7. Moving of Group C" in the cluster status transition diagram

After the stop scrip of Group C is run on server2 where the group failed over from, the start script is run on server3.

Environment variables for Stop

Group

Environment variable

Value

C

CLP_EVENT

FAILOVER

CLP_SERVER

OTHER

CLP_PRIORITY

2

Environment variables for Start

Group

Environment variable

Value

C

CLP_EVENT

START

CLP_SERVER

OTHER

CLP_PRIORITY

3

Supplementary information 2

When monitor resource starts or restarts a script:

To run the start script when resource monitor detected an error in application, the environment variables should be as follows:

Example 1: Resource monitor detects abnormal termination of an application that was running on server1 and restarts Group A on the server1.

Environment variable for Stop

Group

Environment variable

Value

A

(1)

CLP_EVENT

The same value as when the start script is run

Environment variable for Start

Group

Environment variable

Value

A

(2)

CLP_EVENT

START

Example2: Resource monitor detects abnormal termination of an application that was running on server1, fails over to server2 and restarts Group A on server2

Environment variable for Stop

Group

Environment variable

Value

A

(1)

CLP_EVENT

The same value as when the start script is run

Environment variable for Start

Group

Environment variable

Value

A

(2)

CLP_EVENT

FAILOVER

Supplementary information 3

With Execute on standby server of Exec Resource Tuning Properties enabled, start and stop scripts can also be executed on another server (standby server) that does not start a group--in accordance with the timings of running these scripts on the active server that started a group.

Compared with the script execution on the active server, that on the standby server has the following characteristics:

  • The results (error codes) of executing the scripts do not affect the group-resource statuses.

  • No script before and after activation/deactivation is executed.

  • Monitor resources set for monitoring at activation are not started or stopped.

  • Different types and values of environment variables are set. (Refer to "Environment variables in EXEC resource script" as described above.)

The following describes the relationships between the execution timings of scripts on the standby server and the environment variables--with cluster status transition diagrams.

<Cluster status transition diagram>

Numbers 1. to 4. in the diagram correspond to the following descriptions:

  1. Normal startup

For starting a group, the start script is run on the active server before executed on the standby server.
The start script requires a description, with CLP_EVENT (= STANDBY) as a branch condition, of what to be done on the standby server.

Environment variables for Start

Server

Environment variable

Value

1

CLP_EVENT

START

CLP_SERVER

HOME

2

CLP_EVENT

STANDBY

CLP_SERVER

OTHER

  1. Normal shutdown

For stopping a group, the stop script is run on the standby server before executed on the active server.
The stop script requires a description, with CLP_EVENT (= STANDBY) as a branch condition, of what to be done on the standby server.

Environment variables for Stop

Server

Environment variable

Value

1

CLP_EVENT

START

CLP_SERVER

HOME

2

CLP_EVENT

STANDBY

CLP_SERVER

OTHER

  1. Failover at server1 down

When an error occurs in server1, the group is failed over to server2, on which (as the active server) the start script is executed.
You need to write CLP_EVENT (= FAILOVER) as a branch condition for triggering application startup and recovery processes (such as a database rollback process) in the start script in advance.

With server1 crashed, the start script is not run on it as the standby server.

Environment variables for Start

Server

Environment variable

Value

2

CLP_EVENT

FAILOVER

CLP_SERVER

OTHER

  1. Moving of Group A

The stop script for Group A is executed on server1 (= standby server) and server2 (= active server). Then the start script is run on server1 (= active server) and server2 (= standby server).

Environment variables for Stop

Server

Environment variable

Value

1

CLP_EVENT

STANDBY

CLP_SERVER

HOME

2

CLP_EVENT

FAILOVER 3

CLP_SERVER

OTHER

3
The value of an environment variable for the stop script is changed to that for the last executed start script.
In the transition case of "4. Moving of Group A", FAILOVER is applied without a cluster shutdown immediately preceding, or START is applied with a cluster shutdown done before the phase of "4. Moving of Group A".

Environment variables for Start

Server

Environment variable

Value

1

CLP_EVENT

START

CLP_SERVER

HOME

2

CLP_EVENT

STANDBY

CLP_SERVER

OTHER

3.6.6. Writing EXEC resource scripts

This section explains timing script execution described in the preceding topic relating to the actual script codes.

Numbers in brackets "(number)" in the following example script code represent the actions described in " Execution timing of EXEC resource script ".

Group A start script: A sample of start.sh

Group A stop script: A sample of stop.sh

3.6.7. Tips for creating EXEC resource script

  • If your script has a command that requires some time to complete, it is recommended to configure command completion messages to be always produced. This message can be used to determine the error when a problem occurs. There are two ways to produce the message:

  • Specify the log output path of EXEC resource by writing the echo command in the script.
    The message can be produced with the echo command. Specify the log output path in the resource properties that contain the script.
    The message is not logged by default. For how to configure the settings for the log output path, see"Maintenance tab" in "Details tab - Tuning Properties" . If the Rotate Log check box is not selected, pay attention to the available disk space of a file system because messages are sent to the file specified as the log output destination file regardless of the size of available disk space.

    (Example: sample script)

    echo "appstart.."
    appstart
    echo "OK"
    
  • Write the clplogcmd command in the script.
    The message can be produced to the Alert logs of the Cluster WebUI or syslog in OS with the clplogcmd command. For details on the clplogcmd command, see "Outputting messages (clplogcmd command)" in "8. EXPRESSCLUSTER command reference" in this guide.

    (Example: sample script)

    clplogcmd -m "appstart.."
    appstart
    clplogcmd -m "OK"
    

3.6.8. Notes on EXEC Resource

  • Script Log Rotate

    When the Script Log Rotate function is enabled, a process is generated to mediate the log output. This intermediate process continues to work until the file descriptor is closed (i.e. until all the logs stop being output from the start and stop scripts and from a descendant process that takes over the standard output and/or the standard error output from the start and stop scripts). To exclude output from the descendant process from the log, redirect the standard output and/or the standard error output when the process is generated with the script.

  • The start script and the stop script are executed by the root user.

  • To start an application dependent on an environment variable, the script must set the environment variable as needed.

3.6.9. Details tab

User Application

Select this option to use executable files (executable shell scripts and binary files) on your server as scripts. Specify the local disk path on the server for each executable file name.

The executable files will not be distributed to each server. They should be placed on each server in advance. The cluster configuration data created by the Cluster WebUI does not contain these files. You cannot edit the script files using the Cluster WebUI.

Script created with this product

Select this option to use script files created by the Cluster WebUI as scripts. You can edit them using the Cluster WebUI as necessary. The cluster configuration data contains these script files.

View

Click here to display the script file when you select Script created with this product.

Edit

Click here to edit the script file when you select Script created with this product. Click Save to apply changes. You cannot rename the script file

With the User Application option selected, the Enter application path dialog box appears.

Enter application path

Specify an exec resource executable file name.

Start (Within 1023 bytes)

Enter an executable file name to be run when the exec resource starts. The name should begin with "/." Arguments can also be specified.

Stop (Within 1023 bytes)

Enter an executable file name to be run when the exec resource exits. The name should begin with "/." The stop script is optional.

For the executable file name, specify a full path name starting with "/" to a file on your cluster server.

Arguments can also be specified.

Replace

Opens the Open dialog box with the Script created with this product option selected.

The contents of the script file selected in the Resource Properties are replaced with the one selected in the Open dialog box. If the selected script file is being viewed or edited, you cannot replace it. Select a script file, not a binary file such as an application program.

Tuning

Opens the EXEC resource tuning properties dialog box. You can make advanced settings for the EXEC resource. If you want the PID monitor resource to monitor the exec resources, you have to set the start script to asynchronous.

Exec Resource Tuning Properties

Parameter tab

Common to all start scripts and stop scripts

Synchronous

Waits for the script to end when it is run. Select this option for executable files that are not resident (the process is returned immediately after the script completion).

Asynchronous

Does not wait for the script to end when it is run. Select this for resident executable files. The script can be monitored by PID monitor resource if Asynchronous is selected.

Timeout (1 to 9999)

When you want to wait for a script termination (when selecting Synchronous), specify how many seconds you want to wait before a timeout. This box is enabled when Synchronous is selected. Unless the script completes within the specified time, it is determined as an error.

Execute on standby server

Set whether the scripts are to be executed on the standby server. Enabling this parameter allows you to specify the timeout value (1 to 9999) for the execution.

Maintenance tab

Log Output Path (Within 1023 bytes)

Specify the redirect destination path of standard output and standard error output for EXEC resource scripts and executable files. If this box is left blank, messages are directed to /dev/null. The name should begin with "/."
If the Rotate Log check box is off, note the amount of available disk space in the file system because no limit is imposed on message output.
If the Rotate Log check box is on, the log file to be output is rotated. Note the following items.
You must specify a log output path within 1009 bytes. If you specify a path of 1010 bytes or more, the log is not output.
You must specify a log file name within 31 bytes. If you specify a log file name of 32 bytes or more, the log is not output.
When using multiple EXEC resources, the rotation size may not be normally recognized if you specify resources with the same file name, even if the paths differ (for example, /home/foo01/log/exec.log, /home/foo02/log/exec.log).

Rotate Log

Clicking Rotate Log when the Rotate Log check box is not checked outputs the execution logs of the EXEC resource script and the executable file without imposing any limit on the file size. Clicking Rotate Log when the Rotate Log check box is selected rotates and outputs messages.

Rotation Size (1 to 999999999)
If the Rotate Log check box is selected, specify a rotation size.
The structures of the log files to be rotated and output are as follows:

File name

Description

file_name for the Log Output Path specification

Newest log

file_name.pre for the Log Output Path specification

Previously rotated log

3.7. Understanding Disk resource

3.7.1. Dependencies of Disk resource

Disk resource is supported by the following versions of EXPRESSCLUSTER by default.

Group Resource Type

Dynamic DNS resource

Floating IP resource

Virtual IP resource

Volume manager resource

AWS Elastic IP resource

AWS Virtual IP resource

AWS DNS resource

Azure probe port resource

Azure DNS resource

3.7.2. Switching partitions

Switching partitions refer to partitions on shared disks connected to more than one server in a cluster.
Switching is done for each failover group according to the failover policy. By storing data required for applications on switching partitions, the data can be automatically used when failing over or moving failover group.

Note

For "raw" disk type, EXPRESSCLUSTER maps (binds) the switching partition to the raw device of the OS. If Execute Unbind is selected on the Disk Resource Tuning Properties, the unbind process is performed to deactivate the disk resource.

If switching partitions are not accessible with the same device name on all the servers, configure the server individual setup.

3.7.3. Device region expansion on disk resources

Follow the steps below to execute region expansion of the device. Be sure to execute the following steps on the server where the disk resource in question has been activated.

  1. Deactivate a group to which the disk resource in question belongs by using a command such as clpgrp.

  2. Confirm that no disks have been mounted by using a command such as mount and df.

  3. Change the state of the disk from Read Only to Read Write by executing one of the following commands depending on the disk resource type.
    # clproset -w -d <device-name>
  4. Execute region expansion of the device.

  5. Change the state of the disk from Read Write to Read Only by executing one of the following commands depending on the disk resource type.
    # clproset -o -d <device-name>
  6. Activate a group to which the disk resource in question belongs by using a command such as clpgrp.

3.7.4. Notes on disk resources

  • EXPRESSCLUSTER controls accesses to the file system (mount/umount). Thus, do not configure the settings about mount/umount on the OS.
    (If the entry to is required /etc/fstab, please use the noauto option is not used ignore option.)
  • The partition device name set to the disk resource is in the read-only mode on all servers in a cluster. Read-only status is released when the server is activated.

  • If Exclude Mount/Unmount Commands is selected on the Extension tab of the Cluster Prosperities, it may take some time to activate or deactivate a disk resource because mount or unmount of disk resource, VxVM volume resource, NAS resource, and mirror resource is performed exclusively in the same server.

  • When specifying path including symbolic link for mount point, Force Operation cannot be done even if it is chosen as operation in Detecting Failure.
    Similarly, if a path containing "//" is specified, forced termination will also fail.
  • If you want to prevent behalf of the device in OS startup, udev devices the Please set the device name.
    example: /dev/disk/by-label/<device-name>
  • When a change is made at the run level on the OS, some device files of a partition device set as a disk resource might be created again. This may reset the read-only setting for the partition device set as a disk resource.

<When using a resource that has the disk type LVM>

  • When using this setting, it is recommended to control a volume group by using a volume manager resource together. For details, see "Understanding Volume manager resources" of this guide.

  • The volume is not defined on the EXPRESSCLUSTER side.

  • Please do not select [zfs] for the File System.

<When using a resource that has the disk type VXVM>

  • When using this setting, see "Understanding Volume manager resources".

  • The volume is not defined on the EXPRESSCLUSTER side.

  • No disk resource is needed when using only the accessible raw device (/dev/vx/rdsk/<disk-group-name>/<volume-name>) with the disk group imported and the volume started (raw access without setting up a file system on the volume).

  • Please do not select [zfs] for the File System.

3.7.5. Details tab

Disk Type Server Individual Setup

Select a disk type. You can only choose [disk].

Choose one of the types below.

  • DISK

  • RAW

  • LVM

  • VXVM

File System Server Individual Setup

You select a file system type created on the disk device. Choose one from the types described below. You may also directly enter the type. This setting is necessary when the setting to Disk Type is other than raw.

  • ext3

  • ext4

  • xfs

  • reiserfs

  • vxfs

  • zfs

Device Name (Within 1023 bytes) Server Individual Setup

Select the disk device name to be used for disk resources. Otherwise, you can enter the device name. When other than [zfs] is selected for File System, the name should begin with "/". If File System is [zfs], specify the ZFS data set name.

Raw Device Name (within 1,023 bytes) Server Individual Setup

Enter the raw disk device name to be used for disk resources. This setting is necessary when the setting to Disk Type is raw or vxvm.

Mount Point (Within 1023 bytes) Server Individual Setup

Enter the directory to mount the disk device. The name should begin with "/." This setting is necessary when the setting to Disk Type is other than raw.

Tuning

Opens the Disk Resource Tuning Properties dialog box. Make detailed settings on the dialog box.

Disk Resource Tuning Properties(when the setting to Disk Type is other than raw)

Mount tab

The detailed settings related to mount are displayed.

Mount Option

Enter options to give the mount command when mounting the file system on the disk device. More than one option is delimited with a comma ",".

A mount option sample

Setting item

Setting value

Device name

/dev/sdb5

Mount point

/mnt/sdb5

File system

ext3

Mount option

rw,data=journal

The mount command to be run with the above settings is:

mount -t ext3 -o rw,data=journal /dev/sdb5 /mnt/sdb5

Timeout (1 to 999)

Enter how many seconds you want to wait for the mount command completion before its timeout when you mount the file system on the disk device.
If the file system has a large size of disk space, it may take some time for the command to complete. Make sure to specify the value that is enough for the mount command completion.

Retry Count (0 to 999)

Enter how many times you want to retry to mount the file system on the disk device when one fails.
If you set this to zero (0), mount will not be retried.

Initialize

Clicking Initialize resets the values of all items to the default values.

Unmount tab

The detailed settings related to unmount are displayed.

Timeout (1 to 999)

Enter how many seconds you want to wait for the umount command completion before its timeout when you unmount the file system on the disk device.

Retry Count (0 to 999)

Enter how many times you want to retry to unmount the file system on the disk device when one fails. If this is set to zero (0), unmount will not be retried.

Retry Interval (0 to 999)

Enter the interval in which you want to retry unmounting the file system on the disk device when unmounting fails.

Forced operation when failure is detected

Select an action to be taken at an unmount retry if unmount is failed.

  • kill
    Select this to try to kill the processes that are accessing the mount point. Not always the process can be killed.
  • No Operation
    Select this not to try to kill the processes that are accessing the mount point.

Initialize

Clicking Initialize resets the values of all items to the default values.

Fsck tab

The detailed settings related to fsck are displayed. The tab appears only if [xfs] is set for the file system. If [zfs] is selected for the file system, it will be invalid.

fsck Option (Within 1023 bytes)

Enter options to give to the fsck command when checking the file system on disk device. Options are delimited with a space. Specify options so that the fsck command does not work interactively.
Otherwise, you may not be allowed to mount until the "fsck timeout" elapses. When the file system is reiserfs, the fsck command works interactively. However, it can be avoided if EXPRESSCLUSTER gives "Yes" to reiserfsck.

fsck Timeout (1 to 9999)

Enter how many seconds you want to wait for the fsck command completion before its timeout when you check the file system on the disk device. If the file system has a large size of disk space, it may take some time for the command to complete. Make sure to specify the value that is enough for the mount command completion.

fsck action before mount

Select an fsck action before mounting file system on a disk device from the following choices:

  • Always Execute
    fsck is executed before mounting the file system.
  • Execute at Specified Count
    fsck is executed when resource is activated successfully within the count specified by Count.
    = Count (0 to 999)
  • Not Execute
    fsck is not executed before mounting the file system.

Note

The number of times to execute fsck is not related to the check interval managed by a file system.

fsck Action When Mount Failed

Set an fsck action when detecting a mount failure on a disk device.
This setting is enabled when the setting of Mount Retry Count is other than zero.
  • When the check box is selected:
    Mount is retried after running fsck.
  • When the check box is not selected:
    Mount is retried without running fsck.

Note

It is not recommended to set "Not Execute" fsck action before performing mount. With this setting, disk resource does not execute fsck and disk resource cannot be failed over when there is an error that can be recovered by fsck in the switchable partition.

Rebuilding of reiserfs

Specify the action when reiserfsck fails with a recoverable error.

  • When the checkbox is selected
    reiserfsck --fix-fixable is executed.
  • When the checkbox is not selected
    Recovery is not performed even if reiserfsck fails with a recoverable error.

Initialize

Clicking Initialize resets the values of all items to the default values.

xfs_repair tab

The detailed settings related to [xfs_repair] are displayed. The tab appears only if [xfs] is set for the file system.

xfs_repair Option (Within 1023 bytes)

Enter the option to give to the [xfs_repair] command when checking the file system on the disk device. To enter multiple options, delimit each with a space.

xfs_repair Timeout (1 to 9999)

Enter how many seconds you want to wait for the [xfs_repair] command completion before its timeout when you check the file system on the disk device. If the file system has a large size of disk space, it may take some time for the command to complete. Make sure that the value to set is not too small.

xfs_repair Action When Mount Failed

Set the [xfs_repair] action when mounting the file system on the disk device fails. This setting is enabled when the setting of Mount Retry Count is other than zero.

  • When the check box is selected:
    Mount is retried after running [xfs_repair].
  • When the check box is not selected:
    Mount is retried without running [xfs_repair].

Initialize

Clicking Initialize resets the values of all items to the default values.

Disk Resource Tuning Properties (when the setting to Disk Type is raw)

Unbind tab

The detailed settings related to unbind are displayed.

Execute Unbind

Specify whether to execute unbind a raw disk device.

  • When the check box is selected:
    Execute unbind a raw disk device.
  • When the check box is not selected:
    Do not execute unbind a raw disk device.

Timeout (1 to 999)

When the Execute Unbind check box is selected, Set the time-out for the unbind completion of the raw disk device.

Retry Count (1 to 999)

When the Execute Unbind check box is selected, Specify the retry count to unbind the raw disk device when one fails.

Initialize

Clicking Initialize resets the values of all items to the default values.

3.8. Understanding Floating IP resource

3.8.1. Dependencies of Floating IP resource

By default, this function does not depend on any group resource type.

3.8.2. Floating IP

Client applications can use floating IP addresses to access cluster servers. By using floating IP addresses, clients do not need to be aware of switching access destination server when a failover occurs or moving a group migration.

Floating IP addresses can be used on the same LAN and over the remote LAN.

Execute the [ifconfig] command or the API to assign an IP address to the OS. The floating IP resource automatically determines whether to execute the [ifconfig] command or the API.

When [ifconfig] command has a format other than the following, excute API.

eth0    Link encap:Ethernet HWaddr 00:50:56:B7:1B:C0
        inet addr:192.168.1.113 Bcast:192.168.1.255 Mask:255.255.255.0
        inet6 addr: fe80::250:56ff:feb7:1bc0/64 Scope:Link

(The following is omitted.)

Address assignment

An IP address to assign for floating IP address needs to meet the condition described below:

  • Available host address which is in the same network address as the LAN that the cluster server belongs

Allocate as many IP addresses that meet the above condition as required (generally as many as failover groups). These IP addresses are the same as general host addresses, therefore, global IP addresses can be assigned such as Internet.

Switching method

For IPv4, MAC addresses on the ARP table are switched by sending ARP broadcasting packets from the server on which FIP resources are activated.

For IPv6, ARP broadcasting packets are not sent.

The table below shows the information of ARP broadcasting packets sent by EXPRESSCLUSTER:

0

1

2

3

ff

ff

ff

ff

ff

ff

MAC address

(6byte)

08

06

00

01

08

00

06

04

00

02

MACaddress(6bytes)

FIP address (4 bytes)

MACaddress(6bytes)

FIP address

(4byte)

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

00

Routing

You do not need to configure the settings for the routing table.

Conditions to use

Floating IP addresses are accessible to the following machines:

  • Cluster server itself

  • Other servers in the same cluster and the servers in other clusters

  • Clients on the same LAN as the cluster server and clients on remote LANs

If the following conditions are satisfied, machines other than the above can also access floating IP addresses. However, connection is not guaranteed for all models or architectures of machines. Test the connection thoroughly by yourself before using those machines.

  • TCP/IP is used for the communication protocol.

  • ARP protocol is supported.

Even over LANs configured with switching hubs, floating IP address mechanism works properly. When a server goes down, the TCP/IP connection the server is accessing will be disconnected.

3.8.3. Notes on Floating IP resource

  • Do not execute a network restart on a server on which floating IP resources are active. If the network is restarted, any IP addresses that have been added as floating IP resources are deleted.

  • IP address overlaps due to time-lag of the [ifconfig] command

    If the following is set to the floating IP resource, the failover of resources may fail:

    • When a value smaller than the default is set to Retry Count at Activation Failure.

    • When Ping Retry Count and Ping Interval are not set.

    This problem occurs due to the following causes:

    • Releasing IP address may take time depending on the specification of the [ifconfig] command after deactivating the floating IP address on the server from which the resource is failed over.

    • On the activation of the floating IP address on the server to which the resource is failed over, if the ping command is run to the IP address to be activated in order to prevent dual activation, ping reaches the IP address because of the reason above, and the resource activation error occurs.

    Make the following settings to avoid this problem:

    • Set a greater value to Retry Count at Activation Failure of the resource (default: 5 times).

    • Set greater values to Ping Retry Count and Ping Interval.

  • IP address overlaps when OS is stalled

    If OS stalls with the floating IP address activated, the resource failover may fail when the following settings are made:

    • A value other than 0 is set to Ping Timeout.

    • Forced FIP Activation is off.

    This problem occurs due to the following causes:

    • A part of OS stalls (as examples below) with the floating IP address activated.

      • Network modules are running and respond to ping from other nodes

      • A stall cannot be detected in the user-mode monitor resource

    • When activating the floating IP address on the server to which the resource is failed over, if the ping command is executed to the IP address to be activated in order to prevent redundant activation, ping reaches the IP address because of the reason above, and the resource activation error occurs.

    In the machine environment where this problem often occurs, this can be prevented by the settings below. However, both groups may be activated depending on the status of a stall, and server shutdown may occur depending on the timing of the activation of both groups. For details on activation of both groups, see "What causes servers to shut down" - "Recovery from network partition" in "The system maintenance information" in the "Maintenance Guide".

    • Specify 0 to Ping Timeout
      Overlap check is not performed to the floating IP address.
    • Specify "On" to Forced FIP Activation
      The floating IP address is activated forcibly even when the address is used on a different server.

  • MAC address of virtual NIC to which floating IP is allocated.
    When the floating IP resource fails over, the corresponding MAC address is changed because the MAC address of virtual NIC to which the floating IP is allocated is the MAC address of real NIC.
  • Source address of IP communication from the running server when the resource activation.
    The source address from the server is basically the real IP of the server even though the floating IP resource has activated. When you want to change the source address to the floating IP, the settings are necessary on the application.
  • When Forced FIP Activation is set to ON, if a floating IP address is activated, and then a machine in the same network segment connects to a floating IP address, the connection may be established with a machine that previously used that IP address.

  • floating IP resource does not supported by the environment that OpenVPN has started.

  • The NIC name (the name of a network interface card, such as eth0) is up to 15 characters long. If the length of the name exceeds 15 characters, the activation failure occurs. Modify the NIC name in such a case.

  • Before activating a floating IP resource, [ping] is issued to check whether there is a duplicated IP address. Therefore, if rejection of ICMP reception via a firewall is set to a network device that uses a duplicated IP address, a floating IP address might be duplicated because the existence of duplicated IP addresses cannot be checked by using a [ping] command.

3.8.4. Waiting process for Floating IP resource deactivation

The following process takes place after deactivating of floating IP address.

  1. Waiting process

    • Execute the [ifconfig] command or the API to acquire a list of IP addresses assigned to the OS. The floating IP resource automatically determines whether to execute the [ifconfig] command or the API. If no floating IP address exists in the IP address list, it is regarded as deactive.

    • If a floating IP address exists in the IP addresses, one-second waiting takes place. This setting cannot be changed with the Cluster WebUI.

    • The operation mentioned above is repeated for up to four times at maximum. This number of times cannot be changed by the Cluster WebUI.

    • When it results in an error, whether the floating IP resource is regarded as having a deactivation error can be changed with Status at Failure under Confirm I/F Deletion on the Deactivity Check tab of the floating IP resource.

  2. Confirming process by the ping command

    • The ping command is executed to check if there is a response from the floating IP address. If there is no response, it is regarded as deactive.

    • When there is a response from the floating IP address, one-second waiting takes place. This setting cannot be changed with the Cluster WebUI.

    • The operation mentioned above is repeated for up to four times at maximum. This number of times cannot be changed by the Cluster WebUI.

    • The ping command is executed with one-second timeout. This timeout cannot be changed by the Cluster WebUI.

    • When it results in an error, the status of floating IP resource can be changed in Status at Failure under Confirm I/F Response on the Deactivity Check on the Deactivity Check tab of the floating IP resource.

Note

Acquisition of the list of IP addresses and floating address activation/deactivation using the [ifconfig] command timeout in 60 seconds (this is the default value).
This timeout value can be changed by the Clustew WebUI. For details, see the Parameter tab of the "Details tab".

3.8.5. Details tab

IP Address Server Individual Setup

Enter the floating IP address to be used. When setting the bonding, specify the bonding interface name by using "%" to separate. For details, see "Bonding" in "7. Information on other settings" in this guide.

  • Example: 10.0.0.12%bond0

The floating IP resource searches for the address on a local computer having the same subnet mask, assuming there to be, by default, 24 mask bits for IPv4 or 128 bits for IPv6. Then, it assigns an alias to the relevant network interface to add a floating IP address.

To specify a number of mask bits explicitly, specify the address followed by /number_of_mask_bits. (For an IPv6 address, be sure to specify /number_of_mask_bits.)

Example: fe80::1/8

To specify a network interface explicitly, specify the address followed by %interface_name.

Example: fe80::1/8%eth1

In the above example, a floating IP address with eight mask bits is added to network interface eth1.

When setting the tag VLAN Please specify the I / F name of tag VLAN, separated by the "%".

  • example in the case of setting the tag VLAN: 10.0.0.12% eth0.1

In an environment in which an IPv6 address and the [ifconfig] command can be used, be sure to match the output format of the [ifconfig] command and the IP address notation of the floating IP because the environment is case sensitive.

Tuning

Opens the Floating IP Resource Tuning Properties dialog box where the detailed settings for the floating IP resource can be configured.

Floating IP Resource Tuning Properties

Parameter tab

Detailed settings on parameters for floating IP resource are displayed.

ifconfig

The following is the detailed settings on getting IP addresses and on the [ifconfig] command executed for the activation and/or deactivation of the floating IP resource.

  • Timeout (1 to 999)
    Make the setting of the timeout of [ifconfig] command. This parameter is not available in an environment in which the [ifconfig] command cannot be used. Therefore, specify 60 seconds (default value for such an environment).

ping

These are the detailed settings of the ping command is used to check if there is any overlapped IP address before activating floating IP resource.

  • Interval (0 to 999)
    Set the interval to issue the ping command.
  • Timeout (0 to 999)
    Set timeout of the ping command.
    If zero is set, the ping command is not run.
  • Retry Count (0 to 999)
    Set retry count of the ping command.
  • Forced Fip Activation
    Specify whether to forcibly activate floating IP address when an overlapped IP address is detected by command check.
  • When the check box is selected
    Forced activation is performed.
  • When the check box is not selected
    Forced activation is not performed.

ARP Send Count (0 to 999)

Specify how many times you want to send ARP packets when activating floating IP resources.

If this is set to zero (0), ARP packets will not be sent.

Judge NIC Link Down as Failure

Specify whether to check for an NIC Link Down before the floating IP resource is activated. In some NIC boards and drivers, the required ioctl( ) may not be supported. To check the availability of the NIC Link Up/Down monitor, use the [ethtool] command provided by the distributor. For the check method using the [ethtool] command, see "Note on NIC Link Up/Down monitor resources" in "Understanding NIC Link Up/Down monitor resources" in this guide.

For bonding devices, it is judged as a failure when all the NIC composing the bonding are in the state of Link Down at activation.

  • When the check box is selected
    In the case of an NIC Link Down, the floating IP resource is not activated.
  • When the check box is not selected
    Even in the case of an NIC Link Down, the floating IP resource is activated.

Initialize

Clicking Initialize resets the values of all items to the default values.

Deactivity Check tab

Detailed settings on deactivity check of floating IP resource are displayed.

Confirm I/F Deletion

  • Confirm I/F Deletion
    Specify whether to confirm, whether the target floating IP address has been deleted successfully after the floating IP is deactivated.
  • When the check box is selected
    Confirmation is performed.
  • When the check box is not selected
    Confirmation is not performed.
  • Status at Failure
    Specify how to handle a deactivation error of the floating IP resource.
  • Failure:
    Treats as a deactivity failure of a floating IP resource.
  • Not Failure:
    Do not treat as a deactivity failure of a floating IP resource.

Confirm I/F Response

  • Confirm I/F Response
    Specify whether to confirm, using the ping command, whether the target floating IP address has been deleted successfully after the floating IP is deactivated.
  • When the check box is selected
    Confirmation is performed.
  • When the check box is not selected
    Confirmation is not performed.
  • Status at Failure
    Specify how to handle a deactivation error of the floating IP resource if the floating IP can be reached by the ping command.
  • Failure:
    Treats as a deactivity failure of a floating IP resource.
  • Not Failure:
    Do not treat as a deactivity failure of a floating IP resource.

3.9. Understanding Virtual IP resources

3.9.1. Dependencies of Virtual IP resources

By default, this function does not depend on any group resource type.

3.9.2. Virtual IP resources

Client applications can be connected to a cluster server by using a virtual IP address. The servers can be connected to each other by using a virtual IP address. By using a virtual IP address, switching from one server to the other to which a client is connecting remains transparent even if failover or moving of a failover group occurs. The graphic in the next page shows how virtual IP resources work in the cluster system.

Execute the [ifconfig] command or the API to assign an IP address to the OS. The floating IP resource automatically determines whether to execute the [ifconfig] command or the API. The following shows an example:

  • For an environment such as RHEL 7 or later (including RHEL compatible operating systems) on which the [ifconfig] command cannot be used, the API is executed.

  • For an environment such as RHEL 7 or later (including RHEL compatible operating systems) on which the net-tools package enables execution of the [ifconfig] command, the API is executed because the output format of the [ifconfig] command is not compatible with that of RHEL 6 or earlier.

  • For an environment such as RHEL 6 on which the [ifconfig] command can be used, the [ifconfig] command is executed.

3.9.3. Determining virtual IP address

An IP address used as a virtual IP address should satisfy the following conditions:

  • The IP address should not be within the network address of the LAN to which the cluster belongs.

  • The IP address should not conflict with existing network addresses.

Select one of the following allocation methods to meet the requirements above:

  • Obtain a new network IP address for virtual IP address and allocate virtual IP address.

  • Determine a network IP address from private IP address space and allocate virtual IP address. The following procedures are given as an example.

  • Select one network address from 192.168.0 to 192.168.255 for virtual IP address.

  • Allocate up to 64 host IP addresses for virtual IP address from the network address you have selected. (For example, select the network address 192.168.10 and allocate two host IP addresses: 192.168.10.1 and 192.168.10.254)

  • Specify 255.255.255.0 to net mask of the virtual IP address.

  • When you configure multiple virtual IP addresses, dummy virtual IP addresses may be required. For details, see "Preparing for using Virtual IP resources".

    • Private IP addresses are addresses for a closed network and they cannot be accessed using virtual IP address from outside of the network through internet providers.

    • Do not disclose path information of private IP addresses outside the organization.

    • Adjust the private IP addresses to avoid conflict with other address.

3.9.4. Preparing for using Virtual IP resources

If your cluster configuration satisfies the following conditions, you need to set a dummy virtual IP address which has same network address as a virtual IP address on each server.

  • When multiple virtual IP resources exist in a cluster.

  • Virtual IP resources whose network address and NIC alias name are same exist in a cluster.

Note

If a dummy virtual IP address cannot be configured, other virtual IP addresses assigned to the same NIC alias might be deleted by the OS when any virtual IP resource is deactivated.

A dummy virtual IP address should satisfy the following conditions:

  • The IP address has a same network address as of a virtual IP resource, and is unique.

  • The IP address can be prepared for each server constructing a cluster.

  • The IP address is prepared for each NIC alias.

In the following settings, a dummy virtual IP address should be configured on each server.

  • Virtual IP resource 1
    IP address 10.0.1.11/24
    NIC alias name eth1
  • Virtual IP resource 2
    IP address 10.0.1.12/24
    NIC alias name eth1

For example, set a dummy virtual IP address as follows:

  • Dummy virtual IP address of server1
    IP address 10.0.1.100/24
    NIC alias name eth1:0
  • Dummy virtual IP address of server2
    IP address 10.0.1.101/24
    NIC alias name eth1:0

Configure the OS by the following procedure so that dummy virtual IP addresses are enabled at OS startup.

In the following procedure, eth1 of server 1 is set to 10.0.1.100/24 as an example.

  1. Perform one of the following procedures according to your distribution.

  • For SUSE LINUX Enterprise Server:
    Edit the file on the following path. Add the italic parts on the setting information.

    Path

    /etc/sysconfig/network/ifcfg-eth1-"MAC_address_of_eth1"

    Setting information

    BOOTPROTO='static'
    BROADCAST='10.0.0.255'
    IPADDR='10.0.0.1'
    MTU=''
    NETMASK='255.255.255.0'
    NETWORK='10.0.0.0'
    IPADDR_1='10.0.1.100'
    NETMASK_1='255.255.255.0'
    NETWORK_1='10.0.1.0'
    LABEL_1=1
    REMOTE_IPADDR=''
    STARTMODE='onboot'
    UNIQUE='xxxx'
    _nm_name='xxxx'
    
  • For other than SUSE LINUX Enterprise Server:
    Create a file on the following path, and add the setting information.

    Path

    /etc/sysconfig/network-scripts/ifcfg-eth1:0

    Setting information

    DEVICE=eth1:0
    BOOTPROTO=static
    BROADCAST=10.0.1.255
    HWADDR=MAC_address_of_eth1
    IPADDR=10.0.1.100
    NETMASK_1=255.255.255.0
    NETWORK=10.0.1.0
    ONBOOT=yes
    TYPE=Ethernet
    
  1. Restart the OS.

Dummy virtual IP addresses are enabled after the OS restart. Configure server 2 in the same manner.

Follow the procedure below when the settings above is required due to the cluster configuration change.

  1. Stop a cluster. For the procedure, see "Suspending EXPRESSCLUSTER Stopping the EXPRESSCLUSTER daemon" in "Preparing to operate a cluster system" in the "Installation and Configuration Guide".

  2. Disable the cluster daemon. For the procedure, see "Suspending EXPRESSCLUSTER Disabling the EXPRESSCLUSTER daemon" in "Preparing to operate a cluster system" in the "Installation and Configuration Guide".

  3. Change the settings above.

  4. Restart the OS, and check that the settings are applied.

  5. Enable the cluster daemon. For the procedure, see "Suspending EXPRESSCLUSTER Enabling the disabled EXPRESSCLUSTER daemon" in "Preparing to operate a cluster system" in the "Installation and Configuration Guide".

  6. Modify the cluster configuration. For the procedure, see "Modifying the cluster configuration data" in the "Installation and Configuration Guide".

3.9.5. Controlling path

To access to a virtual IP address from a remote LAN, path information of the virtual IP address must be effective to all routers on the path from the remote LAN to the LAN for cluster server. To be specific, the following condition must be satisfied:

  • Routers on the cluster servers LAN interpret host RIP.

  • Routers on the path from a cluster server to the remote server have the dynamic routing settings or information on the virtual IP address routes has configured as static routing settings.

3.9.6. Requirement to use virtual IP address

Environments where virtual IP address can be used

Virtual IP addresses can be accessed from the machines listed below. Virtual IP address mechanism functions properly even in a LAN where switching hubs are used. However, when a server goes down, TCP/IP that has been connected will be disconnected.

When using virtual IP addresses with a switching HUB that cannot be configured to create a host routing table by receiving host RIP, you need to reserve one new network address and configure virtual IP addresses so that the IP address of each server belongs to a different network address.

  • Cluster servers that belong to the same LAN which the server the virtual IP activates belongs to

    Virtual IP addresses can be used if the following conditions are satisfied:

    • Machines that can change the path by receiving RIP packets.

    • Machines that can resolve the path information of a virtual IP address by accessing a router.

  • Cluster servers that belongs to the different LAN from which the server the virtual IP activates belongs to

    Virtual IP addresses can be used if the following condition is satisfied:

    • Machines that can resolve path information of the virtual IP address by accessing a router.

  • Clients that belongs to the same LAN which cluster servers belong to

    Virtual IP addresses can be used if the following conditions are satisfied:

    • Machines that can change the path by receiving RIP packets.

    • Machines that can resolve the path information of a virtual IP address by accessing a router.

  • Clients on remote LAN

    Virtual IP addresses can be used if the following condition is satisfied:

    • Machines that can resolve path information of the virtual IP address by accessing a router.

3.9.7. Notes on Virtual IP resources

  • Do not execute a network restart on a server on which virtual IP resources are active. If the network is restarted, any IP addresses that have been added as virtual IP resources are deleted.

The following rule applies to virtual IP addresses.

  • If virtual IP resources are not inactivated properly (e.g. when a server goes down), the path information of virtual IP resources is not deleted. If virtual IP resources are activated with their path information not deleted, the virtual IP addresses cannot be accessed until their path information is reset by a router or a routing daemon.
    Thus, you need to configure the settings of a flush timer of a router or a routing daemon. For a flush timer, specify the value within the heartbeat timeout value. For details on the heartbeat timeout, see "Cluster properties" in "2. Parameter details" in this guide.
  • MAC address of virtual NIC to which virtual IP is allocated.

    When the virtual IP resource fails over, the corresponding MAC address is changed because the MAC address of virtual NIC to which the virtual IP is allocated is the MAC address of real NIC.

  • Source address of IP communication from the running server when the resource activation.

    The source address from the server is basically the real IP of the server even though the virtual IP resource has activated. When you want to change the source address to the virtual IP, the settings are necessary on the application.

  • Routing protocol used

    If the routing protocol is set to "RIPver2," the subnet mask for transmitted RIP packets is "255.255.255.255" .

3.9.8. Details tab

IP Address Server Individual Setup

Enter the virtual IP address to use. To specify a number of mask bits explicitly, specify the address followed by /number_of_mask_bits. (For an IPv6 address, be sure to specify /number_of_mask_bits.)

NIC Alias Name Server Individual Setup

Enter the NIC interface name that activates the virtual IP address to be used.

Destination IP Address Server Individual Setup

Enter the destination IP address of RIP packets. IPv4 specifies the broadcast address and IPv6 specifies the router IPv6 address.

Source IP Address Server Individual Setup

Enter the IP address to bind when sending RIP packets. Specify the actual IP address activated on NIC which activates the virtual IP address.

To use an IPv6 address, specify a link local address as the source IP address.

Note

The source IP address should be set for individual servers, and set the actual IP address of each server. Virtual IP resources do not operate properly if a source address is invalid. In the Common tab, describes the source IP address of any server, other servers, please to perform the individual setting.

Send Interval (1 to 30) Server Individual Setup

Specify the send interval of RIP packets.

Use Routing Protocol Server Individual Setup

Specify the RIP version to use. For IPv4 environment, select RIPver1 or RIPver2. For IPv6 environment, select RIPngver1 or RIPngver2 or RIPngver3. You can select more than one routing protocols.

Tuning

Opens Virtual IP resource Tuning Properties. You can make the advanced settings for the virtual IP resources.

Virtual IP Resource Tuning Properties

Parameter tab

Detailed setting for virtual IP parameter is displayed.

ifconfig

The following is the detailed settings on getting IP addresses and on the ifconfig command executed for the activation and/or deactivation of the virtual IP resource.

  • Timeout (1 to 999)
    Make the setting of the timeout of [ifconfig] command. This parameter is not available in an environment in which the [ifconfig] command cannot be used. Therefore, specify 60 seconds (default value for such an environment).

Ping

In this box, make detailed settings of the ping command used to check for any overlapped IP address before activating the virtual IP resource.

  • Interval (0 to 999)
    Specify the interval to issue the ping command in seconds.
  • Timeout (0 to 999)
    Specify the time-out for the ping command in seconds.
    When 0 is specified, the ping command is not run.
  • Retry Count (0 to 999 )
    Specify how many retries of issuing the ping command are attempted.
  • Forced Vip Activation
    Use this button to configure whether to forcibly activate the virtual IP address when an overlapped IP address is found using the ping command.
  • When the check box is selected
    Forcefully activate the virtual IP address.
  • When the check box is not selected
    Do not forcefully activate the virtual IP address.

ARP Send Count (0 to 999)

Specify how many times you want to send ARP packets when activating virtual IP resources.

If this is set to zero (0), ARP packets will not be sent.

Judge NIC Link Down as Failure

Specify whether to check for an NIC Link Down before the virtual IP resource is activated. In some NIC boards and drivers, the required ioctl( ) may not be supported. To check the availability of the NIC Link Up/Down monitor, use the [ethtool] command provided by the distributor. For the check method using the [ethtool] command, see "Note on NIC Link Up/Down monitor resources" in "Understanding NIC Link Up/Down monitor resources" in this guide.

  • When the check box is selected
    In the case of an NIC Link Down, the floating IP resource is not activated.
  • When the check box is not selected
    Even in the case of an NIC Link Down, the floating IP resource is activated. This operation is the same as before.

Initialize

Click Initialize to reset the values of all items to their default values.

Deactivity Check tab

Detailed settings on deactivity check of virtual IP resource are displayed.

Confirm I/F Deletion

After deactivating the virtual IP, the cluster makes sure that the given virtual IP address disappeared successfully. Configure if failure is treated as the IP resource deactivity failure.

  • Failure:
    Treats as a deactivity failure of a virtual IP resource.
  • Not Failure:
    Does not treat as a deactivity failure of a virtual IP resource.

Confirm I/F Response

After deactivating a virtual IP, a cluster makes sure that the given virtual IP address cannot be accessed by the ping command. Configure reaching the virtual IP address by the ping command is treated as deactivity failure.

  • Failure:
    Treats as a deactivity failure of a virtual IP resource.
  • Not Failure:
    Do not treat as a deactivity failure of a virtual IP resource.

RIP tab

Detailed settings on RIP of virtual IP resource are displayed.

Next Hop IP Address

Enter the next hop address (address of the next router). Next hop IP address can be omitted. It can be specified for RIPver2 only. You cannot specify a netmask or prefix.

Metric (1 to 15)

Enter a metric value of RIP. A metric is a hop count to reach the destination address.

Port

On Port Number, a list of communication ports used for sending RIP is displayed.

Add

Add a port number used for sending RIP. Clicking this button displays the dialog box to enter a port number.

Port No.

Enter a port number to be used for sending RIP, and click OK.

Edit

A dialog box to enter a port number is displayed. The port selected in the Port Number is displayed. Edit it and click OK.

Remove

Click Remove to remove the selected port on the Port Number.

RIPng tab

Detailed settings on RIPng of virtual IP resource are displayed.

Metric (1 to 15)

Enter a metric value of RIPng. A metric is a hop count to reach the destination address.

Port

On Port Number, a list of ports used for sending RIPng is displayed.

Add

Add a port number used for sending RIPng. Clicking this button displays the dialog box to enter a port number.

Port No.

Enter a port number to be used for sending RIPng, and click OK.

Edit

A dialog box to enter a port number is displayed. The port selected in the Port Number is displayed. Edit it and click OK.

Remove

Click Remove to remove the selected port on the Port Number.

3.10. Understanding Mirror disk resources

3.10.1. Dependencies of Mirror disk resource

By default, this function depends on the following group resource type.

Group resource type

Floating IP resource

Virtual IP resource

AWS Elastic IP resource

AWS Virtual IP resource

AWS DNS resource

Azure probe port resource

Azure DNS resource

3.10.2. Mirror disk

Mirror disk

Mirror disks are a pair of disks that mirror disk data between two servers in a cluster.

Data partition

Partitions where data to be mirrored (such as application data) is stored are referred to as data partitions. Allocate data partitions as follows:

  • Data partition size
    The size of data partition should be 1GB or larger but smaller than 1TB.
    (Less than 1TB size is recommended from the viewpoint of the construction time and the restoration time of data.)
  • Partition ID
    83(Linux)
  • If Execute initial mkfs is selected in the cluster configuration information, a file system is automatically created when a cluster is generated.
  • EXPRESSCLUSTER is responsible for the access control (mount/umount) of file system. Do not configure the settings that allow the OS to mount or unmount a data partition.

Cluster partition

Dedicated partitions used in EXPRESSCLUSTER for mirror partition controlling are referred to as cluster partition.

Allocate cluster partitions as follows:

  • Cluster partition size
    1024MB or more. Depending on the geometry, the size may be larger than 1024MB, but that is not a problem.
  • Partition ID
    83(Linux)
  • A cluster partition and data partition for data mirroring should be allocated in a pair.
  • Do not make the file system on cluster partitions.
  • EXPRESSCLUSTER performs the access control of the file system (mount/umount) as a device to mount the mirror partition device. Thus, do not configure the settings to mount or unmount the cluster partition on the OS side.

Mirror Partition Device (/dev/NMPx)

One mirror disk resource provides the file system of the OS with one mirror partition. If a mirror disk resource is registered to the failover group, it can be accessed from only one server (it is generally the primary server of the resource group).

Typically, the mirror partition device (dev/NMPx) remains invisible to users (AP) because they perform I/O via a file system. The device name is assigned so that the name does not overlap with others when the information is created by the Cluster WebUI.

  • EXPRESSCLUSTER is responsible for the access control (mount/umount) of file system. Do not configure the settings that allow the OS to mount or unmount a data partition.
    Mirror partition's (mirror disk resource's) accessibility to applications is the same as switching partition (disk resources) that uses shared disks.
  • Mirror partition switching is done for each failover group according to the failover policy.

Mirror disk connect

Maximum of two mirror disk connects can be registered per mirror disk resource.

  • When two mirror disk connects are registered, operations such as switching etc. are as follows:

    • The paths used to synchronize mirror data can be duplicated. By setting this, mirror data can be synchronized even when one of the mirror disk connects becomes unavailable due to such as disconnection.

    • The speed of mirroring does not change.

    • When mirror disk connects switch during data writing, mirror break may occur temporarily. After switching mirror disk connects completes, differential mirror recovery may be performed.

    • When mirror disk connects switch during mirror recovery, mirror recovery may suspended. If the setting is configured so that the automatic mirror recovery is performed, mirror recovery automatically resumes after switching mirror disk connects completes. If the setting is configured so that the automatic mirror recovery is not performed, you need to perform mirror recovery again after switching mirror disk connects completes.

For the mirror disk connect settings, see "Cluster properties""Interconnect tab" in "2. Parameter details" in this guide.

  • Disk partition

    It is possible to allocate a mirror disk partition (cluster partition, data partition) on a disk, such as root partition or partition, where the OS is located

    • When maintainability at a failure is important:
      It is recommended to allocate a disk for mirror which is not used by the OS (such as root partition, swap partition).
    • If LUN cannot be added due to H/W RAID specifications:
      If you are using hardware/RAID preinstall model where the LUN configuration cannot be changed, you can allocate a mirror partition (cluster partition, data partition) in the disk where the OS (root partition, swap partition) is located.

    Example: Adding a SCSI disk to both servers to create a pair of mirroring disks.

    Example: Using available area of the IDE disks of both servers on which OS of is stored to create a pair of mirroring disks.

  • Disk allocation
    You may use more than one disk for mirror disk. You may also allocate multiple mirror partition devices to a single disk.

    Example: Adding two SCSI disks to both servers to create two pairs of mirroring disks.

    Example: Adding a SCSI disk for both servers to create two mirroring partitions.

3.10.3. Understanding mirror parameters

Mirror Data Port Number

Set the TCP port number used for sending and receiving mirror data between servers. It needs to be configured for individual mirror disk resources.

The default value is displayed when a mirror disk resource is added in Cluster WebUI based on the following condition:

  • A port number of 29051 or later which is unused and the smallest

Heartbeat Port Number

Set the port number that a mirror driver uses to communicate control data between servers. It needs to be configured for individual mirror disk resources.

The default value is displayed when a mirror disk resource is added in Cluster WebUI based on the following condition:

  • A port number of 29031 or later which is unused and the smallest

ACK2 Port Number

Set the port number that a mirror driver uses to communicate control data between servers. It needs to be configured for individual mirror disk resources.

The default value is displayed when a mirror disk resource is added in Cluster WebUI based on the following condition:

  • A port number of 29071 or later which is unused and the smallest

The maximum number of request queues

Configure the number of queues for I/O requests (write requests) from the higher layer of the OS to the mirror disk driver. If a larger value is selected, the write performance will improve but more physical memory will be required.

Note the following when setting the number of queues:

  • The improvement in the performance is expected when a larger value is set under the following conditions:

    • Large amount of physical memory is installed on the server and there is plenty of available memory.

Connection Timeout

This timeout is used for the time passed waiting for a successful connection between servers when recovering mirror or synchronizing data.

Send timeout

This timeout is used:

  • For the time passed waiting for the write data to be completely sent from the active server to the standby server from the beginning of the transmission at mirror return or data synchronization.

    In detail, this timeout is to wait for write data to be completely stored in the send buffer of a network (TCP) once data storing begins. If the TCP buffer is full and there is no free space, a timeout occurs.

  • For the time interval for checking if the ACK send (in which the active server notifies the standby server of write completion) is necessary.

Receiving timeout

  • This timeout is used for the time passed waiting for the standby server to completely receive the write data from the active server from the beginning of the transmission.

Ack timeout

  • This timeout is used for the time passed waiting for the active server to receive the ACK notifying the completion of write once the active server begins sending write data to the standby server.
    If the ACK is not received within the specified timeout time, the difference information is accumulated to the bitmap for difference on the active server.

    If you use the synchronous mode, a response to an application might wait until receiving the ACK or until it's timeout.
    If you use the asynchronous mode, a response to an application is returned after writing to the active server's disk. (This response does not wait for ACK).
  • This timeout is used for the time passed waiting for the standby server to receive the ACK from the active server after the standby server completely sent the ACK notifying the completion of write.
    If the ACK for the active server is not received within the specified timeout time, the difference information is accumulated to the bitmap for difference on the standby server.

  • This timeout is used for the time passed waiting for the copy source server to receive the ACK notifying completion from the copy destination server after it began the data transmission when recovering mirror.

    When the sending amount of the recovery data reaches the Recovery Data Size, 1 ACK is returned (Recovery Data Size is described below.)
    Therefore when the Recovery Data Size becomes larger, sending becomes more efficient. But if an ACK timeout occurred, re-send data size also becomes larger.

Heartbeat Interval (1 to 600)

Heartbeat interval (sec) for checking the soundness of the mirror disk connect between the mirror drivers of two servers. Use the default whenever possible.

ICMP Echo Reply Receive Timeout (1 to 100)

Value used for heartbeat that is performed to check the soundness of the mirror disk connect between the mirror drivers of two servers. The maximum wait time from when ICMP Echo Request is sent until ICMP Echo Reply is received from the destination server. If ICMP Echo Reply is not received even if this timeout elapses, the reception is repeated for up to the ICMP Echo Request retry count, explained later. Use the default whenever possible.

ICMP Echo Request Retry Count (1 to 50)

Enter how many times you want to retry at the maximum to send ICMP Echo Request if ICMP Echo Reply from the destination server to ICMP Echo Request cannot be received before the ICMP Echo Reply receive timeout. Use the default whenever possible.

Adjustment between the ICMP Echo Reply receive timeout and ICMP Echo Request retry count.

You can adjust the sensitivity that determines mirror disk connect disconnection by adjusting the ICMP Echo Reply receive timeout and ICMP Echo Request retry count.

  • Increasing the value

    • Case in which a network delay occurs in a remote location

    • Case in which a temporary failure occurs in a network

  • Decreasing the value

    • Case in which the time for detecting a network failure is to be reduced

Difference Bitmap Update Interval

Information to be written to the bit map for difference is temporarily accumulated in memory, and is written to the cluster partition at regular intervals. This interval is used for the standby server to check whether this is information to write to the bit map as well as to perform a write.

Difference Bitmap Size

Users can set the difference bitmap size.

If the data partition size is large, there are times efficiency of differential copy can be better by enlarging the size of difference bitmap.

However, memory efficiency could be deteriorated. Please use the default value under normal conditions.

This setting is needed to be set before establishing a mirror disk resource and/or a hybrid disk resource in the cluster. If the mirror disk resource and/or the hybrid disk resource already exist in the cluster, the setting cannot be changed.

Initial Mirror Construction

Specify if configure initial mirroring 4 when activating cluster for the first time after the cluster is created.

  • Execute the initial mirror construction
    An initial mirroring is configured when activating cluster for the first time after the cluster is created.
    The time that takes to construct the initial mirror is different from ext2/ext3/ext4 and other file systems.
  • Do not execute initial mirror construction
    Does not configure initial mirroring after constructing a cluster.
    Before constructing a cluster, it is necessary to make the content of mirror disks identical without using EXPRESSCLUSTER.
4

Regardless of the existence of the FastSync Option, the entire data partition is copied.

Initial mkfs

Specify if initial file creation in the data partition of the mirror disk is configured when activating cluster for the first time after the cluster is created.

  • Execute initial mkfs
    The first file system is created when activating cluster for the first time immediately after the cluster is created.
  • Do not execute initial mkfs
    Does not create a first file system to the data partition in the mirror disk when activating cluster for the first time immediately after the cluster is created.
    You can configure the settings so that the initial mkfs setting is not executed when a file system has been set up in the data partition of the mirror disk and contains data to be duplicated, which does not require file system construction or initialization by mkfs.
    The mirror disk partition 5 configuration should fulfill mirror disk resource requirements.
5

There must be a cluster partition in a mirror disk. If you cannot allocate a cluster partition when the single server disk is the mirroring target, take a backup and allocate the partition.

If Does not execute initial mirror construction is selected, Execute initial mkfs cannot be chosen. (Should mkfs be performed for the active and standby data partitions, even immediately after mkfs is performed, differences will arise between the active data partition and standby data partition for which mkfs has been executed. Therefore, when initially executing mkfs, initial mirror construction (copying of the active data partition and the standby data partition) is also required. If [Execute initial mirror construction] is selected, [Execute initial mkfs] can be chosen.)

Number of Queues

In the Asynchronous mode, specify the maximum number of queues in which write requests to the remote disk are held. For details on asynchronous mode setting, see "Details tab".
In cases such as when a slow network is used or if the amount of data requiring transmission (synchronization) increases as the amount written to the mirror increases, those data waiting for transmission (waiting for synchronization to be complete) are accumulated in these queues. Then, if the network speed becomes fast or if the amount of data transmitted (synchronized) decreases along with reduced writes to the mirror, data in queues waiting for transmission are transmitted. In this way, queues are used to absorb the increase and decrease in written data and the network speed change and to transmit data to the network.
If a larger value is set for the number of queues to absorb the increase and decrease in synchronous data, usually, the maximum time until synchronization is complete (Ack timeout) should also be set to a larger value.
These queues are created in the memory space. However, if the number of data units waiting for synchronization to be completed exceeds the maximum number of queues, then the excess is recorded and stored as a file.
By setting a larger maximum number of queues, the I/O performance may be improved, but more memory space will be used. For information on the required memory size, see " Installation requirements for EXPRESSCLUSTER" - " Software" - " Required memory and disk size" in the "Getting Started Guide".
In the case that the maximum number of queues is too large, if a synchronization timeout (Ack timeout) or a mirror communication break occurs while writing a large amount of data, an enormous volume of queue processes will arise at a time, possibly leading to extremely high load.

Rate limitation of Mirror Connect

In the Asynchronous mode, the server tries to transfer data that has been temporarily queued to the standby server as quickly as possible. For this reason, if the channel for mirror disk connect is used for other applications, the communication band may become busy, hindering other communications.
In this case, by imposing bounds on the communication band for mirror connect communication, the impact on other communications can be reduced.
If, however, the communication band for mirror disk connect is smaller than the average amount of data to be written to the mirror disk, the queued data cannot be fully transferred to the standby server, and at last the maximum number of queues is reached, causing mirroring to interrupt (mirror break). The bandwidth should be large enough to allow data to be written into the business application.

Note

This function imposes a limit on the communication band by having a maximum one-second pause when the total amount of data to be transferred per second exceeds the configured value. If the amount of data to be written to the disk at one time exceeds the configured value, the expected level of performance may not be achieved. For example, when the amount of data to be transferred to a copy of a mirror disk at one time is 64 KB, even if you set a communication band limit of 64 KB or less per second, the actual amount of communication during copy can be greater than the configured value.

History File Store Directory

Specify the directory of a file in which, if the maximum number of queues created in the memory is exceeded in the Asynchronous mode, the excess is recorded.
It is recommended to prepare a disk for storing the history file and set the History File Store Directory on the disk, because the amount of I/O to/from the mirror disk may increase the I/O load on the History File Store Directory.

Size Limitation of History File

Specify the maximum accumulation in the history file in the Asynchronous mode. When the accumulation reaches the maximum, a mirror break occurs.

Compress Data

Specify whether to compress mirror synchronous data (in the case of Asynchronous mode) or mirror recovery data before transmission. If a slow network is used, compressing transmission data can reduce the amount of data to be transmitted.

Note

  • Compression may increase the CPU load at data transmission.

  • In a slow network, compression reduces the amount of data transmitted, so a reduction in time can be expected compared to uncompressed data. Conversely, in a fast network, increases in compression processing time as well as load are more noticeable than a reduction in transfer time, so a reduction in time might not be expected.

  • If most of data has a high compression efficiency, compression reduces the amount of data transmitted, so a reduction in time can be expected compared to uncompressed data. Conversely, if most of data has a low compression efficiency, not only the amount of data transmitted is not reduced, but also the compression processing time and load increase, in which case a reduction in time might not be expected.

Mirror agent send time-out

Time-out for the mirror agent waiting to complete processing data after sending a request to the other server.

Mirror agent receiving time-out

Time-out for the mirror agent waiting to start receiving data after the mirror agent creates a communication socket with the other server.

Recovery Data Size (64 to 32768)

Specify the size of data in mirror recovery between two servers in one processing. The default size is used in general.

  • Specify a larger size

    • It takes less time to completely process mirror recovery because the number of data exchanges between two servers decreases.

    • During mirror recovery, disk performance may degrade.
      (This is because, if the disk read range for mirror recovery data and the disk write range for a file system overlap, access is excluded and a wait occurs until the first processing is complete.
      In a slow network environment, if there is a large amount of recovery data, a single data transfer for mirror recovery will take more time. If a normal disk access for mirror data and this data transfer range for mirror recovery overlap, disk access is awaited until the transfer is complete. This may lead to degraded disk performance.
      Therefore, specify a smaller size, especially for a slow network environment.)
  • Specify a smaller size

    • Sending/receiving data between two servers gets segmented and the possibility for a timeout to occur is decreased with a slow network speed or a high server load.

    • Because the number of exchanges between two servers increases, mirror recovery takes more time, especially in a network where delay occurs easily.

3.10.4. Examples of mirror disk construction

If you are using a disk that has been used as a mirror disk in the past, you must format the disk because old data exists in its cluster partition. For the initialization of a cluster partition, refer to the " Installation and Configuration Guide".

  • Execute the initial mirror construction
    Executing initial mkfs

  • Execute the initial mirror construction
    Not executing initial mkfs

  • Do not execute initial mirror construction
    Not executing initial mkfs

    The following is an example of making the mirror disks of both servers identical. (This cannot be done after constructing the cluster. Be sure to perform this before the cluster construction.)

    Example 1

    Copying partition images of a disk

    Example 2

    Copying by a backup device

3.10.5. Notes on mirror disk resources

  • If both servers cannot access the identical partitions under the identical device name, configure the server individual setting.

  • If Exclude Mount/Unmount Commands is selected on the Extension tab in Cluster Properties, activation/deactivation of mirror resource may take time because mount/umount is performed exclusively to disk resource, VxVM volume resource, NAS resource, and mirror resource in the same server.

  • When specifying path including symbolic link for mount point, Force Operation cannot be done even if it is chosen as operation in Detecting Failure.
    Similarly, if a path containing "//" is specified, forced termination will also fail.
  • Disks using stripe set, volume set, mirroring, stripe set with parity by Linux md cannot be specified for the cluster partition and data partition.

  • Volumes by Linux LVM can be specified for the cluster partition and data partition.
    For SuSE Linux, volumes by LVM or MultiPath cannot be used for the cluster partition or data partition.
  • Mirror disk resources (mirror partition devices) cannot be the targets of stripe set, volume set, mirroring, stripe set with parity by Linux md or LVM.

  • When the geometries of the disks used as mirror disks differ between the servers:

    The size of a partition allocated by the fdisk command is aligned by the number of blocks (units) per cylinder.

    Allocate data partitions to achieve the following data partition size and direction of the initial mirror construction.

    Source server <= Destination server

    "Source server" refers to the server with the higher failover policy in the failover group to which a mirror resource belongs.
    "Destination server" refers to the server with the lower failover policy in the failover group to which a mirror resource belongs.

    If the data partition sizes differ significantly between the copy source and the copy destination, initial mirror construction may fail. Be careful, therefore, to secure data partitions of similar sizes.
    Make sure that the data partition sizes do not cross over 32GiB, 64GiB, 96GiB, and so on (multiples of 32GiB) on the source server and the destination server. For sizes that cross over multiples of 32GiB, initial mirror construction may fail.

    Examples)

    Combination

    Data partition size

    Description

    On server 1

    On server 2

    OK

    30GiB

    31GiB

    OK because both are in the range of 0 to 32GiB.

    OK

    50GiB

    60GiB

    OK because both are in the range of 32GiB to 64GiB.

    NG

    30GiB

    39GiB

    Error because they are crossing over 32GiB.

    NG

    60GiB

    70GiB

    Error because they are crossing over 64GiB.

  • Do not use the O_DIRECT flag of the open() system call for a file used in a mirror disk resource.
    Examples include the Oracle parameter filesystemio_options = setall.
  • Do not specify a mirror partition device (such as /dev/NMP1) as the monitor target in the READ (O_DIRECT) disk monitoring mode.

  • For the data partition and the cluster partition of mirror disk resources, use disk devices with the same logical sector size on all servers. If you use devices with different logical sector sizes, they do not operate normally. They can operate even if they have different sizes for the data partition and the cluster partition.

Examples)

Combination

Logical sector size of the partition

Description

Server 1

Server 1

Server 2

Server 2

Data
partition
Cluster
partition
Data
partition
Cluster
partition

OK

512B

512B

512B

512B

The logical sector sizes are uniform.

OK

4KB

512B

4KB

512B

The data partitions have a uniform size of 4 KB,
and the cluster partitions have a uniform size of 512 bytes.

NG

4KB

512B

512B

512B

The logical sector sizes for the data partitions are not uniform.

NG

4KB

4KB

4KB

512B

The logical sector sizes for the cluster partitions are not uniform.

  • Do not use HDDs and SSDs in combination for the disks used for the data partition and the cluster partition of mirror disk resources. If you used them in combination, optimum performance cannot be obtained. Even if disks with different disk types are used for the data partition and the cluster partition, they can operate.

    Examples)

    Combination

    Logical sector size of the partition

    Description

    Server 1

    Server 1

    Server 2

    Server 2

    Data
    partition
    Cluster
    partition
    Data
    partition
    Cluster
    partition

    OK

    HDD

    HDD

    HDD

    HDD

    The disk types are uniform.

    OK

    SSD

    HDD

    SSD

    HDD

    The data partitions are of the uniform disk type of SSD,
    and the cluster partitions are of the uniform type of HDD.

    NG

    SSD

    HDD

    HDD

    HDD

    As the data partitions, both HDD and SSD are used.

    NG

    SSD

    SSD

    SSD

    HDD

    As the cluster partitions, both HDD and SSD are used.

  • The bit64 format of an ext4 filesystem is not supported.
    To format ext4 manually on RHEL7, Asianux Server 7, and Ubuntu, add the option to disable bit64 to the mkfs command.

3.10.6. mount processing flow

The mount processing needed to activate the mirror disk resource is performed as follows:

With none specified for the file system, the mount processing does not occur.

  1. Is the device already mounted?

    When already mounted -> To X

  2. Is fsck set to be run before mounting?

    Timing at which to run fsck -> Run fsck for the device.

  3. Mount the device.

    Mounted successfully -> To O

  4. Is mounting set to be retried?

    When retry is not set -> To X

  5. When fsck(xfs_repair) is set to be run if mounting fails:

    When fsck has run successfully in 2. -> Go to 6.

    When mounting fails due to a timeout in 3. -> Go to 6.

    Other than the above -> Run fsck(xfs_repair) for the device.

  6. Retry mounting of the device.

    Mounted successfully -> To O

  7. Has the retry count for mounting been exceeded?

    Within the retry count -> Go to 6.

    The retry count has been exceeded -> To X

O The resource is activated (mounted successfully).

X The resource activation has failed (not mounted).

3.10.7. umount processing flow

The umount processing to deactivate the mirror disk resource is performed as follows:

With none specified for the file system, the umount processing does not occur.

  1. Is the device already unmounted?

    When already unmounted -> To X

  2. Unmount the device.

    Unmounted successfully -> To O

  3. Is unmount set to be retried?

    When retry is not set -> To X

  4. Is the device still mounted? (Is the mount point removed from the mount list and is the mirror device in the unused status?)

    No longer mounted -> To O

  5. Try KILL for the process using the mount point.

  6. Retry unmount of the device.

    Unmounted successfully -> To O

  7. Is the result other than the unmount timeout and is the mount point removed from the mount list?

    The mount point has already been removed.

    -> Wait until the mirror device is no longer used.

    (Wait no more than a length of time equal to the unmount timeout.)

  8. Has the retry count for unmount been exceeded?

    Within the retry count -> Go to 4.

    The retry count is exceeded -> To X

O The resource is stopped (unmounted successfully).

X The resource stop has failed (still mounted, or already unmounted).

3.10.8. Conditions under which the mirror status becomes abnormal

The following lists the most common situations in which the status of a mirror disk resource changes from normal (GREEN) to abnormal (RED).

  • Due to the disconnection of communication (mirror disconnect), stoppage of the standby server, etc., mirror synchronization between the active and standby servers fails, leading to differences between the servers.
    The standby server does not retain the latest data, so enters the abnormal (RED) state.
  • Settings are made so that mirror data is not synchronized, causing differences between the active and standby servers.
    The standby server does not retain the latest data, so enters the abnormal (RED) state.
  • A mirror disk disconnection (mirroring interruption) operation is performed.
    The standby server enters the abnormal (RED) state.
  • Mirror recovery is interrupted during mirror recovery (during mirror re-synchronization).
    The standby server has not completed copying, so enters the abnormal (RED) state.
  • The active server does not execute cluster shutdown normally due to server down, etc.
    (The activated mirror disk resource stops without switching to the deactivated state.)
    The mirror disk of the server enters the abnormal (RED) state after the server starts.
  • After a mirror disk is activated by starting only one server, the server is stopped without performing mirror synchronization, and then the other server is started and the mirror disk is activated.
    Because the mirror disks of the two servers are updated individually,
    those disks enter the abnormal (RED) state.
    If the mirror disks of the two servers are updated individually as described above, it is not possible to automatically judge the mirror disk of which server should act as the copy source, so automatic mirror recovery is not performed. In this case it is necessary to execute forced mirror recovery.
  • Due to the disconnection of communication (mirror disconnect), reboot of the standby server, etc., mirror synchronization between the active and standby servers fails, causing differences between the servers and, later, the active server fails to execute cluster shutdown normally due to a server down, etc.
    In this case, if the server normally fails over to the standby server later, both servers enter the abnormal (RED) state after the servers start.
    In this case, automatic mirror recovery is not performed, either. Rather, it is necessary to execute forced mirror recovery.

For details on how to refer to the status of a mirror, see the following:

For details on how to perform the mirror recovery or forcible mirror recovery, see the following:

3.10.9. Details tab

Mirror Partition Device Name

Select a mirror partition device name to be associated with the mirror partition.

Device names of mirror disk resource/hybrid disk resource that have already been configured are not displayed on the list.

Mount Point (Within 1023 bytes) Server Individual Setup

Specify a directory to mount the mirror partition device. The name should begin with "/."

Data Partition Device Name (Within 1023 bytes) Server Individual Setup

Specify a data partition device name to be used for a disk resource.

The name should begin with "/."

Cluster Partition Device Name (Within 1023 bytes) Server Individual Setup

Specify a cluster partition device name to be paired with the data partition.

The name should begin with "/."

File System

You select a file system type to be used on the mirror partition. Choose one from the list box. You may also directly enter the type.

  • ext2

  • ext3

  • ext4

  • xfs

  • jfs

  • reiserfs

  • none (no file system)

Mirror Disk Connect

Add, delete or modify mirror disk connects. In the Mirror Disk Connects list, I/F numbers of the mirror disk connects used for mirror disk resources are displayed.

In Available Mirror Disk Connect, mirror disk connect I/F numbers that are currently not used are displayed.

  • Set mirror disk connects on the Cluster Properties.

  • Maximum of two mirror disk connects can be used per mirror disk resource. For the behavior when two mirror disk connects are used, see "Mirror disk".

  • For details on how to configure mirror disk connects, see the "Installation and Configuration Guide".

Add

Use Add to add a mirror disk connect. Select the I/F number you want to add from Available Mirror Disk Connect and then click Add. The selected number is added to the Mirror Disk Connects list.

Remove

Use Remove to remove mirror disk connects to be used. Select the I/F number you want to remove from the Mirror Disk Connect list and then click Remove. The selected number is added to Available Mirror Disk Connect.

Order

Use the arrows to change the priority of mirror disk connects to be used. Select the I/F number you want to change from the Mirror Disk Connect list and then click the arrows.

Tuning

Opens the Mirror Disk Resource Tuning Properties dialog box. You make detailed settings for the mirror disk resource there.

Mirror disk resource tuning properties

Mount tab

The advanced settings of mount are displayed.

This does not appear with none selected from File System under the Details tab of the Resource Properties dialog box.

Mount Option (Within 1023 bytes)

Enter options to give the mount command when mounting the file system on the mirror partition device. Use a comma "," to separate multiple options.

Mount option example

Setting item

Setting value

Mirror partition device name

/dev/NMP5

Mirror mount point

/mnt/sdb5

File system

ext3

Mount option

rw,data=journal

The mount command to be run with the above settings is:

mount -t ext3 -o rw,data=journal /dev/NMP5 /mnt/sdb5

Timeout (1 to 999)

Enter how many seconds you want to wait for the mount command completion before its timeout when you mount the file system on the mirror partition device. Be careful about the value you specify. That is because it may take some time for the command to complete if the capacity of the file system is large.

Retry Count (0 to 999)

Enter how many times you want to retry to mount the file system on the mirror partition device when one fails. If you set this to zero (0), mount will not be retried.

Initialize

Clicking Initialize resets the values of all items to the default values.

Unmount tab

The advanced settings for unmounting are displayed.

This does not appear with none selected from File System under the Details tab of the Resource Properties dialog box.

Timeout (1 to 999)

Enter how many seconds you want to wait for the unmount command completion before its timeout when you unmount the file system on the mirror partition device.

Retry Count (0 to 999)

Enter how many times you want to retry to unmount the file system on the mirror partition device when one fails. If you set this to zero (0), unmount will not be retried.

Retry Interval (0 to 999)

Enter the interval in which you want to retry unmounting the file system from the mirror partition device when unmounting fails.

Forced operation when failure is detected

Select an action to be taken at an unmount retry if unmount fails.

  • kill:
    Select this option to try to forcibly terminate the processes that are accessing the mount point. Not all processes can be terminated.
  • No Operation:
    Select this option not to try killing the processes that are accessing the mount point.

Initialize

Clicking Initialize resets the values of all items to the default values.

Fsck tab

The advanced settings of fsck are displayed.

This does not appear with xfs or none selected from File System under the Details tab of the Resource Properties dialog box.

fsck Option (Within 1023 bytes)

Enter options to give the fsck command when checking the file system on the mirror partition device. Use a space to separate multiple options. Specify options so that the fsck command does not run interactively. Otherwise, activation of resources after the time specified to fsck Timeout elapses becomes an error.

fsck Timeout (1 to 9999)

Enter how many seconds you want to wait for the fsck command completion before its timeout when you check the file system on the mirror partition device. Be careful about the value you specify. This is because it may take some time for the command to complete if the capacity of the file system is large.

fsck Action Before Mount

Select an fsck action before mounting file system on a disk device from the following choices:

  • Always Execute:
    fsck is executed before mounting the file system.
  • Execute at Specified Count:
    fsck is executed when resource is activated successfully within the count specified by Count.
    = Count (0~999)
  • Not Execute:
    fsck is not executed before mounting the file system.

Note

The specified count for fsck is not related to the check interval managed by a file system.

fsck Action When Mount Failed

Set an fsck action to take when detecting a mount failure on a disk device.
This setting is enabled when the setting of Mount Retry Count is other than zero.
  • When the check box is selected:
    Mount is retried after running fsck.
  • When the check box is not selected:
    Mount is retried without running fsck.

Note

It is not recommended to set "Not Execute" fsck action before performing mount. With this setting, disk resource does not execute fsck and disk resource cannot be failed over when there is an error that can be recovered by fsck in the switchable partition.

Rebuilding of reiserfs

Specify the action when reiserfsck fails with a recoverable error.

  • When the checkbox is selected
    reiserfsck --fix-fixable is executed.
  • When the checkbox is not selected
    Recovery is not performed even if reiserfsck fails with a recoverable error.

Initialize

Clicking Initialize resets the values of all items to the default values.

xfs_repair tab

The detailed settings related to [xfs_repair] are displayed. The tab appears only if [xfs] is set for the file system.

xfs_repair Option (Within 1023 bytes)

Enter the option to give to the [xfs_repair] command when checking the file system on the disk device. To enter multiple options, delimit each with a space.

xfs_repair Timeout (1 to 999)

Enter how many seconds you want to wait for the [xfs_repair] command completion before its timeout when you check the file system on the disk device. If the file system has a large size of disk space, it may take some time for the command to complete. Make sure that the value to set is not too small.

xfs_repair Action When Mount Failed

Set the [xfs_repair] action when mounting the file system on the disk device fails. This setting is enabled when the setting of Mount Retry Count is other than zero.

  • When the check box is selected:
    Mount is retried after running [xfs_repair].
  • When the check box is not selected:
    Mount is retried without running [xfs_repair].

Initialize

Clicking Initialize resets the values of all items to the default values.

Mirror tab

The advanced settings of mirror disks are displayed.

Execute the initial mirror construction

Specify if an initial mirror configuration is constructed when constructing a cluster.

  • When the check box is selected:
    An initial mirror configuration will be constructed.

The time that takes to construct the initial mirror is different from ext2/ext3/ext4 and other file systems.

  • When the check box is not selected:
    An initial mirror configuration will not be constructed.

Execute initial mkfs

Specify if an initial mkfs is constructed when constructing a cluster. This option can be set only if the initial mirror is being constructed.

In the case of hybrid disk resources, the clphdinit command behavior is executed instead of initial mkfs behavior upon cluster construction

  • When the check box is selected:
    An initial mkfs will be run.
  • When the check box is not selected:
    An initial mkfs will not be run.

Perform Data Synchronization

Specify if the mirror data synchronization is executed when mirror disk resource is activated.

  • When the check box is selected:
    Mirror data synchronization is executed. The write data is passed from the active server to the standby server. The clpmdctr command and clphdctrl command can be used not to synchronize mirror data.
  • When the check box is not selected:
    Mirror data synchronization will not be executed. The write data will not be passed
    from the active server to the standby server and will be accumulated as the finite difference. You can use the clpmdctrl command and clphdctrl command to switch to the status where mirror data is synchronized.

Mode

Specify synchronous mode of mirror data.

  • Synchronous
    Select when LAN is mainly used for mirror connect.
  • Asynchronous
    Select when WAN is mainly used for mirror connect. Specify Number of Queues when Asynchronous is chosen. Specify it for each mirror disk resource.
    • Unlimited:
      Queues will be allocated as long as possible to allocate memory. When it failed to allocate memory, mirror breaks.
    • Set Number (1 to 999999):
      Specify maximum number of queues to be allocated. When synchronous data exceeds it, the excess is recorded as a history file.

    When Asynchronous is selected, the Rate limitation of Mirror Connect check box can be selected.

    • When the check box is selected (1 to 999999)
      The upper rate limitation of mirror connect is set.
    • When the check box is cleared
      The upper rate limitation of mirror connect is not set.

    With Asynchronous selected, you can edit the setting in the History File Store Directory text box to specify the directory of a file in which, if the maximum number of queues is exceeded, the excess is recorded. Without specifying the directory here, the file is generated under the following directory: (EXPRESSCLUSTER-installed directory)/work.

    With Asynchronous selected, you can edit the setting in the Size Limitation of History File text box. When the accumulation in the history file reaches the size specified here, a mirror break occurs. Specifying the value as 0 or nothing makes the size unlimited.

    When Asynchronous is selected, the Compress data check box can be selected.

    • When the check box is selected
      Mirror synchronous communication data is compressed.
    • When the check box is cleared
      Mirror synchronous communication data is not compressed.

Compress data when recovering

Specify whether to compress mirror recovery communication data.

Initialize

Clicking Initialize resets the values of all items to the default values.

Mirror Driver tab

Advanced settings for a mirror driver is displayed.

Mirror Data Port Number (1 to 65535 6)

Set the TCP port number used for sending and receiving disk data between servers. The default value 29051 is set to the mirror disk resource or the hybrid disk resource created first. From a second mirror disk resource or the hybrid disk resource, the value increased by one from default (29052,29053,...) is set accordingly.

6

It is not recommended to use well-known ports, especially reserved ports from 1 to 1023.

Heartbeat Port Number (1 to 65535 7)

Set the port number that a mirror driver uses to communicate control data between servers. The default value 29031 is set to the mirror disk resource or the hybrid disk resource created first. From a second mirror disk resource or the hybrid disk resource, the value increased by one from default (29032, 29033,...) is set accordingly.

7

It is not recommended to use well-known ports, especially reserved ports from 1 to 1023.

ACK2 Port Number (1 to 65535 8)

Set the port number that a mirror driver uses to communicate control data between servers. The default value 29071 is set to the mirror disk resource or the hybrid disk resource created first. From a second mirror disk resource or the hybrid disk resource, the value increased by one from default (29072, 29073,...) is set accordingly.

8

It is not recommended to use well-known ports, especially reserved ports from 1 to 1023.

Send Timeout (10 to 99)

Set the delivery time-out for write data.

Connection Timeout (5 to 99)

Set the time-out for connection.

Ack Timeout (1 to 600)

Set the time-out which waits for Ack response when mirror recovers and data is synchronized.

Receive Timeout (1 to 600)

Set the receive time-out for write confirmation.

Heartbeat interval (1 to 600)

Set the heartbeat interval between mirror disk connects by the mirror driver.

ICMP Echo Reply Reception Timeout (1 to 100)

Set the heartbeat timeout between mirror disk connects by the mirror driver. If no-response is returned for the ICMP Echo Request retry count during the time set here, a mirror disk connect disconnection is assumed.

ICMP Echo Request Retry Count (1 to 50)

Set the heartbeat retry count between mirror disk connects by the mirror driver. This value is related to the mirror connect disconnection judgment sensitivity as well as the ICMP Echo Reply receive timeout.

Initialize

Clicking Initialize resets the following values to the default values.

  • Send Timeout

  • Connection Timeout

  • Ack Timeout

  • Receive Timeout

  • Heartbeat Interval

  • ICMP Echo Reply Receive Timeout

  • ICMP Echo Request Retry Count

Note

For Mirror Data Port Number, Heartbeat Port Number and ACK2 Port Number, different port numbers should be configured for each resource. Also, those should not be the same as other port numbers used on a cluster. Thus, the initial values are not set even when you click Initialize.

High Speed SSD tab

The detailed settings for the high-speed SSD specifications in mirror disk resources are displayed.

Data Partition

Select the check box when you use a high-speed SSD for the data partition of mirror disk resources. Make sure that the disk devices used for the data partitions on all nodes are either HDDs or SSDs. If they are used in combination, optimum performance cannot be exerted.

Cluster Partition

Select the check box when you use a high-speed SSD for the cluster partition of mirror disk resources. Make sure that the disk devices used for the cluster partitions on all nodes are either HDDs or SSDs. If they are used in combination, optimum performance cannot be exerted.

3.11. Understanding Hybrid disk resources

3.11.1. Dependencies of Hybrid disk resource

By default, this function depends on the following group resource types.

Group resource type

Floating IP resource

Virtual IP resource

AWS Elastic IP resource

AWS Virtual IP resource

AWS DNS resource

Azure probe port resource

Azure DNS resource

3.11.2. What is hybrid disk?

A hybrid disk is a resource which performs data mirroring between two server groups. A server group consists of 1 server or 2 servers. When a server group consists of 2 servers, a shared disk is used. When a server group consists of 1 server, a disk which is not shared type (e.g. a built-in disk, an external disk chassis which is not shared between servers) is used.

Data partition

Partitions where data to be mirrored (such as application data) is stored are referred to as data partitions.

Allocate data partitions as follows:

  • Data partition size
    The size of data partition should be 1GB or larger but smaller than 1TB.
    (Less than 1TB size is recommended from the viewpoint of the construction time and the restoration time of data.)
  • Partition ID
    83(Linux)
  • Please make the file system on data partitions if you need. Automatic initial mkfs is not executed.
  • EXPRESSCLUSTER is responsible for the access control (mount/umount) of file system. Do not configure the settings that allow the OS to mount or unmount a data partition.

Cluster partition

Dedicated partitions used in EXPRESSCLUSTER for controlling hybrid disk are referred to as cluster partition.

Allocate cluster partitions as follows:

  • Cluster partition size
    1024MB or more. Depending on the geometry, the size may be larger than 1024MB but that is not a problem.
  • Partition ID
    83(Linux)
  • A cluster partition and data partition for data mirroring should be allocated in a pair.

  • Do not make the file system on cluster partitions.

Mirror Partition Device (/dev/NMPx)

One hybrid disk resource provides the file system of the OS with one mirror partition. If a hybrid disk resource is registered with the failover group, it can be accessed only from one server (it is generally the primary server of the resource group).

Typically, the mirror partition device (dev/NMPx) remains transparent to users (AP) because I/O is performed via a file system. When the information is created by the Cluster WebUI, device names should be assigned without overlapping with each other.

  • EXPRESSCLUSTER is responsible for the access control (mount/umount) of file system. Do not configure the settings that allow the OS to mount or unmount a data partition.
    Mirror partition's (hybrid disk resource's) accessibility to applications is the same as switching partition (disk resources) that uses shared disks.
  • Mirror partition switching is performed on a failover group basis according to the failover policy.

  • /dev/NMPx(x is a number between 1 and 8) is used for the special device name of mirror partition. Do not use /dev/NMPx in other device drivers.

  • The major number 218 is used for mirror partition. Do not use the major number 218 in other device drivers.

Example 1) When two servers use the shared disk and the third server uses the built-in disk

  • When a non-shared disk is used (i.e. when there is one server in the server group), it is possible to secure a partition for the hybrid disk resource (cluster partition and data partition) on the same disk where the OS (root partition and swap partition) is located.

    • When maintainability at a failure is important:
      It is recommended to allocate a disk for mirror which is not used by the OS (such as root partition, swap partition).
    • If LUN cannot be added due to H/W RAID specifications:
      If you are using hardware/RAID preinstall model where the LUN configuration cannot be changed, you can allocate a mirror partition (cluster partition, data partition) in the disk where the OS (root partition, swap partition) is located.

Mirror disk connect

See "Mirror disk connect" for the "Mirror disk"

3.11.3. Mirror parameter settings

The following parameters are the same as those of mirror disk resources. See "mirror disk resources".

  • Mirror data port number

  • Heartbeat port number

  • ACK2 port number

  • The maximum number of request queues

  • Connection timeout

  • Send timeout

  • Receiving timeout

  • Ack timeout

  • Difference bitmap update interval (cluster properties)

  • Difference Bitmap size (cluster properties)

  • Mirror agent send timeout (cluster properties)

  • Mirror agent receiving timeout (cluster properties)

  • Recovery data size (cluster properties)

  • Initial mirror construction

  • Number of Queues

  • Mode of Communication Band

  • History File Store Directory

  • Size Limitation of History File

  • Heartbeat Interval

  • ICMP Echo Reply Receive Timeout

  • ICMP Echo Request Retry Count

The following parameter is different from mirror disk resource.

  • Initial mkfs
    Automatic initial mkfs is not executed. Please execute mkfs manually.

3.11.4. Notes on hybrid disk resources

  • If device names for the cluster partitions or the data partitions differ between servers, set up each server separately. In addition, if the device names differ between servers belonging to the same server group, set by-id to the device name.

  • If Exclude Mount/Unmount Commands is selected on the Extension tab in Cluster Properties, activation/deactivation of hybrid disk resource may take time because mount/umount is performed exclusively to disk resource, VxVM volume resource, NAS resource, mirror resource and hybrid disk resource in the same server.

  • When specifying path including symbolic link for mount point, Force Operation cannot be done even if it is chosen as operation in failure detection.
    Similarly, if a path containing "//" is specified, forced termination will also fail.
  • Disks using stripe set, volume set, mirroring, stripe set with parity by Linux md cannot be specified for the cluster partition and data partition.

  • Hybrid disk resources (mirror partition devices) cannot be the targets of stripe set, volume set, mirroring, stripe set with parity by Linux md or LVM.

  • When the geometries of the disks used as hybrid disks differ between the servers:
    The size of a partition allocated by the fdisk command is aligned by the number of blocks (units) per cylinder. Allocate data partitions to achieve the following data partition size and direction of the initial mirror construction.

    Source server <= Destination server

    "Source server" refers to the server with the higher failover policy in the failover group to which a hybrid disk resource belongs.
    "Destination server" refers to the server with the lower failover policy in the failover group to which a hybrid disk resource belongs.
    If the data partition sizes differ significantly between the copy source and the copy destination, initial mirror construction may fail. Be careful, therefore, to secure data partitions of similar sizes.
    Make sure that the data partition sizes do not cross over 32GiB, 64GiB, 96GiB, and so on (multiples of 32GiB) on the source server and the destination server. For sizes that cross over multiples of 32GiB, initial mirror construction may fail.

Examples)

Combination

Data partition size

Description

On server 1

On server 2

OK

30GiB

31GiB

OK because both are in the range of 0 to 32GiB.

OK

50GiB

60GiB

OK because both are in the range of 32GiB to 64GiB.

NG

30GiB

39GiB

Error because they are crossing over 32GiB.

NG

60GiB

70GiB

Error because they are crossing over 64GiB.

  • Do not use the O_DIRECT flag of the open() system call for a file used in a hybrid disk resource.
    Examples include the Oracle parameter filesystemio_options = setall.
  • Do not specify a mirror partition device (such as /dev/NMP1) as the monitor target in the READ (O_DIRECT) disk monitoring mode.

  • For a cluster configuration that uses a hybrid disk, do not set the final action of a monitor resource, etc., to Stop the cluster service.

  • For the data partition and the cluster partition of hybrid disk resources, use disk devices with the same logical sector size on all servers. If you use devices with different logical sector sizes, they do not operate normally. They can operate even if they have different sizes for the data partition and the cluster partition.

Examples)

Combination

Logical sector size of the partition

Description

Server 1

Server 1

Server 2

Server 2

Data
partition
Cluster
partition
Data
partition
Cluster
partition

OK

512B

512B

512B

512B

The logical sector sizes are uniform.

OK

4KB

512B

4KB

512B

The data partitions have a uniform size of 4 KB,
and the cluster partitions have a uniform size of 512 bytes.

NG

4KB

512B

512B

512B

The logical sector sizes for the data partitions are not uniform.

NG

4KB

4KB

4KB

512B

The logical sector sizes for the cluster partitions are not uniform.

  • Do not use HDDs and SSDs in combination for the disks used for the data partition and the cluster partition of hybrid disk resources. If you used them in combination, optimum performance cannot be obtained. Even if disks with different disk types are used for the data partition and the cluster partition, they can operate.

Examples)

Combination

Logical sector size of the partition

Description

Server 1

Server 1

Server 2

Server 2

Data
partition
Cluster
partition
Data
partition
Cluster
partition

OK

HDD

HDD

HDD

HDD

The disk types are uniform.

OK

SSD

HDD

SSD

HDD

The data partitions are of the uniform disk type of SSD,
and the cluster partitions are of the uniform type of HDD.

NG

SSD

HDD

HDD

HDD

As the data partitions, both HDD and SSD are used.

NG

SSD

SSD

SSD

HDD

As the cluster partitions, both HDD and SSD are used.

  • The bit64 format of an ext4 filesystem is not supported.
    To format ext4 manually on RHEL7, Asianux Server 7, and Ubuntu, add the option to disable bit64 to the mkfs command.
  • Behavior of mirror recovery after the active server goes down abnormally

    When the active server goes down abnormally, depending on the timing of the server failure, full mirror recovery or differential mirror recovery is performed.

    • When a resource is activated by a server connected via a shared disk (a server in the same server group)

    • When a resource is activated by a server in the remote server group

3.11.5. mount processing flow

The mount processing needed to activate the hybrid disk resource is performed as follows:

With none specified for the file system, the mount processing does not occur.

  1. Is the device already mounted?

    When already mounted -> To X

  2. Is fsck set to be run before mounting?

    Timing at which to run fsck -> Run fsck for the device.

  3. Mount the device.

    Mounted successfully -> To O

  4. Is mounting set to be retried?

    When retry is not set -> To X

  5. When fsck(xfs_repair) is set to be run if mounting fails:

    When fsck is executed in 2. and mount is successful -> Go to 6.

    When mount fails in 3. due to a timeout -> Go to 6.

    Other than the above -> Execute fsck(xfs_repair) for the device.

  6. Retry mounting of the device.

    Mounted successfully -> To O

  7. Has the retry count for mounting been exceeded?

    Within the retry count -> Go to 6.

    The retry count has been exceeded -> To X

O The resource is activated (mounted successfully).

X The resource activation has failed (not mounted).

3.11.6. umount processing flow

The umount processing to deactivate the hybrid disk resource is performed as follows:

With none specified for the file system, the umount processing does not occur.

  1. Is the device already unmounted?

    When already unmounted -> To X

  2. Unmount the device.

    Unmounted successfully -> To O

  3. Is unmount set to be retried?

    When retry is not set -> To X

  4. Is the device still mounted? (Is the mount point removed from the mount list and is the mirror device in the unused status?)

    No longer mounted -> To O

  5. Try KILL for the process using the mount point.

  6. Retry unmount of the device.

    Unmounted successfully -> To O

  7. Is the result other than the unmount timeout and is the mount point removed from the mount list?

    The mount point has already been removed.

    -> Wait until the mirror device is no longer used.

    (Wait no more than a length of time equal to the unmount timeout.)

  8. Has the retry count for unmount been exceeded?

    Within the retry count -> Go to 4.

    The retry count is exceeded -> To X

O The resource is stopped (unmounted successfully).

X The resource stop has failed (still mounted, or already unmounted).

3.11.7. Details tab

The followings are the same as those of mirror disk resources. Refer to "mirror disk resource".

  • Hybrid disk detail tab (See mirror disk detail tab)

  • Mirror disk connect selection

  • Hybrid disk adjustment properties (See mirror disk adjustment properties)

    • Mount tab

    • Unmount tab

    • Fsck tab

    • xfs_repair tab

    • Mirror tab (parameter other than the one for executing the initial mkfs)

    • Mirror drive tab

    • High-speed SSD tab

The following tab is different from that of mirror disk resource:

  • Mirror tab of hybrid disk adjustment properties [execute initial mkfs]

Execute initial mkfs

The hybrid disk resource in this version, automatic initial mkfs is not executed.

3.12. Understanding NAS resource

3.12.1. Dependencies of the NAS resource

By default, this function depends on the following group resource type:

Group resource type

Dynamic DNS resource

Floating IP resource

Virtual IP resource

AWS Elastic IP resource

AWS Virtual IP resource

AWS DNS resource

Azure probe port resource

Azure DNS resource

3.12.2. NAS resource

  • The NAS resource controls the resources in the NFS server.

  • By storing the data that is necessary for business transactions in the NFS server, it is automatically passed on when the failover group is moving during failover.

3.12.3. Notes on NAS resource

  • The EXPRESSCLUSTER will control the access (mount and/or umount) to the file system. Thus, do not configure the settings for the OS to run the mount or umount command.

  • On the NFS server, it is necessary to configure the settings that allow servers in the cluster for access to NFS resources.

  • On the EXPRESSCLUSTER X, configure the settings that start the portmap service.

  • If the host name is specified as the NAS server name, make the settings for name resolving.

  • If Exclude Mount/Unmount Commands is selected on the Extension tab of the Cluster Properties, it may take some time to activate or deactivate the NAS resource because the mount or unmount of the disk resource, NAS resource, and mirror resource is performed exclusively in the same server.

  • When specifying path including symbolic link for mount point, Force Operation cannot be done even if it is chosen as operation in Detecting Failure.
    Similarly, if a path containing "//" is specified, forced termination will also fail.

3.12.4. Details tab

Server Name (Within 255 bytes)

Enter the IP address or the server name of the NFS. If you set the host name, set the name resolution to OS. (ex. By adding entry to /etc/hosts)

Shared Name (Within 1023 bytes)

Enter the share name on the NFS server.

Mount Point (Within 1023 bytes)

Enter the directory where the NFS resource will be mounted. This must start with "/."

File System (Within 15 bytes)

Enter the type of file system of the NFS resource. You may also directly enter the type.

  • nfs

Tuning

Displays the NAS Resource Tuning Properties dialog box. Configure the NAS resource detailed settings.

NAS Resource Tuning Properties

Mount tab

The advanced settings for mounting are displayed.

Mount Option (Within 1023 bytes)

Enter the option that is passed to the mount command when mounting a file system. If you are entering more than one option, use "," to separate them.

Examples of the mount option

Setting item

Setting value

Server Name

nfsserver1

Shared Name

/share1

Mount Point

/mnt/nas1

File System

nfs

Mount Option

rw

The mount command that is run when the option shown above is set:

mount -t nfs -o rw nfsserver1:/share1 /mnt/nas1

Timeout (1 to 999)

Set the timeout to wait the mount command to be completed when mounting a file system.

It may take a while depending on how heavily network is loaded. Be careful when you are setting the value as the timeout may be detected while a command is running when you set a small value.

Retry Count (0 to 999)

Set the number of mount retries when mounting the file system fails.

When zero is set, mounting is not retried.

Initialize

Clicking Initialize resets the values of all items to the default values.

Unmount tab

The advanced settings for unmounting are displayed.

Timeout (1 to 999)

Set the timeout that waits for the end of the umount command when unmounting a file system.

Retry Count (0 to 999)

Set the number of unmount retries to be made when unmounting the file system fails. When zero is set, unmounting is not retried.

Retry Interval (0 to 999)

Enter the interval in which you want to retry unmounting the file system when unmounting fails.

Forced operation when failure is detected

Select an action to be taken when retrying unmount after unmount fails from the following.

  • kill:
    Attempts the forceful termination of the process that is accessing the mount point. This does not always mean that the processes can be forcibly terminated.
  • No Operation:
    Does not attempt the forceful termination of the process that is accessing the mount point.

Initialize

Clicking Initialize resets the values of all items to the default values.

NAS tab

The advanced settings for NAS are displayed.

Ping Timeout (0 to 999)

Set timeout of the ping command is used to check the connection with the server when activating and deactivating NAS resources. If zero is specified, the ping command is not is used.

Initialize

Clicking Initialize sets all the items to their default values.

3.13. Understanding Volume manager resources

3.13.1. Dependencies of Volume manager resources

The volume manager resources depend on the following group resource types by default.

Group resource type

Dynamic DNS resource

Floating IP resource

Virtual IP resource

AWS Elastic IP resource

AWS Virtual IP resource

AWS DNS resource

Azure probe port resource

Azure DNS resource

3.13.2. What is a Volume manager resource?

  • The volume manager is disk management software that handles multiple storage devices and disks as one logical disk.

  • Volume manager resources control logical disks managed by the volume manager.

  • If data necessary for operation is stored in a logical disk, it is automatically taken over, for example, when there is a failover or a failover group is moved.

3.13.3. Notes on Volume manager resources

<General>

  • Do not use volume manager resources to manage a mirror disk.

  • Disk resources control each volume.

  • Do not specify the import or export settings on the OS because EXPRESSCLUSTER performs access control (importing or exporting) for logical disks.

<Notes on using resources with the volume manager lvm>

  • Volume groups are not defined on the EXPRESSCLUSTER side.

  • At least one disk resource is required because each volume must be controlled.

  • The volume groups included in the EXPRESSCLUSTER configuration data are automatically exported when the OS is started.

  • Other volume groups are not exported.

  • When a VG created by using a shared disk is specified as a target volume, the import/export status of the VG is recorded on the shared disk according to the LVM specification. Therefore, if activation (import) or deactivation (export) is performed on the active server, it might be assumed that the same operation is performed on the standby server.

  • When controlling the LVM by using the volume manager resource in an environment of Red Hat Enterprise Linux 7 or later, the LVM metadata daemon must be disabled.

  • Run the following commands when activating resource.

    Command

    Option

    Timing when using command

    vgs

    -P

    Verifying volume group status

    --noheadings

    Verifying volume group status

    -o vg_attr,vg_name

    Verifying volume group status

    vgimport

    (Nothing)

    Importing volume group

    vgscan

    (Nothing)

    Activating volume group

    vgchange

    -ay

    Activating volume group

  • The resource activation sequence is shown below.

  • Run the following commands when deactivating resource.

    Command

    Option

    Timing when using command

    vgs

    -P

    Verifying volume group status

    --noheadings

    Verifying volume group status

    -o vg_attr,vg_name

    Verifying volume group status

    vgchange

    -an

    Deactivating volume group

    vgexport

    (Nothing)

    Exporting volume group

  • The resource deactivation sequence is shown below.

<Notes on using resources with the volume manager vxvm>

  • Disk groups are not defined on the EXPRESSCLUSTER side.

  • The disk groups included in the EXPRESSCLUSTER configuration data are automatically deported when the OS is started.

  • Other disk groups are not deported.

  • If the Clear host ID option is not selected, disk groups cannot be imported to the failover destination server due to VxVM specifications if the failover source server fails to normally deport the disk groups.

  • Even if an import timeout occurs, importing might be successfully completed. This problem can be avoided by specifying the Clear host ID or Forced Option at Import option, which retries importing.

  • Run the following commands when activating a resource.

    Command

    Option

    When to use

    vxdg

    import

    When importing a disk group

    -t

    When importing a disk group

    -C

    When importing a disk group fails and the Clear host ID option is selected

    -f

    When importing a disk group fails and the Forced Activation option is selected

    vxrecover

    -g

    When the volume for the specified disk group is started

    -sb

    When the volume for the specified disk group is started

  • The resource activation sequence is shown below.

  • Run the following commands when activating a resource.

    Command

    Option

    When to use

    vxdg

    deport

    When deporting a disk group

    flush

    When flushing data

    vxvol

    -g

    When the volume of the specified disk group is stopped

    stopall

    When the volume of the specified disk group is stopped

  • The resource deactivation sequence is shown below.

<Notes on using resources with the volume manager zfspool>

  • Exporting and other processes for ZFS may be delayed dramatically if iSCSI connection is disconnected when using ZFS storage pool under iSCSI environment.(OS restriction)
    The ZFS operations at the time of iSCSI disconnection is regulated in ZFS property value failmode. However, failmode=panic is recommended in EXPRESSCLUSTER. When it is failmode=panic, it operates as OS panics independently in a given time after iSCSI
  • On the data set that the ZFS property value mountpoint is configured in legacy, the file system will not be mounted by just importing the storage pool. In this case, it is necessary to mount or unmount ZFS file system by using the disk resource in addition to Volume Manager resource.
  • When on Ubuntu 16.04 or later, a failover group may be activated on more than 1 servers, state of "network partition" in other words, depending on the timing of OS startup. Even if the storage pool is automatically imported at OS startup, prevent the file system from being automatically mounted.
    The way to avoid automatic mounting is either of the below.
    • Set ZFS property value mountpoint to legacy.

    • Set ZFS property value canmount to noauto.

    This setting enables to avoid the automatic mounting even when the automatic import is performed at OS startup, preventing the network partition. In this case, it is necessary to mount or unmount ZFS file system by using the disk resource.

3.13.4. Details tab

Volume Manager

Specify the volume manager to use. The following volume managers can be selected:

  • lvm (LVM volume group control)

  • vxvm (VxVM disk group control)

  • zfspool (ZFS storage pool control)

Target Name (within 1023 bytes)

Specify the volume name in the <VG name> format (only the target name is used).

Combo box options collect volume group information from all the servers and display all the volume groups on one or more servers.

When the volume manager is lvm, it's possible to control multiple volumes together. More than one volume is delimited with an one-byte space.

Tuning

This displays the Volume Manager Resource Tuning Properties dialog box. Specify detailed settings for the volume manager resource.

Volume Manager Resource Tuning Properties (When Volume Manager is other than [zfspool])

Import Tab

The detailed import settings are displayed.

Import Timeout (1 to 9999)

Specify how long the system waits for completion of the volume import command before it times out.

Start Volume Timeout (1 to 9999)

Specify the startup command timeout.

Volume Status Check Timeout (1 to 9999)

Specify the volume status check command timeout.

This option can be used when the volume manager is lvm.

Clear Host ID

When normal importing fails, the clear host ID flag is set and importing is retried. The host ID is cleared when the check box is selected.

This option can be used when the volume manager is vxvm.

Forced Import

Specify whether to forcibly import data when importing fails. Data is forcibly imported if the check box is selected.

This option can be used when the volume manager is vxvm.

Initialize

Clicking Initialize resets the values of all items to the defaults.

Export Tab

The detailed export settings are displayed.

Stop Volume Timeout (1 to 9999)

Specify the volume deactivation command timeout.

Flush Timeout (1 to 9999)

Specify the flush command timeout.

This option can be used when the volume manager is vxvm.

Export Timeout (1 to 9999)

Specify the export/deport command timeout.

Volume Status Check Timeout (1 to 9999)

Specify the volume status check command timeout.

This option can be used when the volume manager is lvm.

Initialize

Clicking Initialize resets the values of all items to the defaults.

Volume Manager Resource Tuning Properties (When Volume Manager is [zfspool])

Import Tab

The detailed import settings are displayed.

Import Timeout (1 to 9999)

Specify how long the system waits for completion of the volume import command before it times out.

Forced Import

Specify whether to forcibly import data when importing fails. Data is forcibly imported if the check box is selected.

Execute Ping Check

This setting is enabled only when Forced Import is set to ON.

If an import failure occurs because another host has already performed import, ping Check specifies monitoring of whether the host is active using ping before the forced import. If the host becomes active as a result of the monitoring, forced activation is not performed. This prevents more than one host from simultaneously performing import to a single pool. When the check box is ON, activation of the host is monitored.

Note

When this setting is enabled, and a considerable time elapses between EXPRESSCLUSTER stopping and the OS shutting down, failover may fail. For example, if a monitor resource detects an abnormality and shuts down the operating server, and if the standby system starts activation of the volume manager before the operating server has stopped, a ping check will cause the activation to fail.

Initialize

Clicking Initialize resets the values of all items to the defaults.

Export Tab

The detailed export settings are displayed.

Export Timeout (1 to 9999)

Specify how long the system waits for completion of the volume export command before it times out.

Forced Export

Specify whether to forcibly export data when exporting fails. Data is forcibly exported if the check box is selected.

Initialize

Clicking Initialize resets the values of all items to the defaults.

3.14. Understanding VM resources

3.14.1. Dependencies of VM resources

VM resources do not depend on any group resource type by default.

3.14.2. What is a VM resource?

The VM resources control the virtual machines (guest OSs) in the virtualization infrastructure.

The management OS under which EXPRESSCLUSTER is installed starts and stops the virtual machines. For vSphere, EXPRESSCLUSTER can be installed and used under the guest OS of the virtual machine which was prepared for management.

Migration of the virtual machines can also be performed. If, however, vSphere is used, the configuration must also use vCenter.

Fig. 1 : Configuration when EXPRESSCLUSTER is installed under the management OS for the virtualization infrastructure

Fig. 2 : Configuration when EXPRESSCLUSTER is installed under the OS on a virtual machine for management (vSphere only)

3.14.3. Notes on VM resources

  • If the virtualization infrastructure type is XenServer or KVM, the VM resources are valid only when EXPRESSCLUSTER is installed under the host OS in the virtualization infrastructure.

  • If the virtualization infrastructure type is vSphere, the VM resources can be used even if EXPRESSCLUSTER is installed under the guest OS. In this case, however, vCenter must always be used.

  • A VM resource can be registered with a group for which the group type is virtual machine.

  • Only one VM resource can be registered per group.

  • If vSphere is selected as the virtualization infrastructure, Use vCenter must be selected (on) to perform migration.

  • Confirm the start time of the virtual machine (guest OS) to be controlled with a virtual machine resource, and set Virtual Machine Start Wait Time of Virtual Machine Resource Adjustment Property.
    The default value of Virtual Machine Start Wait Time is 0 seconds, so if it is not changed, the virtual machine monitor resource may mistakenly detect a monitor error.

3.14.4. Details tab

For vSphere

Virtual Machine Type

Specify the virtualization infrastructure type.

Installation Destination of the Cluster Service

Specify the type of OS under which EXPRESSCLUSTER is installed. Selecting the guest OS automatically selects the Use vCenter check box.

Virtual Machine Name (within 255 bytes)

Enter the virtual machine name. This setting is not required if the virtual machine path is entered. Specify the virtual machine path if the virtual machine name might be changed in the virtualization infrastructure.

Data Store Name (within 255 bytes)

Specify the name of data store containing the virtual machine configuration information.

VM Configuration File Path (within 1,023 bytes)

Specify the path where the virtual machine configuration information is stored.

IP Address of Host Server Individual Setup

Specify the management IP address of the host. You must specify the IP address of host for each server, using individual server settings.

User Name (within 255 bytes) Server Individual Setup

Specify the user name used to start the virtual machine.

Password (within 255 bytes) Server Individual Setup

Specify the password used to start the virtual machine.

Use vCenter

Specify whether to use vCenter. Use vCenter when performing migration.

vCenter (within 1,023 bytes)

Specify the vCenter host name.

User Name for vCenter (within 255 bytes)

Specify the user name used to connect to vCenter.

Password for vCenter (within 255 bytes)

Specify the password used to connect to vCenter.

Resource Pool Name (within 255 bytes) Server Individual Setup

Specify the resource pool name for starting the virtual machine.

For XenServer

Virtual Machine Type

Specify the virtualization infrastructure type.

Virtual Machine Name (within 255 bytes)

Enter the virtual machine name. This setting is not required if the UUID is specified. Specify the UUID if the virtual machine name might be changed in the virtualization infrastructure.

UUID

Specify the UUID (Universally Unique Identifier) for identifying the virtual machine.

Library Path (within 1,023 bytes)

Specify the library path used to control XenServer.

User Name (within 255 bytes)

Specify the user name used to start the virtual machine.

Password (within 255 bytes)

Specify the password used to start the virtual machine.

For KVM

Virtual Machine Type

Specify the virtualization infrastructure type.

Virtual Machine Name (within 255 bytes)

Enter the virtual machine name. This setting is not required if the UUID is specified.

UUID

Specify the UUID (Universally Unique Identifier) for identifying the virtual machine.

Library Path (within 1,023 bytes)

Specify the library path used to control KVM.

Tuning

This displays the VM Resource Tuning Properties dialog box. Specify detailed settings for the VM resource.

VM Resource Tuning Properties

Request Timeout

Specify how long the system waits for completion of a request such as to start or stop a virtual machine.

If the request is not completed within this time, a timeout occurs and resource activation or deactivation fails.

Virtual Machine Start Waiting Time

The system definitely waits this time after requesting the virtual machine to startup.

Virtual Machine Stop Waiting Time

The maximum time to wait for the stop of the virtual machine. Deactivation completes at the timing the stop of the virtual machine.

3.15. Understanding Dynamic DNS resources

3.15.1. Dependencies of Dynamic DNS resources

By default, NAS resources depend on the following group resources types:

Group resource type

Virtual IP resource

Floating IP resource

AWS Elastic IP resource

AWS Virtual IP resource

Azure probe port resource

3.15.2. What is a Dynamic DNS resource?

  • A Dynamic DNS resource registers the virtual host name and the IP address of the active server to the Dynamic DNS server. Client applications can be connected to a cluster server by using a virtual computer name. When the virtual host name is used, the client does not have to be aware of whether the connection destination server is switched when a failover occurs or a group is moved.

3.15.3. Preparing to use Dynamic DNS resources

Set up the DDNS server before using Dynamic DNS resources.

The description below assumes the use of BIND9.

One of the two types of /etc/named.conf settings below is used depending on the Dynamic DNS resource use mode when the DDNS server is set up.
Specify /etc/named.conf on the DDNS server in the desired mode.
  • When using Dynamic DNS resources with authentication

    Create a shared key on the BIND9 server by using the dnssec-keygen command. Add the shared key to /etc/named.conf and allow the zone file to be updated. When adding a Dynamic DNS resource, enter the shared key name in Authentication Key Name and the shared key value in Authentication Key Value.

    Note

    For details about setting up the DDNS server, using the dnssec-keygen command, and specifying setting other than allow-update, see the BIND manual.

    Example:

    1. Generate a shared key.
      #dnssec-keygen -a HMAC-MD5 -b 256 -n HOST example
      example is the shared key name.

      When the dnssec-keygen command is executed, the two files below are generated. The same shared key is used for these files.

      Kexample.+157+09088.key
      Kexample.+157+09088.private
      While the shared key is extracted from Kexample.+157+09088.key when using the named.conf setting below, using Kexample.+157+09088.private leads to the same result.
      The shared key value for Kexample.+157+09088.key is underlined below.
      # cat Kexample.+157+09088.key
      example. IN KEY 512 3 157 iuBgSUEIBjQUKNJ36NocAgaB
    2. Add the shared key information to /etc/named.conf.

      key " example " {
          algorithm hmac-md5;
          secret " iuBgSUEIBjQUKNJ36NocAgaB";
      };
      
    3. Add the shared key information to the zone statement in /etc/named.conf.

      zone "example.jp" {
          :
          allow-update{
               key example;
          };
          :
      };
      
    4. When adding a Dynamic DNS resource by using the Claster WebUI, enter the shared key name (example) in Authentication Key Name and the shared key value (iuBgSUEIBjQUKNJ36NocAgaB) in Authentication Key Value.

  • When using Dynamic DNS resources without authentication

    Be sure to specify the IP addresses of all servers in the cluster as the IP address range in which the zone file can be updated (allow-update {xxx.xxx.xxx.xxx}) in /etc/named.conf.

    Example:

    IP address for server1 in the cluster: 192.168.10.110
    IP address for server2 in the cluster: 192.168.10.111
    1. Add the IP address range in which updates are allowed to the zone statement in /etc/named.conf.

      zone "example.jp" {
          :
          //IP address range in which updates are allowed
          allow-update {
              192.168.10.0/24;
          };
          :
      };
      

    or

    zone "example.jp" {
        :
        //IP address range in which updates are allowed
        allow-update {
            192.168.10.110;
            192.168.10.111;
        };
        :
    };
    
    1. When adding a Dynamic DNS resource, do not enter any values in Authentication Key Name or Authentication Key Value.

3.15.4. Notes on Dynamic DNS resources

  • When using Dynamic DNS resources, the bind-utils package is necessary on each server.

  • Configuring Dynamic DNS server settings to be used is necessary to /etc/resolve.conf on each server.

  • When IP address of each server exists in different segments, FIP address cannot be set as IP address of Dynamic DNS resources.

  • To register each server IP address with the DDNS server, specify the addresses in the settings for each server.

  • In case of connecting from clients using virtual host name, when the fail over of the group which has Dynamic DNS resources occurs, reconnection may be necessary (restart browsers, etc.).

  • This method, which authenticates resources, applies only to a DDNS server set up using BIND9. To use the method without authentication, do not enter any values in Authentication Key Name or Authentication Key Value.

  • The behavior when the Cluster WebUI is connected depends on the Dynamic DNS resource settings.

    • When the IP address of each server is specified for Dynamic DNS resources on a server basis
      If the Cluster WebUI is connected by using the virtual host name from the client, this connection is not automatically switched if a failover occurs for a group containing Dynamic DNS resources.
      To switch the connection, restart the browser, and then connect to the Cluster WebUI again.
    • When the FIP address is specified for the Dynamic DNS resource
      If the Cluster WebUI is connected by using the virtual host name from the client, this connection is automatically switched if a failover occurs for a group containing Dynamic DNS resources.
  • If Dynamic DNS resources are used with the method with authentication, the difference between the time of every server in the cluster and that of the DDNS server must be less than five minutes.
    If the time difference is five minutes or more, the virtual host name cannot be registered with the DDNS server.

3.15.5. Details tab

Virtual Host Name

Enter the virtual host name to register with the DDNS service.

IP Address Server Individual Setup

Enter the IP address for the virtual host name.
When also using FIP resources, enter the IP address of the resources on the Common tab.
When using an IP address for each server, enter the IP address on each server tab.

DDNS Server

Enter the IP address of the DDNS server.

Port No.

Enter the port number of the DDNS server. The default value is 53.

Authentication Key Name

Enter the shared key name if a shared key was generated using the dnssec-keygen command.

Authentication Key Value

Enter the value of the shared key generated using the dnssec-keygen command.

3.16. Understanding AWS Elastic IP resources

3.16.1. Dependencies of AWS Elastic IP resources

By default, this function does not depend on any group resource type.

3.16.2. What is an AWS Elastic IP resource?

Client applications can use AWS Elastic IP addresses(referred to as the EIP) to access the Amazon Virtual Private Cloud (referred to as the VPC) in the Amazon Web Services (referred to as AWS) environment.

By using EIP, clients do not need to be aware of switching access destination server when a failover occurs or moving a group migration.

An AWS Elastic IP resource, an AWS Virtual IP resource, and an AWS DNS resource can be used together.

HA cluster with EIP control

This is used to place instances on public subnets (release business operations inside the VPC).

A configuration such as the following is assumed: Instances to be clustered are placed on public subnets in each Availability Zone (referred to as AZ), and each instance can access the Internet via the gateway.

3.16.4. Applying environment variables to AWS CLI run from the AWS Elastic IP resource

Specify environment variables in the environment variable configuration file to apply environment variables to the AWS CLI run from the AWS Elastic IP resource, AWS Virtual IP resource, AWS Elastic IP monitor resource and AWS AZ monitor resource.

This feature is useful when using a proxy server in an AWS environment.

The envirionment variable configuration file is stored in the following location.

<EXPRESSCLUSTER Installation path>/cloud/aws/clpaws_setting.conf

The format of the environment variable configuration file is as follows:

Envirionment variable name = Value

(Example)

[ENVIRONMENT]
HTTP_PROXY = http://10.0.0.1:3128
HTTPS_PROXY = http://10.0.0.1:3128

To specify multiple values for a parameter, enter them in comma-delimited format. The following shows an example of specifying more than one non-destination for the environment variable NO_PROXY:

(Example)

NO_PROXY = 169.254.169.254,ec2.ap-northeast-1.amazonaws.com

The specifications of the environment variable configuration file are as follows:

  • Write [ENVIRONMENT] on the first line. If this is not set, the environment variables will not be set.

  • If the environment variable configuration file does not exist or you do not have read permission for the file, the variables are ignored. This does not cause an activation failure or a monitor error.

  • If the same environment variables already exist in the file, the values are overwritten.

  • More than one environment variable can be set. Set one environment variable on each line.

  • The settings are valid regardless of whether there are spaces before and after "=" or not.

  • The settings are invalid if there is a space or tab in front of the environment variable name or if there are tabs before and after "=".

  • Environment variable names are case sensitive.

  • Even if a value contains spaces, you do not have to enclose the value in "" (double quotation marks).

  • The environment variables configured with the environment variable configuration file are propagated only to the AWS CLI executed from an AWS Elastic IP resource, an AWS Virtual IP resource, an AWS DNS resource, an AWS Elastic IP monitor resource, an AWS Virtual IP monitor resource, an AWS DNS monitor resource, and an AWS AZ monitor resource. Therefore, the configured variables are not propagated to any other script (e.g. a script before final action, a script before and after activation/deactivation, and a script to be run from EXEC resources). To execute the AWS CLI with such a script, configure necessary environment variables with the corresponding script.

3.16.5. Details tab

EIP ALLOCATION ID (Within 45 bytes)

For EIP control, specify the ID of the EIP to replace.

ENI ID (Within 45 bytes) Server Individual Setup

For EIP control, specify the ENI ID to which to allocate an EIP. In the Common tab, describes the ENI ID of any server, other servers, please to perform the individual setting.

AWS Elastic IP Resource Tuning Properties

Parameter tab

Timeout (1 to 999)

Set the timeout of the AWS CLI command to be executed for AWS Elastic IP resource activation/deactivation.

3.17. Understanding AWS Virtual IP resources

3.17.1. Dependencies of AWS Virtual IP resources

By default, this function does not depend on any group resource type.

3.17.2. What is an AWS Virtual IP resource?

Client applications can use AWS Virtual IP addresses(referred to as the VIP) to access the VPC in AWS environment.

By using VIP, clients do not need to be aware of switching access destination server when a failover occurs or moving a group migration.

AWS CLI command is executed for AWS Virtual IP resource when it is activated to update the route table information.

An AWS Elastic IP resource, an AWS Virtual IP resource, and an AWS DNS resource can be used together.

HA cluster with VIP control

This is used to place instances on private subnets (release business operations inside the VPC).

A configuration such as the following is assumed: Instances to be clustered, as well as the instance group accessing the instances, are placed on private subnets in each Availability Zone (referred to as AZ), and each instance can access the Internet via the NAT instance placed on the public subnet.

3.17.4. Applying environment variables to AWS CLI run from the AWS Virtual IP resource

Specify environment variables in the environment variable configuration file to apply environment variables to the AWS CLI run from the AWS Elastic IP resource, AWS Virtual IP resource, AWS Elastic IP monitor resource and AWS AZ monitor resource.

This feature is useful when using a proxy server in an AWS environment.

The envirionment variable configuration file is stored in the following location.

<EXPRESSCLUSTER Installation path>/cloud/aws/clpaws_setting.conf

The format of the environment variable configuration file is as follows:

Envirionment variable name = Value

(Example)

[ENVIRONMENT]
HTTP_PROXY = http://10.0.0.1:3128
HTTPS_PROXY = http://10.0.0.1:3128

To specify multiple values for a parameter, enter them in comma-delimited format. The following shows an example of specifying more than one non-destination for the environment variable NO_PROXY:

(Example)

NO_PROXY = 169.254.169.254,ec2.ap-northeast-1.amazonaws.com

The specifications of the environment variable configuration file are as follows:

  • Write [ENVIRONMENT] on the first line. If this is not set, the environment variables will not be set.

  • If the environment variable configuration file does not exist or you do not have read permission for the file, the variables are ignored. This does not cause an activation failure or a monitor error.

  • If the same environment variables already exist in the file, the values are overwritten.

  • More than one environment variable can be set. Set one environment variable on each line.

  • The settings are valid regardless of whether there are spaces before and after "=" or not.

  • The settings are invalid if there is a space or tab in front of the environment variable name or if there are tabs before and after "=".

  • Environment variable names are case sensitive.

  • Even if a value contains spaces, you do not have to enclose the value in "" (double quotation marks).

  • The environment variables configured with the environment variable configuration file are propagated only to the AWS CLI executed from an AWS Elastic IP resource, an AWS Virtual IP resource, an AWS DNS resource, an AWS Elastic IP monitor resource, an AWS Virtual IP monitor resource, an AWS DNS monitor resource, and an AWS AZ monitor resource. Therefore, the configured variables are not propagated to any other script (e.g. a script before final action, a script before and after activation/deactivation, and a script to be run from EXEC resources). To execute the AWS CLI with such a script, configure necessary environment variables with the corresponding script.

3.17.5. Details tab

IP Address (Within 45 bytes)

For VIP control, specify the VIP address to use. As the VIP address, an IP address not belonging to a CIDR in the VPC must be specified.

VPC ID (Within 45 bytes) Server Individual Setup

For VIP control, specify the VPC ID to which the server belongs. To specify an individual VPC ID for the servers, enter the VPC ID of any server on the Common tab and specify a VPC ID for the other servers individually.
For how to configure the routing, see the following:

" Configuring the VPC Environment" in the "EXPRESSCLUSTER X HA Cluster Configuration Guide for Amazon Web Services (Linux)"

ENI ID (Within 45 bytes) Server Individual Setup

For VIP control, specify the ENI ID of VIP routing destination. For the ENI ID to specify, Source/Dest. Check must be disabled beforehand. This must be set for each server. On the Common tab, enter the ENI ID of any server, and specify an ENI ID for the other servers individually.

AWS Virtual IP Resource Tuning Properties

Parameter tab

Timeout (1 to 999)

Set the timeout of the AWS CLI command to be executed for AWS Virtual IP resource activation/deactivation.

3.18. Understanding AWS DNS resources

3.18.1. Dependencies of AWS DNS resources

By default, this function does not depend on any group resource type.

3.18.2. What is an AWS DNS resource?

An AWS DNS resource registers an IP address corresponding to the virtual host name (DNS name) used in Amazon Web Services (hereinafter, referred to as "AWS") by executing AWS CLI at activation, and deletes it by executing AWS CLI at deactivation.

A client can access the node on which failover groups are active with the virtual host name.

By using AWS DNS resources, clients do not need to be aware of switching access destination node when a failover occurs or moving a group migration.

An AWS Elastic IP resource, an AWS Virtual IP resource, and an AWS DNS resource can be used together.

If using AWS DNS resources, you need to take the following preparations before establishing a cluster.

  • Creating Hosted Zone of Amazon Route 53

  • Installing AWS CLI

3.18.3. Notes on AWS DNS resources

3.18.4. Applying environment variables to AWS CLI run from the AWS DNS resource

Specify environment variables in the environment variable configuration file to apply environment variables to the AWS CLI run from the AWS Elastic IP resource, AWS Virtual IP resource, AWS Elastic IP monitor resource and AWS AZ monitor resource.

This feature is useful when using a proxy server in an AWS environment.

The envirionment variable configuration file is stored in the following location.

<EXPRESSCLUSTER Installation path>/cloud/aws/clpaws_setting.conf

The format of the environment variable configuration file is as follows:

Envirionment variable name = Value

(Example)

[ENVIRONMENT]
HTTP_PROXY = http://10.0.0.1:3128
HTTPS_PROXY = http://10.0.0.1:3128

To specify multiple values for a parameter, enter them in comma-delimited format. The following shows an example of specifying more than one non-destination for the environment variable NO_PROXY:

(Example)

NO_PROXY = 169.254.169.254,ec2.ap-northeast-1.amazonaws.com

The specifications of the environment variable configuration file are as follows:

  • Write [ENVIRONMENT] on the first line. If this is not set, the environment variables will not be set.

  • If the environment variable configuration file does not exist or you do not have read permission for the file, the variables are ignored. This does not cause an activation failure or a monitor error.

  • If the same environment variables already exist in the file, the values are overwritten.

  • More than one environment variable can be set. Set one environment variable on each line.

  • The settings are valid regardless of whether there are spaces before and after "=" or not.

  • The settings are invalid if there is a space or tab in front of the environment variable name or if there are tabs before and after "=".

  • Environment variable names are case sensitive.

  • Even if a value contains spaces, you do not have to enclose the value in "" (double quotation marks).

  • The environment variables configured with the environment variable configuration file are propagated only to the AWS CLI executed from an AWS Elastic IP resource, an AWS Virtual IP resource, an AWS DNS resource, an AWS Elastic IP monitor resource, an AWS Virtual IP monitor resource, an AWS DNS monitor resource, and an AWS AZ monitor resource. Therefore, the configured variables are not propagated to any other script (e.g. a script before final action, a script before and after activation/deactivation, and a script to be run from EXEC resources). To execute the AWS CLI with such a script, configure necessary environment variables with the corresponding script.

3.18.5. Details tab

Hosted Zone ID (within 255 bytes)

Specify a Hosted Zone ID of Amazon Route 53.

Resource Record Set Name (within 255 bytes)

Specify the name of DNS A record. Put a dot (.) at the end of the name. When an escape character is included in Resource Record Set Name, a monitor error occurs. Set Resource Record Set Name with no escape character. Specify the value of Resource Record Set Name in lowercase letters.

IP Address (within 39 bytes) Server Individual Setup

Specify the IP address corresponding to the virtual host name (DNS name) (IPv4). For using the IP address of each server, enter the IP address on the tab of each server. For configuring a setting for each server, enter the IP address of an arbitrary server on Common tab, and configure the individual settings for the other servers.

TTL (0 to 2147483647)

Specify the time to live (TTL) of the cache.

Delete a record set at deactivation

  • When the check box is selected (default):
    The record set is delete when it is deactivated.
  • When the check box is not selected:
    The record set is not deleted when it is deactivated. If it is not deleted, the remaining virtual host name (DNS name) may be accessed from a client.

Tuning

Opens the AWS DNS Resource Tuning Properties dialog box where you can make detailed settings for the AWS DNS resource.

AWS DNS Resource Tuning Properties

Parameter tab

Timeout (1 to 999)

Make the setting of the timeout of AWS CLI command executed for the activation and/or deactivation of the AWS DNS resource.

3.19. Understanding Azure probe port resources

3.19.1. Dependencies of Azure probe port resources

By default, this function does not depend on any group resource type.

3.19.2. What is an Azure probe port resource?

Client applications can use the global IP address called a public virtual IP (VIP) address (referred to as a VIP in the remainder of this document) to access virtual machines on an availability set in the Microsoft Azure environment.

By using VIP, clients do not need to be aware of switching access destination server when a failover occurs or moving a group migration.

To access the cluster created on the Microsoft Azure environment in the figure above, specify the end point for communicating from the outside with VIP or the end point for communicating from the outside with the DNS name. The active and standby nodes of the cluster are switched by controlling the Microsoft Azure load balancer (Load Balancer in the figure above) from EXPRESSCLUSTER. For control, Health Check is used.

At activation, start the probe port control process for waiting for alive monitoring (access to the probe port) from the Azure load balancer.

At deactivation, stop the probe port control process for waiting for alive monitoring (access to the probe port).

Azure probe port resources also support the Internal Load Balancing of Microsoft Azure. For Internal Load Balancing, the VIP is the private IP address of Azure.

3.19.3. Notes on Azure probe port resources

3.19.4. Details tab

Probeport (1 to 65535)

Specify the port number used by the Azure load balancer for the alive monitoring of each server. Specify the value specified for Probe Port when creating an end point. For Probe Protocol, specify TCP.

Tuning

Display the Azure probe port Resource Tuning Properties dialog box. Specify detailed settings for the Azure probe port resources.

Azure Probe Port Resource Tuning Properties

Parameter tab

Probe wait timeout (5 to 999999999)

Specify the timeout time for waiting alive monitoring from the Azure load balancer. Check if alive monitoring is performed periodically from the Azure load balancer.

3.20. Understanding Azure DNS resources

3.20.1. Dependencies of Azure DNS resources

By default, this function does not depend on any group resource type.

3.20.2. What is an Azure DNS resource?

An Azure DNS resource controls an Azure DNS record set and DNS A record to obtain an IP address set from the virtual host name (DNS name).

A client can access the node on which failover groups are active with the virtual host name.

By using Azure DNS resources, clients do not need to be aware of switching access destination node on Azure DNS when a failover occurs or moving a group migration.

If using Azure DNS resources, you need to take the following preparations before establishing a cluster. For details, see "EXPRESSCLUSTER X HA Cluster Configuration Guide for Microsoft Azure (Linux)".

  • Creating Microsoft Azure Resource Group and DNS zone

  • Installing Azure CLI

    Use Azure CLI (Azure CLI 1.0) for Red Hat Enterprise Linux 6 and OS with compatibility.

    Use Azure CLI (Azure CLI 2.0) for Red Hat Enterprise Linux 7 and OS with compatibility.

  • Installing Python (only when Azure CLI 2.0 is used)

3.20.3. Notes on Azure DNS resources

3.20.4. Details tab

Record Set Name (within 253 bytes)

Specify the name of the record set in which Azure DNS A record is registered.

Zone Name (within 253 bytes)

Specify the name of the DNS zone to which the record set of Azure DNS belongs.

IP Address (within 39 bytes) Server Individual Setup

Specify the IP address corresponding to the virtual host name (DNS name) (IPv4). For using the IP address of each server, enter the IP address on the tab of each server. For configuring a setting for each server, enter the IP address of an arbitrary server on Common tab, and configure the individual settings for the other servers.

TTL (0 to 2147483647)

Specify the time to live (TTL) of the cache.

Resource Group Name (within 180 bytes)

Specify the name of Microsoft Azure Resource Group to which the DNS zone belongs.

User URI (within 2083 bytes)

Specify the user URI to log on to Microsoft Azure.

Tenant ID (within 36 bytes)

Specify the tenant ID to log on to Microsoft Azure.

File Path of Service Principal (within 1023 bytes)

Specify the file name of the service principal to log in to Microsoft Azure (file name of the credential. Specify with an absolute path.

Thumbprint of Service Principal (within 256 bytes)

Specify the service principal to log in to Microsoft Azure (Thumbprint on Certificate). Enter only when using Azure CLI 1.0.

Azure CLI File Path (within 1023 bytes)

Specify the installation path of Azure CLI and the file name. Specify with an absolute path.

Delete a record set at deactivation

  • When the check box is selected (default):
    The record set is deleted when it is deactivated.
  • When the check box is not selected:
    The record set is not deleted when it is deactivated. If it is not deleted, the remaining virtual host name (DNS name) may be accessed from a client.

Tuning

Opens the AWS DNS Resource Tuning Properties dialog box where you can make detailed settings for the Azure DNS resource.

Server separate setting

Opens the Server Separate Setting dialog box. An IP address different depending on servers is set.

Azure DNS Resource Tuning Properties

Parameter tab

Timeout (1 to 999)

Make the setting of the timeout of the Azure CLI command executed for the activation and/or deactivation of the Azure DNS resource.

3.21. Understanding Google Cloud Virtual IP resources

3.21.1. Dependencies of Google Cloud Virtual IP resources

By default, this function does not depend on any group resource type.

3.21.2. What is an Google Cloud Virtual IP resource?

For virtual machines in the Google Cloud Platform environment, client applications can use a virtual IP (VIP) address to connect to the node that constitutes a cluster. Using the VIP address eliminates the need for clients to be aware of switching between the virtual machines even after a failover or a group migration occurs.

To access the cluster created in the Google Cloud Platform environment as in the figure above, specify the port for communicating from the outside as well as the VIP address or DNS name. The active and standby nodes of the cluster are switched by controlling the load balancer of Google Cloud Platform (Cloud Load Balancing in the figure above) from EXPRESSCLUSTER. For this control, Health Check (in the figure above) is used.

At activation, start the control process for awaiting a health check from the load balancer of Google Cloud Platform, and open the port specified in Port Number.

At deactivation, stop the control process for awaiting the health check, and close the port specified in Port Number.

Google Cloud virtual IP resources support the internal load balancing of Google Cloud Platform.

3.21.3. Notes on Google Cloud Virtual IP resources

3.21.4. Details tab

Port Number (1 to 65535)

Specify a port number to be used by the load balancer of Google Cloud Platform for the health check of each node: the value specified as the port number in configuring the load balancer for health checks. For the load balancer, specify TCP load balancing.

Tuning

Displays the Google Cloud Virtual IP Resource Tuning Properties dialog box, where you can make advanced settings for the Google Cloud virtual IP resource.

Google Cloud Virtual IP Resource Tuning Properties

Health check timeout (5 to 999999999)**

Specify a timeout value for awaiting a health check from the load balancer of Google Cloud Platform, in order to check whether the load balancer periodically performs health checks.

3.22. Understanding Oracle Cloud Virtual IP resources

3.22.1. Dependencies of Oracle Cloud Virtual IP resources

By default, this function does not depend on any group resource type.

3.22.2. What is an Oracle Cloud Virtual IP resource?

For virtual machines in the Oracle Cloud Infrastructure environment, client applications can use a public virtual IP (VIP) address to connect to the node that constitutes a cluster. Using the VIP address eliminates the need for clients to be aware of switching between the virtual machines even after a failover or a group migration occurs.

To access the cluster created in the Oracle Cloud Infrastructure environment as in the figure above, specify the port for communicating from the outside as well as the VIP (global IP) address or DNS name. The active and standby nodes of the cluster are switched by controlling the load balancer of Oracle Cloud Infrastructure (Load Balancer in the figure above) from EXPRESSCLUSTER. For this control, Health Check (in the figure above) is used.

At activation, start the control process for awaiting a health check from the load balancer of Oracle Cloud Infrastructure, and open the port specified in Port Number.

At deactivation, stop the control process for awaiting the health check, and close the port specified in Port Number.

Oracle Cloud virtual IP resources also support private load balancers of Oracle Cloud Infrastructure. For a private load balancer, the VIP address is the private IP address of Oracle Cloud Infrastructure.

3.22.3. Notes on Oracle Cloud Virtual IP resources

3.22.4. Details tab

Port Number (1 to 65535)

Specify a port number to be used by the load balancer of Oracle Cloud Infrastructure for the health check of each node: the value specified as the port number in configuring the backend set for health checks. For the health check protocol, specify TCP.

Tuning

Displays the Oracle Cloud Virtual IP Resource Tuning Properties dialog box, where you can make advanced settings for the Oracle Cloud virtual IP resource.

Oracle Cloud Virtual IP Resource Tuning Properties

Health check timeout (5 to 999999999)**

Specify a timeout value for awaiting a health check from the load balancer of Oracle Cloud Infrastructure, in order to check whether the load balancer periodically performs health checks.