1. Preface¶
1.1. Who Should Use This Guide¶
EXPRESSCLUSTER X Getting Started Guide is intended for first-time users of the EXPRESSCLUSTER. The guide covers topics such as product overview of the EXPRESSCLUSTER, how the cluster system is installed, and the summary of other available guides. In addition, latest system requirements and restrictions are described.
1.2. How This Guide is Organized¶
2. What is a cluster system?: Helps you to understand the overview of the cluster system.
3. Using EXPRESSCLUSTER: Provides instructions on how to use EXPRESSCLUSTER and other related-information.
4. Installation requirements for EXPRESSCLUSTER: Provides the latest information that needs to be verified before starting to use EXPRESSCLUSTER.
5. Latest version information: Provides information on latest version of the EXPRESSCLUSTER.
6. Notes and Restrictions: Provides information on known problems and restrictions..
1.3. EXPRESSCLUSTER X Documentation Set¶
The EXPRESSCLUSTER X manuals consist of the following four guides. The title and purpose of each guide is described below:
Getting Started Guide
This guide is intended for all users. The guide covers topics such as product overview, system requirements, and known problems.
Installation and Configuration Guide
This guide is intended for system engineers and administrators who want to build, operate, and maintain a cluster system. Instructions for designing, installing, and configuring a cluster system with EXPRESSCLUSTER are covered in this guide.
This guide is intended for system administrators. The guide covers topics such as how to operate EXPRESSCLUSTER, function of each module and troubleshooting. The guide is supplement to the "Installation and Configuration Guide".
This guide is intended for administrators and for system administrators who want to build, operate, and maintain EXPRESSCLUSTER-based cluster systems. The guide describes maintenance-related topics for EXPRESSCLUSTER.
1.4. Conventions¶
In this guide, Note, Important, See also are used as follows:
Note
Used when the information given is important, but not related to the data loss and damage to the system and machine.
Important
Used when the information given is necessary to avoid the data loss and damage to the system and machine.
See also
Used to describe the location of the information given at the reference destination.
The following conventions are used in this guide.
Convention |
Usage |
Example |
---|---|---|
Bold |
Indicates graphical objects, such as fields, list boxes, menu selections, buttons, labels, icons, etc. |
In User Name, type your name.
On the File menu, click Open Database.
|
Angled bracket within the command line |
Indicates that the value specified inside of the angled bracket can be omitted. |
|
Monospace |
Indicates path names, commands, system output (message, prompt, etc), directory, file names, functions and parameters. |
|
bold |
Indicates the value that a user actually enters from a command line. |
Enter the following:
clpcl -s -a
|
italic |
Indicates that users should replace italicized part with values that they are actually working with. |
|
In the figures of this guide, this icon represents EXPRESSCLUSTER.
1.5. Contacting NEC¶
For the latest product information, visit our website below:
2. What is a cluster system?¶
This chapter describes overview of the cluster system.
This chapter covers:
2.1. Overview of the cluster system¶
A key to success in today's computerized world is to provide services without them stopping. A single machine down due to a failure or overload can stop entire services you provide with customers. This will not only result in enormous damage but also in loss of credibility you once had.
Introducing a cluster system allows you to minimize the period during which your system stops (down time) or to improve availability by load distribution.
As the word "cluster" represents, a system aiming to increase reliability and performance by clustering a group (or groups) of multiple computers. There are various types of cluster systems, which can be classified into following three listed below. EXPRESSCLUSTER is categorized as a high availability cluster.
- High Availability (HA) ClusterIn this cluster configuration, one server operates as an active server. When the active server fails, a stand-by server takes over the operation. This cluster configuration aims for high-availability. The high availability cluster is available in the shared disk type and the mirror disk type.
- Load Distribution ClusterThis is a cluster configuration where requests from clients are allocated to each of the nodes according to appropriate load distribution rules. This cluster configuration aims for high scalability. Generally, data cannot be passed. The load distribution cluster is available in a load balance type or parallel database type.
- High Performance Computing (HPC) ClusterThis is a cluster configuration where the computation amount is huge and a single operation is performed with a super computer. CPUs of all nodes are used to perform a single operation.
2.2. High Availability (HA) cluster¶
To enhance the availability of a system, it is generally considered that having redundancy for components of the system and eliminating a single point of failure is important. "Single point of failure" is a weakness of having a single computer component (hardware component) in the system. If the component fails, it will cause interruption of services. The high availability (HA) cluster is a cluster system that minimizes the time during which the system is stopped and increases operational availability by establishing redundancy with multiple nodes.
The HA cluster is called for in mission-critical systems where downtime is fatal. The HA cluster can be divided into two types: shared disk type and mirror disk type. The explanation for each type is provided below.
The HA cluster can be divided into two types: shared disk type and data mirror type. The explanation for each type is provided below.
2.2.2. Mirror disk type¶
The shared disk type cluster system is good for large-scale systems. However, creating a system with this type can be costly because shared disks are generally expensive. The mirror disk type cluster system provides the same functions as the shared disk type with smaller cost through mirroring of server disks.
The mirror disk type is not recommended for large-scale systems that handle a large volume of data since data needs to be mirrored between servers.
When a write request is made by an application, the data mirror engine writes data in the local disk and sends the written data to the stand-by server via the interconnect. Interconnect is a cable connecting servers. It is used to monitor whether the server is activated or not in the cluster system. In addition to this purpose, interconnect is sometimes used to transfer data in the data mirror type cluster system. The data mirror engine on the stand-by server achieves data synchronization between stand-by and active servers by writing the data into the local disk of the stand-by server.
For read requests from an application, data is simply read from the disk on the active server.
Snapshot backup is applied usage of data mirroring. Because the data mirror type cluster system has shared data in two locations, you can keep the data of the stand-by server as snapshot backup by simply separating the server from the cluster.
HA cluster mechanism and problems
The following sections describe cluster implementation and related problems.
2.3. System configuration¶
In a shared disk-type cluster, a disk array device is shared between the servers in a cluster. When an error occurs on a server, the standby server takes over the applications using the data on the shared disk.
In the mirror disk type cluster, a data disk on the cluster server is mirrored via the network. When an error occurs on a server, the applications are taken over using the mirror data on the stand-by server. Data is mirrored for every I/O. Therefore, the mirror disk type cluster appears the same as the shared disk viewing from a high level application.
The following the shared disk type cluster configuration.
A failover-type cluster can be divided into the following categories depending on the cluster topologies:
Uni-Directional Standby Cluster System
In the uni-directional standby cluster system, the active server runs applications while the other server, the standby server, does not. This is the simplest cluster topology and you can build a high-availability system without performance degradation after failing over.
Multi-directional standby cluster system with the same application
In the same application multi-directional standby cluster system, the same applications are activated on multiple servers. These servers also operate as standby servers. These applications are operated on their own. When a failover occurs, the same applications are activated on one server. Therefore, the applications that can be activated by this operation need to be used. When the application data can be split into multiple data, depending on the data to be accessed, you can build a load distribution system per data partitioning basis by changing the client's connecting server.
Multi-directional standby cluster system with different applications
In the different application multi-directional standby cluster system, different applications are activated on multiple servers and these servers operate as standby servers. When a failover occurs, two or more applications are activated on one server. Therefore, these applications need to be able to coexist. You can build a load distribution system per application unit basis.
Application A and Application B are different applications.
N-to-N Configuration
The configuration can be expanded with more nodes by applying the configurations introduced thus far. In an N-to-N configuration described below, three different applications are run on three servers and one standby server takes over the application if any problem occurs. In a uni-directional standby cluster system, the stand-by server does not operate anything, so one of the two server functions as a stand-by server. However, in an N-to N configuration, only one of the four servers functions as a stand-by server. Performance deterioration is not anticipated if an error occurs only on one server.
2.4. Error detection mechanism¶
Cluster software executes failover (for example, passing operations) when a failure that can affect continued operation is detected. The following section gives you a quick view of how the cluster software detects a failure.
EXPRESSCLUSTER regularly checks whether other servers are properly working in the cluster system. This function is called "heartbeat communication."
Heartbeat and detection of server failures
Failures that must be detected in a cluster system are failures that can cause all servers in the cluster to stop. Server failures include hardware failures such as power supply and memory failures, and OS panic. To detect such failures, the heartbeat is used to monitor whether the server is active or not.
Some cluster software programs use heartbeat not only for checking if the target is active through ping response, but for sending status information on the local server. Such cluster software programs begin failover if no heartbeat response is received in heartbeat transmission, determining no response as server failure. However, grace time should be given before determining failure, since a highly loaded server can cause delay of response. Allowing grace period results in a time lag between the moment when a failure occurred and the moment when the failure is detected by the cluster software.
Detection of resource failures
Factors causing stop of operations are not limited to stop of all servers in the cluster. Failure in disks used by applications, NIC failure, and failure in applications themselves are also factors that can cause the stop of operations. These resource failures need to be detected as well to execute failover for improved availability.
Accessing a target resource is used to detect resource failures if the target is a physical device. For monitoring applications, trying to service ports within the range not affecting operation is a way of detecting an error in addition to monitoring if application processes are activated.
2.4.2. Network partition (Split-Brain Syndrome)¶
When all interconnects between servers are disconnected, it is not possible to tell if a server is down, only by monitoring if it is activated by a heartbeat. In this status, if a failover is performed and multiple servers mount a file system simultaneously considering the server has been shut down, data on the shared disk may be corrupted.
The problem explained in the section above is referred to as "network partition" or "Split Brain Syndrome." To resolve this problem, the failover cluster system is equipped with various mechanisms to ensure shared disk lock at the time when all interconnects are disconnected.
2.5. Inheriting cluster resources¶
As mentioned earlier, resources to be managed by a cluster include disks, IP addresses, and applications. The functions used in the failover cluster system to inherit these resources are described below.
2.5.1. Inheriting data¶
In the shared disk type cluster, data to be passed from a server to another in a cluster system is stored in a partition in a shared disk. This means inheriting data is re-mounting the file system of files that the application uses from a healthy server. What the cluster software should do is simply mount the file system because the shared disk is physically connected to a server that inherits data.
The diagram above (Figure 2.16 Inheriting data) may look simple. Consider the following issues in designing and creating a cluster system.
One issue to consider is recovery time for a file system or database. A file to be inherited may have been used by another server or to be updated just before the failure occurred. For this reason, a cluster system may need to do consistency checks to data it is moving on some file systems, as well as it may need to rollback data for some database systems. These checks are not cluster system-specific, but required in many recovery processes, including when you reboot a single server that has been shut down due to a power failure. If this recovery takes a long time, the time is wholly added to the time for failover (time to take over operation), and this will reduce system availability.
Another issue you should consider is writing assurance. When an application writes data into the shared disk, usually the data is written through a file system. However, even though the application has written data - but the file system only stores it on a disk cache and does not write into the shared disk - the data on the disk cache will not be inherited to a stand-by server when an active server shuts down. For this reason, it is required to write important data that needs to be inherited to a stand-by server into a disk, by using a function such as synchronous writing. This is same as preventing the data becoming volatile when a single server shuts down. Namely, only the data registered in the shared disk is inherited to a stand-by server, and data on a memory disk such as a disk cache is not inherited. The cluster system needs to be configured considering these issues.
2.5.2. Inheriting IP addresses¶
When a failover occurs, it does not have to be concerned which server is running operations by inheriting IP addresses. The cluster software inherits the IP addresses for this purpose.
2.5.3. Inheriting applications¶
The last to come in inheritance of operation by cluster software is inheritance of applications. Unlike fault tolerant computers (FTC), no process status such as contents of memory is inherited in typical failover cluster systems. The applications running on a failed server are inherited by rerunning them on a healthy server.
For example, when the database instance is failed over, the database that is started in the stand-by server can not continue the exact processes and transactions that have been running in the failed server, and roll-back of transaction is performed in the same as restarting the database after it was down. It is required to connect to the database again from the client. The time needed for this database recovery is typically a few minutes though it can be controlled by configuring the interval of DBMS checkpoint to a certain extent.
Many applications can restart operations by re-execution. Some applications, however, require going through procedures for recovery if a failure occurs. For these applications, cluster software allows to start up scripts instead of applications so that recovery process can be written. In a script, the recovery process, including cleanup of files half updated, is written as necessary according to factors for executing the script and information on the execution server.
2.5.4. Summary of failover¶
To summarize the behavior of cluster software:
Detects a failure (heartbeat/resource monitoring)
Performs fencing (resolves a network partition (NP resolution) and disconnects the failed server)
Pass data
Pass IP address
Pass applications
Cluster software is required to complete each task quickly and reliably (see Figure 2.17 Failover time chart) Cluster software achieves high availability with due consideration on what has been described so far.
2.6. Eliminating single point of failure¶
Having a clear picture of the availability level required or aimed is important in building a high availability system. This means when you design a system, you need to study cost effectiveness of countermeasures, such as establishing a redundant configuration to continue operations and recovering operations within a short period, against various failures that can disturb system operations.
Single point of failure (SPOF), as described previously, is a component where failure can lead to stop of the system. In a cluster system, you can eliminate the system's SPOF by establishing server redundancy. However, components shared among servers, such as shared disk may become a SPOF. The key in designing a high availability system is to duplicate or eliminate this shared component.
A cluster system can improve availability but failover will take a few minutes for switching systems. That means time for failover is a factor that reduces availability. Solutions for the following three, which are likely to become SPOF, will be discussed hereafter although technical issues that improve availability of a single server such as ECC memory and redundant power supply are important.
Shared disk
Access path to the shared disk
LAN
2.6.3. LAN¶
In any systems that run services on a network, a LAN failure is a major factor that disturbs operations of the system. If appropriate settings are made, availability of cluster system can be increased through failover between nodes at NIC failures. However, a failure in a network device that resides outside the cluster system disturbs operation of the system.
In the case of this above figure, even if NIC on the server has a failure, a failover will keep the access from the PC to the service on the server.
In the case of this above figure, if the router has a failure, the access from the PC to the service on the server cannot be maintained (Router becomes a SPOF).
LAN redundancy is a solution to tackle device failure outside the cluster system and to improve availability. You can apply ways used for a single server to increase LAN availability. For example, choose a primitive way to have a spare network device with its power off, and manually replace a failed device with this spare device. Choose to have a multiplex network path through a redundant configuration of high-performance network devices, and switch paths automatically. Another option is to use a driver that supports NIC redundant configuration such as Intel's ANS driver.
Load balancing appliances and firewall appliances are also network devices that are likely to become SPOF. Typically, they allow failover configurations through standard or optional software. Having redundant configuration for these devices should be regarded as requisite since they play important roles in the entire system.
2.7. Operation for availability¶
2.7.1. Evaluation before starting operation¶
Given many of factors causing system troubles are said to be the product of incorrect settings or poor maintenance, evaluation before actual operation is important to realize a high availability system and its stabilized operation. Exercising the following for actual operation of the system is a key in improving availability:
Clarify and list failures, study actions to be taken against them, and verify effectiveness of the actions by creating dummy failures.
Conduct an evaluation according to the cluster life cycle and verify performance (such as at degenerated mode)
Arrange a guide for system operation and troubleshooting based on the evaluation mentioned above.
Having a simple design for a cluster system contributes to simplifying verification and improvement of system availability.
2.7.2. Failure monitoring¶
Despite the above efforts, failures still occur. If you use the system for long time, you cannot escape from failures: hardware suffers from aging deterioration and software produces failures and errors through memory leaks or operation beyond the originally intended capacity. Improving availability of hardware and software is important yet monitoring for failure and troubleshooting problems is more important. For example, in a cluster system, you can continue running the system by spending a few minutes for switching even if a server fails. However, if you leave the failed server as it is, the system no longer has redundancy and the cluster system becomes meaningless should the next failure occur.
If a failure occurs, the system administrator must immediately take actions such as removing a newly emerged SPOF to prevent another failure. Functions for remote maintenance and reporting failures are very important in supporting services for system administration.
To achieve high availability with a cluster system, you should:
Remove or have complete control on single point of failure.
Have a simple design that has tolerance and resistance for failures, and be equipped with a guide for operation and troubleshooting.
Detect a failure quickly and take appropriate action against it.
3. Using EXPRESSCLUSTER¶
This chapter explains the components of EXPRESSCLUSTER, how to design a cluster system, and how to use EXPRESSCLUSTER.
This chapter covers:
3.1. What is EXPRESSCLUSTER?¶
EXPRESSCLUSTER is software that enables the HA cluster system.
3.2. EXPRESSCLUSTER modules¶
EXPRESSCLUSTER consists of following two modules:
- EXPRESSCLUSTER ServerA core component of EXPRESSCLUSTER. Install this to the server machines that constitute the cluster system. This includes all high availability functions of EXPRESSCLUSTER. The server functions of the Cluster WebUI are also included.
- Cluster WebUIThis is a tool to create the configuration data of EXPRESSCLUSTER and to manage EXPRESSCLUSTER operations. Uses a Web browser as a user interface. The Cluster WebUI is installed in EXPRESSCLUSTER Server, but it is distinguished from the EXPRESSCLUSTER Server because the Cluster WebUI is operated from the Web browser on the management PC.
3.3. Software configuration of EXPRESSCLUSTER¶
The software configuration of EXPRESSCLUSTER should look similar to the figure below. Install the EXPRESSCLUSTER Server (software) on a server that constitutes a cluster. Because the main functions of Cluster WebUI are included in EXPRESSCLUSTER Server, it is not necessary to separately install them. The Cluster WebUI can be used through the web browser on the management PC or on each server in the cluster.
EXPRESSCLUSTER Server (Main module)
Cluster WebUI
3.3.1. How an error is detected in EXPRESSCLUSTER¶
There are three kinds of monitoring in EXPRESSCLUSTER: (1) server monitoring, (2) application monitoring, and (3) internal monitoring. These monitoring functions let you detect an error quickly and reliably. The details of the monitoring functions are described below.
3.3.2. What is server monitoring?¶
- Primary InterconnectLAN dedicated to communication between the cluster servers. This is used to exchange information between the servers as well as to perform heartbeat communication.
- Secondary InterconnectThis is used as a path to be used for the communicating with a client. This is used for exchanging data between the servers as well as for a backup interconnects.
- WitnessThis is used by the external Witness server running the Witness server service to check if other servers constructing the failover-type cluster exist through communication with them.
3.3.3. What is application monitoring?¶
Application monitoring is a function that monitors applications and factors that cause a situation where an application cannot run.
- Monitoring applications and/or protocols to see if they are stalled or failed by using the monitoring option.In addition to the basic monitoring of successful startup and existence of applications, you can even monitor stall and failure in applications including specific databases (such as Oracle, DB2), protocols (such as FTP, HTTP) and / or application servers (such as WebSphere, WebLogic) by introducing optional monitoring products of EXPRESSCLUSTER. For the details, see "Monitor resource details" in the "Reference Guide".
- Monitoring activation status of applicationsAn error can be detected by starting up an application by using an application-starting resource (called application resource and service resource) of EXPRESSCLUSTER and regularly checking whether the process is active or not by using application-monitoring resource (called application monitor resource and service monitor resource). It is effective when the factor for application to stop is due to error termination of an application.
Note
An error in resident process cannot be detected in an application started up by EXPRESSCLUSTER.
Note
An internal application error (for example, application stalling and result error) cannot be detected.
- Resource monitoringAn error can be detected by monitoring the cluster resources (such as disk partition and IP address) and public LAN using the monitor resources of the EXPRESSCLUSTER. It is effective when the factor for application to stop is due to an error of a resource that is necessary for an application to operate.
3.3.4. What is internal monitoring?¶
Internal monitoring refers to an inter-monitoring of modules within EXPRESSCLUSTER. It monitors whether each monitoring function of EXPRESSCLUSTER is properly working. Activation status of EXPRESSCLUSTER process monitoring is performed within EXPRESSCLUSTER.
Monitoring activation status of an EXPRESSCLUSTER process
3.3.5. Monitorable and non-monitorable errors¶
There are monitorable and non-monitorable errors in EXPRESSCLUSTER. It is important to know what kind of errors can or cannot be monitored when building and operating a cluster system.
3.3.6. Detectable and non-detectable errors by server monitoring¶
Monitoring conditions: A heartbeat from a server with an error is stopped
Example of errors that can be monitored:
Hardware failure (of which OS cannot continue operating)
Stop error
Example of error that cannot be monitored:
Partial failure on OS (for example, only a mouse or keyboard does not function)
3.3.7. Detectable and non-detectable errors by application monitoring¶
Monitoring conditions: Termination of application with errors, continuous resource errors, disconnection of a path to the network devices.
Example of errors that can be monitored:
Abnormal termination of an application
Failure to access the shared disk (such as HBA failure)
Public LAN NIC problem
Example of errors that cannot be monitored:
- Application stalling and resulting in error.EXPRESSCLUSTER cannot monitor application stalling and error results 1. However, it is possible to perform failover by creating a program that monitors applications and terminates itself when an error is detected, starting the program using the application resource, and monitoring application using the application monitor resource.
- 1
Stalling and error results can be monitored for the database applications (such as Oracle, DB2), the protocols (such as FTP, HTTP) and application servers (such as WebSphere and WebLogic) that are handled by a monitoring option.
3.4. Fencing Function¶
EXPRESSCLUSTER's fencing function consists of network partition resolution and forced stopping.
3.4.1. Network partition resolution¶
PING method
HTTP method
Shared disk method
PING + shared disk method
Majority method
Not solving the network partition
See also
For the details on the network partition resolution method, see "Details on network partition resolution resources" in the "Reference Guide".
3.4.2. Forced stop¶
When a server failure is detected, a healthy server can send a stop request to the failed server. Making the failed server stop eliminates the possibility of simultaneously starting business applications on two or more servers. The forced stop is made before a failover is started.
See also
For the details on the forced stop function, see "Forced stop resource details" in the "Reference Guide".
3.5. Failover mechanism¶
Upon detecting that a heartbeat from a server is interrupted, EXPRESSCLUSTER determines whether the cause of this interruption is an error in a server or a network partition before starting a failover. Then a failover is performed by activating various resources and starting up applications on a properly working server.
The group of resources which fail over at the same time is called a "failover group." From a user's point of view, a failover group appears as a virtual computer.
Note
In a cluster system, a failover is performed by restarting the application from a properly working node. Therefore, what is saved in an application memory cannot be failed over.
From occurrence of error to completion of failover takes a few minutes. See the time-chart below:
Heartbeat timeout
The time for a standby server to detect an error after that error occurred on the active server.
The setting values of the cluster properties should be adjusted depending on the delay caused by application load. (The default value is 30 seconds.)
Fencing
The time for network partition resolution and forced stopping.
For network partition resolution, EXPRESSCLUSTER checks whether stop of heartbeat (heartbeat timeout) detected from the other server is due to a network partition or an error in the other server.Confirmation completes immediately. For forced stopping, a stop request is sent to the server that is recognized to be the failure source.How long it will take varies depending on the cluster's operating environment such as a physical one, a virtual one, or the cloud.
Activating resources
The time to activate the resources necessary for operating an application.
The file system recovery, transfer of the data in disks, and transfer of IP addresses are performed.
The resources can be activated in a few seconds in ordinary settings, but the required time changes depending on the type and the number of resources registered to the failover group. For more information, see the "Installation and Configuration Guide".
Recovering and restarting applications
The startup time of the application to be used in operation. The data recovery time such as a roll-back or roll-forward of the database is included.
The time for roll-back or roll-forward can be predicted by adjusting the check point interval. For more information, refer to the document that comes with each software product.
3.5.2. Hardware configuration of the mirror disk type cluster configured by EXPRESSCLUSTER¶
The mirror disk type cluster is an alternative to the shared disk device, by mirroring the partition on the server disks. This is good for the systems that are smaller-scale and lower-budget, compared to the shared disk type cluster.
Note
To use a mirror disk, it is a requirement to purchase the Replicator option or the Replicator DR option.
Sample cluster environment with mirror disks used (when the cluster partitions and data partitions are allocated to the OS-installed disks)
In the following configuration, free partitions of the OS-installed disks are used as cluster partitions and data partitions.
FIP1
10.0.0.11 (Access destination from the Cluster WebUI client)
FIP2
10.0.0.12 (Access destination from the operation client)
NIC1-1
192.168.0.1
NIC1-2
10.0.0.1
NIC2-1
192.168.0.2
NIC2-2
10.0.0.2
Drive letter of the cluster partition
E
File system
RAW
Drive letter of the data partition
F
File system
NTFS
Sample cluster environment with mirror disks used (when disks are prepared for cluster partitions and data partitions)
In the following configuration, disks are prepared for cluster partitions and data partitions and connected to the servers.
FIP1
10.0.0.11 (Access destination from the Cluster WebUI client)
FIP2
10.0.0.12 (Access destination from the operation client)
NIC1-1
192.168.0.1
NIC1-2
10.0.0.1
NIC2-1
192.168.0.2
NIC2-2
10.0.0.2
Drive letter of the cluster partition
E
File system
RAW
Drive letter of the data partition
F
File system
NTFS
3.5.3. Hardware configuration of the hybrid disk type cluster configured by EXPRESSCLUSTER¶
By combining the shared disk type and the mirror disk type and mirroring the partitions on the shared disk, this configuration allows the ongoing operation even if a failure occurs on the shared disk device. Mirroring between remote sites can also serve as a disaster countermeasure.
Note
To use the hybrid disk type configuration, it is a requirement to purchase the Replicator DR option.
Sample cluster environment with hybrid disks used (a shared disk is used by two servers and the data is mirrored to the normal disk of the third server)
FIP1
10.0.0.11 (Access destination from the Cluster WebUI client)
FIP2
10.0.0.12 (Access destination from the operation client)
NIC1-1
192.168.0.1
NIC1-2
10.0.0.1
NIC2-1
192.168.0.2
NIC2-2
10.0.0.2
NIC3-1
192.168.0.3
NIC3-2
10.0.0.3
Shared disk
Drive letter of the partition for heartbeat
E
File system
RAW
Drive letter of the cluster partition
F
File system
RAW
Drive letter of the data partition
G
File system
NTFS
The above figure shows a sample of the cluster environment where a shared disk is mirrored in the same network. While the hybrid disk type configuration mirrors between server groups that are connected to the same shared disk device, the sample above mirrors the shared disk to the local disk in server3. Because of this, the stand-by server group svg2 has only one member server, server3.
VIP1 |
10.0.0.11 (Access destination from the Cluster WebUI client) |
VIP2 |
10.0.0.12 (Access destination from the operation client) |
NIC1-1 |
192.168.0.1 |
NIC1-2 |
10.0.0.1 |
NIC2-1 |
192.168.0.2 |
NIC2-2 |
10.0.0.2 |
NIC3-1 |
192.168.0.3 |
NIC3-2 |
10.0.0.3 |
Shared disk
Drive letter of the partition for heartbeat
E
File system
RAW
Drive letter of the cluster partition
F
File system
RAW
Drive letter of the data partition
G
File system
NTFS
The above sample shows a sample of the cluster environment where mirroring is performed between remote sites. This sample uses virtual IP addresses but not floating IP addresses because the server groups have different network segments of the Public-LAN. When a virtual IP address is used, all the routers located in between must be configured to pass on the host route. The mirror disk connect communication transfers the write data to the disk as it is. It is recommended to enable use a VPN with a dedicated line or the compression and encryption functions.
3.5.4. What is cluster object?¶
In EXPRESSCLUSTER, the various resources are managed as the following groups:
- Cluster objectConfiguration unit of a cluster.
- Server objectIndicates the physical server and belongs to the cluster object.
- Server group objectIndicates a group that bundles servers and belongs to the cluster object. This object is required when a hybrid disk resource is used.
- Heartbeat resource objectIndicates the network part of the physical server and belongs to the server object.
- Network partition resolution resource objectIndicates the network partition resolution mechanism and belongs to the server object.
- Group objectIndicates a virtual server and belongs to the cluster object.
- Group resource objectIndicates resources (network, disk) of the virtual server and belongs to the group object.
- Monitor resource objectIndicates monitoring mechanism and belongs to the cluster object.
3.6. What is a resource?¶
In EXPRESSCLUSTER, a group used for monitoring the target is called "resources." The resources that perform monitoring and those to be monitored are classified into two groups and managed. There are four types of resources and are managed separately. Having resources allows distinguishing what is monitoring and what is being monitored more clearly. It also makes building a cluster and handling an error easy. The resources can be divided into heartbeat resources, network partition resolution resources, group resources, and monitor resources.
See also
For the details of each resource, see the "Reference Guide".
3.6.1. Heartbeat resources¶
Heartbeat resources are used for verifying whether the other server is working properly between servers. The following heartbeat resources are currently supported:
- LAN heartbeat resourceUses Ethernet for communication.
- Witness heartbeat resourceUses the external server running the Witness server service to show the status (of communication with each server) obtained from the external server.
3.6.2. Network partition resolution resources¶
The following resource is used to resolve a network partition:
- DISK network partition resolution resourceThis is a network partition resolution resource by the DISK method and can be used only for the shared disk configuration.
- PING network partition resolution resourceThis is a network partition resolution resource by the PING method.
- HTTP network partition resolution resourceUses the external server running the Witness server service to show the status (of communication with each server) obtained from the external server.
- Majority network partition resolution resourceThis is a network partition resolution resource by the majority method.
3.6.3. Group resources¶
A group resource constitutes a unit when a failover occurs. The following group resources are currently supported:
- Application resource (appli)Provides a mechanism for starting and stopping an application (including user creation application.)
- Floating IP resource (fip)Provides a virtual IP address. A client can access a virtual IP address the same way as accessing a regular IP address.
- Mirror disk resource (md)Provides a function to perform mirroring a specific partition on the local disk and control access to it. It can be used only on a mirror disk configuration.
- Registry synchronization resource (regsync)Provides a mechanism to synchronize specific registries of more than two servers, to set the applications and services in the same way among the servers that constitute a cluster.
- Script resource (script)Provides a mechanism for starting and stopping a script (BAT) such as a user creation script.
- Disk resource (sd)Provides a function to control access to a specific partition on the shared disk. This can be used only when the shared disk device is connected.
- Service resource (service)Provides a mechanism for starting and stopping a service such as database and Web.
- Virtual computer name resource (vcom)Provides a virtual computer name. This can be accessed from a client in the same way as a general computer name.
- Dynamic DNS resource (ddns)Registers a virtual host name and the IP address of the active server to the dynamic DNS server.
- Virtual IP resource (vip)Provides a virtual IP address. This can be accessed from a client in the same way as a general IP address. This can be used in the remote cluster configuration among different network addresses.
- CIFS resource (cifs)Provides a function to disclose and share folders on the shared disk and mirror disks.
- Hybrid disk resource (hd)A resource in which the disk resource and the mirror disk resource are combined. Provides a function to perform mirroring on a certain partition on the shared disk or the local disk and to control access.
- AWS elastic ip resource (awseip)Provides a system for giving an elastic IP (referred to as EIP) when EXPRESSCLUSTER is used on AWS.
- AWS virtual ip resource (awsvip)Provides a system for giving a virtual IP (referred to as VIP) when EXPRESSCLUSTER is used on AWS.
- AWS secondary ip resource (awssip)Provides a system for giving a secondary IP when EXPRESSCLUSTER is used on AWS.
- AWS DNS resource (awsdns)Registers the virtual host name and the IP address of the active server to Amazon Route 53 when EXPRESSCLUSTER is used on AWS.
- Azure probe port resource (azurepp)Provides a system for opening a specific port on a node on which the operation is performed when EXPRESSCLUSTER is used on Microsoft Azure.
- Azure DNS resource (azuredns)Registers the virtual host name and the IP address of the active server to Azure DNS when EXPRESSCLUSTER is used on Microsoft Azure.
- Google Cloud virtual IP resource (gcvip)Provides a system for opening a specific port on a node on which the operation is performed when EXPRESSCLUSTER is used on Google Cloud.
- Google Cloud DNS resource (gcdns)Registers the virtual host name and the IP address of the active server to Cloud DNS when EXPRESSCLUSTER is used on Google Cloud.
- Oracle Cloud virtual IP resource (ocvip)Provides a system for opening a specific port on a node on which the operation is performed when EXPRESSCLUSTER is used on Oracle Cloud Infrastructure.
- Oracle Cloud DNS resource (ocdns)Registers the virtual host name and the IP address of the active server to Oracle Cloud DNS when EXPRESSCLUSTER is used on Oracle Cloud Infrastructure.
Note
3.6.4. Monitor resources¶
A monitor resource monitors a cluster system. The following monitor resources are currently supported:
- Application monitor resource (appliw)Provides a monitoring mechanism to check whether a process started by application resource is active or not.
- Disk RW monitor resource (diskw)Provides a monitoring mechanism for the file system and function to perform a failover by resetting the hardware or an intentional stop error at the time of file system I/O stalling. This can be used for monitoring the file system of the shared disk.
- Floating IP monitor resource (fipw)Provides a monitoring mechanism of the IP address started by floating IP resource.
- IP monitor resource (ipw)Provides a mechanism for monitoring the network communication.
- Mirror disk monitor resource (mdw)Provides a monitoring mechanism of the mirroring disks.
- NIC Link Up/Down monitor resource (miiw)Provides a monitoring mechanism for link status of LAN cable.
- Multi target monitor resource (mtw)Provides a status with multiple monitor resources.
- Registry synchronization monitor resource (regsyncw)Provides a monitoring mechanism of the synchronization process by a registry synchronization resource.
- Disk TUR monitor resource (sdw)Provides a mechanism to monitor the operation of access path to the shared disk by the TestUnitReady command of SCSI. This can be used for the shared disk of FibreChannel.
- Service monitor resource (servicew)Provides an alive monitoring mechanism for services.
- Virtual computer name monitor resource (vcomw)Provides a monitoring mechanism of the virtual computer started by a virtual computer name resource.
- Dynamic DNS monitor resource (ddnsw)Periodically registers a virtual host name and the IP address of the active server to the dynamic DNS server.
- Virtual IP monitor resource (vipw)Provides a monitoring mechanism of the IP address started by a virtual IP resource.
- CIFS resource (cifsw)Provides a monitoring mechanism of the shared folder disclosed by a CIFS resource.
- Hybrid disk monitor resource (hdw)Provides a monitoring mechanism of the hybrid disk.
- Hybrid disk TUR monitor resource (hdtw)Provides a monitoring mechanism for the behavior of the access path to the shared disk device used as a hybrid disk by the TestUnitReady command. It can be used for a shared disk using FibreChannel.
- Custom monitor resource (genw)Provides a monitoring mechanism to monitor the system by the operation result of commands or scripts which perform monitoring, if any.
- Process name monitor resource (psw)Provides a monitoring mechanism for checking whether a process specified by a process name is active.
- DB2 monitor resource (db2w)Provides a monitoring mechanism for the IBM DB2 database.
- ODBC monitor resource (odbcw)Provides a monitoring mechanism for the database that can be accessed by ODBC.
- Oracle monitor resource (oraclew)Provides a monitoring mechanism for the Oracle database.
- PostgreSQL monitor resource (psqlw)Provides a monitoring mechanism for the PostgreSQL database.
- SQL Server monitor resource (sqlserverw)Provides a monitoring mechanism for the SQL Server database.
- FTP monitor resource (ftpw)Provides a monitoring mechanism for the FTP server.
- HTTP monitor resource (httpw)Provides a monitoring mechanism for the HTTP server.
- IMAP4 monitor resource (imap4w)Provides a monitoring mechanism for the IMAP server.
- POP3 monitor resource (pop3w)Provides a monitoring mechanism for the POP server.
- SMTP monitor resource (smtpw)Provides a monitoring mechanism for the SMTP server.
- Tuxedo monitor resource (tuxw)Provides a monitoring mechanism for the Tuxedo application server.
- WebLogic monitor resource (wlsw)Provides a monitoring mechanism for the WebLogic application server.
- WebSphere monitor resource (wasw)Provides a monitoring mechanism for the WebSphere application server.
- WebOTX monitor resource (otxw)Provides a monitoring mechanism for the WebOTX application server.
- Eternal link monitor resource (mrw)Specifies the action to take when an error message is received and how the message is displayed on the Cluster WebUI.
- JVM monitor resource (jraw)Provides a monitoring mechanism for Java VM.
- System monitor resource (sraw)Provides a monitoring mechanism for the resources of the whole system.
- Process resource monitor resource (psrw)Provides a monitoring mechanism for running processes on the server.
- User mode monitor resource (userw)Provides a stall monitoring mechanism for the user space and a function for performing failover by an intentional STOP error or an HW reset at the time of a user space stall.
- AWS Elastic Ip monitor resource (awseipw)Provides a monitoring mechanism for the elastic ip given by the AWS elastic ip (referred to as EIP) resource.
- AWS Virtual Ip monitor resource (awsvipw)Provides a monitoring mechanism for the virtual ip given by the AWS virtual ip (referred to as VIP) resource.
- AWS Secondary Ip monitor resource (awssipw)Provides a monitoring mechanism for the secondary ip given by the AWS secondary ip resource.
- AWS AZ monitor resource (awsazw)Provides a monitoring mechanism for an Availability Zone (referred to as AZ).
- AWS DNS monitor resource (awsdnsw)Provides a monitoring mechanism for the virtual host name and IP address provided by the AWS DNS resource.
- Azure probe port monitor resource (azureppw)Provides a monitoring mechanism for ports for alive monitoring for the node where an Azure probe port resource has been activated.
- Azure load balance monitor resource (azurelbw)Provides a mechanism for monitoring whether the port number that is same as the probe port is open for the node where an Azure probe port resource has not been activated.
- Azure DNS monitor resource (azurednsw)Provides a monitoring mechanism for the virtual host name and IP address provided by the Azure DNS resource.
- Google Cloud virtual IP monitor resource (gcvipw)Provides a mechanism for monitoring the alive-monitoring port for the node where a Google Cloud virtual IP resource has been activated.
- Google Cloud load balance monitor resource (gclbw)Provides a mechanism for monitoring whether the same port number as the health-check port number has already been used, for the node where a Google Cloud virtual IP resource has not been activated.
- Google Cloud DNS monitor resource (gcdnsw)Provides a monitoring mechanism for the virtual host name and IP address provided by the Google Cloud DNS resource.
- Oracle Cloud virtual IP monitor resource (ocvipw)Provides a mechanism for monitoring the alive-monitoring port for the node where an Oracle Cloud virtual IP resource has been activated.
- Oracle Cloud load balance monitor resource (oclbw)Provides a mechanism for monitoring whether the same port number as the health-check port number has already been used, for the node where an Oracle Cloud virtual IP resource has not been activated.
- Oracle Cloud DNS monitor resource (ocdnsw)Provides a monitoring mechanism for the virtual host name and IP address provided by the Oracle Cloud DNS resource.
Note
3.7. Getting started with EXPRESSCLUSTER¶
Refer to the following guides when building a cluster system with EXPRESSCLUSTER:
3.7.1. Latest information¶
Refer to "4. Installation requirements for EXPRESSCLUSTER", "5. Latest version information" and "6. Notes and Restrictions" in this guide.
3.7.2. Designing a cluster system¶
Refer to "Determining a system configuration" and "Configuring a cluster system" in the "Installation and Configuration Guide" and "Group resource details", "Monitor resource details", "Heartbeat resources", "Details on network partition resolution resources", and "Information on other settings" in the "Reference Guide".
3.7.3. Configuring a cluster system¶
Refer to the "Installation and Configuration Guide"
3.7.4. Troubleshooting the problem¶
Refer to "The system maintenance information" in the "Maintenance Guide", and "Troubleshooting" and "Error messages" in the "Reference Guide".
4. Installation requirements for EXPRESSCLUSTER¶
This chapter provides information on system requirements for EXPRESSCLUSTER.
This chapter covers:
4.1. System requirements for hardware¶
EXPRESSCLUSTER operates on the following server architectures:
x86_64
4.1.1. General server requirements¶
Required specifications for the EXPRESSCLUSTER Server are the following:
Ethernet port 2 or more ports
Mirror disk or empty partition for mirror (required when the Replicator is used)
DVD-ROM drive
4.2. System requirements for the EXPRESSCLUSTER Server¶
4.2.1. Supported operating systems¶
EXPRESSCLUSTER Server only runs on the operating systems listed below.
x86_64 version
OS |
Remarks |
---|---|
Windows Server 2016 Standard |
|
Windows Server 2016 Datacenter |
|
Windows Server 2019 Standard |
|
Windows Server 2019 Datacenter |
|
Windows Server 2022 Standard |
|
Windows Server 2022 Datacenter |
4.2.2. Required memory and disk size¶
Required memory size
(User mode)
|
384MB( 2 ) |
---|---|
Required memory size
(Kernel mode)
|
32 MB + 4 MB ( 3 ) x (number of mirror/hybrid resources) |
Required disk size
(Right after installation)
|
100MB |
Required disk size
(During operation)
|
5.0GB + 9.0GB ( 4 ) |
- 2
excepting for optional products.
- 3
A single mirror/hybrid disk resource needs 4 MB RAM.
- 4
A disk capacity required to use mirror disk resources and hybrid disk resources.
When changing to asynchronous method, changing the queue size or changing the difference bitmap size, it is required to add more memory. Memory size increases as disk load increases because memory is used corresponding to mirror disk I/O.
For the required size of a partition for a DISK network partition resolution resource, see "Partition for shared disk".
For the required size of a cluster partition, see "Partition for mirror disk" and "Partition for hybrid disk".
4.2.3. Application supported by the monitoring options¶
The following applications are the target monitoring options that are supported.
x86_64 version
Monitor resource |
Application to be monitored |
EXPRESSCLUSTER Version |
Remarks |
---|---|---|---|
Oracle monitor |
Oracle Database 19c (19.3) |
13.00 or later |
|
DB2 monitor |
DB2 V11.5 |
13.00 or later |
|
PostgreSQL monitor |
PostgreSQL 14.1 |
13.00 or later |
|
PostgreSQL 15.1 |
13.10 or later |
||
PostgreSQL 16.3 |
13.21 or later |
||
PowerGres on Windows V13 |
13.00 or later |
||
SQL Server monitor |
SQL Server 2019 |
13.00 or later |
|
SQL Server 2022 |
13.10 or later |
||
Tuxedo monitor |
Tuxedo 12c Release 2 (12.1.3) |
12.00 or later |
|
Tuxedo 22c (22.1.0) |
13.20 or later |
||
WebLogic monitor |
WebLogic Server 11g R1 |
12.00 or later |
|
WebLogic Server 11g R2 |
12.00 or later |
||
WebLogic Server 12c R2 (12.2.1) |
12.00 or later |
||
WebLogic Server 14c (14.1.1) |
12.20 or later |
||
WebSphere monitor |
WebSphere Application Server 8.5 |
12.00 or later |
|
WebSphere Application Server 8.5.5 |
12.00 or later |
||
WebSphere Application Server 9.0 |
12.00 or later |
||
WebOTX monitor |
WebOTX Application Server V9.1 |
12.00 or later |
|
WebOTX Application Server V9.2 |
12.00 or later |
||
WebOTX Application Server V9.3 |
12.00 or later |
||
WebOTX Application Server V9.4 |
12.00 or later |
||
WebOTX Application Server V9.5 |
12.00 or later |
||
WebOTX Application Server V10.1 |
12.00 or later |
||
WebOTX Application Server V10.3 |
12.30 or later |
||
WebOTX Application Server V11.1 |
13.20 or later |
||
JVM monitor |
WebLogic Server 11g R1 |
12.00 or later |
|
WebLogic Server 11g R2 |
12.00 or later |
||
WebLogic Server 12c R2 (12.2.1) |
12.00 or later |
||
WebLogic Server 14c (14.1.1) |
12.20 or later |
||
WebOTX Application Server V9.1 |
12.00 or later |
||
WebOTX Application Server V9.2 |
12.00 or later |
||
WebOTX Application Server V9.3 |
12.00 or later |
||
WebOTX Application Server V9.4 |
12.00 or later |
||
WebOTX Application Server V9.5 |
12.00 or later |
||
WebOTX Application Server V10.1 |
12.00 or later |
||
WebOTX Application Server V10.3 |
12.30 or later |
||
WebOTX Application Server V11.1 |
13.20 or later |
||
WebOTX Enterprise Service Bus V8.4 |
12.00 or later |
||
WebOTX Enterprise Service Bus V8.5 |
12.00 or later |
||
WebOTX Enterprise Service Bus V10.3 |
12.30 or later |
||
WebOTX Enterprise Service Bus V11.1 |
13.20 or later |
||
Apache Tomcat 8.0 |
12.00 or later |
||
Apache Tomcat 8.5 |
12.00 or later |
||
Apache Tomcat 9.0 |
12.00 or later |
||
Apache Tomcat 10.0 |
13.02 or later |
||
WebSAM SVF for PDF 9.1 |
12.00 or later |
||
WebSAM SVF for PDF 9.2 |
12.00 or later |
||
WebSAM SVF PDF Enterprise 10.1 |
13.10 or later |
||
WebSAM Report Director Enterprise 9.1 |
12.00 or later |
||
WebSAM Report Director Enterprise 9.2 |
12.00 or later |
||
WebSAM RDE SUITE 10.1 |
13.10 or later |
||
WebSAM Universal Connect/X 9.1 |
12.00 or later |
||
WebSAM Universal Connect/X 9.2 |
12.00 or later |
||
WebSAM SVF Connect SUITE Standard 10.1 |
13.10 or later |
||
System monitor |
N/A |
12.00 or later |
|
Process resource monitor |
N/A |
12.10 or later |
Note
Above monitor resources are executed as 64-bit application in x86_64 environment. So that, the target applications must be 64-bit binaries.
4.2.4. Operation environment for SNMP linkage functions¶
EXPRESSCLUSTER with SNMP Service of Windows is validated on following OS.
x86_64 version
OS |
Remarks |
---|---|
Windows Server 2016 |
4.2.5. Operation environment for JVM monitor¶
The use of the JVM monitor requires a Java runtime environment.
4.2.6. Operation environment for system monitor or process resource monitor or function of collecting system resource information¶
Note
On the OS of Windows Server 2016 or later, NET Framework 4.6.2 version or later is pre-installed (The version of the pre-installed one varies depending on the OS).
4.2.7. Operation environment for AWS Elastic IP resource, AWS Elastic IP monitor resource and AWS AZ monitor resource¶
The use of the AWS elastic ip resource, AWS elastic IP monitor resource and AWS AZ monitor resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
AWS CLI |
1.12.0 or later
2.0.0 or later
|
4.2.8. Operation environment for AWS Virtual IP resource and AWS Virtual IP monitor resource¶
The use of the AWS virtual ip resource and AWS virtual IP monitor resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
AWS CLI |
1.12.0 or later
2.0.0 or later
|
4.2.9. Operation environment for AWS secondary IP resource and AWS Secondary IP monitor resource¶
The use of the AWS secondary IP resource, AWS secondary IP monitor resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
AWS CLI |
1.12.0 or later
2.0.0 or later
|
4.2.10. Operation environment for AWS DNS resource and AWS DNS monitor resource¶
The use of the AWS DNS resource and AWS DNS monitor resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
AWS CLI |
1.12.0 or later
2.0.0 or later
|
4.2.11. Operation environment for AWS forced stop resource¶
The use of the AWS forced stop resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
AWS CLI |
1.15.0 or later
2.0.0 or later
|
4.2.12. Operation environment for Azure probe port resource, Azure probe port monitor resource and Azure load balance monitor resource¶
The following are the Microsoft Azure deployment models with which the operation of the Azure probe port resource, Azure probe port monitor resource, and Azure load balance monitor resource has been verified.
For the method to configure a load balancer, refer to "EXPRESSCLUSTER X HA Cluster Configuration Guide for Microsoft Azure (Windows)".
x86_64
Deployment model |
EXPRESSCLUSTER Version |
Remarks |
---|---|---|
Resource Manager |
12.00 or later |
Load balancer is required |
4.2.13. Operation environment for Azure DNS resource and Azure DNS monitor resource¶
The use of the Azure DNS resource and Azure DNS monitor resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
Azure CLI |
2.0 or later |
x86_64
Deployment model |
EXPRESSCLUSTER Version |
Remarks |
---|---|---|
Resource Manager |
12.00 or later |
Azure DNS is required. |
4.2.14. Operation environment for Azure forced stop resource¶
The use of the Azure forced stop resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
Azure CLI |
2.0 or later
|
4.2.15. Operation environments for Google Cloud virtual IP resource, Google Cloud virtual IP monitor resource, and Google Cloud load balance monitor resource¶
The following lists the versions of the OSs on Google Cloud on which the operation of the Google Cloud virtual IP resource, the Google Cloud virtual IP monitor resource, and the Google Cloud load balance monitor resource was verified.
Distribution |
EXPRESSCLUSTER Version |
Remarks |
---|---|---|
Windows Server 2016 |
12.20 or later |
|
Windows Server 2019 |
12.20 or later |
4.2.16. Operation environments for Google Cloud DNS resource, Google Cloud DNS monitor resource¶
The use of the Google Cloud DNS resource, Azure Google Cloud monitor resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
Google Cloud SDK |
295.0.0 or later |
4.2.17. Operation environments for Oracle Cloud virtual IP resource, Oracle Cloud virtual IP monitor resource, and Oracle Cloud load balance monitor resource¶
The following lists the versions of the OSs on Oracle Cloud Infrastructure on which the operation of the Oracle Cloud virtual IP resource, the Oracle Cloud virtual IP monitor resource, and the Oracle Cloud load balance monitor resource was verified.
Distribution |
EXPRESSCLUSTER Version |
Remarks |
---|---|---|
Windows Server 2016 |
12.20 or later |
4.2.18. Operation environments for Oracle Cloud DNS resource, Oracle Cloud DNS monitor resource¶
The use of the Oracle Cloud DNS resource, Azure Oracle Cloud monitor resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
OCI CLI |
3.27.1 or later
|
4.2.19. Operation environment for OCI forced stop resource¶
The use of the OCI forced stop resource requires the following software.
Software |
Version |
Remarks |
---|---|---|
OCI CLI |
3.5.3 or later
|
4.2.20. Operation environment for enabling encryption¶
For EXPRESSCLUSTER components, enabling communication encryption requires the following software:
Software |
Version |
Remarks |
---|---|---|
OpenSSL |
1.1.1 (1.1.1a or later)
3.0 (3.0.0 or later)
3.1 (3.1.0 or later)
|
The following components support communication encryption using the above software:
Cluster WebUI
RESTful API
FTP monitor resource
Mail reporting function
Mirror disk resource
Hybrid disk resource
4.3. System requirements for the Cluster WebUI¶
4.3.1. Supported operating systems and browsers¶
Browser |
Language |
---|---|
Firefox |
English/Japanese/Chinese |
Google Chrome |
English/Japanese/Chinese |
Microsoft Edge (Chromium) |
English/Japanese/Chinese |
Note
When using an IP address to connect to Cluster WebUI, the IP address must be registered to Site of Local Intranet in advance.
Note
No mobile devices, such as tablets and smartphones, are supported.
4.3.2. Required memory size and disk size¶
Required memory size: 500MB or more
Required disk size: 200MB or more
4.4. System requirements for the Witness server¶
4.4.1. Operation verified environment for Witness server service¶
Its operation has been verified in the following environments.
Version of witness server |
Requirement |
---|---|
5.2.0 |
Node.js 18.13.0
Node.js 20.10.0
|
4.4.2. Required memory and disk size¶
Required memory size |
50MB + (Number of nodes * 0.5 MB) |
---|---|
Required disk size |
1GB |
5. Latest version information¶
This chapter provides the latest information on EXPRESSCLUSTER. The latest information on the upgraded and improved functions is described in details.
This chapter covers:
5.1. Correspondence list of EXPRESSCLUSTER and a manual¶
Description in this manual assumes the following version of EXPRESSCLUSTER. Make sure to note and check how EXPRESSCLUSTER versions and the editions of the manuals are corresponding.
EXPRESSCLUSTER Internal Version |
Manual |
Edition |
Remarks |
---|---|---|---|
13.21 |
Getting Started Guide |
4th Edition |
|
Installation and Configuration Guide |
2nd Edition |
||
Reference Guide |
4th Edition |
||
Maintenance Guide |
2nd Edition |
5.2. New features and improvements¶
The following features and improvements have been released.
No.
|
Internal
Version
|
Contents
|
---|---|---|
1 |
13.00 |
Windows Server 2022 is now supported. |
2 |
13.00 |
Along with the major upgrade, some functions have been removed. For details, refer to the list of removed functions. |
3 |
13.00 |
Added a function to suppress the automatic failover against a server crash, collectively in the whole cluster. |
4 |
13.00 |
Added a function to give a notice in an alert log that the server restart count was reset as the final action against the detected activation error or deactivation error of a group resource or against the detected error of a monitor resource. |
5 |
13.00 |
Added a function to exclude a server (with an error detected by a specified monitor resource) from the failover destination, for the automatic failover other than dynamic failover. |
6 |
13.00 |
Added the clpfwctrl command for adding a firewall rule. |
7 |
13.00 |
Added AWS secondary IP resources and AWS secondary IP monitor resources. |
8 |
13.00 |
The forced stop function using BMC has been redesigned as a BMC forced-stop resource. |
9 |
13.00 |
Redesigned the function for forcibly stopping virtual machines as a vCenter forced-stop resource. |
10 |
13.00 |
The forced stop function in the AWS environment has been added to forced stop resources. |
11 |
13.00 |
The forced stop function in the OCI environment has been added to forced stop resources. |
12 |
13.00 |
Redesigned the forced stop script as a custom forced-stop resource. |
13 |
13.00 |
Added a function to collectively change actions (followed by OS shutdowns such as a recovery action following an error detected by a monitor resource) into OS reboots. |
14 |
13.00 |
Improved the alert message regarding the wait process for start/stop between groups. |
15 |
13.00 |
The display option for the clpstat configuration information has allowed displaying the setting value of the resource start attribute. |
16 |
13.00 |
The clpcl/clpstdn command has allowed specifying the -h option even when the cluster service on the local server is stopped. |
17 |
13.00 |
A warning message is now displayed when Cluster WebUI is connected via a non-actual IP address and is switched to config mode. |
18 |
13.00 |
In the config mode of Cluster WebUI, cluster configuration data can now be applied and exported even when information on the partition to be excluded cannot be obtained. |
19 |
13.00 |
In the config mode of Cluster WebUI, a group can now be deleted with the group resource registered. |
20 |
13.00 |
Changed the content of the error message that a communication timeout occurred in Cluster WebUI. |
21 |
13.00 |
Changed the content of the error message that executing the full copy failed on the mirror disk screen in Cluster WebUI. |
22 |
13.00 |
Added a function to copy a group, group resource, or monitor resource registered in the config mode of Cluster WebUI. |
23 |
13.00 |
Added a function to move a group resource registered in the config mode of Cluster WebUI, to another group. |
24 |
13.00 |
The settings can now be changed at the group resource list of [Group Properties] in the config mode of Cluster WebUI. |
25 |
13.00 |
The settings can now be changed at the monitor resource list of [Monitor Common Properties] in the config mode of Cluster WebUI. |
26 |
13.00 |
The dependency during group resource deactivation is now displayed in the config mode of Cluster WebUI. |
27 |
13.00 |
Added a function to display a dependency diagram at the time of group resource activation/deactivation in the config mode of Cluster WebUI. |
28 |
13.00 |
Added a function to narrow down a range of display by type or resource name of a group resource or monitor resource on the status screen of Cluster WebUI. |
29 |
13.00 |
The default value for [Errors in restoring file share setting are treated as activity failure] of CIFS resource has been changed from [On] to [Off]. |
30 |
13.00 |
An intermediate certificate can now be used as a certificate file when HTTPS is used for communication in the WebManager service. |
31 |
13.00 |
Added the clpcfconv command, which changes the cluster configuration data file from the old version to the current one. |
32 |
13.00 |
Added a function to delay the start of the cluster service for starting the OS. |
33 |
13.00 |
Details such as measures can now be displayed for error results of checking cluster configuration data in Cluster WebUI. |
34 |
13.00 |
The OS type can be specified for specifying the create option of the clpcfset command. |
35 |
13.00 |
Added a function to delete a resource or parameter from cluster configuration data, which is enabled by adding the del option to the clpcfset command. |
36 |
13.00 |
Added the clpcfadm.py command, which enhances the interface for the clpcfset command. |
37 |
13.00 |
The start completion timing of an AWS DNS resource has been changed to the timing before which the following is confirmed: The record set was propagated to AWS Route 53. |
38 |
13.00 |
Changed the default value for [Wait Time to Start Monitoring] of AWS DNS monitor resources to 300 seconds. |
39 |
13.00 |
The clpstat command can now be run duplicately. |
40 |
13.00 |
Added the Node Manager service. |
41 |
13.00 |
Added a function for statistical information on heartbeat. |
42 |
13.00 |
The proxy server has become available even when a Witness heartbeat resource is not used for an HTTP NP resolution resource. |
43 |
13.00 |
HTTP monitor resources now support digest authentication. |
44 |
13.00 |
The FTP server that uses FTPS for the FTP monitor resource can now be monitored. |
45 |
13.00 |
Multiple system monitor resources can now be registered. |
46 |
13.00 |
Multiple process resource monitor resources can now be registered. |
47 |
13.00 |
Added a function to target only specific processes for a process resource monitor resource. |
48 |
13.00 |
A single service monitor resource alone can now monitor any service. |
49 |
13.00 |
The options for the clpmdctrl and clpmdstat commands have been made the same as those for the clphdctrl and clphdstat commands. |
50 |
13.02 |
JVM monitor resource supports Apache Tomcat 10.0. |
51 |
13.10 |
Added protection against vulnerabilities (CVE-2022-34824 and CVE-2022-34825): a feature for appropriately giving permission to the installation folder during installation. |
52 |
13.10 |
Added SMTPS and STARTTLS support for the mail reporting function. |
53 |
13.10 |
Added a forced stop function for the Azure environment to the forced stop resource. |
54 |
13.10 |
Added a forced stop function for the vCenter forced stop resource used with vSphere Automation APIs. |
55 |
13.10 |
Allowed specifying a log-file storage period. |
56 |
13.10 |
Expanded the check items of cluster configuration data. |
57 |
13.10 |
Allowed changing the transmission source IP address of a floating IP resource. |
58 |
13.10 |
Accelerated the initial construction and full copying of a ReFS-based mirror disk resource. |
59 |
13.10 |
Added a feature for allowing a failover on a mirror break. |
60 |
13.10 |
Allowed registering the following monitor resources with the multi target monitor resource:
- AWS Elastic IP monitor resource
- AWS Virtual IP monitor resource
- AWS Secondary IP monitor resource
- AWS AZ monitor resource
- AWS DNS monitor resource
- Azure probe port monitor resource
- Azure load balance monitor resource
- Azure DNS monitor resource
- Google Cloud Virtual IP monitor resource
- Google Cloud load balance monitor resource
- Google Cloud DNS monitor resource
- Oracle Cloud Virtual IP monitor resource
- Oracle Cloud load balance monitor resource
|
61 |
13.10 |
Added a feature for setting as a warning a value returned from the specified script, to custom monitor resources. |
62 |
13.10 |
Added support for SQL Server 2022 for SQL Server monitor resources. |
63 |
13.10 |
Added support for PostgreSQL 15.1 for PostgreSQL monitor resources. |
64 |
13.10 |
Eliminated the need for Python for configurations in AWS environments where only AWS Virtual IP resources and AWS Virtual IP monitor resources are used. |
65 |
13.10 |
Allowed using Cluster WebUI to specify environment variables for AWS-related features to access instance metadata and the AWS CLI. |
66 |
13.10 |
Added a feature for specifying command line options for the AWS CLI accessed by AWS-related features. |
67 |
13.10 |
Added support for WebSAM SVF PDF Enterprise 10.1 for JVM monitor resources. |
68 |
13.10 |
Added support for WebSAM RDE SUITE 10.1 for JVM monitor resources. |
69 |
13.10 |
Added support for WebSAM SVF Connect SUITE Standard 10.1 for JVM monitor resources. |
70 |
13.10 |
Added a feature for outputting process resource statistics. |
71 |
13.10 |
Added support for client authentication for HTTP monitor resources. |
72 |
13.10 |
Added support for OpenSSL 3.0 for FTP monitor resources. |
73 |
13.10 |
Added a feature for JVM monitor resources to output retry count information to the operation log. |
74 |
13.10 |
Added support for Java 17 for JVM monitor resources. |
75 |
13.10 |
Subtracted support for Java 7 for JVM monitor resources. |
76 |
13.10 |
Allowed a shutdown in case of an NP state due to the abnormal statuses of all PING NP resolution resources. |
77 |
13.10 |
Added an option for the clpbackup command not to perform a server shutdown or restart. |
78 |
13.10 |
Added an option in the clpcfadm.py command to create a backup file of existing cluster configuration data. |
79 |
13.10 |
Allowed Cluster WebUI to display its operation log. |
80 |
13.10 |
Implemented measures to safeguard against changes in cluster configuration data during the mirroring process. |
81 |
13.10 |
Added support for OpenSSL 3.0 for Cluster WebUI. |
82 |
13.10 |
Disabled TLS 1.1 for the HTTPS connection of Cluster WebUI. |
83 |
13.10 |
Added a feature for Cluster WebUI to apply cluster configuration data to only servers which can be communicated. |
84 |
13.10 |
Added a feature for the status screen of Cluster WebUI to list settings with which cluster operation is disabled. |
85 |
13.10 |
Added features for the config mode of Cluster WebUI to display or hide and to sort the following:
- Group resource list in [Group Properties]
- Monitor resource list in [Monitor Resources Common Properties]
|
86 |
13.10 |
Made the following changes for [Accessible number of clients] of cluster properties: its name to [Number of sessions which can be established simultaneously], and its lower limit value. |
87 |
13.10 |
Hid [Received time] by default in the Alert logs of Cluster WebUI. |
88 |
13.10 |
Changed the description of the [Restart the manager] button on the status screen of Cluster WebUI to "Restart WebManager service". |
89 |
13.10 |
Allowed [Copy the group] in the config mode of Cluster WebUI to copy group resources' dependency on a case-by-case basis as well. |
90 |
13.10 |
Implemented safeguards in Cluster WebUI to prevent configuration errors of AWS DNS resources. |
91 |
13.10 |
Implemented safeguards in Cluster WebUI to prevent configuration errors with [Monitor Type] of custom monitor resources set to [Asynchronous]. |
92 |
13.10 |
Implemented safeguards in Cluster WebUI to prevent configuration errors of the PING NP resolution resource. |
93 |
13.10 |
Allowed distinguishing in cluster statistics between automatic failover due to error detection and manual failover. |
94 |
13.11 |
Added support for OpenSSL 3.0 for RESTful API. |
95 |
13.11 |
Added support for OpenSSL 3.0 for Witness heartbeat resources. |
96 |
13.11 |
Added support for OpenSSL 3.0 for HTTP network partition resolution resources. |
97 |
13.12 |
Added support for OpenSSL 3.1 for the following functions:
- Cluster WebUI
- RESTful API
- Mirror disk resources
- Hybrid disk resources
- FTP monitor resources
- Mail report
|
98 |
13.20 |
Allowed collecting a log files for investigation with a failure in a group/monitor/forced-stop resource detected, and downloading the log files from Alert logs of Cluster WebUI. |
99 |
13.20 |
Changed the action against a stop failure of a group targeted for awaiting its stop: The stop timeout is no longer awaited. |
100 |
13.20 |
Added Oracle Cloud DNS resources and Oracle Cloud DNS monitor resources. |
101 |
13.20 |
Change the default dependency values of the following group resources:
- Azure probe port resources
- Google Cloud virtual IP resources
- Oracle Cloud virtual IP resources
- Script resources
- Application resources
- Service resources
- Dynamic DNS resources
- Registry synchronization resources
- Virtual computer name resources
|
102 |
13.20 |
Eliminated the need for Python for the following AWS-related resources and monitor resources:
- AWS Elastic IP resources
- AWS DNS resources
- AWS Elastic IP monitor resources
- AWS DNS monitor resources
- AWS AZ monitor resources
|
103 |
13.20 |
Supported environments where HTTP/1.1 is required for HTTP monitor resources. |
104 |
13.20 |
Added POP3S as an authentication method of POP3 monitor resources. |
105 |
13.20 |
Changed the operation environment for system monitoring, process resource monitoring, and system resource information, to Microsoft .NET Framework 4.6.2 or higher. |
106 |
13.20 |
Supported WebOTX V11.1 for WebOTX monitor resources. |
107 |
13.20 |
Supported WebOTX V11.1 for JVM monitor resources. |
108 |
13.20 |
Supported Oracle Tuxedo 22c (22.1.0) for Tuxedo monitor resources. |
109 |
13.20 |
Allowed specifying a URI as a target for an HTTP network partition resolution resource. |
110 |
13.20 |
Supported giving a notice in an alert log, in an environment where an AWS forced-stop resource is set, that protection against stopping an EC2 instance is enabled. |
111 |
13.20 |
Added, for the forced-stop action of the Azure forced-stop resource, an option to immediately stop without resource deallocation. |
112 |
13.20 |
Allowed checking the status of a mirror/hybrid disk resource with the a value returned by the clpmdstat/clphdstat command. |
113 |
13.20 |
Changed the folder from work\trnreq to work\rexec, which stores script files to be specified with the --script option for the clprexec command and fits the command name. |
114 |
13.20 |
Provided more error messages about cloud-related functions. |
115 |
13.20 |
Modified the type of message outputted with a server down. |
116 |
13.20 |
Allowed outputting the RESTful API operation log to the server. |
117 |
13.20 |
Added an API for getting the following metrics information with the RESTful API:
- Group's continuous operation time
- Date and time when cluster configuration data was last applied
|
118 |
13.20 |
Provided more check items for cluster configuration data to be checked. |
119 |
13.20 |
Reduced the process time for cluster configuration data to be checked. |
120 |
13.20 |
Added time data to the name of a cluster configuration data file (.zip) to be saved with [Exporting the setting] of Cluster WebUI. |
121 |
13.20 |
Supported making a warning pop up with [Action at NP Occurrence] changed to any of the following options in the config mode of Cluster WebUI:
- Stop the cluster service
- Stop the cluster service and shutdown OS
- Stop the cluster service and reboot OS
|
122 |
13.20 |
Supported displaying server statuses in color in the status tab of Cluster WebUI. |
123 |
13.20 |
Changed the display position of a pop-up alert in Cluster WebUI, from the upper right to the lower right. |
124 |
13.20 |
Supported displaying the expiry date and remaining days of the license in the operation mode of Cluster WebUI. |
125 |
13.21 |
The HBA filtering settings configured during the installation of EXPRESSCLUSTER is now automatically imported when the cluster configuration data is uploaded. |
126 |
13.21 |
Added support for OpenSSL 3.2 and OpenSSL 3.3 for the following functions:
- Cluster WebUI
- RESTful API
- Witness heartbeat resources
- HTTP network partition resolution resources
- FTP monitor resources
- POP3 monitor resources
|
127 |
13.21 |
Added support for PostgreSQL 16.3 for PostgreSQL monitor resources. |
5.3. Corrected information¶
Modification has been performed on the following minor versions.
Critical level:
- L
Operation may stop. Data destruction or mirror inconsistency may occur.Setup may not be executable.- M
Operation stop should be planned for recovery.The system may stop if duplicated with another fault.- S
A matter of displaying messages.Recovery can be made without stopping the system.
No.
|
Version in which the problem has been solved
/ Version in which the problem occurred
|
Phenomenon
|
Level
|
Occurrence condition/
Occurrence frequency
|
---|---|---|---|---|
1 |
13.00
/ 9.00 to 12.32
|
In a group, when a group resource alone is successfully activated, the restoration of another group resource may be executed. |
S |
This problem occurs in a group where a group resource alone is activated with another group resource failing in activation. |
2 |
13.00
/ 12.10 to 12.32
|
In the config mode of Cluster WebUI, modifying a comment on a group resource may not be applied. |
S |
This problem occurs in the following case: A comment on a group resource is modified, the [Apply] button is clicked, the change is undone, and then the [OK] button is clicked. |
3 |
13.00
/ 12.10 to 12.32
|
In the config mode of Cluster WebUI, modifying a comment on a monitor resource may not be applied. |
S |
This problem occurs in the following case: A comment on a monitor resource is modified, the [Apply] button is clicked, the change is undone, and then the [OK] button is clicked. |
4 |
13.00
/ 12.10 to 12.32
|
When Cluster WebUI is connected to a stopped server, the [Recover] button remains disabled for a server restarting after its crash. |
S |
This problem occurs in the following case: When Cluster WebUI is connected to a stopped server, there is a server restarting after its crash. |
5 |
13.00
/ 12.10 to 12.32
|
In the config mode of Cluster WebUI, the [Install Path] item is not required to be entered in the [Monitor (special)] tab of a WebLogic monitor resource. |
S |
This problem always occurs. |
6 |
13.00
/ 12.00 to 12.32
|
In the status screen of Cluster WebUI, a communication timeout during the operation of a cluster causes a request to be repeatedly issued. |
M |
This problem always occurs when a communication timeout occurs between Cluster WebUI and a cluster server. |
7 |
13.00
/ 12.10 to 12.32
|
Custer WebUI may freeze when dependency is set in the config mode of Cluster WebUI. |
S |
This problem occurs when two group resources are made dependent on each other. |
8 |
13.00
/ 12.20 to 12.32
|
The response of the clpstat command may be delayed. |
S |
This problem may occur when communication with other servers is cut off. |
9 |
13.00
/ 11.10 to 12.32
|
In the alert log for a delay warning of a monitor resource, the response time may read zero (0). |
S |
This problem may occur when the alert log for a delay warning of a monitor resource is outputted. |
10 |
13.00
/ 12.20 to 12.32
|
An AP error of clpwebmc may occur. |
S |
This problem rarely occurs when cluster configuration data with a server removed is applied in the config mode of Cluster WebUI. |
11 |
13.00
/ 12.00 to 12.32
|
A monitor resource may mistakenly detect a monitoring timeout. |
M |
This problem very rarely occurs when a monitoring process is executed by a monitor resource. |
12 |
13.00
/ 12.20 to 12.32
|
An error occurs when the status code of a target response is 301 in an HTTP NP resolution resource. |
S |
This problem occurs when the response status code is 301. |
13 |
13.00
/ 12.00 to 12.32
|
In [Monitoring usage of memory] for process resource monitor resources, [Duration time (min)] has been replaced with [Maximum Refresh Count (time)]. |
S |
This problem occurs when the properties are displayed with Cluster WebUI or the clpstat command. |
14 |
13.00
/ 12.00 to 12.32
|
In an HTTP monitor resource, a warning instead of an error is issued in the following case: The status code of a response to an issued HEAD request is in the 400s or 500s, and a non-default URI is specified as the monitor URI. |
S |
This problem occurs in the following case: The status code of a response to an issued HEAD request is in the 400s or 500s, and a non-default URI is specified as the monitor URI. |
15 |
13.00
/ 12.10 to 12.32
|
In a custom monitor resource, when the process of a script to be monitored is cleared, the corresponding monitor resource name is not outputted to the alert message. |
S |
This problem occurs when the process of a script to be monitored is cleared in a custom monitor resource. |
16 |
13.00
/ 11.01 to 12.32
|
A response to a mirror-related command may take time. |
S |
This problem occurs when a mirror disk connection is disconnected or when some of servers constituting a cluster are down. |
17 |
13.00
/ 12.20 to 12.32
|
The EXPRESSCLUSTER Information Base service may abend. |
S |
This problem very rarely occurs when one of the following is performed:
- Cluster startup
- Cluster stop
- Cluster suspension
- Cluster resumption
|
18 |
13.01
/ 9.00 to 12.32,13.00
|
The vulnerabilities of CVE-2021-20700 to 20707 may cause the following acts by third parties:
- Execution of an arbitrary code
- Upload of an arbitrary file
- Reading of an arbitrary file
|
L |
These problems occur when a specific process in EXPRESSCLUSTER receives a packet crafted by a malicious third party against the internal protocol of EXPRESSCLUSTER. |
19 |
13.01
/ 13.00
|
For the clprexec command, the --script option does not work. |
S |
This problem occurs when the clprexec command is executed with the --script option specified. |
20 |
13.01
/ 13.00
|
After a forced-stop resource is added by executing the clpcfset command, the cluster fails to start up. |
S |
This problem occurs during an attempt to start up a cluster to which cluster configuration data (including a forced-stop resource added by executing the clpcfset command) was applied. |
21 |
13.02
/ 13.00 to 13.01
|
The EXPRESSCLUSTER Node Manager service starts without waiting for a service startup delay time. |
S |
This problem occurs with [Service Startup Delay Time] set to a value larger than zero seconds. |
22 |
13.02
/ 13.01
|
Update installation registers the EXPRESSCLUSTER Old API Support service. |
S |
This problem occurs with the internal version 13.00 updated to 13.01. |
23 |
13.02
/ 13.00 to 13.01
|
After a server is removed from the [Servers that can run the Group] list of the failover group, trying to apply the configuration data does not lead to a group-stop request. |
S |
This problem occurs in the following case: After a server is removed from the [Servers that can run the Group] list of the failover group, applying the configuration data is tried. |
24 |
13.02
/ 13.00 to 13.01
|
The STOP error may occur during the application of cluster configuration data including a mirror/hybrid disk resource. |
M |
This problem occurs with the mirror/hybrid disk resource named with eight or more characters. |
25 |
13.02
/ 13.00 to 13.01
|
A monitor resource may detect a monitoring timeout by mistake. |
S |
This problem occurs on very rare occasions during a monitoring process by the monitor resource. |
26 |
13.02
/ 13.00 to 13.01
|
When [Recovery Action tab] for a monitor resource is set with [Generate an intentional stop error], the recovery action may not be performed. |
S |
This problem occurs on rare occasions when the recovery action is tried. |
27 |
13.02
/ 13.00 to 13.01
|
An initialization error may occur in a kernel mode LAN heartbeat resource during a cluster service start. |
M |
This problem occurs when the kernel mode LAN heartbeat resource starts up with the network device yet to become available. |
28 |
13.02
/ 12.00 to 13.01
|
A cluster service stop as an action at NP occurrence is not completed. |
M |
This problem occurs with [Action at NP Occurrence] set to [Stop the cluster service]. |
29 |
13.02
/ 13.00 to 13.01
|
Forcibly stopping more than one server may fail. |
S |
This problem occurs on rare occasions when one of three or more servers in a cluster tries to forcibly stop other servers. |
30 |
13.02
/ 9.00 to 13.01
|
An application error may occur with the clpstat command. |
S |
This problem occurs in an environment where a failover group is set with no group resources registered. |
31 |
13.02
/ 13.00 to 13.01
|
With a cluster suspended, Cluster WebUI or the clpstat command may show the server status as stopped. |
S |
This problem occurs when both of the following services are restarted with the cluster suspended:
- EXPRESSCLUSTER Node Manager
- EXPRESSCLUSTER Information Base
|
32 |
13.02
/ 13.00 to 13.01
|
A group/monitor resource status may be incorrectly shown. |
S |
This problem occurs with something wrong in the internal processing of cluster services during OS startup. |
33 |
13.02
/ 13.00 to 13.01
|
Cluster WebUI or the clpstat command incorrectly shows the status of a server using no forced-stop resources. |
S |
This problem occurs when any of three or more servers in a cluster is configured not to use the forced-stop function. |
34 |
13.02
/ 9.00 to 13.01
|
A STOP error may occur during OS startup or OS shutdown. |
M |
This problem occurs on very rare occasions during OS startup or OS shutdown. |
35 |
13.02
/ 9.00 to 13.01
|
The vulnerabilities of CVE-2022-34822 to 34823 may cause the following acts by third parties:
- Reading of an arbitrary file
- Execution of an arbitrary code
|
L |
These problems occur when a specific process in EXPRESSCLUSTER receives a packet crafted by a malicious third party against the internal protocol of EXPRESSCLUSTER. |
36 |
13.10
/ 12.20 to 13.02
|
The EXPRESSCLUSTER Information Base service may abend. |
S |
This problem occurs on rare occasions when a cluster shutdown is performed. |
37 |
13.10
/ 13.00 to 13.02
|
The clpnm.exe process may abend, leading to an OS restart. |
M |
This problem occurs on very rare occasions. |
38 |
13.10
/ 13.00 to 13.02
|
After a cluster service is started up, an alert may be put out due to abnormal heartbeat. |
S |
This problem occurs on rare occasions after a cluster service is started up. |
39 |
13.10
/ 12.00 to 13.02
|
A cluster may not be started up, due to a corrupted license file. |
S |
This problem occurs on rare occasions in the following case: While a cluster is being started up, its server is de-energized. |
40 |
13.10
/ 12.00 to 13.02
|
Instead of a product version license, a fixed-term license may become active despite its expiration. |
S |
This problem occurs with both an unused fixed-term license and a product version license registered, when the former expires. |
41 |
13.10
/ 13.00 to 13.02
|
The status of the BMC forced stop resource becomes abnormal. |
S |
This problem occurs with the iLO shared network port enabled. |
42 |
13.10
/ 9.00 to 13.02
|
Failure in resuming a cluster may lead to its abend. |
M |
This problem occurs when a cluster is repeatedly suspended and resumed in the following environment: Two or more monitor resources are registered and each of their names consists of only one letter. |
43 |
13.10
/ 13.00 to 13.02
|
When a server is shut down, the notification may not be sent. |
S |
This problem occurs on rare occasions during a server shutdown. |
44 |
13.10
/ 12.10 to 13.02
|
A recovery script for a monitor resource may not be run. |
S |
This problem occurs in the following case: With [Execute Script before Recovery Action] on in Cluster WebUI, the user does not edit the script or simultaneously changes the script and something else. |
45 |
13.10
/ 9.00 to 13.02
|
A monitor resource, configured to perform continuous monitoring, may not work. |
S |
This problem occurs in a monitor resource with the setting of [Monitor Timing] changed from [Active] to [Always]. |
46 |
13.10
/ 9.00 to 13.02
|
With [Service Name] of a service resource or service monitor resource set to the service display name of the service, the monitoring process may fail. |
M |
This problem occurs with a failure in obtaining the service name from the service display name. |
47 |
13.10
/ 11.10 to 13.02
|
A CIFS monitor resource considers the monitoring result as normal by mistake. |
S |
This problem occurs at the time of the first monitoring by a CIFS monitor resource. |
48 |
13.10
/ 12.10 to 13.02
|
[JVM Monitor Resource Tuning Properties] does not allow specifying a usage threshold for [Metaspace]. |
S |
This problem always occurs. |
49 |
13.10
/ 9.00 to 13.02
|
Hostname resolution may fail if the host is accessible from HTTP monitor resources. |
S |
This problem may occur when the hostname (not the IP address) is specified as a connection destination. |
50 |
13.10
/ 13.00 to 13.02
|
With more than one DISK NP resolution resource configured, cluster resumption may cause an error message to be displayed. |
S |
This problem may occur depending on the timing. |
51 |
13.10
/ 12.20 to 13.02
|
The display of the clpstat command may vary depending on the server where the command is executed. |
S |
This problem may occur when the command is executed on the server with the cluster service stopped. |
52 |
13.10
/ 12.30 to 13.02
|
After the clpcfset command is executed to create cluster configuration data, its XML attribute value may be wrong. |
S |
This problem occurs when an ID attribute node is added by executing the clpcfset command. |
53 |
13.10
/ 13.00 to 13.02
|
After the clpcfset command is executed to create cluster configuration data, its object count may be wrong. |
S |
This problem occurs when, by executing the clpcfset command, the object count is added to or deleted from the cluster configuration data including a forced stop resource. |
54 |
13.10
/ 13.00 to 13.02
|
The clpcfadm.py command may not be correctly executed. |
S |
This problem occurs in the following case: Cluster WebUI executes the clpcfadm.py command on cluster configuration data from which all failover groups were deleted. |
55 |
13.10
/ 13.00 to 13.02
|
The clpcfadm.py command may allow an invalid monitor resource to be configured. |
S |
This problem occurs in the following case: When the clpcfadm.py command is used to add a monitor resource, jra is specified as the type of monitor resource. |
56 |
13.10
/ 13.00 to 13.02
|
After the clpcfadm.py command is executed to create cluster configuration data, its resource activation/deactivation timeout value may be wrong. |
S |
This problem occurs when executing the clpcfadm.py command changes the parameter requiring the calculation of the resource activation/deactivation timeout value. |
57 |
13.10
/ 12.20 to 13.02
|
For a cluster with a RESTful API, obtaining its status may fail. |
S |
This problem may occur with the EXPRESSCLUSTER Information Base service restarted. |
58 |
13.10
/ 12.20 to 13.02
|
A RESTful API may show the status of a cluster different from its actual status. |
S |
This problem may occur in the following case: The status is obtained while communication with other servers is cut off. |
59 |
13.10
/ 12.20 to 13.02
|
A RESTful API may fail to collect information. |
S |
This problem occurs on rare occasions in the following case: An API for collecting information is executed just after an API for operation is executed. |
60 |
13.10
/ 12.22 to 13.02
|
In group information retrieval with a RESTful API, an incorrect response to an exception may occur. |
S |
This problem may occur when a cluster server encounters an internal error. |
61 |
13.10
/ 12.00 to 13.02
|
Display on Cluster WebUI may be delayed for a configuration with multiple mirror/hybrid disk resources registered. |
S |
This problem may occur when mirror recovery is performed for multiple resources. |
62 |
13.10
/ 12.00 to 13.02
|
Cluster WebUI may fail to suspend mirror recovery. |
S |
This problem occurs in the following case: Mirror recovery suspension is tried with a browser session different from that of Cluster WebUI, where the mirror recovery was started; or the browser session of Cluster WebUI is reloaded during the mirror recovery. |
63 |
13.10
/ 12.10 to 13.02
|
The cluster-creating wizard of Cluster WebUI fails to automatically register a floating IP monitor resource corresponding to [Management IP Address]. |
S |
This problem occurs with [Management IP Address] registered through the cluster-creating wizard. |
64 |
13.10
/ 12.30 to 13.02
|
Cluster WebUI may fail to obtain cloud environment information. |
S |
This problem occurs with Cluster WebUI connected via a proxy server. |
65 |
13.10
/ 12.00 to 13.02
|
After [TTL] is changed for an Azure DNS resource in the config mode of Cluster WebUI, the change is not applied to the record. |
S |
This problem always occurs. |
66 |
13.10
/ 12.10 to 13.02
|
When configuring strings like a resource name on the Cluster WebUI, consecutive spaces of two or more bytes are reduced to a single byte. |
S |
This problem occurs when the setting of cluster configuration data is changed while two or more bytes of spaces are input consecutively. |
67 |
13.10
/ 12.10 to 13.02
|
In Cluster WebUI, when a group of PING NP resolution resources is added, the group list may be incorrectly displayed. |
S |
This problem may occur with one or more groups registered in the list of PING NP resolution resource groups. |
68 |
13.11
/ 12.20 to 13.10
|
Applying cluster configuration data may fail. |
S |
This problem may occur when applying cluster configuration data repeatedly in the config mode of the Cluster WebUI. |
69 |
13.11
/ 12.30 to 13.10
|
Cluster operation may be disabled. |
S |
This problem occurs in an environment where both a CPU license and a VM node license are registered. |
70 |
13.11
/ 13.00 to 13.10
|
When the EXPRESSCLUSTER service starts, a failover group may not be started. |
M |
This problem may occur in the following case: the EXPRESSCLUSTER service of each server is stopped one server at a time, and then starting the EXPRESSCLUSTER service. |
71 |
13.11
/ 11.30 to 13.10
|
After changing [Startup Server], the appropriate action for cluster configuration application is not required. |
S |
This problem always occurs. |
72 |
13.11
/ 11.10 to 13.10
|
A SQL Server monitor resource may not detect an error. |
S |
This problem occurs when [Monitor Level] is 0. |
73 |
13.11
/ 13.10
|
Mail reporting function may not work. |
S |
This problem occurs when the version is upgraded from X 5.0.2 or earlier to X 5.1.0 while mail reporting function is configured. |
74 |
13.11
/ 12.20 to 13.10
|
Heartbeat status may be incorrect. |
S |
This problem may occur in the following occasions: Connecting the Cluster WebUI on multiple cluster servers, or executing the clpstat command on multiple cluster servers. |
75 |
13.11
/ 13.00 to 13.10
|
Group resource status may be incorrect. |
S |
This problem may occur when restarting the EXPRESSCLUSTER service on a single node. |
76 |
13.11
/ 13.00 to 13.10
|
When a cluster is configured with ESMPRO/ARC, the process of waiting for a shared disk to power on does not work. |
S |
This problem occurs when a cluster is started. |
77 |
13.11
/ 9.00 to 13.10
|
EXPRESSCLUSTER system services may not be started, due to a failure of applying cluster configuration data. |
S |
This problem very rarely occurs when applying cluster configuration data. |
78 |
13.11
/ 13.00
|
In the config mode of the Cluster WebUI, a service monitor resource may not be registered. |
S |
This problem occurs in the following case: A service monitor resource is registered while there are no group resources registered. |
79 |
13.12
/ 13.11
|
A cluster may not start due to an incorrect cluster server status. |
M |
This problem may occur after a cluster service is stopped. |
80 |
13.12
/ 9.00 to 13.11
|
A mirror disk connection may be disconnected when a failover group moves repeatedly. |
S |
This problem may occur when a failover group is moving at short intervals. |
81 |
13.12
/ 13.10 to 13.11
|
Failover of a failover group including a hybrid disk resource may fail. |
S |
This problem occurs when a failover group fails over to a server other than the current server in the server group designated as the failover destination, with [Allow failover on mirror break for specified time] enabled. |
82 |
13.12
/ 13.00 to 13.11
|
An alert that a restart count has been reset may appear when a monitor resource executes the recovery action. |
S |
This problem occurs when a monitor resource executes one of the following recovery actions.
- Stop the cluster service and shutdown OS
- Stop the cluster service and reboot OS
- Generate an intentional stop error
|
83 |
13.12
/ 9.00 to 13.11
|
The EXPRESSCLUSTER Node Manager service (clpnm.exe) may abend when a network partition is resolved, and then it causes the STOP error. |
M |
This problem occurs rarely when all of the following conditions are met.
- In an environment where a DISK NP resolution resource is configured.
- On a server other than the master server.
- When a network partition is resolved.
|
84 |
13.12
/ 13.10 to 13.11
|
The screen may not display when connecting to Cluster WebUI via HTTPS. |
S |
This problem occurs rarely with OpenSSL 3.0 or later. |
85 |
13.12
/ 12.30 to 13.11
|
In the Cluster WebUI operation mode, the specific configuration values of some resources cannot be displayed.
Also the clpstat command fails to display these values.
|
S |
This problem occurs when one of the following items is set to the maximum length.
- A resource name of AWS secondary ip resource (31 characters)
- A resource name of AWS virtual ip resource (31 characters)
- A resource name of Google Cloud DNS resource (31 characters)
- A zone name of Google Cloud DNS resource (63 characters)
- A DNS name of Google Cloud DNS resource (253 characters)
|
86 |
13.12
/ 9.00 to 13.11
|
The vulnerabilities of CVE-2023-39544 to 39548 may cause the following acts by third parties:
- Execution of an arbitrary code
- Uploading of an arbitrary file
- Skimming a cluster configuration data file
|
L |
These problems occur when a specific process in EXPRESSCLUSTER receives a packet crafted by a malicious third party against the internal protocol of EXPRESSCLUSTER. |
87 |
13.20
/ 12.20 to 13.12
|
The EXPRESSCLUSTER Information Base service may abend. |
S |
This problem may occur when cluster configuration data is uploaded with its server data deleted. |
88 |
13.20
/ 11.00 to 13.12
|
The EXPRESSCLUSTER Transaction service may abend and the OS may be restarted. |
S |
This problem occurs when starting the EXPRESSCLUSTER Transaction service leads to initialization failure. |
89 |
13.20
/ 12.20 to 13.12
|
Starting the OS may lead to outputting the error log of the EXPRESSCLUSTER API service. |
S |
This problem occurs when the OS is restarted without a cluster created. |
90 |
13.20
/ 9.00 to 13.12
|
EXPRESSCLUSTER does not work normally. |
L |
This problem occurs in the following case: EXPRESSCLUSTER was installed, the system locale was changed from Japanese to another language, and then EXPRESSCLUSTER was reinstalled. |
91 |
13.20
/ 13.00 to 13.12
|
A cluster service may fail to start up. |
S |
This problem may occur when a cluster service is starting up just after its stop. |
92 |
13.20
/ 9.00 to 13.12
|
An emergency shutdown may occur during an attempt to stop a cluster service. |
M |
This problem occurs when one hour passes in stopping a cluster service. |
93 |
13.20
/ 9.00 to 13.12
|
clprc.exe, a cluster service process, may abend. |
M |
This problem occurs on rare occasions with a delay in stopping a monitor resource which monitors an active target. |
94 |
13.20
/ 9.00 to 13.12
|
During an attempt to restart a resource due to a monitoring error or to perform a failover, a stopped resource is also started. |
S |
This problem occurs when starting up a resource fails, with its final action against a resource activation failure set to [No operation (not activate next resource)], and then the recovery action due to a monitoring error is taken. |
95 |
13.20
/ 12.30 to 13.12
|
A stopped resource may be started during a failover due to a server failure. |
S |
This problem occurs when the failover occurs with a resource which was set to be manually started but has never started despite the startups of the cluster. |
96 |
13.20
/ 9.00 to 13.11
|
The mirror disk connect communication is disconnected. |
M |
This problem occurs on rare occasions with a failover group moved repeatedly in a short period of time. |
97 |
13.20
/ 9.00 to 13.12
|
An application error occurs in an attempt to stop a virtual computer name resource, which may fail. |
M |
This problem occurs on rare occasions depending on the timing. |
98 |
13.20
/ 12.00 to 13.12
|
Starting a dynamic DNS resource may lead to outputting an unnecessary error message. |
S |
This problem occurs with both of the following cases true:
- [Delete the Registered IP Address] is enabled.
- The resource is configured separately for each server.
|
99 |
13.20
/ 12.00 to 13.12
|
The alert log of an Azure DNS resource may be outputted incorrectly. |
S |
This problem occurs depending on the error type. |
100 |
13.20
/ 12.10 to 13.12
|
With the monitoring timing of a monitor resource set to active, the monitor resource may perform monitoring despite the deactivation state of the target resource. |
S |
This problem may occur with the resource repeatedly restarted. |
101 |
13.20
/ 12.20 to 13.12
|
Stopping a monitor resource may lead to outputting the following invalid alert log: "Failed to stop monitor <name of the monitor resource>". |
S |
This problem occurs on rare occasions in an attempt to stop a monitor resource. |
102 |
13.20
/ 12.30 to 13.12
|
The following monitor resources may consider their normal targets to be abnormal:
- AWS Virtual IP monitor resources
- AWS Secondary IP monitor resources
- Google Cloud DNS monitor resources
|
M |
This problem occurs after the internal process becomes abnormal. |
103 |
13.20
/ 12.00 to 13.12
|
An Azure DNS monitor resource fails in the normal monitoring process. |
S |
This problem occurs when the version of the Azure CLI is 2.50.0 or higher. |
104 |
13.20
/ 11.10 to 13.12
|
An unnecessary event log may be outputted. |
S |
This problem occurs in either of the following cases:
- [Monitor NIC Link Up/Down] is set to [On] for a floating IP monitor resource.
- An NIC Link Up/Down monitor resource is configured.
|
105 |
13.20
/ 11.35 to 13.12
|
A heartbeat resource may detect a timeout by mistake. |
M |
This problem may occur with the heartbeat timeout value set to 400 seconds or more. |
106 |
13.20
/ 12.10 to 13.12
|
More than one DISK network partition resolution resource can be configured. |
S |
This problem always occurs. |
107 |
13.20
/ 13.10 to 13.12
|
The Azure forced-stop resource may not work normally. |
S |
This problem may occur with the configuration of [Servers in Use] for the Azure forced-stop changed in an environment with three or more nodes. |
108 |
13.20
/ 13.10 to 13.12
|
It may take time for the Azure forced-stop resource to reboot an instance. |
S |
This problem occurs with [Forced Stop Action] set to [reboot]. |
109 |
13.20
/ 13.10 to 13.12
|
When a timeout occurs in a forced-stop resource in a cloud environment, a regular check may fail. |
S |
This problem may occur with the system heavily loaded. |
110 |
13.20
/ 13.10 to 13.12
|
Running the Amazon CloudWatch linkage function may fail. |
S |
This problem occurs with [Send polling time metrics] set to [On] for at least two monitor resources. |
111 |
13.20
/ 13.00 to 13.12
|
When cluster configuration data is created by executing the clpcfadm.py command, either of the following may occur:
- A value is set different from the specified one.
- The specified value is not set.
|
S |
This problem occurs after a particular parameter is set. |
112 |
13.20
/ 13.10 to 13.12
|
The operation log of Cluster WebUI may fail to be collected. |
S |
This problem occurs with the path of [Log output path] including either of the following:
- A symbolic link
- "\" at the end
|
113 |
13.20
/ 9.00 to 13.12
|
When applying a setting from Cluster WebUI leads to an authentication error, necessary services may not restart. |
S |
This problem occurs with the following performed at the same time:
- Creating or changing a password on the cluster password method
- A change involving a service restart
|
114 |
13.20
/ 12.00 to 13.12
|
In Cluster WebUI, a forcible mirror recovery may fail. |
S |
This problem occurs when an unknown-status server exists in hybrid disk configuration. |
115 |
13.20
/ 9.00 to 13.12
|
In the HTTP response header of the WebManager server, no appropriate character encoding method is specified. |
S |
This problem always occurs in Cluster WebUI. |
116 |
13.20
/ 13.00 to 13.12
|
RESTful API execution may fail. |
S |
This problem may occur in RESTful API execution just after an OS startup. |
117 |
13.20
/ 13.00 to 13.12
|
In cooperation with ESMPRO/AC, the alert log may display an unnecessary error message. |
S |
This problem may occur in the following case: When a power failure occurs, ESMPRO/AC makes a cluster shutdown performed simultaneously on two or more servers. |
118 |
13.20
/ 12.00 to 13.12
|
In Alert logs of Cluster WebUI, the display may become invalid. |
S |
This problem occurs when Cluster WebUI displays a corrupted alert log. |
119 |
13.20
/ 13.00 to 13.12
|
In the config mode of Cluster WebUI, a dependency diagram may not be displayed. |
S |
This problem occurs with an extremely large number of resources. |
120 |
13.20
/ 12.20 to 13.12
|
Cluster WebUI may become uncontrollable in config mode when uploading configuration data with its server data deleted. |
S |
This problem occurs with at least two failover groups started on the server whose data was removed. |
121 |
13.20
/ 12.10 to 13.12
|
In the config mode of Cluster WebUI, the setting for [Network Partition Resolution Tuning Properties] is not saved after the [Apply] button is pressed and [Cluster properties] is closed by pressing the [Cancel] button. |
S |
This problem occurs after the setting for [Network Partition Resolution Tuning Properties] is changed and then [Cluster properties] is closed by pressing the [Cancel] button. |
122 |
13.20
/ 12.10 to 13.12
|
In the config mode of Cluster WebUI, [User Name] in the [Monitor (special)] tab for an FTP monitor resource is not a mandatory item. |
S |
This problem always occurs. |
123 |
13.20
/ 13.00 to 13.12
|
in the config mode of Cluster WebUI, going to [Group properties] -> the [Resources] tab -> [Resource Properties] wrongly displays the [Recovery Operation] tab. |
S |
This problem occurs with [Failover Count Method] set to [Cluster]. |
124 |
13.20
/ 12.00 to 13.12
|
The display of Cluster WebUI is delayed. |
S |
This problem may occur when several Hybrid disk resources are configured. |
125 |
13.21
/ 13.20
|
When the EXPRESSCLUSTER service is stopped, an application error may occur in a specific process of EXPRESSCLUSTER (either clprc.exe or clpibsv.exe). |
S |
This problem very rarely occurs when the EXPRESSCLUSTER service is stopped. |
126 |
13.21
/ 12.20 to 13.20
|
The EXPRESSCLUSTER Information Base service may take time to stop. |
S |
This problem may occur when the EXPRESSCLUSTER Information Base service is restarted after operating cluster for an extended period of time. |
127 |
13.21
/ 9.00 to 13.12
|
In the EXPRESSCLUSTER Web Alert service, unnecessary local communication may occur. |
S |
This problem may occur when a blank is set for some servers in [Interconnect] tab for a heartbeat I/F. |
128 |
13.21
/ 12.20 to 13.20
|
In Cluster WebUI, when applying cluster configuration data, the service restart screen may not close. |
S |
This problem occurs when the restart of WebManager service, information Base service, and API service is simultaneously required as the application method. |
129 |
13.21
/ 11.30 to 13.20
|
The Azure load balance monitor resource may become abnormal in monitoring. |
S |
This problem may occur depending on the timing during the failover group shutdown process. |
130 |
13.21
/ 9.00 to 13.20
|
The DISK network partition resolution resource may take time to stop. |
S |
This problem may occur when the cluster service is stopped by a network partition resolution with [Action at NP Occurrence] set to [Stop the cluster service]. |
131 |
13.21
/ 13.00 to 13.20
|
The DISK network partition resolution resource fails to start. |
S |
This problem occurs when the cluster service is started after stopping it by a network partition resolution with [Action at NP Occurrence] set to [Stop the cluster service]. |
132 |
13.21
/ 13.00 to 13.20
|
An application error may occur in the DISK network partition resolution resources. |
M |
This problem rarely occurs when the DISK network partition resolution resource is stopped. |
133 |
13.21
/ 12.00 to 13.20
|
The activation and deactivation timeout values for Azure DNS resources created with the clpcfadm command may be incorrect. |
S |
This problem occurs when the parameters that require calculation of the active/deactive timeout value for Azure DNS resources are changed using the clpcfadm command. |
134 |
13.21
/ 13.20
|
The WebManager server internal logs may be partially lost. |
S |
This problem may occur when the WebManager server is restarted. |
135 |
13.21
/ 13.20
|
The DISK network partition resolution resource cannot be configured per server group in a hybrid disk configuration environment. |
S |
This problem always occurs. |
136 |
13.21
/ -
|
In Cluster WebUI, it may not be possible to configure group resources and monitor resources that require optional product licenses. |
S |
This problem may occur in an environment running with trial version licenses when valid licenses and expired licenses are registered together. |
137 |
13.21
/ 13.20
|
In Cluster WebUI, if an user without the operation right logs in, an authentication error message is displayed. |
S |
This problem occurs when one of the followings is configured:
- Control connection by using password
- Control connection by using client IP address
|
138 |
13.21
/ 13.20
|
The following alert log is output, and the log file for investigation cannot be downloaded.
Module Type: trnsv
Event ID: 2301
|
S |
This problem occurs in an environment where the [Control connection by using client IP address] setting is enabled. |
6. Notes and Restrictions¶
This chapter provides information on known problems and how to troubleshoot the problems.
This chapter covers:
6.1. Designing a system configuration¶
Hardware selection, system configuration, and shared disk configuration are introduced in this section.
6.1.1. Hardware requirements for mirror disk and hybrid disk¶
Dynamic disks cannot be used. Use basic disks.
The partitions (data and cluster partitions) for mirror disks and hybrid disks cannot be used by mounting them on an NTFS folder.
To use a mirror disk resource or a hybrid disk resource, partitions for mirroring (i.e. data partition and cluster partition) are required.
There are no specific limitations on locating partitions for mirroring, but the data partition sizes need to be perfectly matched with one another on a byte basis. A cluster partition also requires space of 1024MiB or larger.
When making data partitions as logical partitions on the extended partition, make sure to select the logical partition for both servers. Even when the same size is specified on both primary partition and logical partition, their actual sizes may different from each other.
It is recommended to create a cluster partition and a data partition on different disks for the load distribution. (There are not any problems to create them on the same disk, but the writing performance will slightly decline, in case of asynchronous mirroring or in a state that mirroring is suspended.)
Use the same type of disks for reserving data partitions that perform mirroring by mirror resources on both of the servers.
Example
Combination
server1
server2
OK
SCSI
SCSI
OK
IDE
IDE
NG
IDE
SCSI
Partition size reserved by Disk Management is aligned by the number of blocks (units) per disk cylinder. For this reason, if disk geometries used as disks for mirroring differ between servers, the data partition sizes cannot be matched perfectly. To avoid this problem, it is recommended to use the same hardware configurations including RAID configurations for the disks that reserve data partitions on server1 and server2.
When you cannot synchronize the disk type or geometry on the both servers, make sure to check the exact size of data partitions by using the clpvolsz command before configuring a mirror disk resource or a hybrid disk resource. If they do not match, make the larger partition small by using the clpvolsz command.
When RAID-disk is mirrored, it is recommended to use writeback mode because writing performance decreases a lot when the disk array controller cache is set to write-thru mode. However, when writeback mode is used, it is necessary to use disk array controller with battery installed or use with UPS.
A partition with the OS page file cannot be mirrored.
6.1.2. IPv6 environment¶
The following function cannot be used in an IPv6 environment:
AWS Elastic IP resource
AWS Virtual IP resource
AWS Secondary IP resource
AWS DNS resource
Azure probe port resource
Azure DNS resource
Google Cloud virtual IP resource
Google Cloud DNS resource
Oracle Cloud virtual IP resource
Oracle Cloud DNS resource
AWS Elastic IP monitor
AWS Virtual IP monitor
AWS Secondary IP monitor
AWS AZ monitor
AWS DNS monitor
Azure probe port monitor
Azure load balance monitor
Azure DNS monitor
Google Cloud virtual IP monitor resource
Google Cloud load balance monitor resource
Google Cloud DNS monitor resource
Oracle Cloud virtual IP monitor resource
Oracle Cloud load balance monitor resource
Oracle Cloud DNS monitor resource
The following functions cannot use link-local addresses:
Kernel mode LAN heartbeat resource
Mirror disk connect
PING network partition resolution resource
FIP resource
VIP resource
6.1.3. Network configuration¶
The cluster configuration cannot be configured or operated in an environment, such as NAT, where an IP address of a local server is different from that of a remote server.
Cluster settings for Server 1
Local server: 10.0.0.1
Remote server: 10.0.0.2
Cluster settings for Server 2
Local server: 192.168.0.1
Remote server: 10.0.0.1
6.1.5. Write function of the mirror disk and hybrid disk¶
There are 2 types of disk mirroring of mirror disk resources and hybrid disk resources: synchronous mirroring and asynchronous mirroring.
In synchronous mirroring, data is written in the disks of both servers for every request to write data in the data partition to be mirrored and its completion is waited. Data is written in each of the servers along with this, but it is written in disks of other servers via network, so writing performance declines more significantly compared to a normal local disk that is not to be mirrored. In case of the remote cluster configuration, since the network communication speed is slow and delay is long, the writing performance declines drastically.
In asynchronous mirroring, data is written to the local server immediately. However, when writing data to other server, it is saved to the local queue first and then written in the background. Since the completion of writing data to other server is not waited for, even when the network performance is low, the writing performance will not decline significantly. However, in case of asynchronous mirror, the data to be updated is saved in the queue for every writing request as well, so the writing performance declines more significantly, compared to the normal local disk that is not to be mirrored and the shared disk. For this reason, it is recommended to use the shared disk for the system (such as the database system with lots of update systems) that is required high throughput for writing data in disks.
In case of asynchronous mirroring, the writing sequence will be guaranteed, but the data that has been updated to the latest may be lost, if an active server shuts down. For this reason, if it is required to inherit the data immediately before an error occurs for sure, use synchronous mirroring or the shared disk.
6.1.6. History file of asynchronous mirroring¶
In mirror disk or hybrid disk with asynchronous mode, data that cannot afford to be written in memory queue is recorded temporarily in a folder specified to save history files. When the limit of the file is not specified, history files are written in the specified folder without limitation. In this case, the line speed is too low, compared to the disk update amount of application, writing data to other server cannot catch up with updating the disk, and history files will overflow from the disk.
For this reason, it is required to reserve a communication line with enough speed in the remote cluster configuration as well, in accordance with the amount of disk application to be updated.
In case the folder with history files overflows from the disk because the communication band gets narrowed or the disk is updated continuously, it is required to reserve enough empty space in the drive and specify the limit of the history file size. This space will be specified as the destination to write history files, and to specify the drive different from the system drive as much as possible.
6.1.7. Data consistency among multiple asynchronous mirror disks¶
In mirror disk or hybrid disk with asynchronous mode, writing data to the data partition of the active server is performed in the same order as the data partition of the standby server.
This writing order is guaranteed except during the initial mirror disk configuration or recovery (copy) period after suspending mirroring the disks. The data consistency among the files on the standby data partition is guaranteed.
However, the writing order is not guaranteed among multiple mirror disk resources and hybrid disk resources. For example, if a file gets older than the other and files that cannot maintain the data consistency are distributed to multiple asynchronous mirror disks, an application may not run properly when it fails over due to server failure.
For this reason, be sure to place these files on the same asynchronous mirror disk or hybrid disk.
6.1.8. Multi boot¶
Avoid using multi boot if either of mirror disk or shared disk is used because if an operating system is started from another boot disk, access restrictions on mirroring and the shared disk become ineffective. The mirror disk consistency will not be guaranteed and data on the shared disk will not be protected.
6.1.9. JVM monitor resources¶
Up to 25 Java VMs can be monitored concurrently. The Java VMs that can be monitored concurrently are those which are uniquely identified by the Cluster WebUI (with Identifier in the Monitor(special) tab)
Connections between Java VMs and JVM monitor resources do not support SSL.
It may not be possible to detect thread deadlocks. This is a known problem in Java VM. For details, refer to "Bug ID: 6380127" in the Oracle Bug Database
The JVM monitor resources can monitor only the Java VMs on the server on which the JVM monitor resources are running.
The Java installation path setting made by the Cluster WebUI (with Java Installation Path in the JVM monitor tab in Cluster Property) is shared by the servers in the cluster. The version and update of Java VM used for JVM monitoring must be the same on every server in the cluster.
The management port number setting made by the Cluster WebUI (with Management Port in the Connection Setting dialog box opened from the JVM monitor tab in Cluster Property) is shared by all the servers in the cluster.
Application monitoring is disabled when an application to be monitored on the IA32 version is running on an x86_64 version OS.
If a large value such as 3,000 or more is specified as the maximum Java heap size by the Cluster WebUI (by using Maximum Java Heap Size on the JVM monitor tab in Cluster Property), The JVM monitor resources will fail to start up. The maximum heap size differs depending on the environment, so be sure to specify a value based on the capacity of the mounted system memory.
6.1.10. Requirements for network warning light¶
When using "DN-1000S" or "DN-1500GL," do not set your password for the warning light.
- To play an audio file as a warning, you must register the audio file to a network warning light supporting audio file playback.For details about how to register an audio file, see the manual of the network warning light you want to use.
Set up a network warning light so that a server in a cluster is permitted to execute the rsh command to that warning light.
6.1.11. Cluster WebUI¶
We recommend you to use Cluster WebUI with HTTPS used as a securer communication method.
For information on how to set it, see "Reference Guide" -> "Parameter details" -> "Cluster properties" -> "WebManager tab" and "Encryption tab".
6.2. Before installing EXPRESSCLUSTER¶
Consideration after installing an operating system, when configuring OS and disks are described in this section.
6.2.1. File system¶
Use NTFS for file systems of a partition to install OS, a partition to be used as a disk resource of the shared disk, and of a data partition of a mirror disk resource and a hybrid disk resource.
6.2.2. Communication port number¶
In EXPRESSCLUSTER, the following port numbers are used by default. You can change the port number by using the Cluster WebUI.
Make sure not to access the following port numbers from a program other than EXPRESSCLUSTER.
Configure to be able to access the port number below when setting a firewall on a server:
After installing EXPRESSCLUSTER, you can use the clpfwctrl command to configure a firewall. For more information, see "Reference Guide" -> "EXPRESSCLUSTER command reference" -> "Adding a firewall rule (clpfwctrl command)". Ports to be set with the clpfwctrl command are marked with ✓ in the clpfwctrl column of the table below. The applicable protocols are ICMPv4 and ICMPv6.
For a cloud environment, allow access to ports numbered as below, not only in a firewall configuration at the instance side but also in a security configuration at the cloud infrastructure side.
Server to Server
From
To
Used for
clpfwctrl
Server
Automatic allocation 5
Server
29001/TCP
Internal communication
✓
Server
Automatic allocation
Server
29002/TCP
Data transfer
✓
Server
Automatic allocation
Server
29003/UDP
Alert synchronization
✓
Server
Automatic allocation
Server
29004/TCP
Communication between disk agents
✓
Server
Automatic allocation
Server
29005/TCP
Communication between mirror drivers
✓
Server
Automatic allocation
Server
29008/TCP
Cluster information management
✓
Server
Automatic allocation
Server
29010/TCP
Internal communication of RESTful API
✓
Server
29106/UDP
Server
29106/UDP
Heartbeat
✓
Server
icmp
Server
icmp
Duplication check for FIP/VIP resource
- 5
In automatic allocation, a port number not being used at a given time is allocated.
Client to Server
From
To
Used for
clpfwctrl
RESTful API clientAutomatic allocationServer29009/TCPhttp communication✓Cluster WebUI to Server
From
To
Used for
clpfwctrl
Cluster WebUI,
Automatic allocation
Server
29003/TCP
http communication
✓
Others
From
To
Used for
clpfwctrl
Server
Automatic allocation
Network warning light
See the manual for each product.
Network warning light control
Server
Automatic allocation
BMC Management LAN of the server
623/UDP
BMC control (Forced stop)
Server
Automatic allocation
Witness server
Communication port number specified with Cluster WebUI
Connection destination host of the Witness heartbeat resource
Server
Automatic allocation
Monitor target
icmp
IP monitor resource
Server
Automatic allocation
Monitor target
icmp
Monitoring target of PING method of network partition resolution resource
Server
Automatic allocation
Monitor target
Management port number set by the Cluster WebUI
Monitoring target of HTTP method of network partition resolution resource
Server
Automatic allocation
Server
Management port number set by the Cluster WebUI
JVM monitor resource
✓
Server
Automatic allocation
Monitoring target
Connection port number set by the Cluster WebUI
JVM monitor resource
Server
Automatic allocation
Server
Probe port set by the Cluster WebUI
Azure probe port resource
✓
Server
Automatic allocation
AWS service endpoint
443/tcp
AWS Elastic IP resourceAWS virtual IP resourceAWS secondary IP resourceAWS DNS resourceAWS Elastic IP monitor resourceAWS virtual IP monitor resourceAWS secondary IP monitor resourceAWS AZ monitor resourceAWS DNS monitor resourceAWS forced stop resourceServer
Automatic allocation
Azure endpoint
443/tcp
Azure DNS resource
Server
Automatic allocation
Azure authoritative name server
53/udp
Azure DNS monitor resource
Server
Automatic allocation
Server
Port number set in Cluster WebUI
Google Cloud virtual IP resource
✓
Server
Automatic allocation
Server
Port number set in Cluster WebUI
Oracle Cloud virtual IP resource
✓
Server
Automatic allocation
Oracle Cloud endpoint
443/tcp
Oracle Cloud DNS resourceOracle Cloud DNS monitor resourceOCI Forced stop resource
For an AWS environment, modify the Security Group setting in addition to the firewall setting.
JVM monitor uses the following two port numbers:
This management port number is a port number that the JVM monitor resource uses internally. To set the port number, open the Cluster Properties window of the Cluster WebUI, select the JVM monitor tab, and then open the Connection Setting dialog box. For more information, refer to " Parameter details" in the "Reference Guide".
This connection port number is the port number used to connect to the Java VM on the monitoring target (WebLogic Server or WebOTX). To set the port number, open the Properties window for the relevant JVM monitoring resource name, and then select the Monitor(special) tab. For more information, refer to "Monitor resource details" in the "Reference Guide".
The following are port numbers used by the load balancer for the alive monitoring of each server: Probeport of an Azure probe port resource, Port Number of a Google Cloud virtual IP resource, and Port Number of an Oracle Cloud virtual IP resource.
The above port numbers are used with the AWS CLI, which is executed by the following AWS-related resources:
AWS Elastic IP resource
AWS Virtual IP resource
AWS Secondary IP resource
AWS DNS resource
AWS Elastic IP monitor resource
AWS Virtual IP monitor resource
AWS Secondary IP monitor resource
AWS AZ monitor resource
AWS DNS monitor resource
AWS Forced stop resource
The Azure DNS resource runs the Azure CLI. The above port numbers are used by the Azure CLI.
The above port numbers are used with the OCI CLI, which is executed by the following OCI-related resources:
Oracle Cloud DNS resource
Oracle Cloud DNS monitor resource
OCI Forced stop resource
6.2.3. Changing automatic allocation range of communication port numbers managed by the OS¶
The automatic allocation range of communication port numbers managed by the OS may overlap the communication port numbers used by EXPRESSCLUSTER.
Check the automatic allocation range of communication port numbers managed by the OS, by using the following method. If there is any overlap, change the port numbers used by EXPRESSCLUSTER or change the automatic allocation range of communication port numbers managed by the OS, by using the following method to prevent any overlap.
Display and set the automatic allocation range by using the Windows netsh command.
Checking the automatic allocation range of communication port numbers managed by the OS
netsh interface <ipv4|ipv6> show dynamicportrange <tcp|udp>
An example is shown below.
>netsh interface ipv4 show dynamicportrange tcp Range of dynamic ports of the tcp protocol ------------------------------------------ Start port : 49152 Number of ports : 16384
This example indicates that the range in which communication port numbers are automatically allocated in the TCP protocol is 49152 to 68835 (allocation of 16384 ports beginning with port number 49152). If any of the port numbers used by EXPRESSCLUSTER fall within this range, change the port numbers used by EXPRESSCLUSTER or follow description given in "Setting the automatic allocation range of communication port numbers managed by the OS," below.
Setting the automatic allocation range of communication port numbers managed by the OS
netsh interface <ipv4|ipv6> set dynamicportrange <tcp|udp> [startport=]<start_port_number> [numberofports=]<range_of_automatic_allocation>
An example is shown below.
>netsh interface ipv4 set dynamicportrange tcp startport=10000 numberofports=1000
This example sets the range in which communication port numbers are automatically allocated in the TCP protocol (ipv4) to between 10000 and 10999 (allocation of 1000 ports beginning with port number 10000).
6.2.4. Avoiding insufficient ports¶
6.2.5. Clock synchronization¶
In a cluster system, it is recommended to synchronize multiple server clocks regularly. Synchronize server clocks by using the time server.
6.2.7. Partition for mirror disk¶
Create a raw partition with larger than 1024MiB space on local disk of each server as a management partition for mirror disk resource (cluster partition.)
Create a partition (data partition) for mirroring on local disk of each server and format it with NTFS. It is not necessary to recreate a partition when the existing partition is mirrored.
Set the same data partition size to both servers. Use the clpvolsz command for checking and adjusting the partition size accurately.
Set the same drive letter to both servers for a cluster partition and data partition.
6.2.8. Partition for hybrid disk¶
As a partition for hybrid disk resource management (cluster partition), create a RAW partition of 1024MiB or larger in the shared disk of each server group (or in the local disk if there is one member server in the server group).
Create a partition to be mirrored (data partition) in the shared disk of each server group (or in the local disk if there is one member server in the server group) and format the partition with NTFS (it is not necessary to create a partition again when an existing partition is mirrored).
Set the same data partition size to both server groups. Use the clpvolsz command for checking and adjusting the partition size accurately.
Set the same drive letter to cluster partitions in all servers. Set the same drive letter to data partitions in all servers..
6.2.9. Access permissions of a folder or a file on the data partition¶
In the workgroup environment, you must set access permission of a folder or a file on the data partition for an user on each cluster server. For example, you must set access permission for "test" user of "server1" and "server2" which are cluster servers.
6.2.10. Adjusting OS startup time¶
It is necessary to configure the time from power-on of each node in the cluster to the server operating system startup to be longer than the following6:
The time from power-on of the shared disks to the point they become available.
Heartbeat timeout time.
6.2.11. Verifying the network settings¶
On all servers in the cluster, verify the status of the following networks using the ipconfig or ping command.
Check the network settings by using the ipconfig and ping commands.
Public LAN (used for communication with all the other machines)
Interconnect-dedicated LAN (used for communication between servers in EXPRESSCLUSTER)
Mirror disk connect LAN (used with interconnect)
Host name
The IP address does not need to be set as floating IP resource in the operating system.
When NIC is link down, IP address will be disabled in a server that if IPv6 is specified for the EXPRESSCLUSTER configuration (such as heartbeat and mirror disk connect).
In that case, EXPRESSCLUSTER may cause some problems. Type following command to disable media sense function to avoid this problem.
netsh interface ipv6 set global dhcpmediasense=disabled
6.2.12. Coordination with ESMPRO/AutomaticRunningController¶
The following are the notes on EXPRESSCLUSTER configuration when EXPRESSCLUSTER works together with ESMPRO/AutomaticRunningController (hereafter ESMPRO/AC). If these notes are unmet, EXPRESSCLUSTER may fail to work together with ESMPRO/AC.
The function to use EXPRESSCLUSTER with ESMPRO/AC does not work on the OS of x64 Edition.
You cannot specify only the DISK-method resource as a network partition resolution resource. When you specify the DISK method, do so while combining with other network partition resolution method such as PING method.
When creating a disk TUR monitor resource, do not change the default value (No Operation) for the final action.
When creating a Disk RW monitor resource, if you specify a path on the shared disk for the value to be set for file name, do not change the default value (active) for the monitor timings.
After recovery from power outage, the following alerts may appear on the EXPRESSCLUSTER manager. This does not affect the actual operation due to the configuring the settings mentioned above.
- ID:18Module name: nmMessage: Failed to start the resource <resource name of DiskNP>. (server name:xx)
- ID:1509Module name: rmMessage: Monitor <disk TUR monitor resource name> detected an error. (4 : device open failed. Check the disk status of the volume of monitoring target.)
For information on how to configure ESMPRO/AC and notes etc, see the chapter for ESMPRO/AC in the EXPRESSCLUSTER X for Windows PP Guide.
6.2.13. About ipmiutil¶
The following functions use IPMI Management Utilities (ipmiutil), an open source of the BSD license, to control the BMC firmware servers. To use these functions, it is necessary to install ipmiutil in each server:
Forcibly stopping a physical machine
When you use any of the above functions, configure Baseboard Management Controller (BMC) in each server so that the IP address of the management LAN port for the BMC can communicate with the IP address which the OS uses. These functions cannot be used on a server where there is no BMC installed, or when the network for the BMC management is obstructed. For information on how to configure the settings for the BMC, see the manuals for servers.
EXPRESSCLUSTER does not come with ipmiutil. For information on how to acquire and install ipmiutil, see "Setup of BMC and ipmiutil (Required for using the forced stop function of a physical machine)" in "Settings after configuring hardware" in "Determining a system configuration" in the "Installation and Configuration Guide".
Users are responsible for making decisions and assuming responsibilities. NEC does not support or assume any responsibilities for:
Inquires about ipmiutil itself
Operations of ipmiutil
Malfunction of ipmiutil or any error caused by such malfunction
Inquiries about whether or not ipmiutil is supported by a given server
Check if your server (hardware) supports ipmiutil in advance. Note that even if the machine complies with the IPMI standard as hardware, ipmiutil may not run when you actually try to run it.
6.2.14. Installation on Server Core¶
6.2.15. Access restriction for an HBA to which a system disk is connected¶
6.2.16. Time synchronization in the AWS environment¶
6.2.17. IAM settings in the AWS environment¶
This section describes the settings of IAM (Identity & Access Management) in AWS environment.
Some of EXPRESSCLUSTER's functions internally run AWS CLI for their processes. To run AWS CLI successfully, you need to set up IAM in advance.
You can give access permissions to AWS CLI by using IAM role or IAM user. IAM role method offers a high-level of security because you do not have to store AWS access key ID and AWS secret access key in an instance. Therefore, it is recommended to use IAM role basically.
Advantages and disadvantages of the two methods are as follows:
Advantages |
Disadvantages |
|
---|---|---|
IAM role |
- This method is more secure than using IAM user
- The procedure for maintaining key information is simple.
|
None |
IAM user |
You can set access permissions for each instance later. |
The risk of key information leakage is high.
The procedure for maintaining key information is complicated.
|
The procedure of setting IAM is shown below.
First, create IAM policy by referring to "Creating IAM policy" explained below.
- Next, configure the instance settings.To use IAM role, refer to "Setting up an instance by using IAM role" described later.To use IAM user, refer to "Setting up an instance by using IAM user" described later.
Creating IAM policy
Create a policy that describes access permissions for the actions to the services such as EC2 and S3 of AWS. The actions required for AWS-related resources and monitor resources to execute AWS CLI are as follows:
The necessary policies are subject to change.
AWS virtual IP resources / AWS virtual IP monitor resources
Action
Description
ec2:DescribeNetworkInterfacesec2:DescribeVpcsec2:DescribeRouteTablesThis is required for obtaining information of VPC, route table and network interfaces.
ec2:ReplaceRoute
This is required for updating the route table.
AWS Elastic IP resources /AWS Elastic IP monitor resource
Action
Description
ec2:DescribeNetworkInterfacesec2:DescribeAddressesThis is required for obtaining information of EIP and network interfaces.
ec2:AssociateAddress
This is required for associating EIP with ENI.
ec2:DisassociateAddress
This is required for disassociating EIP from ENI.
AWS secondary IP resources / AWS secondary IP monitor resources
Action
Description
ec2:DescribeNetworkInterfacesec2:DescribeSubnetsThis is required for obtaining information on network interfaces and subnets.
ec2:AssignPrivateIpAddresses
This is required for assigning secondary IP addresses.
ec2:UnassignPrivateIpAddresses
This is required for deassigning secondary IP addresses.
AWS AZ monitor resource
Action
Description
ec2:DescribeAvailabilityZones
This is required for obtaining information of the availability zone.
AWS DNS resource / AWS DNS monitor resource
Action
Description
route53:ChangeResourceRecordSets
This is required for a resource record set is added or deleted or when the resource record set configuration is updated.
route53:GetChange
This is required for a resource record set is added or when the resource record set configuration is updated.
route53:ListResourceRecordSets
This is required for obtaining information of a resource record set.
AWS forced stop resource
Action
Description
ec2:DescribeInstances
This is required for obtaining information on instances.
ec2:StopInstances
This is required for stopping instances.
ec2:RebootInstances
This is required for restarting instances.
ec2:DescribeInstanceAttribute
This is required for obtaining instance attributes.
Function for sending data on the monitoring process time taken by the monitor resource, to Amazon CloudWatch.
Action
Description
cloudwatch:PutMetricData
This is required for sending custom metrics.
Function for sending alert service messages to Amazon SNS
Action
Description
sns:Publish
This is required for sending messages.
The example of a custom policy as shown below permits actions used by all the AWS-related resources and monitor resources.
{ "Version": "2012-10-17", "Statement": [ { "Action": [ "ec2:Describe*", "ec2:ReplaceRoute", "ec2:AssociateAddress", "ec2:DisassociateAddress", "ec2:AssignPrivateIpAddresses", "ec2:UnassignPrivateIpAddresses", "ec2:StopInstances", "ec2:RebootInstances", "route53:ChangeResourceRecordSets", "route53:GetChange", "route53:ListResourceRecordSets" ], "Effect": "Allow", "Resource": "*" } ] }You can create a custom policy from [Policies] - [Create Policy] in IAM Management Console
Setting up an instance by using IAM role
In this method, you can execute AWS CLI after creating IAM role and associate it with an instance.
Create the IAM role and attach the IAM Policy to the role.You can create the IAM role from [Roles] - [Create New Role] in IAM Management ConsoleWhen creating an instance, specify the IAM role you created to IAM Role.
Log on to the instance.
Install the AWS CLI.
Download and install the AWS CLI.The installer automatically adds the path of the AWS CLI to the system environment variable PATH. If the automatic path addition fails, refer to "AWS Command Line Interface" of the AWS document to add the path.The AWS CLI has been installed in an environment with EXPRESSCLUSTER already installed, restart the OS before operating EXPRESSCLUSTER.Launch the command prompt as the Administrator and execute the command as shown below.
> aws configureInput the information required to execute AWS CLI in response to the prompt. Do not input AWS access key ID and AWS secret access key.
AWS Access Key ID [None]: (Just press Enter key)AWS Secret Access Key [None]: (Just press Enter key)Default region name [None]: <default region name>Default output format [None]: textFor "Default output format", other format than "text" may be specified.
When you input the wrong data, delete the files under
%SystemDrive%\Users\Administrator\.aws
and the directory itself and repeat the step described above.
Setting up an instance by using IAM user
In this method, you can execute AWS CLI after creating the IAM user and storing its access key ID and secret access key in the instance. You do not have to assign the IAM role to the instance when creating the instance.
Create the IAM user and attach the IAM Policy to the role.You can create the IAM user in [Users] - [Create New Users] of IAM Management ConsoleLog on to the instance.
Install the AWS CLI.
Download and install the AWS CLI.The installer automatically adds the path of the AWS CLI to the system environment variable PATH. If the automatic path addition fails, refer to "AWS Command Line Interface" of the AWS document to add the path.The AWS CLI has been installed in an environment with EXPRESSCLUSTER already installed, restart the OS before operating EXPRESSCLUSTER.Launch the command prompt as the Administrator and execute the command as shown below.
> aws configureInput the information required to execute AWS CLI in response to the prompt. Obtain AWS access key ID and AWS secret access key from IAM user detail screen to input.
AWS Access Key ID [None]: <AWS access key>AWS Secret Access Key [None]: <AWS secret access key>Default region name [None]: <default region name >Default output format [None]: textFor "Default output format", other format than "text" may be specified.
When you input the wrong data, delete the files under
%SystemDrive%\Users\Administrator\.aws
and the directory itself and repeat the step described above.
6.2.18. Settings in the Azure environment¶
- Making EXPRESSCLUSTER cooperate with Microsoft Azure requires an organizational account for Microsoft Azure.You cannot use a non-organizational account, which requires interactive login in running the Azure CLI.
- For the process, some of EXPRESSCLUSTER's functions internally run the Azure CLI.Successfully running the Azure CLI requires prior setting.For more information on setting the Azure CLI, see the following website:Azure documentation:
- Logging in to Microsoft Azure requires a service principal to be created in advance.For the detailed procedure, see the following web page:Sign in with Azure CLI:Create an Azure service principal with Azure CLI:
For the procedures to install Azure CLI and create a service principal, refer to the "EXPRESSCLUSTER X HA Cluster Configuration Guide for Microsoft Azure (Windows)".
6.2.19. IAM settings in the Azure environment¶
Setting access rights
For EXPRESSCLUSTER's Azure-related functions to run the Azure CLI, the following access rights are required.
These access rights are subject to change in the future.
Azure forced stop resource
Access right
Description
Microsoft.Compute/virtualMachines/deallocate/action
This is required for stopping instances.
Microsoft.Compute/virtualMachines/stop/action
This is required for stopping instances.
Microsoft.Compute/virtualMachines/restart/action
This is required for restarting instances.
Microsoft.Compute/virtualMachines/writeMicrosoft.Compute/virtualMachines/readThis is required for obtaining or updating instance attributes.
Azure DNS resource
Access right
Description
Microsoft.Network/dnsZones/A/write
This is required for adding or updating A records.
Microsoft.Network/dnsZones/A/delete
This is required for removing A records.
Microsoft.Network/dnsZones/NS/read
This is required for obtaining information on NS records.
Note
For Azure CLI 1.0, the following are also required:Microsoft.Network/dnsZones/readMicrosoft.Network/dnsZones/A/read
6.2.20. Azure DNS resources¶
Azure Private DNS is not supported.
6.2.21. Google Cloud virtual IP resources¶
Using a Google Cloud virtual IP resource with Windows Server 2019 requires Startup type for the following services to be set at Automatic (Delayed Start):
Google Compute Engine Agent
Google OSConfig Agent
6.2.22. Google Cloud DNS resources¶
Google Cloud DNS resources use Cloud DNS by Google Cloud. For the details on Cloud DNS, refer to the following website.
Cloud DNSCloud SDK needs to be installed to operate Cloud DNS. For the details on Cloud SDK, refer to the following website.
Cloud SDKCloud SDK needs to be authorized by the account with the permissions for the API methods below:
dns.changes.createdns.changes.getdns.managedZones.getdns.resourceRecordSets.createdns.resourceRecordSets.deletedns.resourceRecordSets.listdns.resourceRecordSets.updateAs for authorizing Cloud SDK, refer to the following website.
Authorizing Cloud SDK tools
6.2.23. CLI setting in the OCI environment¶
6.2.24. Policy setting in the OCI environment¶
Policy setting
For EXPRESSCLUSTER's OCI-related functions to run the OCI CLI, the following policies are required:
These policies are subject to change in the future.
For Oracle Cloud DNS resources and Oracle Cloud DNS monitor resources
Policy syntax
Description
Allow <subject> to use dns in <location>
Required to create, update, or delete an A record of Oracle Cloud DNS, or to retrieve information on it.
For OCI forced-stop resource
Policy syntax
Description
Allow <subject> to use instance-family in <location>
Required to stop or restart an instance, or to retrieve information on it.
Into each of <subject> and <location>, enter a value suitable for the environment.
6.3. Notes when creating the cluster configuration data¶
Notes when creating a cluster configuration data and before configuring a cluster system is described in this section.
6.3.1. Folders and files in the location pointed to by the EXPRESSCLUSTER installation path¶
6.3.2. Final action for group resource deactivation error¶
If select No Operation as the final action when a deactivation error is detected, the group does not stop but remains in the deactivation error status. Make sure not to set No Operation in the production environment.
6.3.3. Delay warning rate¶
If the delay warning rate is set to 0 or 100, the following can be achieved:
- When 0 is set to the delay monitoring rateAn alert for the delay warning is issued at every monitoring.By using this feature, you can calculate the polling time for the monitor resource at the time the server is heavily loaded, which will allow you to determine the time for monitoring timeout of a monitor resource.
- When 100 is set to the delay monitoring rateThe delay warning will not be issued.
Be sure not to set a low value, such as 0%, except for a test operation.
6.3.4. Monitoring method TUR for disk monitor resource and hybrid disk TUR monitor resource¶
You cannot use the TUR methods on a disk or disk interface (HBA) that does not support the Test Unit Ready (TUR) command of SCSI. Even if your hardware supports these commands, consult the driver specifications because the driver may not support them.
TUR methods burdens OS and disk load less compared to Read methods.
In some cases, TUR methods may not be able to detect errors in I/O to the actual media.
6.3.5. Heartbeat resource settings¶
For an interconnect with the highest priority, configure kernel mode LAN heartbeat resources which can be exchanged between all servers.
Configuring at least two kernel mode LAN heartbeat resources is recommended unless it is difficult to add a network to an environment such as the cloud or a remote cluster.
It is recommended to register both an interconnect-dedicated LAN and a public LAN as LAN heartbeat resources.
Time for heartbeat timeout needs to be shorter than the time required for restarting the OS. If the heartbeat timeout is not configured in this way, an error may occur after reboot in some servers in the cluster because other servers cannot detect the reboot.
6.3.6. Double-byte character set that can be used in script comments¶
Scripts edited in Windows environment are dealt as Shift-JIS code, and scripts edited in Linux environment are dealt as EUC code. In case that other character codes are used, character corruption may occur depending on environment.
6.3.7. The number of server groups that can be set as servers to be started in a group¶
- The number of server groups that can be set as servers to be started in one group is 2.If three or more server groups are set, the EXPRESSCLUSTER Disk Agent service (clpdiskagent.exe) may not operate properly.
6.3.8. Setting up JVM monitor resources¶
When the monitoring target is WebLogic, the maximum values of the following JVM monitor resource settings may be limited due to the system environment (including the amount of installed memory):
The number under Monitor the requests in Work Manager
Average under Monitor the requests in Work Manager
The number of Waiting Requests under Monitor the requests in Thread Pool
Average of Waiting Requests under Monitor the requests in Thread Pool
The number of Executing Requests under Monitor the requests in Thread Pool
Average of Executing Requests under Monitor the requests in Thread Pool
To use the Java Resource Agent, install the Java runtime environment (JRE) described in "Operation environment for JVM monitor" in "4. Installation requirements for EXPRESSCLUSTER" or a Java development kit (JDK). You can use either the same JRE or JDK as that used by the monitoring target (WebLogic Server or WebOTX) or a different one. If both JRE and JDK are installed on a server, you can use either one.
The monitor resource name must not include a blank.
6.3.9. System monitor resource settings¶
- Pattern of detection by resource monitoringThe System Resource Agent performs detection by using thresholds and monitoring duration time as parameters.The System Resource Agent collects the data (used size of memory, CPU usage rate, and used size of virtual memory) on individual system resources continuously, and detects errors when data keeps exceeding a threshold for a certain time (specified as the duration time).
6.3.10. Setting up PostgreSQL monitor resource¶
The monitor resource name must not include a blank.
6.3.11. AWS CLI command line options¶
AWS-related features run the AWS CLI.
To specify two or more of the command line options, separate each of them with a space.
aws cloudwatch
Amazon CloudWatch linkage
aws ec2
AWS Elastic IP resource
AWS Virtual IP resource
AWS Secondary IP resource
AWS Elastic IP monitor resource
AWS Virtual IP monitor resource
AWS Secondary IP monitor resource
AWS AZ monitor resource
AWS Forced stop resource
Obtaining cloud environment information with Cluster WebUI
aws route53
AWS DNS resource
AWS DNS monitor resource
aws sns
Amazon SNS linkage
For more information on the command line options for the AWS CLI, see AWS documents.
Note
6.3.14. Setting up AWS Elastic IP resources¶
IPv6 is not supported.
In the AWS environment, floating IP resources, floating IP monitor resources, virtual IP resources, virtual IP monitor resources, virtual computer name resources, and virtual computer name monitor resources cannot be used.
- Only ASCII characters is supported. Check that the character besides ASCII character isn't included in an execution result of the following command.aws ec2 describe-addresses --allocation-ids <EIP ALLOCATION ID>
AWS elastic IP resources associate an EIP with the primary private IP address of an ENI, but not with its secondary private IP address.
6.3.15. Setting up AWS Virtual IP resources¶
IPv6 is not supported.
In the AWS environment, floating IP resources, floating IP monitor resources, virtual IP resources, virtual IP monitor resources, virtual computer name resources, and virtual computer name monitor resources cannot be used.
Only ASCII characters is supported. Check that the character besides ASCII character isn't included in an execution result of the following command.
aws ec2 describe-vpcs --vpc-ids <VPC ID> aws ec2 describe-route-tables --filters Name=vpc-id,Values=<VPC ID> aws ec2 describe-network-interfaces --network-interface-ids <ENI ID>
AWS virtual IP resources cannot be used if access via a VPC peering connection is necessary. This is because it is assumed that an IP address to be used as a VIP is out of the VPC range and such an IP address is considered invalid in a VPC peering connection. If access via a VPC peering connection is necessary, use the AWS DNS resource that use Amazon Route 53.
When a AWS Virtual IP resource is set, Windows registers the physical host name and VIP record in the DNS (if the property of the corresponding network adapter for registering addresses to the DNS is set to ON). To convert the IP address linked by the physical host name resolution into a physical IP address, set the relevant data as follows.
Check the setting of the network adapter to which the corresponding VIP address is assigned, by choosing Properties - Internet Protocol Version 4 - Advanced - DNS tab - Register this connection's address in DNS. If this check box is selected, clear it.
Additionally, execute one of the following in order to apply this setting:
Reboot the DNS Client service.
Explicitly run the ipconfig/registerdns command.
Register the physical IP address of the network adapter to which the corresponding VIP address is assigned to the DNS server statically.
An AWS virtual IP resource starts up normally, even if the route table to be used by instances does not include any route to an IP address to be used by the AWS virtual IP resource. This operation is as required. When activated, an AWS virtual IP resource updates the content of a route table that includes a specified IP address entry. Finding no route table, the resource considers the situation as nothing to be updated and therefore as normal. Which route table should have a specified entry, depending on the system configuration, is not the resource's criterion for judging the normality.
An AWS virtual IP resource uses a Windows OS API to add a virtual IP address to a NIC--without setting the skipassource flag. Hence this flag is disabled after the AWS virtual IP resource is activated. However, the skipassource flag can be enabled by using PowerShell after the activation of the resource.
6.3.16. Setting up AWS Secondary IP resources¶
IPv6 is not supported.
In the AWS environment, floating IP resources, floating IP monitor resources, virtual IP resources, virtual IP monitor resources, virtual computer name resources, and virtual computer name monitor resources cannot be used.
Only ASCII characters is supported. Check that the character besides ASCII character isn't included in an execution result of the following command.
aws ec2 describe-network-interfaces --network-interface-ids <ENI ID> aws ec2 describe-subnets --subnet-ids <SUBNET_ID>
No AWS secondary IP resources can be used in a configuration with a different subnet.
- The number of secondary IP addresses to be assigned for AWS secondary IP resources has an upper limit for each instance type.For more information, refer to the following:
- Statically register the physical IP addresses of network adapters to which secondary IP addresses are to be assigned for AWS secondary IP resources.For more information, refer to Step 1 of the following:
An AWS secondary IP resource adds a secondary IP address to a NIC with the help of the netsh command--with the skipassource flag not set. Hence this flag is disabled after the AWS secondary IP resource is activated. However, the skipassource flag can be enabled by using PowerShell after the activation of the resource.
6.3.17. Setting up AWS DNS resources¶
IPv6 is not supported.
In the AWS environment, floating IP resources, floating IP monitor resources, virtual IP resources, virtual IP monitor resources, virtual computer name resource, and virtual computer name monitor resource cannot be used.
In the Resource Record Set Name field, enter a name without an escape code. If it is included in the Resource Record Set Name, a monitor error occurs.
Associated with a single account, an AWS DNS resource cannot be used for different accounts, AWS access key IDs, or AWS secret access keys. If you want such usage, consider creating a script to execute the AWS CLI with a script resource and then setting the environment variables in the script for authenticating other accounts.
6.3.18. Setting up AWS DNS monitor resources¶
Immediately after the AWS DNS resource is activated, monitoring by the AWS DNS monitor resource may fail due to the following events. If monitoring failed, set Wait Time to Start Monitoring of the AWS DNS monitor resource longer than the time to reflect the changed DNS setting of Amazon Route 53 (https://aws.amazon.com/route53/faqs/).
When the AWS DNS resource is activated, a resource record set is added or updated.
- If the AWS DNS monitor resource starts monitoring before the changed DNS setting of Amazon Route 53 is applied, name resolution cannot be done and monitoring fails.The AWS DNS monitor resource will continue to fail monitoring while a DNS resolver cache is enabled.
The changed DNS setting of Amazon Route 53 is applied.
Name resolution succeeds after the TTL valid period of the AWS DNS resource elapses. Then, the AWS DNS monitor resource succeeds monitoring.
6.3.19. Setting up Azure probe port resources¶
IPv6 is not supported.
In the Microsoft Azure environment, floating IP resources, floating IP monitor resources, virtual IP resources, virtual IP monitor resources, virtual computer name resources, and virtual computer name monitor resources cannot be used.
6.3.20. Setting up Azure load balance monitor resources¶
When a Azure load balance monitor resource error is detected, there is a possibility that switching of the active server and the stand-by server from Azure load balancer is not performed correctly. Therefore, in the Final Action of Azure load balance monitor resources and the recommended that you select Stop the cluster service and shutdown OS.
6.3.21. Setting up Azure DNS resources¶
IPv6 is not supported.
In the Microsoft Azure environment, floating IP resources, floating IP monitor resources, virtual IP resources, virtual IP monitor resources, virtual computer name resources, and virtual computer name monitor resources cannot be used.
6.3.22. Setting up Google Cloud virtual IP resources¶
IPv6 is not supported.
6.3.23. Setting up Google Cloud load balance monitor resources¶
For Final Action of Google Cloud load balance monitor resources, selecting Stop the cluster service and shutdown OS is recommended. When a Google Cloud load balance monitor resource detects an error, the load balancer may not correctly switch between the active server and the standby server.
6.3.24. Setting up Google Cloud DNS resources¶
IPv6 is not supported.
In the Google Cloud environment, floating IP resources, floating IP monitor resources, virtual IP resources, and virtual IP monitor resources cannot be used.
When using multiple Google Cloud DNS resources in the cluster, you need to configure them to prevent their simultaneous activation/deactivation for their dependence or a wait for a group start/stop. Their simultaneous activation/deactivation may cause an error.
6.3.25. Setting up Oracle Cloud virtual IP resources¶
IPv6 is not supported.
6.3.26. Setting up Oracle Cloud load balance monitor resources¶
For Final Action of Oracle Cloud load balance monitor resources, selecting Stop the cluster service and shutdown OS is recommended. When an Oracle Cloud load balance monitor resource detects an error, the load balancer may not correctly switch between the active server and the standby server.
6.3.27. Setting up Oracle Cloud DNS resources¶
IPv6 is not supported.
In the Oracle Cloud environment, floating IP resources, floating IP monitor resources, virtual IP resources, and virtual IP monitor resources cannot be used.
6.3.29. Recovery operation on systems with Windows Server 2012 or later when a service fails¶
This applies to systems with Windows Server 2012 or later, with Restart Computer selected as the recovery option to be exercised when a service fails (abends): If the failure actually occurs, the OS is restarted not in the same way as on Windows Server 2008 or earlier but with a STOP error.
The EXPRESSCLUSTER services for which Restart Computer is set as the recovery operation by default are the following:
EXPRESSCLUSTER Disk Agent service
EXPRESSCLUSTER Node Manager service
EXPRESSCLUSTER Server service
EXPRESSCLUSTER Transaction service
6.3.30. Coexistence with the Network Load Balancing function of the OS¶
6.3.31. Note on applying the HBA configuration¶
When you create a new cluster by changing the access control settings under the HBA tab of the Server Properties dialog box and uploading the configuration data, you are possibly not prompted to restart the OS to apply the change. Even so, restart the OS after changing the access control settings under the HBA tab to apply the configuration data.
6.3.32. Resource types listed in the wizard window for adding resources¶
6.3.33. Coexistence of a mirror disk resource with a hybrid disk resource¶
A mirror disk resource and a hybrid disk resource cannot coexist in the same failover group.
6.3.34. Notes on Allow failover on mirror break for specified time¶
When using this setting, pay attention to the following:
Enabling this setting temporarily suppresses automatic mirror recovery after a mirror break.
Enabling this setting restricts the configuration of some failover attributes.
If you use this feature for a hybrid disk resource, make sure that the times of servers constituting a server group synchronize with each other.
For Timeout, it is recommended to set a value equal to or higher than a value for the heartbeat timeout.
For more information, see "Reference Guide" -> "Parameter details" -> "Cluster properties" -> "Mirror Disk tab".
6.4. After starting operating EXPRESSCLUSTER¶
Notes on situations you may encounter after start operating EXPRESSCLUSTER are described in this section.
6.4.1. Limitations during the recovery operation¶
Do not perform the following operations by the Cluster WebUI or from the command line while recovery processing is changing (reactivation -> failover -> last operation), if a group resource such as disk resource or application resource is specified as a recovery target and when a monitor resource detects an error.
Stop and suspend of a cluster
Start, stop, moving of a group
6.4.2. Executable format file and script file not described in the command reference¶
Executable format files and script files which are not described in "EXPRESSCLUSTER command reference" in the "Reference Guide" exist under the installation directory. Do not run these files on any system other than EXPRESSCLUSTER. The consequences of running these files will not be supported.
6.4.3. Cluster shutdown and cluster shutdown reboot¶
When using a mirror disk, do not execute cluster shutdown or cluster shutdown reboot from the clpstdn command or the Cluster WebUI while a group is being activated. A group cannot be deactivated while being activated. OS may shut down while mirror disk resource is not properly deactivated and mirror break may occur.
6.4.4. Shutdown and reboot of individual server¶
With a mirror disk used, a mirror break is caused by using a command or Cluster WebUI to stop a cluster service on a server, shut down a server, or run the shutdown reboot command.
6.4.5. Recovery from network partition status¶
The servers that constitute a cluster cannot check the status of other servers if a network partition occurs. Therefore, if a group is operated (started/stopped/moved) or a server is restarted in this status, a recognition gap about the cluster status occurs among the servers. If a network is recovered in a state that servers with different recognitions about the cluster status are running like this, a group cannot be operated normally after that. For this reason, during the network partition status, shut down the server separated from the network (the one cannot communicate with the client) or stop the EXPRESSCLUSTER Server service. Then, start the server again and return to the cluster after the network is recovered. In case that a network is recovered in a state that multiple servers have been started, it becomes possible to return to the normal status, by restarting the servers with different recognitions about the cluster status.
When a network partition resolution resource is used, even though a network partition occurs, emergent shut-down of a server (or all the servers) is performed. This prevents two or more servers that cannot communicate with one another from being started. When manually restarting the server that emergent shut down took place, or when setting the operations during the emergent shut down to restarting, the restarted server performs emergent shut down again. (In case of ping method or majority method, the EXPRESSCLUSTER Server service will stop.) However, if two or more disk heartbeat partitions are used by the disk method, and if a network partition occurs in the state that communication through the disk cannot be performed due to a disk failure, both of the servers may continue their operations with being suspended.
6.4.6. Notes on the Cluster WebUI¶
If the Cluster WebUI is operated in the state that it cannot communicate with the connection destination, it may take a while until the control returns.
When going through the proxy server, configure the settings for the proxy server be able to relay the port number of the Cluster WebUI.
When going through the reverse proxy server, the Cluster WebUI will not operate properly.
When updating EXPRESSCLUSTER, close all running browsers. Clear the browser cache and restart the browser.
Cluster configuration data created using a later version of this product cannot be used with this product.
When closing the Web browser, the dialog box to confirm to save may be displayed.
When you continue to edit, click the Stay on this page button.
Reloading the Web browser (by selecting Refresh from the menu or tool bar) , the dialog box to confirm to save may be displayed.
When you continue to edit, click the Stay on this page button.
For notes and restrictions of Cluster WebUI other than the above, see the online manual.
6.4.7. EXPRESSCLUSTER Disk Agent Service¶
Make sure not to stop the EXPRESSCLUSTER Disk Agent Service. This cannot be manually started once you stop. Restart the OS, and then restart the EXPRESSCLUSTER Disk Agent Service.
6.4.8. Changing the cluster configuration data during mirroring¶
Make sure not to change the cluster configuration data during the mirroring process including initial mirror configuration. The driver may malfunction if the cluster configuration is changed.
6.4.9. Returning the stand-by server to the cluster during mirror-disk activation¶
If the stand-by server is running while the cluster service (EXPRESSCLUSTER server service) is stopped and the mirror disk is activated, restart the stand-by server before starting the service and returning the stand-by server to the cluster. If the stand-by server is returned without being restarted, the information about mirror differences will be invalid and a mirror disk inconsistency will occur.
6.4.10. Changing the configuration between the mirror disk and hybrid disk¶
To change the configuration so that the disk mirrored using a mirror disk resource will be mirrored using a hybrid disk resource, first delete the existing mirror disk resource from the configuration data, and then upload the data. Next, add a hybrid disk resource to the configuration data, and then upload it again. You can change a hybrid disk to a mirror disk in a similar way.
If you upload configuration data in which the existing resource has been replaced with a new one without deleting the existing resource as described above, the disk mirroring setting might not be changed properly, potentially resulting in a malfunction.
6.4.11. chkdsk command and defragmentation¶
6.4.12. Index service¶
When you create a shared disk/mirror disk directory on the index service catalogue to make an index for the folders on the shared disk / mirror disk, it is necessary to configure the index service to be started manually and to be controlled from EXPRESSCLUSTER so that the index service starts after the shared disk / mirror disk is activated. If the index service is configured to start automatically, the index service opens the target volume, which leads to failure in mounting upon the following activation, resulting in failure in disk access from an application or explorer with the message telling the parameter is wrong.
6.4.13. Issues with User Account Control (UAC) in a Windows Server 2012 or later environment¶
In a Windows Server 2012 or later environment, User Account Control (UAC) is enabled by default. When UAC is enabled, there are following issues.
- Monitor ResourceFollowing resource has issues with UAC.
- Oracle Monitor ResourceFor the Oracle monitor resource, if you select OS Authentication for Authentication Method and then set any user other than those in the Administrators group as the monitor user, the Oracle monitoring processing will fail.When you set OS Authentication in Authentication Method, the user to be set in Monitor User must belong to the Administrators group.
6.4.14. Environment in which the network interface card (NIC) is duplicated¶
In an environment in which the NIC is duplicated, NIC initialization at OS startup may take some time. If the cluster starts before the NIC is initialized, the starting of the kernel mode LAN heartbeat resource (lankhb) may fail. In such cases, the kernel mode LAN heartbeat resource cannot be restored to its normal status even if NIC initialization is completed. To restore the kernel mode LAN heartbeat resource, you must first suspend the cluster and then resume it.
In that environment, we recommend to delay startup of the cluster by following setting.
- Network Initialization Complete Wait Time SettingYou can configure this setting in the Timeout tab of Cluster Properties. This setting will be enabled on all cluster servers. If NIC initialization is completed within timeout, the cluster service starts up.
6.4.15. EXPRESSCLUSTER service login account¶
The EXPRESSCLUSTER service login account is set in Local System Account. If this account setting is changed, EXPRESSCLUSTER might not properly operate as a cluster.
6.4.16. Monitoring the EXPRESSCLUSTER resident process¶
The EXPRESSCLUSTER resident process can be monitored by using software monitoring processes. However, recovery actions such as restarting a process when the process abnormally terminated must not be executed.
6.4.17. Eternal link monitor resource settings¶
Error notification to eternal link monitor resources can be done in any of three ways: using the clprexec command, or linkage with the server management infrastructure.
To use the clprexec command, use the relevant file stored on the EXPRESSCLUSTER CD. Use this method according to the OS and architecture of the notification-source server. The notification-source server must be able to communicate with the notification-destination server.
6.4.18. JVM monitor resources¶
When restarting the monitoring-target Java VM, you must first suspend JVM monitor resources or stop the cluster.
When changing the JVM monitor resource settings, you must suspend and resume the cluster.
JVM monitor resources do not support a delay warning for monitor resources.
6.4.19. System monitor resources, Process resource monitor resource¶
To change a setting, the cluster must be suspended.
System monitor resources do not support a delay warning for monitor resources.
If the date and time of the OS is changed during operation, the timing of analysis processing being performed at 10-minute intervals will change only once immediately after the date and time is changed. This will cause the following to occur; suspend and resume the cluster as necessary.
An error is not detected even when the time to be detected as abnormal elapses.
An error is detected before the time to be detected as abnormal elapses.
Up to 26 disks that can be monitored by the System monitor resources of disk resource monitor function at the same time.
6.4.20. Event log output relating to linkage between mirror statistical information collection function and OS standard function¶
The following error may be output to an application event log in the environment where the internal version is updated from 11.16 or earlier.
- Event ID: 1008Source: PerflibMessage: The Open Procedure for service clpdiskperf in DLL <EXPRESSCLUSTER installation path>binclpdiskperf.dll failed. Performance data for this service will not be available. The first four bytes (DWORD) of the Data section contains the error code.
If the linkage function for the mirror statistical information collection function and OS standard function is used, execute the following command at the Command Prompt to suppress this message.
>lodctr.exe <EXPRESSCLUSTER installation path>\perf\clpdiskperf.ini
When the linkage function is not used, even if this message is output, there is no problem in EXPRESSCLUSTER and performance monitor operations. If this message is frequently output, execute the following two commands at the Command Prompt to suppress this message.
> unlodctr.exe clpdiskperf > reg delete HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\clpdiskperf
If the linkage function for the mirror statistical information collection function and OS standard function is enabled, the following error may be output in an application event log:
- Event ID: 4806Source: EXPRESSCLUSTER XMessage: Cluster Disk Resource Performance Data can't be collected because a performance monitor is too numerous.
When the linkage function is not used, even if this message is output, there is no problem in EXPRESSCLUSTER and performance monitor operations. If this message is frequently output, execute the following two commands at the Command Prompt to suppress this message.
> unlodctr.exe clpdiskperf > reg delete HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\clpdiskperf
Refer to the following for the linkage function for the mirror statistical information collection function and OS standard function.
6.4.21. Restoration from an AMI in an AWS environment¶
6.5. Notes when changing the EXPRESSCLUSTER configuration¶
The section describes what happens when the configuration is changed after starting to use EXPRESSCLUSTER in the cluster configuration.
6.5.1. Exclusive rule of group common properties¶
6.5.2. Dependency between resource properties¶
6.5.3. Setting cluster statistics information of eternal link monitor resources¶
Once the settings of cluster statistics information of monitor resource has been changed, the settings of cluster statistics information are not applied to eternal link monitor resources even if you execute the suspend and resume. Reboot the OS to apply the settings to the eternal link monitor resources.
6.5.4. Changing a port number¶
If you have changed a port number with the server firewall enabled, the firewall configuration needs to be changed as well by using the clpfwctrl command. For more information, see "Reference Guide" -> "EXPRESSCLUSTER command reference" -> "Adding a firewall rule (clpfwctrl command)".
6.6. Notes on upgrading EXPRESSCLUSTER¶
This section describes notes on upgrading or updating EXPRESSCLUSTER after starting the cluster operation.
6.6.1. Changed functions¶
The following describes the functions changed for each of the versions.
Internal version 12.00
Management toolThe default management tool has been changed to Cluster WebUI. If you want to use the conventional WebManager as the management tool, specify "http://management IP address of management group or actual IP address:port number of the server in which EXPRESSCLUSTER Server is installed/main.htm" in the address bar of a web browser. Mirror/Hybrid disk resourceConsidering that the minimum size of a cluster partition has been increased to 1 GiB, prepare a sufficient size of it for upgrading EXPRESSCLUSTER..
Internal Version 12.10
Configuration toolThe default configuration tool has been changed to Cluster WebUI, which allows you to manage and configure clusters with Cluster WebUI. Cluster statistical information collection functionBy default, the cluster statistical information collection function saves statistics information files under the installation path. To avoid saving the files for such reasons as insufficient disk capacity, disable the cluster statistical information collection function. For more information on settings for this function, see "Parameter details" in the Reference Guide. System monitor resourceThe System Resource Agent process settings part of the system monitor resource has been separated to become a new monitor resource. Therefore, the conventional monitor settings of the System Resource Agent process settings are no longer valid. To continue the conventional monitoring, configure it by registering a new process resource monitor resource after upgrading EXPRESSCLUSTER. For more information on monitor settings for process resource monitor resources, see "Understanding process resource monitor resources" in "Monitor resource details" in the "Reference Guide". BMC linkageThe ipmiutil parameters have been changed as follows.Before the change (12.01 or earlier)
Forced Stop Action
Forced Stop Action
Parameters
BMC Power Off
ireset.cmd -d -J 0 -N ip_address -U username -P password
BMC Reset
ireset.cmd -r -J 0 -N ip_address -U username -P password
BMC Power Cycle
ireset.cmd -c -J 0 -N ip_address -U username -P password
BMC NMI
ireset.cmd -n -J 0 -N ip_address -U username -P password
Chassis Identify
Chassis Identify
Parameters
Blinking
ialarms.cmd -i250 -J 0 -N ip_address -U username -P password
Off
ialarms.cmd -i0 -J 0 -N ip_address -U username -P password
After the change
Forced Stop Action
Forced Stop Action
Parameters
BMC Power Off
ireset.cmd -d -N ip_address -U username -P password
BMC Reset
ireset.cmd -r -N ip_address -U username -P password
BMC Power Cycle
ireset.cmd -c -N ip_address -U username -P password
BMC NMI
ireset.cmd -n -N ip_address -U username -P password
Chassis Identify
Chassis Identify
Parameters
ialarms.cmd -i250 -N ip_address -U username -P password
ialarms.cmd -i0 -N ip_address -U username -P password
Internal Version 12.20
AWS AZ monitor resourceThe way of evaluating the AZ status grasped through the AWS CLI has been changed: available as normal, information or impaired as warning, and unavailable as warning. (Previously, any AZ status other than available was evaluated as abnormal.)
Internal Version 12.30
WebLogic monitor resourceREST API has been added as a new monitoring method. From this version, REST API is the default value for the monitoring method. At the version upgrade, reconfigure the monitoring method.The default value of the password has been changed. If you use weblogic that is the previous default value, reset the password default value.
Internal Version 13.00
Forced stop function and scriptsThese have been redesigned as individual forced stop resources adapted to environment types.Since the forced stop function and scripts configured before the upgrade are no longer effective, set them up again as forced stop resources.
Internal Version 13.10
AWS Virtual IP resourcesSome of the parameters have been changed due to a discontinuation of using Python.
Internal Version 13.20
Supported browsers for Cluster WebUIIf you use an internal version since 13.20, Cluster WebUI does not support Internet Explorer. For information on supported browsers, refer to "4.3.1. Supported operating systems and browsers" .
6.6.2. Removed Functions¶
The following describes the functions removed for each of the versions. Internal Version 12.00
WebManager Mobile
OfficeScan CL monitor resource
OfficeScan SV monitor resource
OracleAS monitor resource
Important
Internal Version 13.00
Function
Action
WebManager/Builder
COM network partition resolution resources
Open Cluster Properties -> NP resolution tab, then remove each NP resolution resource whose type is unknown.
NAS resourcesNAS monitor resources
If NAS resources are individually set in group resources' dependency, remove the dependency settings first.For the group resources, open Resource Properties -> the Dependency tab, select the NAS resources, and then click the deleted button to exclude them from the dependency. Delete NAS resources, and NAS monitor resources will also be deleted. Print spooler resourcesPrint spooler monitor resources
If print spooler resources are individually set in group resources' dependency, remove the dependency settings first.For the group resources, open Resource Properties -> the Dependency tab, select the print spooler resources, and then click the deleted button to exclude them from the dependency. Delete print spooler resources, and print spooler monitor resources will also be deleted. Virtual machine groupsVirtual machine resourcesVirtual machine monitor resourcesYou cannot move configuration data (for a host cluster) which involves virtual machine groups.
BMC linkage
Delete relevant eternal link monitor resources.
Compatible commands
Script resources
Custom monitor resources
Scripts before final action
Scripts before and after activation/deactivation
Recovery scripts
Pre-recovery action scripts
Forced-stop scripts
Other scripts configured with EXPRESSCLUSTER
If any of these scripts includes a compatible command, modify the script by excluding the command.
Example
To start or stop services controlled with the armload command, use the sc command instead.To monitor services, use service monitor resources instead.If you used armdelay to specify a delay time for starting EXPRESSCLUSTER services, open the Cluster properties Timeout tab, then specify the value in Service Startup Delay Time instead.
Controlling CPU frequency command(clpcpufreq command)-
Estimating the amount of resource usage command(clpprer command)-
Controlling chassis identify lamp command(clpledctrl command)-
Processing inter-cluster linkage command(clptrnreq command)-
Changing BMC information command(clpbmccnf command)-
Broadcast for kernel mode LAN heartbeat resources
The Broadcast option (see Heartbeat I/F -> Cast Method) has been removed.If you use cluster configuration data created with an old version, Unicast is applied for the heartbeat transmission.EXPRESSCLUSTER Task Manager
-
EXPRESSCLUSTER clients
-
Linking with the load balancer(JVM monitor resource)-
The forced stop function using the System Center Virtual Machine Manager (SCVMM)
-
Mirror connect monitor resource(Integrated into mirror disk monitor resource)Delete Mirror connect monitor resources.
6.6.3. Removed Parameters¶
The following tables show the parameters configurable with Cluster WebUI but removed for each of the versions.
Internal Version 12.00
Cluster
Parameters
default values
Cluster Properties
WebManager Tab
Enable WebManager Mobile Connection
Off
WebManager Mobile Password
Password for Operation
-
Password for Reference
-
JVM monitor resource
Parameters
default values
JVM Monitor Resource Properties
Monitor (special) Tab
Memory Tab (when Oracle Java is selected for JVM type)
Monitor Virtual Memory Usage
2048 MB
Memory Tab (when Oracle Java(usage monitoring) is selected for JVM Type)
Monitor Virtual Memory Usage
2048 MB
User mode monitor resource
Parameters
default values
User mode Monitor Resource Properties
Monitor (special) Tab
Use Heartbeat Interval/Timeout
On
Internal Version 12.10
Cluster
Parameters
default values
Cluster Properties
WebManager Tab
WebManager Tuning Properties
Behavior Tab
Max. Number of Alert Records on the Viewer
300
Client Data Update Method
Real Time
Virtual Computer Name resource
Parameters
default values
Virtual Computer Name Resource Properties
Details Tab
Virtual Computer Name Resource Tuning Properties
Parameter Tab
IP address to be associated 7
FIP
- 7
From the IP address to be associated group box, the Public option has been removed.When using configuration data with the Public option selected, you do not need to change it.To change the IP address, select Any Address and specify the desired address.
Internal Version 13.00
Cluster
Parameters
default values
Cluster Properties
Interconnect Tab
Broadcast/Unicast
Unicast
Extension Tab
Virtual Machine Forced Stop Setting - Virtual Machine Management Tool
vCenter
Virtual Machine Forced Stop Setting - Command
C:\Program Files (x86)\VMware\VMware vSphere CLI\Perl\apps\vm\vmcontrol.pl
Execute Script for Forced Stop
Off
Server Properties
Info Tab
Virtual Machine
Off
Type
vSphere
BMC Tab
Forced Stop Command Line
-
Chassis Identify - Flash / Turn off
-
Internal Version 13.10
Virtual IP resource
Parameters
default values
Virtual IP Resource Properties
Details Tab
Virtual IP Resource Tuning Properties
RIP Tab
Next Hop IP Address
-
Internal Version 13.20
Cluster
Parameters
default values
Cluster Properties
RIP (Legacy) tab
Network Address
-
Group
Parameters
default values
Group Properties
Logical Service tab
Logical Service Name
-
Application resource
Parameters
default values
Application Resource Properties
Details Tab
Application Resource Tuning Properties
Parameter Tab
Allow to Interact with Desktop
Off
Script resource
Parameters
default values
Script Resource Properties
Details Tab
Script Resource Tuning Properties
Allow to Interact with Desktop
Off
6.6.4. Changed Default Values¶
The following tables show the parameters which are configurable with Cluster WebUI but whose defaults have been changed for each of the versions.
To continue using a "Default value before update" after the upgrade, change the corresponding "Default value after update" to the desired one.
Any setting other than a "Default value before update" is inherited to the upgraded version and therefore does not need to be restored.
Internal Version 12.00
Cluster
Parameters
Default value before update
Default value after update
Remarks
Cluster Properties
JVM monitor Tab
Maximum Java Heap Size
7 MB
16 MB
Extension Tab
Failover Count Method
Cluster
Server
Group Resource (Common)
Parameters
Default value before update
Default value after update
Remarks
Resource Common Properties
Recovery Operation Tab
Failover Threshold
Set as much as the number of the servers
1 time
This was also changed because the default value of Cluster Properties > Expand tab > Unit for Counting Failover Occurrences was changed.
Application resource
Parameters
Default value before update
Default value after update
Remarks
Application Resource Properties
Dependency Tab
Follow the default dependence
On- CIFS resource- disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- Azure probe port resource On- CIFS resource- Disk resource- Floating ip resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resourceRegistry synchronization resource
Parameters
Default value before update
Default value after update
Remarks
Registry Synchronization Resource Properties
Dependency Tab
Follow the default dependence
On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- Azure probe port resource On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resourceScript resource
Parameters
Default value before update
Default value after update
Remarks
Script Resource Properties
Dependency Tab
Follow the default dependence
On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- Azure probe port resource On- CIFS resource- Disk resource- Floating ip resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resourceService resource
Parameters
Default value before update
Default value after update
Remarks
Service Resource Properties
Dependency Tab
Follow the default dependence
On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- Azure probe port resource On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resourceMonitor resource (common)
Parameters
Default value before update
Default value after update
Remarks
Monitor Resource Common Properties
Recovery Operation Tab
Maximum Failover Count
Set as much as the number of the servers
1 time
This was also changed because the default value of Cluster Properties > Expand tab > Unit for Counting Failover Occurrences was changed.
Application monitor resource
Parameters
Default value before update
Default value after update
Remarks
Application Monitor Resource Properties
Monitor (common) Tab
Wait Time to Start Monitoring
0 sec
3 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Floating IP monitor resource
Parameters
Default value before update
Default value after update
Remarks
Floating IP Monitor Resource Properties
Monitor (common) Tab
Timeout
60 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
NIC Link Up/Down monitor resource
Parameters
Default value before update
Default value after update
Remarks
NIC Link Up/Down Monitor Resource Properties
Monitor (common) Tab
Timeout
60 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Registry synchronous monitor resource
Parameters
Default value before update
Default value after update
Remarks
Registry Synchronization Monitor Resource Properties
Monitor (common) Tab
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Service monitor resource
Parameters
Default value before update
Default value after update
Remarks
Service Monitor Resource Properties
Monitor (common) Tab
Wait Time to Start Monitoring
0 sec
3 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Print spooler monitor resource
Parameters
Default value before update
Default value after update
Remarks
Print Spooler Monitor Resource Properties
Monitor (common) Tab
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Virtual computer name monitor resource
Parameters
Default value before update
Default value after update
Remarks
Virtual Computer Name Monitor Resource Properties
Monitor (common) Tab
Timeout
60 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Virtual IP monitor resource
Parameters
Default value before update
Default value after update
Remarks
Virtual IP Monitor Resource Properties
Monitor (common) Tab
Timeout
60 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Custom monitor resource
Parameters
Default value before update
Default value after update
Remarks
Custom Monitor Resource Properties
Monitor (common) Tab
Wait Time to Start Monitoring
0 sec
3 sec
Process name monitor resource
Parameters
Default value before update
Default value after update
Remarks
Process Name Monitor Properties
Monitor (common) Tab
Wait Time to Start Monitoring
0 sec
3 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
SQL Server monitor resource
Parameters
Default value before update
Default value after update
Remarks
SQL Server Monitor Resource Properties
Monitor (special) Tab
ODBC Driver Name
SQL Native Client
ODBC Driver 13 for SQL Server
WebLogic monitor resource
Parameters
Default value before update
Default value after update
Remarks
WebLogic Monitor Resource Properties
Monitor (special) Tab
Install Path
C:\bea\weblogic92
C:\Oracle\Middleware\Oracle_Home\wlserver
JVM monitor resource
Parameters
Default value before update
Default value after update
Remarks
JVM Monitor Resource Properties
Monitor (common) Tab
Timeout
120 sec
180 sec
Dynamic DNS monitor resource
Parameters
Default value before update
Default value after update
Remarks
Dynamic DNS Monitor Resource Properties
Monitor (common) Tab
Timeout
120 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
AWS Elastic IP monitor resource
Parameters
Default value before update
Default value after update
Remarks
AWS elastic ip Monitor Resource Properties
Monitor (common) Tab
Timeout
100 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
AWS Virtual IP monitor resource
Parameters
Default value before update
Default value after update
Remarks
AWS virtual ip Monitor Resource Properties
Monitor (common) Tab
Timeout
100 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
AWS AZ monitor resource
Parameters
Default value before update
Default value after update
Remarks
AWS AZ Monitor Resource Properties
Monitor (common) Tab
Timeout
100 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Azure probe port monitor resource
Parameters
Default value before update
Default value after update
Remarks
Azure probe port Monitor Resource Properties
Monitor (common) Tab
Timeout
100 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Azure load balance monitor resource
Parameters
Default value before update
Default value after update
Remarks
Azure load balance Monitor Resource Properties
Monitor (common) Tab
Timeout
100 sec
180 sec
Do Not Retry at Timeout Occurrence
Off
On
Do not Execute Recovery Action at Timeout Occurrence
Off
On
Internal Version 12.10
Script resource
Parameters
Default value before update
Default value after update
Remarks
Script Resource Properties
Details Tab
Script Resource Tuning Properties
Parameter Tab
Allow to Interact with Desktop
On
Off
The settings cannot be changed for the internal version 12.00 or earlier.The settings can be changed for 12.10 or later.
Internal Version 12.20
Service resource
Parameters
Default value before update
Default value after update
Remarks
Service Resource Properties
Recovery Operation Tab
Retry Count
0 times
1 time
AWS Elastic IP monitor resource
Parameters
Default value before update
Default value after update
Remarks
AWS elastic ip Monitor Resource Properties
Monitor(special) Tab
Action when AWS CLI command failed to receive response
Disable recovery action(Display warning)
Disable recovery action(Do nothing)
AWS Virtual IP monitor resource
Parameters
Default value before update
Default value after update
Remarks
AWS virtual ip Monitor Resource Properties
Monitor(special) Tab
Action when AWS CLI command failed to receive response
Disable recovery action(Display warning)
Disable recovery action(Do nothing)
AWS AZ monitor resource
Parameters
Default value before update
Default value after update
Remarks
AWS AZ Monitor Resource Properties
Monitor(special) Tab
Action when AWS CLI command failed to receive response
Disable recovery action(Display warning)
Disable recovery action(Do nothing)
AWS DNS monitor resource
Parameters
Default value before update
Default value after update
Remarks
AWS DNS Monitor Resource Properties
Monitor(special) Tab
Action when AWS CLI command failed to receive response
Disable recovery action(Display warning)
Disable recovery action(Do nothing)
Internal Version 12.30
Cluster
Parameters
Default value before update
Default value after update
Remarks
Cluster Properties
Extension Tab
Max Reboot Count
0 times
3 times
Max Reboot Count Reset Time
0 min
60 min
API tab
Communication Method
HTTP
HTTPS
Internal Version 12.32
AWS DNS resource
Parameters
Default value before update
Default value after update
Remarks
AWS DNS Resource Properties
Details Tab
Delete a resource record set at deactivation
on
off
Internal Version 13.00
Application resource
Parameters
Default value before update
Default value after update
Remarks
Application Resource Properties
Dependency Tab
Follow the default dependence
On- CIFS resource- Disk resource- Floating ip resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resource On- CIFS resource- Disk resource- Floating ip resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS secondary IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resourceRegistry synchronization resource
Parameters
Default value before update
Default value after update
Remarks
Registry Synchronization Resource Properties
Dependency Tab
Follow the default dependence
On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resource On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS secondary IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resourceScript resource
Parameters
Default value before update
Default value after update
Remarks
Script Resource Properties
Dependency Tab
Follow the default dependence
On- CIFS resource- Disk resource- Floating ip resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resource On- CIFS resource- Disk resource- Floating ip resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS secondary IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resourceService resource
Parameters
Default value before update
Default value after update
Remarks
Service Resource Properties
Dependency Tab
Follow the default dependence
On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resource On- CIFS resource- Disk resource- Floating IP resource- Hybrid disk resource- Mirror disk resource- Print spooler resource- Registry synchronization resource- Virtual computer name resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS secondary IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resourceVirtual computer name resource
Parameters
Default value before update
Default value after update
Remarks
Virtual computer name resource Properties
Dependency Tab
Follow the default dependence
On- Floating IP resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- Azure probe port resource On- Floating IP resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS secondary IP resource- Azure probe port resourceCIFS resource
Parameters
Default value before update
Default value after update
Remarks
CIFS Resource Properties
Details Tab
Errors in restoring file share setting are treated as activity failure
On
Off
Dynamic DNS monitor resource
Parameters
Default value before update
Default value after update
Remarks
Dynamic DNS monitor resource Properties
Dependency Tab
Follow the default dependence
On- Floating IP resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- Azure probe port resource On- Floating IP resource- Virtual IP resource- AWS elastic IP resource- AWS virtual IP resource- AWS secondary IP resource- Azure probe port resource
Internal Version 13.10
Cluster
Parameters
Default value before update
Default value after update
Remarks
Cluster Properties
WebManager Tab
Output Cluster WebUI Operation Log
Off
On
Internal Version 13.20
Cluster
Parameters
Default value before update
Default value after update
Remarks
Statistics tab
System Resource Statistics
Collect Statistics
Off
On
Azure probe port resource
Parameters
Default value before update
Default value after update
Remarks
Azure probe port resource Properties
Dependency Tab
Follow the default dependence
On- No Dependent Resources On- Application Resource- Script resource- Service resourceGoogle Cloud Virtual IP resource
Parameters
Default value before update
Default value after update
Remarks
Google Cloud Virtual IP resource Properties
Dependency Tab
Follow the default dependence
On- No Dependent Resources On- Application resource- Script resource- Service resourceOracle Cloud Virtual IP resource
Parameters
Default value before update
Default value after update
Remarks
Oracle Cloud Virtual IP resource Properties
Dependency Tab
Follow the default dependence
On- No Dependent Resources On- Application resource- Script resource- Service resourceApplication resource
Parameters
Default value before update
Default value after update
Remarks
Application resource Properties
Dependency Tab
Follow the default dependence
On- Floating IP resource- Virtual IP resource- Virtual computer name resource- Disk resource- Mirror disk resource- Hybrid disk resource- CIFS resource- Registry synchronization resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resource On- Floating IP resource- Virtual IP resource- Virtual computer name resource- Disk resource- Mirror disk resource- Hybrid disk resource- CIFS resource- Registry synchronization resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- AWS DNS resource- Azure DNS resourceScript resource
Parameters
Default value before update
Default value after update
Remarks
Script resource Properties
Dependency Tab
Follow the default dependence
On- Floating IP resource- Virtual IP resource- Virtual computer name resource- Disk resource- Mirror disk resource- Hybrid disk resource- CIFS resource- Registry synchronization resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resource On- Floating IP resource- Virtual IP resource- Virtual computer name resource- Disk resource- Mirror disk resource- Hybrid disk resource- CIFS resource- Registry synchronization resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- AWS DNS resource- Azure DNS resourceService resource
Parameters
Default value before update
Default value after update
Remarks
Service resource Properties
Dependency Tab
Follow the default dependence
On- Floating IP resource- Virtual IP resource- Virtual computer name resource- Disk resource- Mirror disk resource- Hybrid disk resource- CIFS resource- Registry synchronization resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resource On- Floating IP resource- Virtual IP resource- Virtual computer name resource- Disk resource- Mirror disk resource- Hybrid disk resource- CIFS resource- Registry synchronization resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- AWS DNS resource- Azure DNS resourceRegistry synchronization resource
Parameters
Default value before update
Default value after update
Remarks
Registry synchronization resource Properties
Dependency Tab
Follow the default dependence
On- Floating IP resource- Virtual IP resource- Virtual computer name resource- Disk resource- Mirror disk resource- Hybrid disk resource- CIFS resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- AWS DNS resource- Azure probe port resource- Azure DNS resource On- Floating IP resource- Virtual IP resource- Virtual computer name resource- Disk resource- Mirror disk resource- Hybrid disk resource- CIFS resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- AWS DNS resource- Azure DNS resourceDynamic DNS resource
Parameters
Default value before update
Default value after update
Remarks
Dynamic DNS resource Properties
Dependency Tab
Follow the default dependence
On- Floating IP resource- Virtual IP resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- Azure probe port resource On- Floating IP resource- Virtual IP resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resourceVirtual computer name resource
Parameters
Default value before update
Default value after update
Remarks
Virtual computer name resource Properties
Dependency Tab
Follow the default dependence
On- Floating IP resource- Virtual IP resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource- Azure probe port resource On- Floating IP resource- Virtual IP resource- AWS Elastic IP resource- AWS Virtual IP resource- AWS Secondary IP resource
6.6.5. Moved Parameters¶
The following table shows the parameters which are configurable with Cluster WebUI but whose controls have been moved for each of the versions.
Internal Version 12.00
Parameter location Before the change
Parameter location After the change
[Cluster Properties]-[Recovery Tab]-[Max Reboot Count]
[Cluster Properties]-[Extension Tab]-[Max Reboot Count]
[Cluster Properties]-[Recovery Tab]-[Max Reboot Count Reset Time]
[Cluster Properties]-[Extension Tab]-[Max Reboot Count Reset Time]
[Cluster Properties]-[Recovery Tab]-[Use Forced Stop]
[Cluster Properties]-[Extension Tab]-[Use Forced Stop]
[Cluster Properties]-[Recovery Tab]-[Forced Stop Action]
[Cluster Properties]-[Extension Tab]-[Forced Stop Action]
[Cluster Properties]-[Recovery Tab]-[Forced Stop Timeout]
[Cluster Properties]-[Extension Tab]-[Forced Stop Timeout]
[Cluster Properties]-[Recovery Tab]-[Virtual Machine Forced Stop Setting]
[Cluster Properties]-[Extension Tab]-[Virtual Machine Forced Stop Setting]
[Cluster Properties]-[Recovery Tab]-[Execute Script for Forced Stop]
[Cluster Properties]-[Extension Tab]-[Execute Script for Forced Stop]
[Cluster Properties]-[Auto Recovery Tab]-[Auto Return]
[Cluster Properties]-[Extension Tab]-[Auto Return]
[Cluster Properties]-[Recovery Tab]-[Disable Recovery Action Caused by Monitor Resource Error]
[Cluster Properties]-[Extension Tab]-[Disable cluster operation]-[Recovery Action when Monitor Resource Failure Detected]
[Group Properties]-[Attribute Tab]-[Failover Exclusive Attribute]
[Group Common Properties]-[Exclusion Tab]
Internal Version 13.00
Parameter location Before the change
Parameter location After the change
[Cluster Properties]-[Extension Tab]-[Use Forced Stop]
[Cluster Properties]-[Fencing Tab]-[Forced Stop] - [Type]
[Cluster Properties]-[Extension Tab]-[Forced Stop Action]
[BMC Forced Stop Properties]-[Forced Stop Tab]-[Forced Stop Action]
[Cluster Properties]-[Extension Tab]-[Forced Stop Timeout]
[BMC Forced Stop Properties]-[Forced Stop Tab]-[Forced Stop Timeout]
[Cluster Properties]-[Extension Tab]-[Virtual Machine Forced Stop Setting] - [Action]
[vCenter Forced Stop Properties]-[Forced Stop Tab]-[Forced Stop Action]
[Cluster Properties]-[Extension Tab]-[Virtual Machine Forced Stop Setting] - [Timeout]
[vCenter Forced Stop Properties]-[Forced Stop Tab]-[Forced Stop Timeout]
[Cluster Properties]-[Extension Tab]-[Virtual Machine Forced Stop Setting] - [Host Name]
[vCenter Forced Stop Properties]-[vCenter Tab]-[Host Name]
[Cluster Properties]-[Extension Tab]-[Virtual Machine Forced Stop Setting] - [User Name]
[vCenter Forced Stop Properties]-[vCenter Tab]-[User Name]
[Cluster Properties]-[Extension Tab]-[Virtual Machine Forced Stop Setting] - [Password]
[vCenter Forced Stop Properties]-[vCenter Tab]-[Password]
[Cluster Properties]-[Extension Tab]-[Virtual Machine Forced Stop Setting] - [Perl Path]
[vCenter Forced Stop Properties]-[vCenter Tab]-[Perl Path]
[Server Properties]-[BMC Tab]-[IP Address]
[BMC Forced Stop Properties]-[Server List Tab]-[BMC Settings]-[IP Address]
[Server Properties]-[BMC Tab]-[User Name]
[BMC Forced Stop Properties]-[Server List Tab]-[BMC Settings]-[User Name]
[Server Properties]-[BMC Tab]-[Password]
[BMC Forced Stop Properties]-[Server List Tab]-[BMC Settings]-[Password]
Internal Version 13.10
Parameter location Before the change
Parameter location After the change
[Cluster Properties]-[Monitor Tab]
[Cluster Properties]-[Statistics Tab]
[Cluster Properties]-[Mirror Disk Tab]-[Collect Mirror Statistics]
[Cluster Properties]-[Statistics Tab]-[Mirror Statistics]
[Cluster Properties]-[Extension Tab]-[Cluster Statistics]
[Cluster Properties]-[Statistics Tab]-[Cluster Statistics]
6.7. Compatibility with old versions¶
6.7.1. Compatibility with EXPRESSCLUSTER X 1.0/2.0/2.1/3.0/3.1/3.2/3.3/4.0/4.1/4.2/4.3/5.0/5.1¶
6.7.2. Script files¶
When you port a script file used in EXPRESSCLUSTER Ver8.0 or earlier, change the first "ARMS_" of the environment variable name to "CLP_".
Example) IF "%ARMS_EVENT%" == "START" GOTO NORMAL
↓
IF "%CLP_EVENT%" == "START" GOTO NORMAL
7. Glossary¶
- Active server
- A server that is running for an application set.(Related term: Standby server)
- Cluster partition
- A partition on a mirror disk. Used for managing mirror disks.(Related term: Disk heartbeat partition)
- Cluster shutdown
To shut down an entire cluster system (all servers that configure a cluster system).
- Cluster system
Multiple computers are connected via a LAN (or other network) and behave as if it were a single system.
- Data partition
- A local disk that can be used as a shared disk for switchable partition. Data partition for mirror disks.(Related term: Cluster partition)
- Disk heartbeat partition
A partition used for heartbeat communication in a shared disk type cluster.
- Failover
The process of a standby server taking over the group of resources that the active server previously was handling due to error detection.
- Failback
A process of returning an application back to an active server after an application fails over to another server.
- Failover group
A group of cluster resources and attributes required to execute an application.
- Failover policy
A priority list of servers that a group can fail over to.
- Floating IP address
- Clients can transparently switch one server from another when a failover occurs.Any unassigned IP address that has the same network address that a cluster server belongs to can be used as a floating address.
- Heartbeat
- Signals that servers in a cluster send to each other to detect a failure in a cluster.(Related terms: Interconnect, Network partition)
- Interconnect
- A dedicated communication path for server-to-server communication in a cluster.(Related terms: Private LAN, Public LAN)
- Management client
Any machine that uses the Cluster WebUI to access and manage a cluster system.
- Master server
The server displayed at the top of Master Server in Server Common Properties of the Cluster WebUI.
- Mirror disk connect
LAN used for data mirroring in a data mirror type cluster. Mirror disk connect can be used with primary interconnect.
- Mirror disk type cluster
A cluster system that does not use a shared disk. Local disks of the servers are mirrored.
- Moving failover group
Moving an application from an active server to a standby server by a user.
- Network partition
- All heartbeat is lost and the network between servers is partitioned.(Related terms: Interconnect, Heartbeat)
- Node
A server that is part of a cluster in a cluster system. In networking terminology, it refers to devices, including computers and routers, that can transmit, receive, or process signals.
- Private LAN
- LAN in which only servers configured in a clustered system are connected.(Related terms: Interconnect, Public LAN)
- Primary (server)
- A server that is the main server for a failover group.(Related term: Secondary server)
- Public LAN
- A communication channel between clients and servers.(Related terms: Interconnect, Private LAN)
- Startup attribute
A failover group attribute that determines whether a failover group should be started up automatically or manually when a cluster is started.
A disk that multiple servers can access.
A cluster system that uses one or more shared disks.
- Switchable partition
- A disk partition connected to multiple computers and is switchable among computers.(Related terms: Disk heartbeat partition)
- Secondary server
- A destination server where a failover group fails over to during normal operations.(Related term: Primary server)
- Server Group
A group of servers connected to the same network or the shared disk device
- Standby server
- A server that is not an active server.(Related term: Active server)
- Virtual IP address
IP address used to configure a remote cluster.