Working with high availability clusters

This topic enlists the best approach while working with the high availability clusters. The following best practices if persuaded shall prevent issues when any new node joins or leaves the cluster like missing database, storage, search nodes after cluster start etc.

Starting a high availability cluster

Before starting any high availability cluster, ensure a multi-server environment is up and running with just a single instance made available for each group. (For e.g.: On AWS, the parameters - Minimum, Desired & Maximum for the Auto Scaling Group (ASG) should be set to "1".)

To start a high availability cluster, do the following:

  1. Start the NGINX group with just one instance. (For e.g.: On AWS NGINX, set the ASG parameters (Min, Desired & Max) to "1").

  2. Sequentially, start the MASTER group, SEARCH & STORAGE group, Prod 1 group and Prod 2 group with just one instance.

    For this operation, do not start the MASTER, SEARCH & STORAGE, Prod 1 and Prod 2 groups with 2 or more instances at a time. (For e.g.: The ASG parameters (Min, Desired & Max) should be set to "1").
  3. Now, login into the Platform as a Master Admin and check if all the components are available.

    Note: While in the high availability cluster environment and if one of the group leaves the cluster due to any issue or scale down process, you may encounter a non-fatal exception for fraction of seconds which can be usually fixed by clearing the browser cache.
  4. You can now scale up each group to your desired number of instances. (For e.g.: The ASG parameters (Min, Desired & Max) can be set to "2").

Shutting down a high availability cluster

You can shutdown a high availability cluster in any event. You can prefer stopping all the servers at once or you can also stop the servers in a sequence.

Restarting a high availability cluster

Assuming the high availability cluster is already running with 1 or more instances for each node. To restart a high availability cluster, do the following:

  1. Shutdown all the nodes except the NGINX group. Even if the cluster is already running with 2 NGINX instances, you don't need to shutdown. There would be no harm if by any reason the NGINX group is shutdown.

  2. Start the NGINX group with just one instance if it is not running already. Skip this step if you did not shutdown NGINX group in first step.

  3. Sequentially, start the MASTER group, SEARCH & STORAGE group, Prod 1 group and Prod 2 group with just one instance.

    For this operation, do not start the MASTER, SEARCH & STORAGE, Prod 1 and Prod 2 groups with 2 or more instances at a time.
  4. Now, login into the Platform as a Master Admin and check if all the components are available.

  5. You can now scale up each group to your desired number of instances.