Placing HA cluster nodes in different physical locations

Last modified on 15 Apr, 2021. Revision 6
Sometimes there is a need to place the HA nodes physically at different locations due to security, location redundancy, in case of fire etc.
Up to date for
cOS Core 13.00.10
Supported since
cOS Core 9.xx and up
Status OK
Author
Peter Nilsson

Question

Can we create a second Data center and “split” the HA-cluster between both physical locations? Basically we want to place the HA cluster nodes in two different buildings / different parts of the town in order for full physical redundancy in case of e.g. a fire.

Answer

Placing one cluster node on each side of a city is should not be a problem. We do however recommend having the cluster nodes in as close proximity to each other as possible as there is a fairly strict requirement regarding latency on the sync interface. If latency is higher than 20 milliseconds the sending cluster node will initiate a retransmit of the synchronization packets, making the cluster not as redundant as intended. Also the cluster heartbeats and synchronization traffic must not be “routed” (TTL=255 on departure as well as on arrival). More details about cluster mechanics can be found in KB article <Broken link, to be added>

A common solution for the synchronization interface is to install/use a dedicated fiber cable/connection for the sync interface (or e.g. a black fibre connection). You can also solve it in other ways, but the connection setup must be similar to a “direct cable” between the HA cluster nodes.

The bandwidth requirement on the synchronization interface is highly dependent on the number of connections and function states that are synchronized between the nodes. Not all features are synchronized, the release notes contains a list of features/functions that are not state synchronized. If/when a cluster node boots up and links towards the active cluster node, there would be a temporary burst of data as the nodes link up with each other.

Usually the synchronization state packets are small but could be sent at a high rate, putting some strain on the synchronization interface and intermediate equipment (although in today’s gigabit environments it is rarely a problem). In short, a dedicated 1 Gbps link is recommended,  but it will vary with the traffic patterns and the amount of traffic being synchronized between the cluster nodes.

Some care should also be done in the design of the connection from each interface of the peers to its “common switch” so it mimics the default way of setting up an HA cluster (meaning as if there are physically in close proximity to each other).

Summary:

It is possible to place the clusters in different physical locations but it is not recommended (but possible) as high latency synchronization (20ms+) is not something that is actively tested in Clavister QA, so there may be unknown behaviours that could be difficult to troubleshoot. If a guarantee of low latency can be achieved, it should work just fine.

It is also recommended to read through the following article about cluster setting adjustments: https://kb.clavister.com/329090437/adjusting-advanced-cluster-settings-on-larger-installations


Related articles