Adjusting advanced cluster settings on larger installations

Last modified on 23 Aug, 2022. Revision 7

Up to date for	cOS Core 13.00.08
Supported since	cOS Core 10.xx.xx
Status	OK

Question:

My High Availability cluster is not synchronizing properly, also I have seen incidents where the cluster changes role for no apparent reason.

Answer:

Problems with synchronization and cluster role changes can of course be all kinds of reasons such as hardware problem on sync interface, bad cable, incorrect configuration etc. but if we look at some of the Advanced Settings for High Availability there are some settings here that may need to be adjusted. For most these settings never need to be changed but for larger installations it is recommended to modify them to incorporate large synchronization data and (based on scenario) lessen the chance that the cluster performs a failover due to lack of heartbeats from it’s peer.

The settings that we want to adjust are the following and can be found under System->High Availability->Advanced:

Sync Buffer Size , default value 4096
Recommended value : 4096

This setting controls how much synchronization data (in KB) can be buffered before waiting for acknowledgement from its cluster peer. Today’s appliance models (E80 and above) have quite a lot of spare memory, so allocating 4 MB instead of one should be no problem, having a little extra buffer for the synchronization will never hurt.

Note: The default value in older versions was 1024. The value will not update on existing configurations automatically. Only new configurations from around 2017 will use the new default value.

Sync Packet Max Burst , default value 100
Recommended value : 100

This setting controls how many packet the active cluster peer can send in a synchronization state burst to the inactive node. For larger installations (100+ users) it is highly recommended to increase this value, using the default value can cause the active node to be unable to synchronize data fast enough. Meaning the inactive node may not be fully synchronized with the active.

Note: The old default value was in older versions 20. The value will not update on existing configurations automatically. Only new configurations from around 2017 will use the new default value.

HA Failover Time , default values 750ms
Recommended value : 1500-2500ms

This setting controls how long the inactive node node will “wait” before going active in case it has not received sufficient heartbeats from it’s peer within this time. Simply speaking if the inactive node has not “seen” the active node for 750ms it will go active.

Depending on the scenario/size/network structure, 750ms can be a bit low. In case the system encounters network packet bursts it could result in the inactive declaring the active node as inactive and then go active itself. Then you could enter an active/active state and then the clusters start to negotiate which node that should be the active node. This in turn could cause disruptions in the network.

One way to make the cluster “less” sensitive to minor network “hiccups” would be to increase this value.

Note: The higher the value here the longer it would take for the inactive node to take over in case something happens with the active node. The value configured here will have to be based on what is reasonable acceptable, is 1.5-2.5 seconds of total network outage acceptable in case something happens with the active node? It will be up to the administrator to decide.

cOS Core HA clusters in VMware with Promiscuous Mode
4 Apr, 2023 core vmware highavailability ha promiscuous

HA: disallowed_on_sync_iface log events with rule=HA_RestrictSyncIf for Reverse ARP, RARP, and IGMP
23 Aug, 2022 vmware log ha rarp arp core

Placing HA cluster nodes in different physical locations
15 Apr, 2021 core brokenlink cluster

Device initiated InControl management of NetWall HA clusters with a single public IP
31 Mar, 2022 incontrol core netcon netwall ha cluster coscore

Differences between the NetWall E80A and E80B
31 May, 2021 hardware ha e80a e80b

Avoiding cOS Core HA interruptions during configuration deployment
20 Feb, 2023 ha core idp cli cluster antivirus configuration

FAQ about licenses when using HA (High Availability) cluster
4 Feb, 2025 ha hacluster netwall license core

Transparent mode & L2TPv3 unavailable in cOS Core HA clusters
17 Feb, 2023 core ha cluster transparentmode l2tpv3

Connecting firewalls to equipment that runs in active-active mode
18 Nov, 2022 core cluster

Manage NetWall HA cluster with a Single Public IP Address
9 Jan, 2025 core ha hacluster netwall coscore slb

cOS Core High Availability Cluster troubleshooting
23 Feb, 2023 core troubleshoot cluster ha

Tagscore ha cluster

Adjusting advanced cluster settings on larger installations

Question:

Answer:

Related articles