Part 10 - vROps Continuous Availability

Disabling HA is a prerequisite so we have done that in the previous blog, and we will now start enabling continuous availability on our vRops now that our environment is massive. While HA offers what we need in terms of losing a node, the CA ensure that I ever there was a fault domain failure, the vROps cluster will still be operational however in a degraded mode. Both HA and CA can’t be enabled at the same time. The vRops Cluster will need to go offline during the CA configuration so plan accordingly.

In case of a Fault Domain failure, the remaining active Fault Domain and the Witness Node will keep the cluster alive by promoting the Replica Node as the new Master Node. The cluster will run in degraded mode.

In this environment, I’ve got my nodes under a single vCenter however, data Nodes must be split into different vSphere clusters to ensure they are under two different fault domains.

The v`Rops CA Configuration

To implement vRealize Operations’ continuous availability, we require the following prerequisites in place

Master Node – Assign the master node to fault domain 1.
Replica Master Node – The replica node is a copy of the master node and this will need to be moved to fault domain 2.
Witness Node – The witness node should be deployed in a 3rd site ideally and should be working. This act as a quorum node just to prevent a split-brain scenario
Data Node – In order for data to be divided evenly across the fault domain, we will need to ensure there is an equal number of data nodes on both fault domains.
Remote Collectors – they don’t need to be part of fault domains

In my lab, I have these nodes deployed – 2 of my data nodes will go into Fault domain 1 & the other two will go into Fault domain 2. The witness can be anywhere.

Role	Type	Fault Domain
vRops-1	Master	*Fault domain 1*
vRops-4	Data Node	*Fault domain 1*
vRops-6	Witness
vRops-7/ftp	Data Node	*Fault domain 2*
vRops-8	Data Node	*Fault domain 2*

Once logged in to the admin UI, click Enable CA button to proceed with the Continuous Availability configuration.

The master node by default is placed in fault domain 1 & we do have options to spread our data collectors as pairs. I’ve placed 2 of my data nodes in London vCenter 1 and the other two in London vCenter 2 showing these nodes are deployed in separate sites. Click OK to proceed.