Set up Percona Server for MongoDB cross-site replication¶
The cross-site replication involves configuring one MongoDB site as Main, and another MongoDB site as Replica to allow replication between them:
This feature can be useful in several cases:
- simplify the migration of the MongoDB cluster to and from Kubernetes
- add remote nodes to the replica set for disaster recovery
- keep the replica set of the database cluster in different data centers to get a fault-tolerant system.
Prerequisites¶
- Every node in Main and Replica clusters need to be reachable through network.
- User credentials should be the same in each cluster.
- TLS certificates should be the same in each cluster.
Glossary¶
- Main cluster: The cluster which the primary node runs and accepts write traffic. It’s the managed cluster if it’s running on Kubernetes.
- Replica cluster: The cluster which is configured to replicate from main cluster. It’s the unmanaged cluster if it’s running on Kubernetes.
- Managed cluster: The cluster controlled by operator. The operator controls everything from Replica Set configuration to users credentials. It’s the default deployment of the operator.
- Unmanaged cluster: The cluster controlled by operator but the operator isn’t responsible for managing Replica Set configuration .
Topologies¶
The Operator automates configuration of Main and Replica MongoDB sites, but the feature itself is not bound to Kubernetes. Either Main or Replica can run outside of Kubernetes, be regular MongoDB and be out of the Operators’ control.
You need to have a single Main cluster but you can have multiple Replica clusters as long as you don’t have more than 50 members in Replica Set. This limitation comes from MongoDB itself, for more information please check MongoDB docs .
Main and Replica clusters on Kubernetes¶
If you want both Main and Replica clusters to run on Kubernetes, overall steps will look like:
- Deploy the Main cluster on a Kubernetes cluster (or use an existing one)
- Get TLS secrets from the Main cluster and apply them to the namespace in Kubernetes cluster to which you’ll deploy the Replica cluster
- Deploy Replica cluster on a Kubernetes cluster
- Add nodes from the Replica cluster to the Main cluster as external nodes
Main cluster on Kubernetes and Replica cluster outside of Kubernetes¶
If you want Main cluster to run on Kubernetes, but Replica cluster outside of Kubernetes, overall steps will look like:
- Deploy the Main cluster on a Kubernetes cluster (or use an existing one)
- Get TLS secrets from the Main cluster to configure the Replica cluster
- Deploy the Replica cluster on wherever you want
- Add nodes from the Replica cluster to the Main cluster as external nodes
Main cluster outside of Kubernetes and Replica cluster on Kubernetes¶
If you want Main cluster to run outside of Kubernetes but Replica cluster on Kubernetes, overall steps will look like:
- Deploy the Main cluster on wherever you want (or use an existing one)
- Get TLS certificates and create a Kubernetes Secret with them
- Get user credentials and create a Kubernetes Secret with them
- Deploy the Replica cluster on a Kubernetes cluster
- Add nodes from the Replica cluster to the Main cluster using Mongo client
Exposing instances of the MongoDB cluster¶
You need to expose all Replica Set nodes (including Config Servers) through a dedicated Service to ensure that both the Main and the Replica can reach each other, like in a full mesh:
This is done through the replsets.expose
, sharding.configsvrReplSet.expose
,
and sharding.mongos.expose
sections in the deploy/cr.yaml
configuration file
as follows.
spec:
replsets:
- rs0:
expose:
enabled: true
type: LoadBalancer
...
sharding:
configsvrReplSet:
expose:
enabled: true
type: LoadBalancer
...
The above example is using the LoadBalancer Kubernetes Service object, and the result will be a LoadBalancer per each Replica Set Pod. In most cases, this Load Balancer should be Internet-facing for cross-region replication to work. Also, there are other options except the LoadBalancer (ClusterIP, NodePort, etc.).
Note
Starting from v1.14, the Operator configures the replset using local DNS hostnames even if the replset is exposed. If you want to have IP addresses in the replset configuration to achieve a multi-cluster deployment, you need to set clusterServiceDNSMode
to External
.
To list the endpoints assigned to Pods, list the Kubernetes Service objects by
executing kubectl get services -l "app.kubernetes.io/instance=CLUSTER_NAME"
command.