Skip to content

Percona Server for MongoDB Sharding

About sharding

Sharding provides horizontal database scaling, distributing data across multiple MongoDB Pods. It is useful for large data sets when a single machine’s overall processing speed or storage capacity turns out to be not enough. Sharding allows splitting data across several machines with a special routing of each request to the necessary subset of data (so-called shard).

A MongoDB Sharding involves the following components:

  • shard - a replica set which contains a subset of data stored in the database (similar to a traditional MongoDB replica set),
  • mongos - a query router, which acts as an entry point for client applications,
  • config servers - a replica set to store metadata and configuration settings for the sharded database cluster.

Note

Percona Operator for MongoDB 1.6.0 supported only one shard of a MongoDB cluster; still, this limited sharding support allowed using mongos as an entry point instead of provisioning a load-balancer per replica set node. Multiple shards are supported starting from the Operator 1.7.0. Also, before the Operator 1.12.0 mongos were deployed by the Deployment object, and starting from 1.12.0 they are deployed by the StatefulSet one.

Turning sharding on and off

Sharding is controlled by the sharding section of the deploy/cr.yaml configuration file and is turned on by default.

To enable sharding, set the sharding.enabled key to true (this will turn existing MongoDB replica set nodes into sharded ones). To disable sharding, set the sharding.enabled key to false.

When sharding is turned on, the Operator runs replica sets with config servers and mongos instances. Their number is controlled by configsvrReplSet.size and mongos.size keys, respectively.

Note

Config servers for now can properly work only with WiredTiger engine, and sharded MongoDB nodes can use either WiredTiger or InMemory one.

By default replsets section of the deploy/cr.yaml configuration file contains only one replica set, rs0. You can add more replica sets with different names to the replsets section in a similar way. Please take into account that having more than one replica set is possible only with the sharding turned on.

Checking connectivity to sharded and non-sharded cluster

With sharding turned on, you have mongos service as an entry point to access your database. If you do not use sharding, you have to access mongod processes of your replica set.

  1. You will need the login and password for the admin user to access the cluster. Use kubectl get secrets command to see the list of Secrets objects (by default the Secrets object you are interested in has my-cluster-name-secrets name). Then kubectl get secret my-cluster-name-secrets -o yaml command will return the YAML file with generated Secrets, including the MONGODB_DATABASE_ADMIN_USER and MONGODB_DATABASE_ADMIN_PASSWORD strings, which should look as follows:

    ...
    data:
      ...
      MONGODB_DATABASE_ADMIN_PASSWORD: aDAzQ0pCY3NSWEZ2ZUIzS1I=
      MONGODB_DATABASE_ADMIN_USER: ZGF0YWJhc2VBZG1pbg==
    

    Here the actual login name and password are base64-encoded. Use echo 'aDAzQ0pCY3NSWEZ2ZUIzS1I=' | base64 --decode command to bring it back to a human-readable form.

  2. Run a container with a MongoDB client and connect its console output to your terminal. The following command will do this, naming the new Pod percona-client:

    $ kubectl run -i --rm --tty percona-client --image=percona/percona-server-mongodb:4.4.16-16 --restart=Never -- bash -il
    

    Executing it may require some time to deploy the correspondent Pod.

  3. Now run mongo tool in the percona-client command shell using the login (which is normally databaseAdmin), a proper password obtained from the Secret, and a proper namespace name instead of the <namespace name> placeholder. The command will look different depending on whether sharding is on (the default behavior) or off:

    $ mongo "mongodb://databaseAdmin:databaseAdminPassword@my-cluster-name-mongos.<namespace name>.svc.cluster.local/admin?ssl=false"
    
    $ mongo "mongodb+srv://databaseAdmin:databaseAdminPassword@my-cluster-name-rs0.<namespace name>.svc.cluster.local/admin?replicaSet=rs0&ssl=false"
    

Last update: 2022-11-03