Initial troubleshooting¶

Percona Operator for MongoDB uses Custom Resources to manage options for the various components of the cluster.

PerconaServerMongoDB Custom Resource with Percona Server for MongoDB options (it has handy psmdb shortname also),
PerconaServerMongoDBBackup and PerconaServerMongoDBRestore Custom Resources contain options for Percona Backup for MongoDB used to backup Percona Server for MongoDB and to restore it from backups (psmdb-backup and psmdb-restore shortnames are available for them).

The first thing you can check for the Custom Resource is to query it with kubectl get command:

$ kubectl get psmdb

Expected output

NAME              ENDPOINT                                           STATUS   AGE
my-cluster-name   my-cluster-name-mongos.default.svc.cluster.local   ready    5m26s

The Custom Resource should have Ready status.

Note

You can check which Percona’s Custom Resources are present and get some information about them as follows:

$ kubectl api-resources | grep -i percona

Expected output

perconaservermongodbbackups       psmdb-backup    psmdb.percona.com/v1                   true         PerconaServerMongoDBBackup
perconaservermongodbrestores      psmdb-restore   psmdb.percona.com/v1                   true         PerconaServerMongoDBRestore
perconaservermongodbs             psmdb           psmdb.percona.com/v1                   true         PerconaServerMongoDB

Check the Pods¶

If Custom Resource is not getting Ready status, it makes sense to check individual Pods. You can do it as follows:

$ kubectl get pods

Expected output

NAME                                               READY   STATUS    RESTARTS   AGE
my-cluster-name-cfg-0                              2/2     Running   0          11m
my-cluster-name-cfg-1                              2/2     Running   1          10m
my-cluster-name-cfg-2                              2/2     Running   1          9m
my-cluster-name-mongos-0                           1/1     Running   0          11m
my-cluster-name-mongos-1                           1/1     Running   0          11m
my-cluster-name-mongos-2                           1/1     Running   0          11m
my-cluster-name-rs0-0                              2/2     Running   0          11m
my-cluster-name-rs0-1                              2/2     Running   0          10m
my-cluster-name-rs0-2                              2/2     Running   0          9m
percona-server-mongodb-operator-665cd69f9b-xg5dl   1/1     Running   0          37m

The above command provides the following insights:

READY indicates how many containers in the Pod are ready to serve the traffic. In the above example, my-cluster-name-rs0-0 Pod has all two containers ready (2/2). For an application to work properly, all containers of the Pod should be ready.
STATUS indicates the current status of the Pod. The Pod should be in a Running state to confirm that the application is working as expected. You can find out other possible states in the official Kubernetes documentation .
RESTARTS indicates how many times containers of Pod were restarted. This is impacted by the Container Restart Policy . In an ideal world, the restart count would be zero, meaning no issues from the beginning. If the restart count exceeds zero, it may be reasonable to check why it happens.
AGE: Indicates how long the Pod is running. Any abnormality in this value needs to be checked.

You can find more details about a specific Pod using the kubectl describe pods <pod-name> command.

$ kubectl describe pods my-cluster-name-rs0-0

Expected output

...
Name:         my-cluster-name-rs0-0
Namespace:    default
...
Controlled By:  StatefulSet/my-cluster-name-rs0
Init Containers:
 mongo-init:
...
Containers:
 mongod:
...
   Restart Count:  0
   Limits:
     cpu:     300m
     memory:  500M
   Requests:
     cpu:      300m
     memory:   500M
   Liveness:   exec [/opt/percona/mongodb-healthcheck k8s liveness --ssl --sslInsecure --sslCAFile /etc/mongodb-ssl/ca.crt --sslPEMKeyFile /tmp/tls.pem --startupDelaySeconds 7200] delay=60s timeout=10s period=30s #success=1 #failure=4
   Readiness:  tcp-socket :27017 delay=10s timeout=2s period=3s #success=1 #failure=8
   Environment Variables from:
     internal-my-cluster-name-users  Secret  Optional: false
   Environment:
...
   Mounts:
...
Volumes:
...
Events:                      <none>

This gives a lot of information about containers, resources, container status and also events. So, describe output should be checked to see any abnormalities.

Last update: 2025-07-28