High availability and scaling¶

One of the great advantages brought by Kubernetes and the OpenShift platform is the ease of an application scaling. Scaling an application results in adding resources or Pods and scheduling them to available Kubernetes nodes.

Scaling can be vertical and horizontal. Vertical scaling adds more compute or storage resources to PostgreSQL nodes; horizontal scaling is about adding more nodes to the cluster. High availability looks technically similar, because it also involves additional nodes, but the reason is maintaining liveness of the system in case of server or network failures.

Vertical scaling¶

Scale compute¶

There are multiple components that the Operator deploys and manages: PostgreSQL instances, pgBouncer connection pooler, pgBackRest and others (See Architecture for the full list of components.)

You can manage compute resources for a specific component using the corresponding section in the Custom Resource manifest. We follow the structure for requests and limits that Kubernetes provides.

The most common resources to specify are CPU and memory (RAM).

You can specify a request for CPU or memory for a component’s Pod. In this case, the Kubernetes scheduler uses these values to decide on which Kubernetes node to place the Pod, ensuring the node has at least the requested resources available. The Pod will only be scheduled on a node that can satisfy all its resource requests.

If you specify a limit for the resources, this is the maximum amount of CPU or memory the container is allowed to use. If the container tries to use more than the limit, it may be throttled (for CPU) or terminated (for memory).

You can set both requests and limits in the resources section of your Custom Resource. For example:

spec:
...
  instances:
  - name: instance1
    replicas: 3
    resources:
      requests:
        cpu: 1.0
        memory: 2Gi
      limits:
        cpu: 2.0
        memory: 4Gi

If you only set limits and omit requests, Kubernetes will default the request to the limit value.

Use our reference documentation for the Custom Resource options for more details about other components.

Scale storage¶

Kubernetes manages storage with a PersistentVolume (PV), a segment of storage supplied by the administrator, and a PersistentVolumeClaim (PVC), a request for storage from a user. In Kubernetes v1.11 the feature was added to allow a user to increase the size of an existing PVC object (considered stable since Kubernetes v1.24). The user cannot shrink the size of an existing PVC object.

Scaling with Volume Expansion capability¶

Certain volume types support PVCs expansion (exact details about PVCs and the supported volume types can be found in Kubernetes documentation ).

You can run the following command to check if your storage supports the expansion capability:

$ kubectl describe sc <storage class name> | grep AllowVolumeExpansion

Expected output

AllowVolumeExpansion: true

The Operator versions 2.5.0 and higher will automatically expand such storage for you when you change the appropriate options in the Custom Resource.

For example, you can do it by editing and applying the deploy/cr.yaml file:

spec:
  ...
  instances:
    ...
    dataVolumeClaimSpec:
      resources:
        requests:
          storage: <NEW STORAGE SIZE>

Apply changes as usual:

$ kubectl apply -f cr.yaml

Automated scaling with auto-growable disk¶

The Operator 2.5.0 and newer is able to detect if the storage usage on the PVC reaches a certain threshold, and trigger the PVC resize. Such autoscaling needs the upstream “auto-growable disk” feature turned on when deploying the Operator. This is done via the PGO_FEATURE_GATES environment variable set in the deploy/operator.yaml manifest (or in the appropriate part of deploy/bundle.yaml):

...
subjects:
- kind: ServiceAccount
  name: percona-postgresql-operator
  namespace: pg-operator
...
spec:
  containers:
  - env:
    - name: PGO_FEATURE_GATES
      value: "AutoGrowVolumes=true"
...

When the support for auto-growable disks is turned on, the auto grow will be working automatically if the maximum value available for the Operator to scale up is set in the spec.instances[].dataVolumeClaimSpec.resources.limits.storage Custom Resource option:

spec:
  ...
  instances:
    ...
    dataVolumeClaimSpec:
      resources:
        requests:
          storage: 1Gi
        limits:
          storage: 5Gi

High availability¶

Percona Operator allows you to deploy highly-available PostgreSQL clusters. High-availability implementation is based on the Patroni template, which uses PostgreSQL streaming replication. The cluster includes a number of replicas, one of which is a primary PostgreSQL instance: it is available for writes, and streams changes to other replicas (standby servers in terms of PostgreSQL). Streaming replication used in this configuration is asynchronous by default, which means transferring data to a different instance without waiting for a confirmation of its receiving. Alternatively, a synchronous replication can be used, where the data transfer waits for a confirmation of its successful processing on the standby. If the primary server crashes then some transactions that were committed may not have been replicated to the standby server, causing data loss (the amount of data loss is proportional to the replication delay at the time of failover). Synchronous replication is slower but minimizes the data loss possibility in case if the primary server crash.

There are two ways how to control the number replicas in your HA cluster:

Through changing spec.instances.replicas value
By adding new entry into spec.instances

Using `spec.instances.replicas`¶

For example, you have the following Custom Resource manifest:

spec:
...
  instances:
    - name: instance1
      replicas: 2

This will provision a cluster with two nodes - one Primary and one Replica. Add the node by changing the manifest…

spec:
...
  instances:
    - name: instance1
      replicas: 3

…and applying the Custom Resource:

$ kubectl apply -f deploy/cr.yaml

The Operator will provision a new replica node. It will be ready and available once data is synchronized from Primary.

Using `spec.instances`¶

Each instance’s entry has its own set of parameters, like resources, storage configuration, sidecars, etc. When you add a new entry into instances, this creates replica PostgreSQL nodes, but with a new set of parameters. This can be useful in various cases:

Test or migrate to new hardware
Blue-green deployment of a new configuration
Try out new versions of your sidecar containers

For example, you have the following Custom Resource manifest:

spec:
...
  instances:
    - name: instance1
      replicas: 2
      dataVolumeClaimSpec:
        storageClassName: old-ssd
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi

Now you have a goal to migrate to new disks, which are coming with the new-ssd storage class. You can create a new instance entry. This will instruct the Operator to create additional nodes with the new configuration keeping your existing nodes intact.

spec:
...
  instances:
    - name: instance1
      replicas: 2
      dataVolumeClaimSpec:
        storageClassName: old-ssd
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi
    - name: instance2
      replicas: 2
      dataVolumeClaimSpec:
        storageClassName: new-ssd
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 100Gi

Using Synchronous replication¶

Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one or more synchronous standby servers. When requesting synchronous replication, each commit of a write transaction will wait until confirmation is received that the commit has been written to the write-ahead log on disk of both the primary and standby server. The drawbacks of synchronous replication are increased latency and reduced throughput on writes.

You can turn on synchronous replication by customizing the patroni.dynamicConfiguration Custom Resource option.

Enable synchronous replication by setting synchronous_mode option to on.
Use synchronous_node_count option to set the number of replicas (PostgreSQL standby servers) which should operate in syncrhonous mode (the default value is 1).

The result in your deploy/cr.yaml manifest may look as follows:

...
  patroni:
    dynamicConfiguration:
      synchronous_mode: "on"
      synchronous_node_count: 2
      ...

You will have the desired amount of replicas switched to synchronous replication after applying changes as usual, with kubectl apply -f deploy/cr.yaml command.

Find more options useful to tune how your database cluster should operate in synchronous mode in the official Patroni documentation .

Last update: 2025-08-14