Configure storage for backups¶
You can configure storage for backups in the backup.storages
subsection of the
Custom Resource, using the deploy/cr.yaml
configuration file.
Warning
Remote storage for backups has the technical preview status.
You should also create the Kubernetes Secret object with credentials needed to access the storage.
Amazon S3 or S3-compatible storage¶
-
To store backups on the Amazon S3, you need to create a Secret with the following values:
- the
metadata.name
key is the name which you will further use to refer your Kubernetes Secret, - the
data.AWS_ACCESS_KEY_ID
anddata.AWS_SECRET_ACCESS_KEY
keys are base64-encoded credentials used to access the storage (obviously these keys should contain proper values to make the access possible).
Create the Secrets file with these base64-encoded keys following the deploy/backup-s3.yaml example:
apiVersion: v1 kind: Secret metadata: name: my-cluster-name-backup-s3 type: Opaque data: AWS_ACCESS_KEY_ID: UkVQTEFDRS1XSVRILUFXUy1BQ0NFU1MtS0VZ AWS_SECRET_ACCESS_KEY: UkVQTEFDRS1XSVRILUFXUy1TRUNSRVQtS0VZ
Note
You can use the following command to get a base64-encoded string from a plain text one:
$ echo -n 'plain-text-string' | base64 --wrap=0
$ echo -n 'plain-text-string' | base64
Once the editing is over, create the Kubernetes Secret object as follows:
$ kubectl apply -f deploy/backup-s3.yaml
- the
-
Put the data needed to access the S3-compatible cloud into the
backup.storages
subsection of the Custom Resource.-
storages.<NAME>.type
should be set tos3
(substitute thepart with some arbitrary name you will later use to refer this storage when making backups and restores). -
storages.<NAME>.s3.credentialsSecret
key should be set to the name used to refer your Kubernetes Secret (my-cluster-name-backup-s3
in the last example). -
storages.<NAME>.s3.bucket
andstorages.<NAME>.s3.region
should contain the S3 bucket and region. Also you can usestorages.<NAME>.s3.prefix
option to specify the path (sub-folder) to the backups inside the S3 bucket. If prefix is not set, backups are stored in the root directory. -
if you use some S3-compatible storage instead of the original Amazon S3, add the endpointURL key in the
s3
subsection, which should point to the actual cloud used for backups. This value and is specific to the cloud provider. For example, using Google Cloud involves the following endpointUrl:endpointUrl: https://storage.googleapis.com
The options within the
storages.<NAME>.s3
subsection are further explained in the Operator Custom Resource options.Here is an example of the deploy/cr.yaml configuration file which configures Amazon S3 storage for backups:
... backup: ... storages: s3-us-west: type: s3 s3: bucket: S3-BACKUP-BUCKET-NAME-HERE region: us-west-2 credentialsSecret: my-cluster-name-backup-s3 ...
-
Finally, make sure that your storage has enough resources to store backups, which is especially important in the case of large databases. It is clear that you need enough free space on the storage. Beside that, S3 storage upload limitats include the maximum number 10000 parts, and backing up large data will result in larger chunk sizes, which in turn may cause S3 server to run out of RAM, especially within the default memory limits.
Automating access to Amazon s3 based on IAM roles¶
Using AWS EC2 instances for backups makes it possible to automate access to AWS S3 buckets based on Identity Access Management (IAM) roles for Service Accounts with no need to specify the S3 credentials explicitly.
You can use either make and use the IAM instance profile, or configure IAM roles for Service Accounts (both ways heavily rely on AWS specifics, and need following the official Amazon documentation to be configured).
Following steps are needed to turn this feature on:
- Create the IAM instance profile and the permission policy within where you specify the access level that grants the access to S3 buckets.
- Attach the IAM profile to an EC2 instance.
- Configure an S3 storage bucket in the Custom Resource and verify the connection from the EC2 instance to it.
- Do not provide
s3.credentialsSecret
for the storage indeploy/cr.yaml
.
IRSA is the native way for the cluster running on Amazon Elastic Kubernetes Service (AWS EKS) to access the AWS API using permissions configured in AWS IAM roles.
Assuming that you have deployed the MongoDB Operator and the database cluster on EKS, following our installation steps, and your EKS cluster has OpenID Connect issuer URL (OIDC) enabled, the the high-level steps to configure it are the following:
-
Create an IAM role for your OIDC, and attach to the created role the policy that defines the access to an S3 bucket. See official Amazon documentation for details.
-
Find out service accounts used for the Operator and for the database cluster. Service account for the Operator is
percona-server-mongodb-operator
(it is set by theserviceAccountName
key in thedeploy/operator.yaml
ordeploy/bundle.yaml
manifest) The cluster’s default account isdefault
(it can be set withserviceAccountName
Custom Resource option in thereplsets
,sharding.configsvrReplSet
, andsharding.mongos
subsections of thedeploy/cr.yaml
manifest). -
Annotate both service accounts with the needed IAM roles. The commands should look as follows:
$ kubectl -n <cluster namespace> annotate serviceaccount default eks.amazonaws.com/role-arn: arn:aws:iam::111122223333:role/my-role --overwrite $ kubectl -n <operator namespace> annotate serviceaccount percona-server-mongodb-operator eks.amazonaws.com/role-arn: arn:aws:iam::111122223333:role/my-role --overwrite
Don’t forget to substitute the
<operator namespace>
and<cluster namespace>
placeholders with the real namespaces, and use your IAM role instead of theeks.amazonaws.com/role-arn: arn:aws:iam::111122223333:role/my-role
example. -
Configure an S3 storage bucket in the Custom Resource and verify the connection from the EC2 instance to it. Do not provide
s3.credentialsSecret
for the storage indeploy/cr.yaml
.
Note
If IRSA-related credentials are defined, they have the priority over any IAM instance profile. S3 credentials in a secret, if present, override any IRSA/IAM instance profile related credentials and are used for authentication instead.
Microsoft Azure Blob storage¶
-
To store backups on the Azure Blob storage, you need to create a Secret with the following values:
- the
metadata.name
key is the name which you wll further use to refer your Kubernetes Secret, - the
data.AZURE_STORAGE_ACCOUNT_NAME
anddata.AZURE_STORAGE_ACCOUNT_KEY
keys are base64-encoded credentials used to access the storage (obviously these keys should contain proper values to make the access possible).
Create the Secrets file with these base64-encoded keys following the
deploy/backup-azure.yaml
example:apiVersion: v1 kind: Secret metadata: name: my-cluster-azure-secret type: Opaque data: AZURE_STORAGE_ACCOUNT_NAME: UkVQTEFDRS1XSVRILUFXUy1BQ0NFU1MtS0VZ AZURE_STORAGE_ACCOUNT_KEY: UkVQTEFDRS1XSVRILUFXUy1TRUNSRVQtS0VZ
Note
You can use the following command to get a base64-encoded string from a plain text one:
$ echo -n 'plain-text-string' | base64 --wrap=0
$ echo -n 'plain-text-string' | base64
Once the editing is over, create the Kubernetes Secret object as follows:
$ kubectl apply -f deploy/backup-azure.yaml
- the
-
Put the data needed to access the Azure Blob storage into the
backup.storages
subsection of the Custom Resource.-
storages.<NAME>.type should be set to
azure` (substitute thepart with some arbitrary name you will later use to refer this storage when making backups and restores). -
storages.<NAME>.azure.credentialsSecret
key should be set to the name used to refer your Kubernetes Secret (my-cluster-azure-secret
in the last example). -
storages.<NAME>.azure.container
option should contain the name of the Azure container. Also you can usestorages.<NAME>.azure.prefix
option to specify the path (sub-folder) to the backups inside the container. If prefix is not set, backups are stored in the root directory of the container.
These and other options within the
storages.<NAME>.azure
subsection are further described in the Operator Custom Resource options.Here is an example of the deploy/cr.yaml configuration file which configures Azure Blob storage for backups:
... backup: ... storages: azure-blob: type: azure azure: container: <your-container-name> prefix: psmdb credentialsSecret: my-cluster-azure-secret ...
-
Remote file server¶
You can use fileystem
backup storage type to mount a remote file server to
a local directory as a sidecar volume, and make Percona Backup for MongoDB
using this directory as a storage for backups.
The approach is based on using common Network File System (NFS) protocol . Particularly, this storage type is useful in network-restricted environments without S3-compatible storage, or in cases with a non-standard storage service that still supports NFS access.
-
Add the remote storage as a sidecar volume in the
replset
section of the Custom Resource (and also inconfigsvrReplSet
in case of a sharded cluster). You will need to specify the server hostname and some directory on it, as in the following example:replsets: - name: rs0 ... sidecarVolumes: - name: backup-nfs-vol nfs: server: "nfs-service.storage.svc.cluster.local" path: "/psmdb-my-cluster-name-rs0" ...
The
backup-nfs-vol
name specified above will be used to refer this sidecar volume in the backup section. -
Now put the mount point (the local directory path to which the remote storage will be mounted) and the name of your sidecar volume into the
backup.volumeMounts
subsection of the Custom Resource:backup: ... volumeMounts: - mountPath: /mnt/nfs/ name: backup-nfs-vol ...
-
Finally, storage of the
filesystem
type needs to be configured in thebackup.storages
subsection. It needs only the mount point:backup: enabled: true ... storages: backup-nfs: type: filesystem filesystem: path: /mnt/nfs/