- Amazon S3¶
Amazon S3 (Simple Storage Service) is an object storage service provided through a web service interface offered by Amazon Web Services.
Atomicity means that database operations are applied following a “all or nothing” rule. A transaction is either fully applied or not at all.
A blob stands for Binary Large Object, which includes objects such as images and multimedia files. In other words these are various data files that you store in Microsoft’s data storage platform. Blobs are organized in containers which are kept in Azure Blob storage under your storage account.
A bucket is a container on the s3 remote storage that stores backups.
A collection is the way data is organized in MongoDB. It is analogous to a table in relational databases.
- Completion time¶
The completion time is the time to which the sharded cluster / non-shared replica set will be returned to after the restore. It is reflected in the “complete” section of the
pbm statuscommand outputs.
In logical backups, the completion time almost coincides with the backup finish time. To define the completion time, Percona Backup for MongoDB waits for the backup snapshot to finish on all cluster nodes. Then it captures the oplog from the backup start time up to that time.
In physical backups, the completion time is only a few seconds after the backup start time. By holding the
$backupCursoropen guarantees that the checkpoint data won’t change during the backup, and Percona Backup for MongoDB can define the completion time ahead.
In the context of backup and restore, consistency means that the data restored will be consistent in a given point in time. Partial or incomplete writes to disk of atomic operations (for example, to table and index data structures separately) won’t be served to the client after the restore. The same applies to multi-document transactions, that started but didn’t complete by the time the backup was finished.
A container is like a directory in Azure Blob storage that contains a set of blobs.
Once a transaction is committed, it will remain so.
GCP (Google Cloud Platform) is the set of services, including storage service, that runs on Google Cloud infrastructure.
The Isolation requirement means that no transaction can interfere with another.
Jenkins is a continuous integration system that we use to help ensure the continued quality of the software we produce. It helps us achieve the aims of:
no failed tests in trunk on any platform,
aid developers in ensuring merge requests build and test on all platforms,
no known performance regressions (without a damn good explanation).
MinIO is a cloud storage server compatible with Amazon S3, released under Apache License v2.
Oplog (operations log) is a fixed-size collection that keeps a rolling record of all operations that modify data in the database.
- Oplog slice¶
A compressed bundle of oplog entries stored in the Oplog Store database in MongoDB. The oplog size captures an approximately 10-minute frame. For a snapshot, the oplog size is defined by the time that the slowest replica set member requires to perform mongodump.
A unique identifier of an operation such as backup, restore, resync. When a pbm-agent starts processing an operation, it acquires a lock and an opID. This prevents processing the same operation twice (for example, if there are network issues in distributed systems). Using opID as a log filter allows viewing logs for an operation in progress.
pbm-agentis a PBM process running on the mongod node for backup and restore operations. A pbm-agent instance is required for every mongod node (including replica set secondary members and config server replica set nodes).
- pbm CLI¶
Command-line interface for controlling the backup system. PBM CLI can connect to several clusters so that a user can manage backups on many clusters.
- PBM Control collections¶
PBM Control collections are collections with config, authentication data and backup states. They are stored in the admin db in the cluster or non-sharded replica set and serve as the communication channel between pbm-agent and pbm CLI. pbm CLI creates a new pbmCmd document for a new operation. pbm-agents monitor it and update as they process the operation.
- Percona Backup for MongoDB¶
Percona Backup for MongoDB (PBM) is a low-impact backup solution for MongoDB non-sharded replica sets and clusters. It supports both Percona Server for MongoDB and MongoDB Community Edition.
- Percona Server for MongoDB¶
Percona Server for MongoDB is a drop-in replacement for MongoDB Community Edition with enterprise-grade features.
- Point-in-Time Recovery¶
Point-in-Time Recovery is restoring the database up to a specific moment in time. The data is restored from the backup snapshot and then events that occurred to the data are replayed from oplog.
- Replica set¶
A replica set is a group of mongod nodes that host the same data set.
- S3 compatible storage¶
This is the storage that is built on the S3 API.
- Server-side encryption¶
Server-side encryption is the encryption of data by the remote storage server as it receives it. The data is encrypted when it is written to S3 bucket and decrypted when you access the data.