This feature was implemented in Percona XtraBackup 8.0.26-18.0 in the xbcloud binary.
Exponential backoff increases the chances for the completion of a backup or a restore operation. For example, a chunk upload or download may fail if you have an unstable network connection or other network issues. This feature adds an exponential backoff, or sleep, time and then retries the upload or download.
When a chunk upload or download operation fails, xbcloud checks the reason
for the failure. This failure can be a CURL error or an HTTP error, or a
client-specific error. If the error is listed in the Retriable errors list,
xbcloud pauses for a calculated time before retrying the operation until
that time reaches the
The operation is retried until the
--max-retries value is reached. If the
chunk operation fails on the last retry, xbcloud aborts the process.
The default values are the following:
–max-backoff = 300000 (5 minutes)
–max-retries = 10
You can adjust the number of retries by adding the
parameter and adjust the maximum length of time between retries by adding
--max-backoff parameter to an xbcloud command.
Since xbcloud does multiple asynchronous requests in parallel, a calculated
value, measured in milliseconds, is used for
max-backoff. This algorithm
calculates how many milliseconds to sleep before the next retry. A number
generated is based on the combining the power of two (2), the number of
retries already attempted and adds a random number between 1 and 1000. This
number avoids network congestion if multiple chunks have the same backoff
value. If the default values are used, the final retry attempt to be
approximately 17 minutes after the first try. The number is no longer
calculated when the milliseconds reach the
--max-backoff setting. At that
point, the retries continue by using the
--max-backoff setting until
max-retries parameter is reached.
We retry for the following CURL operations:
We retry for the following HTTP operation status codes:
Each cloud provider may return a different CURL error or an HTTP error,
depending on the issue. Add new errors by setting the following
--http-retriable-errors on the
command line or in
my.cnf or in a custom configuration file under
the [xbcloud] section. You can add multiple errors using a comma-separated list of codes.
The error handling is enhanced when using the
--verbose output. This
output specifies which error caused xbcloud to fail and what parameter a
user must add to retry on this error.
The following is an example of a verbose output:
210701 14:34:23 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data) 210701 14:34:23 /work/pxb/ins/8.0/bin/xbcloud: Curl error (52) Server returned nothing (no headers, no data) is not configured as retriable. You can allow it by adding --curl-retriable-errors=52 parameter
The following example adjusts the maximum number of retries and the maximum time between retries.
xbcloud [options] --max-retries=5 --max-backoff=10000
The following list describes the process using
xtrabackup_logfile.00000000000000000006fails to upload the first time and sleeps for 2384 milliseconds.
The same chunk fails for the second time and the sleep time is increased to 4387 milliseconds.
The same chunk fails for the third time and the sleep time is increased to 8691 milliseconds.
The same chunk fails for the fourth time. The
max-backoffparameter has been reached. All retries sleep the same amount of time after reaching the parameter.
The same chunk is successfully uploaded.
An example of the output for this setting
210702 10:07:05 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data) 210702 10:07:05 /work/pxb/ins/8.0/bin/xbcloud: Sleeping for 2384 ms before retrying backup3/xtrabackup_logfile.00000000000000000006 . . . 210702 10:07:23 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data) 210702 10:07:23 /work/pxb/ins/8.0/bin/xbcloud: Sleeping for 4387 ms before retrying backup3/xtrabackup_logfile.00000000000000000006 . . . 210702 10:07:52 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Failed sending data to the peer 210702 10:07:52 /work/pxb/ins/8.0/bin/xbcloud: Sleeping for 8691 ms before retrying backup3/xtrabackup_logfile.00000000000000000006 . . . 210702 10:08:47 /work/pxb/ins/8.0/bin/xbcloud: Operation failed. Error: Failed sending data to the peer 210702 10:08:47 /work/pxb/ins/8.0/bin/xbcloud: Sleeping for 10000 ms before retrying backup3/xtrabackup_logfile.00000000000000000006 . . . 210702 10:10:12 /work/pxb/ins/8.0/bin/xbcloud: successfully uploaded chunk: backup3/xtrabackup_logfile.00000000000000000006, size: 8388660