Skip to content

Exponential backoff

Exponential backoff increases the chances for the completion of a backup or a restore operation. For example, a chunk upload or download may fail if you have an unstable network connection or other network issues. This feature adds an exponential backoff, or sleep, time and then retries the upload or download.

When a chunk upload or download operation fails, xbcloud checks the reason for the failure. This failure can be a CURL error or an HTTP error, or a client-specific error. If the error is listed in the Retriable errors list, xbcloud pauses for a calculated time before retrying the operation until that time reaches the --max-backoff value.

The operation is retried until the --max-retries value is reached. If the chunk operation fails on the last retry, xbcloud aborts the process.

The default values are the following:

  • –max-backoff = 300000 (5 minutes)

  • –max-retries = 10

You can adjust the number of retries by adding the --max-retries parameter and adjust the maximum length of time between retries by adding the --max-backoff parameter to an xbcloud command.

Since xbcloud does multiple asynchronous requests in parallel, a calculated value, measured in milliseconds, is used for max-backoff. This algorithm calculates how many milliseconds to sleep before the next retry. A number generated is based on the combining the power of two (2), the number of retries already attempted and adds a random number between 1 and 1000. This number avoids network congestion if multiple chunks have the same backoff value. If the default values are used, the final retry attempt to be approximately 17 minutes after the first try. The number is no longer calculated when the milliseconds reach the --max-backoff setting. At that point, the retries continue by using the --max-backoff setting until the max-retries parameter is reached.

Retriable errors

We retry for the following CURL operations:

  • CURLE_GOT_NOTHING

  • CURLE_OPERATION_TIMEOUT

  • CURLE_RECV_ERROR

  • CURLE_SEND_ERROR

  • CURLE_SEND_FAIL_REWIND

  • CURLE_PARTIAL_FILE

  • CURLE_SSL_CONNECT_ERROR

We retry for the following HTTP operation status codes:

  • 503

  • 500

  • 504

  • 408

Each cloud provider may return a different CURL error or an HTTP error, depending on the issue. Add new errors by setting the following variables --curl-retriable-errors or --http-retriable-errors on the command line or in my.cnf or in a custom configuration file under the [xbcloud] section. You can add multiple errors using a comma-separated list of codes.

The error handling is enhanced when using the --verbose output. This output specifies which error caused xbcloud to fail and what parameter a user must add to retry on this error.

The following is an example of a verbose output:

Expected output
210701 14:34:23 /work/pxb/ins/8.3/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data)
210701 14:34:23 /work/pxb/ins/8.3/bin/xbcloud: Curl error (52) Server returned nothing (no headers, no data) is not configured as retriable. You can allow it by adding --curl-retriable-errors=52 parameter

Example

The following example adjusts the maximum number of retries and the maximum time between retries.

xbcloud [options] --max-retries=5 --max-backoff=10000

The following list describes the process using --max-backoff=10000:

  • The chunk xtrabackup_logfile.00000000000000000006 fails to upload the first time and sleeps for 2384 milliseconds.

  • The same chunk fails for the second time and the sleep time is increased to 4387 milliseconds.

  • The same chunk fails for the third time and the sleep time is increased to 8691 milliseconds.

  • The same chunk fails for the fourth time. The max-backoff parameter has been reached. All retries sleep the same amount of time after reaching the parameter.

  • The same chunk is successfully uploaded.

An example of the output for this setting
210702 10:07:05 /work/pxb/ins/8.3/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data)
210702 10:07:05 /work/pxb/ins/8.3/bin/xbcloud: Sleeping for 2384 ms before retrying backup3/xtrabackup_logfile.00000000000000000006
. . .
210702 10:07:23 /work/pxb/ins/8.3/bin/xbcloud: Operation failed. Error: Server returned nothing (no headers, no data)
210702 10:07:23 /work/pxb/ins/8.3/bin/xbcloud: Sleeping for 4387 ms before retrying backup3/xtrabackup_logfile.00000000000000000006
. . .
210702 10:07:52 /work/pxb/ins/8.3/bin/xbcloud: Operation failed. Error: Failed sending data to the peer
210702 10:07:52 /work/pxb/ins/8.3/bin/xbcloud: Sleeping for 8691 ms before retrying backup3/xtrabackup_logfile.00000000000000000006
. . .
210702 10:08:47 /work/pxb/ins/8.3/bin/xbcloud: Operation failed. Error: Failed sending data to the peer
210702 10:08:47 /work/pxb/ins/8.3/bin/xbcloud: Sleeping for 10000 ms before retrying backup3/xtrabackup_logfile.00000000000000000006
. . .
210702 10:10:12 /work/pxb/ins/8.3/bin/xbcloud: successfully uploaded chunk: backup3/xtrabackup_logfile.00000000000000000006, size: 8388660

Get expert help

If you need assistance, visit the community forum for comprehensive and free database knowledge, or contact our Percona Database Experts for professional support and services.


Last update: 2024-09-04