Skip to content
logo
Percona XtraBackup
Analyze table statistics
Initializing search
    percona/pxb-docs
    percona/pxb-docs
    • Home
      • About Percona XtraBackup
      • How Percona XtraBackup works
      • Understand version numbers
      • Install Percona XtraBackup 8.0
      • Use an APT repo to install Percona XtraBackup
      • Use a YUM repo to install Percona XtraBackup
      • Use DEB or RPM downloaded packages to install Percona XtraBackup
      • Install Percona XtraBackup from a Binary Tarball
      • Compile and install Percona XtraBackup from source code
      • What’s in the packages
      • Uninstall Percona XtraBackup
      • Run Percona XtraBackup in a Docker container
      • Implementation details
      • Connection and privileges needed
      • Configure xtrabackup
      • Server version and backup version comparison
      • xtrabackup exit codes
      • The backup cycle - full backups
      • Incremental backup
      • Compressed backup
      • Partial backups
      • Percona XtraBackup user manual
      • Throttling backups
      • Encrypted InnoDB tablespace backups
      • Encrypt backups
      • LRU dump backup
      • Point-in-time recovery
      • Restore individual tables
      • Smart memory estimation
      • Work with binary logs
      • Improved log statements
      • Work with SELinux
      • Work with AppArmor
      • Enable the server to communicate via TCP/IP
      • Install and configure an SSH server
      • Analyze table statistics
        • Script to format output
          • Sample script output
      • FLUSH TABLES WITH READ LOCK option
      • lock-ddl-per-table option improvements
      • Take an incremental backup using page tracking
      • The xbcloud binary
      • Use the xbcloud binary with Swift
      • Use xbcloud Binary with Amazon S3
      • Xbcloud with an IAM instance profile
      • Use the xbcloud binary with MinIO
      • Use the xbcloud with Google Cloud Storage
      • Exponential backoff
      • Use the xbcloud binary with Microsoft Azure Cloud Storage
      • How-tos and recipes
      • Release notes index
      • Percona XtraBackup 8.0.33-27 (2023-05-25)
      • Percona XtraBackup 8.0.32-26 (2023-04-04)
      • Percona XtraBackup 8.0.32-25 (2023-02-27)
      • Percona XtraBackup 8.0.31-24 (2023-02-07)
      • Percona XtraBackup 8.0.30-23 (2022-11-14)
      • Percona XtraBackup 8.0.29-22 (2022-07-19)
      • Percona XtraBackup 8.0.28-21 (2022-05-25)
      • Percona XtraBackup 8.0.28-20
      • Percona XtraBackup 8.0.27-19
      • Percona XtraBackup 8.0.26-18.0
      • Percona XtraBackup 8.0.25-17.0
      • Percona XtraBackup 8.0.23-16.0
      • Percona XtraBackup 8.0.22-15.0
      • Percona XtraBackup 8.0.14
      • Percona XtraBackup 8.0.13
      • Percona XtraBackup 8.0.12
      • Percona XtraBackup 8.0.11
      • Percona XtraBackup 8.0.10
      • Percona XtraBackup 8.0.9
      • Percona XtraBackup 8.0.8
      • Percona XtraBackup 8.0.7
      • Percona XtraBackup 8.0.6
      • Percona XtraBackup 8.0.5
      • Percona XtraBackup 8.0.4
      • Percona XtraBackup 8.0-3-rc1
      • Error Message: Found tables with row versions due to INSTANT ADD/DROP columns
      • The xtrabackup option reference
      • The xbcrypt binary
      • The xbstream binary
      • Frequently asked questions
      • Glossary
      • Index of files created by Percona XtraBackup
      • Trademark policy
      • Copyright and licensing information
      • Version checking

    • Script to format output
      • Sample script output

    Analyze table statistics¶

    Note

    As of Percona XtraBackup 8.0.30 the --stats option is deprecated and may be removed in future releases. We no longer support using the --stats option in Percona XtraBackup 8.0.29 and older versions.

    The xtrabackup binary can analyze InnoDB data files in read-only mode to give statistics about them. To do this, you should use the --stats option. You can combine this with the --tables option to limit the files to examine. It also uses the --use-memory option.

    You can perform the analysis on a running server, with some chance of errors due to the data being changed during analysis. Or, you can analyze a backup copy of the database. Either way, to use the statistics feature, you need a clean copy of the database including correctly sized log files, so you need to execute with --prepare twice to use this functionality on a backup.

    The result of running on a backup might look like the following:

    Expected output
    <INDEX STATISTICS>
      table: test/table1, index: PRIMARY, space id: 12, root page 3
      estimated statistics in dictionary:
        key vals: 25265338, leaf pages 497839, size pages 498304
      real statistics:
         level 2 pages: pages=1, data=5395 bytes, data/pages=32%
         level 1 pages: pages=415, data=6471907 bytes, data/pages=95%
            leaf pages: recs=25958413, pages=497839, data=7492026403 bytes, data/pages=91%
    

    This can be interpreted as follows:

    • The first line simply shows the table and index name and its internal identifiers. If you see an index named GEN_CLUST_INDEX, that is the table’s clustered index, automatically created because you did not explicitly create a PRIMARY KEY.

    • The estimated statistics in dictionary information is similar to the data that’s gathered through ANALYZE TABLE inside of InnoDB to be stored as estimated cardinality statistics and passed to the query optimizer.

    • The real statistics information is the result of scanning the data pages and computing exact information about the index.

    • The level <X> pages: output means that the line shows information about pages at that level in the index tree. The larger <X> is, the farther it is from the leaf pages, which are level 0. The first line is the root page.

    • The leaf pages output shows the leaf pages, of course. This is where the table’s data is stored.

    • The external pages: output (not shown) shows large external pages that hold values too long to fit in the row itself, such as long BLOB and TEXT values.

    • The recs is the real number of records (rows) in leaf pages.

    • The pages is the page count.

    • The data is the total size of the data in the pages, in bytes.

    • The data/pages is calculated as (data / (pages * PAGE_SIZE)) * 100%. It will never reach 100% because of space reserved for page headers and footers.

    A more detailed example is posted as a MySQL Performance Blog post.

    Script to format output¶

    The following script can be used to summarize and tabulate the output of the statistics information:

    Expected output
    tabulate-xtrabackup-stats.pl
    
    #!/usr/bin/env perl
    use strict;
    use warnings FATAL => 'all';
    my $script_version = "0.1";
    
    my $PG_SIZE = 16_384; # InnoDB defaults to 16k pages, change if needed.
    my ($cur_idx, $cur_tbl);
    my (%idx_stats, %tbl_stats);
    my ($max_tbl_len, $max_idx_len) = (0, 0);
    while ( my $line = <> ) {
       if ( my ($t, $i) = $line =~ m/table: (.*), index: (.*), space id:/ ) {
          $t =~ s!/!.!;
          $cur_tbl = $t;
          $cur_idx = $i;
          if ( length($i) > $max_idx_len ) {
             $max_idx_len = length($i);
          }
          if ( length($t) > $max_tbl_len ) {
             $max_tbl_len = length($t);
          }
       }
       elsif ( my ($kv, $lp, $sp) = $line =~ m/key vals: (\d+), \D*(\d+), \D*(\d+)/ ) {
          @{$idx_stats{$cur_tbl}->{$cur_idx}}{qw(est_kv est_lp est_sp)} = ($kv, $lp, $sp);
          $tbl_stats{$cur_tbl}->{est_kv} += $kv;
          $tbl_stats{$cur_tbl}->{est_lp} += $lp;
          $tbl_stats{$cur_tbl}->{est_sp} += $sp;
       }
        elsif ( my ($l, $pages, $bytes) = $line =~ m/(?:level (\d+)|leaf) pages:.*pages=(\d+), data=(\d+) bytes/ ) {
          $l ||= 0;
          $idx_stats{$cur_tbl}->{$cur_idx}->{real_pages} += $pages;
          $idx_stats{$cur_tbl}->{$cur_idx}->{real_bytes} += $bytes;
          $tbl_stats{$cur_tbl}->{real_pages} += $pages;
          $tbl_stats{$cur_tbl}->{real_bytes} += $bytes;
       }
    }
    
    my $hdr_fmt = "%${max_tbl_len}s %${max_idx_len}s %9s %10s %10s\n";
    my @headers = qw(TABLE INDEX TOT_PAGES FREE_PAGES PCT_FULL);
    printf $hdr_fmt, @headers;
    
    my $row_fmt = "%${max_tbl_len}s %${max_idx_len}s %9d %10d %9.1f%%\n";
    foreach my $t ( sort keys %tbl_stats ) {
      my $tbl = $tbl_stats{$t};
      printf $row_fmt, $t, "", $tbl->{est_sp}, $tbl->{est_sp} - $tbl->{real_pages},
          $tbl->{real_bytes} / ($tbl->{real_pages} * $PG_SIZE) * 100;
       foreach my $i ( sort keys %{$idx_stats{$t}} ) {
          my $idx = $idx_stats{$t}->{$i};
          printf $row_fmt, $t, $i, $idx->{est_sp}, $idx->{est_sp} - $idx->{real_pages},
             $idx->{real_bytes} / ($idx->{real_pages} * $PG_SIZE) * 100;
       }
    }
    

    Sample script output¶

    The output of the above Perl script, when run against the sample shown in the previously mentioned blog post, will appear as follows:

    Expected output
          TABLE           INDEX TOT_PAGES FREE_PAGES   PCT_FULL
    art.link_out104                    832383      38561      86.8%
    art.link_out104         PRIMARY    498304         49      91.9%
    art.link_out104       domain_id     49600       6230      76.9%
    art.link_out104     domain_id_2     26495       3339      89.1%
    art.link_out104 from_message_id     28160        142      96.3%
    art.link_out104    from_site_id     38848       4874      79.4%
    art.link_out104   revert_domain    153984      19276      71.4%
    art.link_out104    site_message     36992       4651      83.4%
    

    The columns are the table and index, followed by the total number of pages in that index, the number of pages not actually occupied by data, and the number of bytes of real data as a percentage of the total size of the pages of real data. The first line in the above output, in which the INDEX column is empty, is a summary of the entire table.

    Contact us

    For free technical help, visit the Percona Community Forum.

    To report bugs or submit feature requests, open a JIRA ticket.

    For paid support and managed or consulting services , contact Percona Sales.


    Last update: 2022-12-12
    Percona LLC and/or its affiliates, © 2023
    Made with Material for MkDocs

    Cookie consent

    We use cookies to recognize your repeated visits and preferences, as well as to measure the effectiveness of our documentation and whether users find what they're searching for. With your consent, you're helping us to make our documentation better.