Skip to content

Data Masking

This feature was implemented in Percona Server for MySQL version Percona Server for MySQL 5.7.32-35.

The Percona Data Masking plugin is a free and Open Source implementation of the MySQL’s data masking plugin. Data Masking provides a set of functions to hide sensitive data with modified content.

Data masking can have either of the characteristics:

  • Generation of random data, such as an email address

  • De-identify data by transforming the data to hide content

Installing the plugin

The following command installs the plugin:

mysql> INSTALL PLUGIN data_masking SONAME 'data_masking.so';

Data Masking functions

The data masking functions have the following categories:

  • General purpose

  • Special purpose

  • Generating Random Data with Defined characteristics

  • Using Dictionaries to Generate Random Data

General Purpose

The general purpose data masking functions are the following:

Parameter

Description

Sample

mask_inner(string, margin1, margin2 [, character])

Returns a result where only the inner part of a string is masked. An optional masking character can be specified.

mysql> SELECT mask_inner('123456789', 1, 2);

+-----------------------------------+
| mask_inner("123456789", 1, 2)     |
+-----------------------------------+
|1XXXXXX89                          |
+-----------------------------------+

mask_outer(string, margin1, margin2 [, character])

Masks the outer part of the string. The inner section is not masked.

mysql> SELECT mask_outer('123456789', 2, 2);

+------------------------------------+
| mask_outer("123456789", 2, 2).     |
+------------------------------------+
| XX34567XX                          |
+------------------------------------+

Special Purpose

The special purpose data masking functions are as follows:

Generating Random Data for Specific Requirements

The following functions generate random values for specific requirements:

Parameter

Description

Sample

gen_range(lower, upper)

Generates a random number based on a selected range and supports negative numbers.

mysql> SELECT gen_range(10, 100) AS result;

+--------------------------------------+
| result                               |
+--------------------------------------+
| 56                                   |
+--------------------------------------+

mysql> SELECT gen_range(100,80);

+--------------------------------------+
| gen_range(100,80)                    |
+--------------------------------------+
| 91                                   |
+--------------------------------------+

gen_rnd_email()

Generates a random email address. The domain is example.com.

mysql> SELECT gen_rnd_email();

+---------------------------------------+
| gen_rnd_email()                       |
+---------------------------------------+
| sma.jrts@example.com                  |
+---------------------------------------+

gen_rnd_pan([size in integer])

Generates a random primary account number. This function should only be used for test purposes.

mysql> SELECT mask_pan(gen_rnd_pan());

+-------------------------------------+
| mask_pan(gen_rnd_pan())             |
+-------------------------------------+
| XXXXXXXXXXXX4444                    |
+-------------------------------------+

gen_rnd_us_phone()

Generates a random U.S. phone number. The generated number adds the 1 dialing code and is in the 555 area code. The 555 area code is not valid for any U.S. phone number.

mysql> SELECT gen_rnd_us_phone();

+-------------------------------+
| gen_rnd_us_phone()            |
+-------------------------------+
| 1-555-635-5709                |
+-------------------------------+

gen_rnd_ssn()

Generates a random, non-legitimate US Social Security Number in an AAA-BBB-CCCC format. This function should only be used for test purposes.

mysql> SELECT gen_rnd_ssn()

+-----------------------------+
| gen_rnd_ssn()               |
+-----------------------------+
| 995-33-5656                 |
+-----------------------------+

Using Dictionaries to Generate Random Terms

Data masking returns a value from a range. To use a predefined file as the range to select a string value, load and use a dictionary. A dictionary supports only strings and is loaded from a file with the following characteristics:

  • Plain text

  • One term per line

  • Must contain at least one entry

An example of a dictionary, which is a list of trees, located in /usr/local/mysql/dict-files/testdict

  • Black Ash

  • White Ash

  • Bigtooth Aspen

  • Quaking Aspen

The following table displays the commands for using dictionaries to generate random terms:

Parameter

Description

Sample

gen_range(lower, upper)

Generates a random number based on a selected range and supports negative numbers.

mysql> SELECT gen_range(10, 100) AS result;

+--------------------------------------+
| result                               |
+--------------------------------------+
| 56                                   |
+--------------------------------------+

mysql> SELECT gen_range(100,80);

+--------------------------------------+
| gen_range(100,80)                    |
+--------------------------------------+
| 91                                   |
+--------------------------------------+

gen_rnd_email()

Generates a random email address. The domain is example.com.

mysql> SELECT gen_rnd_email();

+---------------------------------------+
| gen_rnd_email()                       |
+---------------------------------------+
| sma.jrts@example.com                  |
+---------------------------------------+

gen_rnd_pan([size in integer])

Generates a random primary account number. This function should only be used for test purposes.

mysql> SELECT mask_pan(gen_rnd_pan());

+-------------------------------------+
| mask_pan(gen_rnd_pan())             |
+-------------------------------------+
| XXXXXXXXXXXX4444                    |
+-------------------------------------+

gen_rnd_us_phone()

Generates a random U.S. phone number. The generated number adds the 1 dialing code and is in the 555 area code. The 555 area code is not valid for any U.S. phone number.

mysql> SELECT gen_rnd_us_phone();

+-------------------------------+
| gen_rnd_us_phone()            |
+-------------------------------+
| 1-555-635-5709                |
+-------------------------------+

gen_rnd_ssn()

Generates a random, non-legitimate US Social Security Number in an AAA-BBB-CCCC format. This function should only be used for test purposes.

mysql> SELECT gen_rnd_ssn()

+-----------------------------+
| gen_rnd_ssn()               |
+-----------------------------+
| 995-33-5656                 |
+-----------------------------+

Uninstalling the plugin

The UNINSTALL PLUGIN statement disables and uninstalls the plugin.


Last update: 2022-09-27