CSV engine mode for standard-compliant quote and comma parsing¶
MySQL CSV Storage Engine is non-standard with respect to embedded "
and ,
character parsing. Fixing this issue unconditionally would break MySQL CSV format compatibility for any pre-existing user tables and for data exchange with other MySQL instances, but it would improve compatibility with other CSV producing/consuming tools.
To keep both MySQL and other tool compatibility, a new dynamic, global/session server variable csv_mode has been implemented. This variable allows an empty value (the default), and IETF_QUOTES
.
If IETF_QUOTES
is set, then embedded commas are accepted in quoted fields as-is, and a quote character is quoted by doubling it. In legacy mode embedded commas terminate the field, and quotes are quoted with a backslash.
Example¶
Table:
> CREATE TABLE albums (
`artist` text NOT NULL,
`album` text NOT NULL
) ENGINE=CSV DEFAULT CHARSET=utf8
;
Following example shows the difference in parsing for default and IETF_QUOTES
csv_quotes.
> INSERT INTO albums VALUES ("Great Artist", "Old Album"),
("Great Artist", "Old Album \"Limited Edition\"");
If the variable csv_mode is set to empty value (default) parsed data will look like:
"Great Artist","Old Album"
"Great Artist","\"Limited Edition\",Old Album"
If the variable csv_mode is set to IETF_QUOTES
parsed data will look like as described in CSV rules:
"Great Artist","Old Album"
"Great Artist","""Limited Edition"",Old Album"
Parsing the CSV file which has the proper quotes (shown in the previous example) can show different results:
With csv_mode set to empty value, parsed data will look like:
> SELECT * FROM albums;
+--------------+--------------------+
| artist | album |
+--------------+--------------------+
| Great Artist | Old Album |
| Great Artist | ""Limited Edition" |
+--------------+--------------------+
2 rows in set (0.02 sec)
With csv_mode set to IETF_QUOTES
parsed data will look like:
mysql> SET csv_mode = 'IETF_QUOTES';
Query OK, 0 rows affected (0.00 sec)
> SELECT * FROM albums;
+--------------+-----------------------------+
| artist | album |
+--------------+-----------------------------+
| Great Artist | Old Album |
| Great Artist | "Limited Edition",Old Album |
+--------------+-----------------------------+
Version Specific Information¶
- Percona Server for MySQL 5.7.10-1: Feature ported from Percona Server for MySQL 5.6
System Variables¶
csv_mode¶
Option | Description |
---|---|
Command-line | Yes |
Config file | Yes |
Scope | Global, Session |
Dynamic | Yes |
Data type | SET |
Default | (empty string) |
Range | (empty string) , IETF_QUOTES |
Setting this variable is to IETF_QUOTES
will enable the standard-compliant quote parsing: commas are accepted in quoted fields as-is, and quoting of "
is changed from \\"
to ""
. If the variable is set to empty value (the default), then the old parsing behavior is kept.