On MySQL InnoDB Row Formats and Compression
9/Jun 2019I have spent some time researching on InnoDB row formats and InnoDB compression. This article is about things that I have learned.
InnoDB file format and row format
There are 2 file formats, Antelope and Barracuda, with the latter being the newer. Antelope supports row formats Redundant
and Compact
, whereas Barracuda supports 2 more, Dynamic
and Compressed
.
Dynamic vs Compact
Compact is the default row format in MySQL 5.6. The difference between Compact and Dynamic is the handling of BLOBs (including BLOB, TEXT, VARCHAR, etc).
In Compact, even when a field is stored off-page (a.k.a. overflow page), the first 768 bytes and a 20-byte pointer is stored inline in the primary index (a.k.a. clustered index). This bloats the main index and not useful because the access to the overflow page is needed to reteive the full value anyway. Dynamic only stores the 20-byte pointer inline. More rows can fit inside a page if the prefix of a long value is not stored inline.
Misconception about off-page / overflow page
There is a common misconception that any field exceeding 768 bytes are automatically stored off-page. In fact, storing off-page only happens if the total length of the row exceeds certain number of bytes.
Given the default page size of 16KB and the InnoDB restriction that a page must fit at least 2 rows, the length limit of a row is ~8K. If a row exceeds 8K, the largest column is moved off-page until the row fits into the limit.
See Peter Zaitsev’s article and Mark Callaghan’s explanation and experiment.
Innodb table compression disk write
Redo log by default logs images of re-compressed pages. To tune this behavior, see innodb_log_compressed_pages.
Migrating to Barracuda and strict mode
To convert table to Dynamic or Compressed, use ALTER TABLE table_name ROW_FORMAT=DYNAMIC
. Make sure innodb_file_format=Barracuda
instead of Antelope
, otherwise the alter table will implicitly rebuild table with row format Compact, with warnings. To avoid this undesired behavior, set innodb_strict_mode=ON
.
MySQL 5.6 Online DDL & pt-online-schema-change
MySQL 5.6 has Online DDL. e.g. ALTER TABLE db.table ROW_FORMAT=COMPRESSED, ALGORITHM=INPLACE, LOCK=NONE;
. Without online DDL, to perform online schema change, pt-online-schema-change is needed, but it has issues with foreign keys.
InnoDB table compression vs page compression
InnoDB table compression is ROW_FORMAT=COMPRESSED
, which has been around for a few years. InnoDB page compression (a.k.a. Transparent Page Compression) is the new compression available in MySQL 5.7. It utilizes sparse file and hole punching in file systems. There are reports of nasty bugs that make page compression look non-production-ready.
Conclusion
InnoDB table compression remains a mature compression method available out of the box. It does not require application changes and is reasonable in terms of space-effiency and performance. Other solutions such as TokuDB and MyRocks are promising. See Pinterest’s slides for comparison of MySQL compression options.
References
- https://dev.mysql.com/doc/refman/5.6/en/innodb-row-format.html
- https://dev.mysql.com/doc/refman/5.6/en/innodb-compression.html
- http://mysqlentomologist.blogspot.com/2018/06/on-innodb-data-compression-in-mysql.html
- https://dba.stackexchange.com/questions/69795/benefits-of-barracuda-and-compression
- https://www.percona.com/blog/2010/02/09/blob-storage-in-innodb/
- https://www.facebook.com/notes/mysql-at-facebook/lob-storage-in-mysql-and-postgres/466714895932/
- https://www.facebook.com/notes/mysql-at-facebook/how-many-pages-does-innodb-for-tables-with-large-columns/10150481019790933/
- https://www.slideshare.net/nizameddin/getting-innodb-compressionreadyforfacebookscale
- https://www.slideshare.net/denshikarasu/less-is-more-novel-approaches-to-mysql-compression-for-modern-data-sets