On MySQL InnoDB Row Formats and Compression

I have spent some time researching on InnoDB row formats and InnoDB compression. This article is about things that I have learned.

InnoDB file format and row format

There are 2 file formats, Antelope and Barracuda, with the latter being the newer. Antelope supports row formats Redundant and Compact, whereas Barracuda supports 2 more, Dynamic and Compressed.

Dynamic vs Compact

Compact is the default row format in MySQL 5.6. The difference between Compact and Dynamic is the handling of BLOBs (including BLOB, TEXT, VARCHAR, etc).

In Compact, even when a field is stored off-page (a.k.a. overflow page), the first 768 bytes and a 20-byte pointer is stored inline in the primary index (a.k.a. clustered index). This bloats the main index and not useful because the access to the overflow page is needed to reteive the full value anyway. Dynamic only stores the 20-byte pointer inline. More rows can fit inside a page if the prefix of a long value is not stored inline.

Misconception about off-page / overflow page

There is a common misconception that any field exceeding 768 bytes are automatically stored off-page. In fact, storing off-page only happens if the total length of the row exceeds certain number of bytes.

Given the default page size of 16KB and the InnoDB restriction that a page must fit at least 2 rows, the length limit of a row is ~8K. If a row exceeds 8K, the largest column is moved off-page until the row fits into the limit.

See Peter Zaitsev’s article and Mark Callaghan’s explanation and experiment.

Innodb table compression disk write

Redo log by default logs images of re-compressed pages. To tune this behavior, see innodb_log_compressed_pages.

Migrating to Barracuda and strict mode

To convert table to Dynamic or Compressed, use ALTER TABLE table_name ROW_FORMAT=DYNAMIC. Make sure innodb_file_format=Barracuda instead of Antelope, otherwise the alter table will implicitly rebuild table with row format Compact, with warnings. To avoid this undesired behavior, set innodb_strict_mode=ON.

MySQL 5.6 Online DDL & pt-online-schema-change

MySQL 5.6 has Online DDL. e.g. ALTER TABLE db.table ROW_FORMAT=COMPRESSED, ALGORITHM=INPLACE, LOCK=NONE;. Without online DDL, to perform online schema change, pt-online-schema-change is needed, but it has issues with foreign keys.

InnoDB table compression vs page compression

InnoDB table compression is ROW_FORMAT=COMPRESSED, which has been around for a few years. InnoDB page compression (a.k.a. Transparent Page Compression) is the new compression available in MySQL 5.7. It utilizes sparse file and hole punching in file systems. There are reports of nasty bugs that make page compression look non-production-ready.

Conclusion

InnoDB table compression remains a mature compression method available out of the box. It does not require application changes and is reasonable in terms of space-effiency and performance. Other solutions such as TokuDB and MyRocks are promising. See Pinterest’s slides for comparison of MySQL compression options.

References

  1. https://dev.mysql.com/doc/refman/5.6/en/innodb-row-format.html
  2. https://dev.mysql.com/doc/refman/5.6/en/innodb-compression.html
  3. http://mysqlentomologist.blogspot.com/2018/06/on-innodb-data-compression-in-mysql.html
  4. https://dba.stackexchange.com/questions/69795/benefits-of-barracuda-and-compression
  5. https://www.percona.com/blog/2010/02/09/blob-storage-in-innodb/
  6. https://www.facebook.com/notes/mysql-at-facebook/lob-storage-in-mysql-and-postgres/466714895932/
  7. https://www.facebook.com/notes/mysql-at-facebook/how-many-pages-does-innodb-for-tables-with-large-columns/10150481019790933/
  8. https://www.slideshare.net/nizameddin/getting-innodb-compressionreadyforfacebookscale
  9. https://www.slideshare.net/denshikarasu/less-is-more-novel-approaches-to-mysql-compression-for-modern-data-sets
Tags// , ,