Monday 21 May 2018

Advanced Compression no good for Data Warehousing

This post is about experiencing "table fetch continued row" in compressed data warehouse tables that experience some level of updates to existing (old) data, and how using the Advanced Compression option of Oracle Enterprise Edition not only does not help in avoiding this but can actually introduce severe performance penalties into your daily load jobs.

Background

Basic Compression is a free part of Oracle Enterprise Edition, allowing you to compress the data within a table, and is useful for large databases such as a Data Warehouse. Not only will compression reduce the total disk storage used by that data in the table, it can also improve query performance in various ways - full table scans will read less disk blocks, and potentially more rows can be cached in the buffer cache in the SGA within the same number of blocks. One downside to Basic Compression is that it can only be used to compress existing data in a table or partition (done by "moving" that table or partition within the same tablespace). Newly inserted rows or updated rows are not compressed when added to a data block, and are added in their normal uncompressed form. This can mean that over time a compressed table or partition can become less compressed due to updates and newly inserted rows.

Oracle's official solution to this is the extra Advanced Compression Option with the OLTP Compression feature. OLTP Compression will compress the data of newly inserted rows into a table where it has been configured. It basically does this by inserting the rows in the normal uncompressed format (no performance penalty for these inserts) until the block becomes full, whereupon it then compresses that one block (performance penalty for that particular insert), which will also increase the free, unused space in that block for yet more rows to be inserted. The assumption is that overall for an average / normal OLTP application the frequency of block compressions will be low, and the impact on most transactions will be negligible (assuming they SELECT and INSERT on many other tables within the same business transaction).

A "perfect ideal" Data Warehouse would only ever load new data into the current partition of a partitioned table. Existing data would never be updated, and new data would never be inserted into a previous, old partition. In such a scenario you have two potentially straightforward strategies for compressing your data using only Basic Compression:
  • Wait until the end of a period when the database will switch to the next partition, and then compress the previous partition
  • As above plus re-compressing the current partition at a set frequency during the period in which it is the current partition
The first strategy means that all old partitions have the data in them compressed, and so take up minimal disk storage. However, the data in the current partition is not compressed at all. The old partitions only ever need to be compressed once, because this is an ideal scenario and that old data is never updated or extra data inserted into it.

The second strategy is the same as the first for old partitions, but also compresses the current partition at scheduled intervals to gain some of the benefits from compression on the most recently loaded data. How frequently you compress the current partition depends on various factors, such as the time to do this compression and the time available each night for the daily data load into the data warehouse. As the partition grows in size with more data being added to it, so the compression will take longer each time.

The Compressed Row Update Problem

Both strategies are straightforward and only use the features of Basic Compression. However, real world data warehouses are not "perfect" or "ideal" - old data can arrive after a period has ended and must be loaded into old, existing partitions, and corrections can come through for old data needing updates to the existing data. If your data volumes are very high, both within each partition and for the number of rows of old or corrected data, then your compressed partitions can become less compressed as a result and can grow in size.

As well as an increase in the disk storage used, there is also a potential performance penalty to pay for such data changes in old partitions due to the way the update physically happens within the data block. When the compressed row is updated it will not fit back into its original slot in the data block, because it will have grown in size one way or another. This means that Oracle will try and move that row to the end of the data block into the free, unused space area as reserved by the "PCTFREE" setting for that table. However, Basic Compression sets "PCTFREE" to zero by default, so there is nowhere for such updated rows to go within its current data block. What Oracle will do is "insert" that row into a new data block (as used for normal inserts), and leave behind a forward pointer in the block where the row used to be. This is termed "row migration" - the row's physical storage has migrated to another block within the table.

This row migration has a negative impact on index based reads, because the index structure has the ROWID of each data row in its leaf blocks, and the ROWID is really a map to the physical location of the data row - it specifies things such as the data file, block and offset within the block within the encoding in the ROWID. When a query using such an index goes to read a migrated data row it first reads the original data block as indicated by the ROWID and rather than the expected data row instead finds a forward pointer (another ROWID) to where the row has now been stored. So the query must issue a second read against the second data block in order to get the data row. Thus migrated updated rows cause extra disk reads when accessed through an index. Such "second, additional disk reads" are visible in the Oracle statistic of "table fetch continued row" i.e. you can see how many table row fetches cause an additional fetch due to migrated data rows.

Advanced Compression

In such scenarios you might assume that using the OLTP Compression feature of Advanced Compression would solve this. You would hope that it would keep the data in the old partitions compressed for you without any explicit action needed from you, and avoid or minimize the number of migrated rows. Unfortunately this is not true, and in fact using OLTP Compression will probably result in both a severe performance penalty during the daily data load and have no positive impact on updated rows being migrated. Which ultimately means that there is little to no point in using OLTP Compression on most data warehouses. Of course, there may be some scenarios where it might be of some benefit, but for the vast majority I believe it will only have a negative impact overall.

The first performance penalty is that it has a direct slow down on the INSERT's being executed by the daily data load jobs as the blocks are compressed as they fill up. This is to be expected of course, and you would hope that the performance penalty was relatively small and would ensure that all blocks in a table were compressed.

Unfortunately there is a second and more substantial performance penalty arising from this, because each block will be compressed multiple times during the data load. And this repeated compression of each block multiplies up the total CPU time cost used during the daily data load to the level where it can more than double the elapsed time of the data load jobs. This is a consequence of the way OLTP Compression works. Compression is triggered each time a block "becomes full", which will happen multiple times during the data load process. After a block is compressed for the first time, it will now have free unused space in it, into which new, subsequent rows can be inserted. So the same block will repeatedly fill up, compress, fill up again, compress, fill up again, compress, until eventually it is full of only compressed rows and another row cannot fit in the unused, free space.

And the third performance penalty is that the frequency of such block compressions is much higher with longer, larger data rows, such as typically found in a data warehouse. Longer rows will fill up a block sooner, triggering a compression sooner and more often as fewer rows are needed each time to consume the remaining free space i.e. the average number of rows inserted between a compression occurring gets lower as the row size increases. And the more compression operations that happen during a data load results in much more CPU time being consumed for a given number of rows. See the end of this post for the detail of the maths behind this.

This triple whammy combination - every INSERT slows down, blocks are compressed multiple times during the load, longer rows lead to more frequent block compressions - can cause data load times to double or triple in elapsed time. The precise level of the performance impact depends on various factors such as the rows sizes, the level of compression achievable, and the CPU's in your computer system.

Conversely Basic Compression only ever compresses each block once, so the total CPU used by it is far lower than for Advanced Compression. And because the compression operation takes place separately, there is no direct impact on the daily data loads themselves i.e. they continue to take the same elapsed time as normal.

But that is not the end of the negative impacts of using OLTP Compression in a data warehouse. The other big negative is that it does not actually solve the original problem you were experiencing - updated rows being migrated. Under OLTP Compression, only an INSERT will trigger a compression of a block, never an UPDATE. This has been shown by various people:
So OLTP Compression doesn't help at all with the original scenario - UPDATE's causing row migration with additional "table fetch continued row" block reads, and increased storage. Your data warehouse will still have migrated rows from updates, and an increase in storage and disk reads as a result. The only viable way to force migrated rows to move back and be compressed is to use Basic Compression.

And finally, for completeness, OLTP Compression typically achieves a slightly lower compression ratio than Basic Compression. This is because they both use the same compression mechanism, and OLTP Compression leaves the value of PCTFREE at the table / partition level setting unchanged (10% by default) whereas Basic Compression sets PCTFREE to 0 by default during the compression, so achieving more compressed rows per block. The difference in compression level will be very small and minor, but it is there nevertheless.

Conclusion

Although Basic Compression is not "perfect", for a data warehouse it is by far and away a better choice than trying to use OLTP Compression of the Advanced Compression option. The performance penalties from the significant increase in CPU usage during data loads when using OLTP Compression are very severe. And ultimately it does not even address the original problem we were concerned about - UPDATE's causing row migration into different data blocks, resulting in additional "table fetch continued row" block reads from queries needing those rows. And that's why sticking with the free Basic Compression used on a scheduled basis after a data load can be much more efficient for a data warehouse, combined with a well designed partitioning strategy.

Appendix - Why Longer Rows Lead To More Frequent Compressions

I stated earlier that when using Advanced OLTP Compression that longer rows will both trigger block compressions sooner, and more frequently i.e. more compressions on average per row (or conversely, fewer rows inserted to a block on average between compressions). I will here assume a simple 2:1 compression ratio between an uncompressed row and its compressed form as it keeps the maths a bit simpler, but still shows what is going on.

If you can fit 50 uncompressed rows into a block then you can fit 100 compressed rows into the same block. As the compression halves the size of the uncompressed rows, the block is compressed multiple times when data is inserted into it. First 50 rows are inserted, which triggers a compression leaving half the block free, so 25 more rows can be inserted, triggering another compression, then 13 rows can be inserted, the block is compressed, etc. Until eventually 100 rows fit in, but another one will not.

This series of halvings of how many rows can be loaded at a time is really like a sum over the series of powers of the number 2. The last load will be 1 row, the one before that of 2 rows, 4 before that, and so on. This means that the block is "full" when 2 to a certain power is larger than the number of rows that can fit in that block i.e. after that many compressions another row will not fit in the block.

For 100 rows in a block it will be 27 (2 to the power 7) of 128 needed, which is 7 compressions. If each block is compressed 7 times as it has rows inserted into it to achieve 100 rows in a block, then on average there is a compression every 100 / 7 = 14.29 rows.

For a much longer row that can only fit 10 uncompressed rows per block, then 20 compressed rows will fit in that block. And 25 is 32 (2 to the power 5) which is larger than 20, so 5 compressions will be needed to fill each block. As 20 rows will have been loaded into a block then a compression is occurring on average every 20 / 5 = 4 rows.

This is a lot more frequent than the previous case i.e. over 3 times worse on average! Thus OLTP Compression on a data warehouse with larger rows leads to a much larger CPU penalty from all of the repeated block compression that will be occurring during data loads. And why the alternative of just doing a normal load followed by a one-off compression using Basic Compression can be much more efficient.