Friday, 8 May 2009

Oracle Optimizer Plan Costing - Full Table Scans

Hopefully this is the start of a series on how the Oracle Cost Based Optimizer determines the cost of a particular execution plan for a SQL statement. Knowing how it arrives at the cost of each possible access path to a data set, we can better understand why a particular access path was chosen, and why an alternative access path had a higher cost. This information was kick started by A Look Under The Hood Of CBO: The 10053 Event by Wolfgang Breitling, and also information in Troubleshooting Oracle Performance by Christian Antognini, and refined by various tests I did.The Under the Hood paper is very good, and has a good description of how the Optimizer goes about costing different access paths to data sets. Unfortunately it seems to be based on Oracle 9i, and things have changed slightly in Oracle 10g. The principles are the same, but the trace information output by the optimizer has changed slightly.

For this post I will focus on how the optimizer works generally, and specifically how a Full Table Scan is costed. These will not be full, exhaustive explanations of the inner workings of the Oracle Cost Based Optimiser, but short and to the point descriptions of the key costings.

Plan Costing

The Optimizer itself costs all execution plans in terms of the number of disk I/Os involved, because these are generally the largest component of the execution plan cost. Other costs such as CPU costs are also included in the total plan cost, after being converted to an I/O count via a conversion value. When the lowest cost plan has been produced, this can be easily converted to an elapsed time value by simply multiplying the cost by an average I/O time value.

When producing an execution plan for a SQL query (queries are the most common and also the most complicated things that the Optimizer has to deal with), the Optimizer simply tries to produce all possible execution plans and cost each of them. From these it chooses the plan with the lowest cost. Clearly this could be quite an exhaustive way of doing things. So what the Optimizer actually does is to fully cost one execution plan, and then only partially cost the subsequent execution plans.

Each subsequent plan is costed as each step of the plan is determined by the Optimizer. If the partial cost of such a plan so far is greater than the current best plan, then it is abandoned. This early pruning avoids the optimizer wasting too much time on plans that are too costly. Likewise if a new plan is produced with a total cost less than the current best plan, then the new plan becomes the current best plan.

After considering all possible execution plans the optimizer will have the lowest cost execution plan for the SQL query.

The execution plans themselves are formed in an iterative fashion. First, for each table referenced in the SQL query the optimizer costs each potential access path to the data in that table. Indexes are considered if they can be used directly to implement a predicate - a filter condition within the SQL query restricting the rows returned. Then the optimizer considers each possible join combination between the tables in the SQL query, using each possible join method - Nested Loop, Sort Merge or Hash Join. These joins use some of the costs for accessing each of the tables determined eariler, combined according to how that join method works. Thus each possible join path (A to B to C, A to C to B, B to A to C, etc.) is costed in turn using each of the 3 possible join methods, to arrive at the lowest cost execution plan for the SQL query.

Full Table Scan

This is one access method to data in a table that is always available to the optimizer. Logically the cost is the number of disk I/Os required to read every data block in the table. The number of data blocks in a table is obtained from the BLOCKS column in USER_TABLES. It is as accurate as the last time you gathered statistics on that table. Oracle 10g has an automatic job that runs every night by default to update the statistics on tables that have changed significantly, to minimise the drift between actual table sizes and the statistics stored about them in the database.

An enhancement Oracle has is the ability to issue multi-block reads that read more than one block from disk in a single read request. This reduces the total number of disk I/Os needed. Thus the number of disk I/Os needed is reduced by the multi-block read count (MBRC), according to the formula:
  • Blocks / MBRC

The Optimizer wants all costs to be in units of equal disk I/Os i.e. in units of single block reads. However, the time for a multi-block read is likely to be different – generally slightly longer given that more data is transferred. To convert a multi-block read cost into units of a single block read, the multi-block read cost is divided by the time for a multi-block read and multiplied by the time for a single block read. This adjustment means that the “cost” is now in units of cost of single block reads, as used by other access paths such as indexes.

Therefore the cost of a Full Table Scan is actually:
  • (Blocks * MREADTIM) / (MBRC * SREADTIM)

MREADTIM and SREADTIM, as well as MBRC, are part of a set of statistics Oracle has about the relative performance of the computer system it is running on. These are termed the System Statistics, and are visible in sys.aux_stats$ or via the dbms_stats.get_system_stats procedure. Out of the box following installation, Oracle only has a minimal set of default values for some of these statistics - IOSEEKTIM, IOTFRSPEED and CPUSPEEDNW. All the other system statistics have no value. They can be set in a number of ways - the intended way being to get Oracle to measure these values from the system itself while it is executing some representative workload (these are termed Workload Statistics) - but they can be set manually if desired, or left unset.

Oracle uses those System Statistics that it has values for. For those with no value Oracle uses various substitution formulae to calculate values to use instead.

If there is a value for MBRC in the System Statistics then the Optimizer uses that value. Otherwise, it uses the value of the db_file_multiblock_read_count (DBFMBRC) initialization parameter.

If MREADTIM and SREADTIM do not have any values in the System Statistics, then the following substitutions are made:

This is probably the general case on most systems - no Workload System Statistics are present (MBRC, SREADTIM, MREADTIM and others) - and only default values for IOSEEKTIM (10 milliseconds) and IOTFRSPEED (4 KB / millisecond). For an 8 KB block size and DBFMBRC of 8 this gives values for SREADTIM of 12 ms and for MREADTIM of 26 ms, and the ratio of MREADTIM to SREADTIM of 2.167.

A simple test shows this to be the case - a SELECT on a table with no indexes. The table I used had 1252 blocks of 8 KB each, and DBFMBRC was set to 8. No System Statistics had been collected, so only the defaults exist for IOSEEKTIM and IOTFRSPEED. Using the substituted formulae for MREADTIM and SREADTIM the calculated cost should be:
  • (1252 * 26) / (8 * 12) = 32552 / 96 = 339.08

The reported cost by Oracle was 341, which is very close. The difference is due to various factors, such as rounding within the calculation, the optimizer probably costing an extra read for the table header (+1), and the optimizer adding in a CPU cost for processing the data.

Using these formulae and the values of the System Statistics, we can calculate for ourselves the expected cost of a Full Table Scan of any table.

1 comment:

Shailesh said...

clap clap......... for you.
Does oracle consider long table or short table parameter in determining access path. if yes, how?