This post is about how a slightly de-normalized database design
involving redundant foreign keys to other tables can end up producing
sub-optimal execution plans for queries that use those extra joins as
additional filter conditions.
By default the Oracle Optimizer
assumes that different columns of data in a table are independent of
each other, and that their data values are not correlated with each
other in any way. When a query has filter conditions on multiple
columns in a table, the Optimizer will combine their filter factors
together assuming they are independent of each other i.e. unless it has
explicit information to the contrary. So if each filter factor was
estimated to match 1% of the rows in the table, then their combined
filter factor would be to match 0.01% of the rows in the table (1% of
1%).
However, if the data in these columns is in fact correlated
then this filter factor estimate will be wrong. A classic example is
"month of birth" and "zodiac sign". Each has only 12 possible values,
and the Optimizer will assume that there are 144 possible combinations.
But in fact there are only 24 possible value combinations because each
zodiac sign straddles only two months. Assuming a uniform distribution
amongst these 24 possible values within the data rows in the table, then
each pair of values would match 1 / 24 or 4.17% of the rows in the
table. However, the Optimizer will assume that a pair of values would
match 1 / 144 or only 0.69% of the rows in the table. Such misestimates
can make the Optimizer choose a relatively inefficient access such as
using an index instead of a full table scan, due to estimating too few
rows to match the filters and a too low cost for the data access.
Often
this kind of correlation between data values in columns in a table
occurs naturally, and there is little you can directly do about it. The
"month of birth" and "zodiac sign" is one such example. However, this
kind of correlation between columns can also occur as a result of how
the tables have been designed, and can result in sub-optimal execution
plans being produced.
One scenario is adding a redundant column
to a table as a foreign key to an indirectly related table, such as a
"grandparent" table that is the parent of the table's direct parent. A
specific example would be something like putting the "customer / account
identifier" into the "order line item" table, as a redundant copy of
that in the "order" table, along with the "order identifier". Or
putting both the "product main category id" and "product sub-category
id" into the "product" table, when there is a hierarchy of categories.
Again,
in these cases, the Optimizer will assume that such columns are
independent of each other, unless it has explicit information otherwise.
The danger now is that the Optimizer will produce
different execution plans for what are essentially the
same queries depending on whether the
redundant joins are included or not.
Lets
look at a solid example. I created three tables each with a parent /
child relationship between them, and more rows in the child tables than
the parent tables. I won't post the SQL DDL for these as it will make
the post too long, but their structure and contents should be
straightforward enough, and all values are uniformly distributed with an
equal number of child records per parent record. There are indexes on
the primary key columns, and the individual foreign key columns, but
nothing else. Primary key and foreign key constraints are explicitly
specified.
The tables are Grandparent (1,000 rows), Parent
(100,000 rows), and Child (10,000,000 rows) i.e. 100:1 ratios, and each
has a primary key column named "pkid". The Parent table has a foreign
key column (gp_id) to Grandparent, while the Child table has a foreign
key to Parent (of p_id) and also a redundant foreign key to Grandparent
(of gp_id).
I will now execute three different but equivalent
queries against these tables using different combinations of the
possible foreign key joins:
- Only direct joins from child to parent, and then parent to grandparent
- As previous query plus additional (redundant) join from child to grandparent
- Direct from child to grandparent only, and parent table omitted from query
Each
query has a filter condition on another column in the grandparent table
restricting the matching rows to just 5 rows in that table.
I'm using Oracle 11gR2 on Oracle Linux:
SQL> select * from v$version;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
PL/SQL Release 11.2.0.4.0 - Production
CORE 11.2.0.4.0 Production
TNS for Linux: Version 11.2.0.4.0 - Production
NLSRTL Version 11.2.0.4.0 - Production
First query - direct joins between parent / child tables:
SQL> @test1
ROW_COUNT
----------
50000
SQL_ID fddwv6gp5m26h, child number 0
-------------------------------------
select sum (c.one) row_count
from child c, parent p, grandparent gp
where c.p_id = p.pkid
and p.gp_id = gp.pkid
and gp.pct05 = 1
Plan hash value: 4174412392
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 25668 (100)| |
| 1 | SORT AGGREGATE | | 1 | 25 | | |
|* 2 | HASH JOIN | | 49591 | 1210K| 25668 (1)| 00:05:09 |
|* 3 | HASH JOIN | | 500 | 8500 | 244 (1)| 00:00:03 |
|* 4 | TABLE ACCESS FULL| GRANDPARENT | 5 | 40 | 5 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL| PARENT | 100K| 878K| 238 (1)| 00:00:03 |
| 6 | TABLE ACCESS FULL | CHILD | 10M| 76M| 25398 (1)| 00:05:05 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C"."P_ID"="P"."PKID")
3 - access("P"."GP_ID"="GP"."PKID")
4 - filter("GP"."PCT05"=1)
We can see that it is estimating 5 rows to be retrieved from the
Grandparent table, 500 from the join to Parent, and about 50,000 from
the join to Child - operations 4, 3, and 2. This is correct given the
100:1 ratio between the rows in each table. The cost is estimated at
just over 25,000, which is really just the sum of the costs of the full
table scans involved.
Second query - redundant join added between Child and Grandparent:
SQL> @test2
ROW_COUNT
----------
50000
SQL_ID 0a703yvda9z4h, child number 0
-------------------------------------
select sum (c.one) row_count
from child c, parent p, grandparent gp
where c.p_id = p.pkid
and p.gp_id = gp.pkid
and gp.pct05 = 1
and c.gp_id = gp.pkid
Plan hash value: 4140991566
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 12271 (100)| |
| 1 | SORT AGGREGATE | | 1 | 29 | | |
| 2 | NESTED LOOPS | | 50 | 1450 | 12271 (1)| 00:02:28 |
| 3 | NESTED LOOPS | | 49500 | 1450 | 12271 (1)| 00:02:28 |
|* 4 | HASH JOIN | | 500 | 8500 | 244 (1)| 00:00:03 |
|* 5 | TABLE ACCESS FULL | GRANDPARENT | 5 | 40 | 5 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | PARENT | 100K| 878K| 238 (1)| 00:00:03 |
| 7 | BITMAP CONVERSION TO ROWIDS | | | | | |
| 8 | BITMAP AND | | | | | |
| 9 | BITMAP CONVERSION FROM ROWIDS| | | | | |
|* 10 | INDEX RANGE SCAN | IX_CHILD_PID | 99 | | 2 (0)| 00:00:01 |
| 11 | BITMAP CONVERSION FROM ROWIDS| | | | | |
|* 12 | INDEX RANGE SCAN | IX_CHILD_GPID | 99 | | 21 (0)| 00:00:01 |
| 13 | TABLE ACCESS BY INDEX ROWID | CHILD | 1 | 12 | 12271 (1)| 00:02:28 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("P"."GP_ID"="GP"."PKID")
5 - filter("GP"."PCT05"=1)
10 - access("C"."P_ID"="P"."PKID")
12 - access("C"."GP_ID"="GP"."PKID")
The only difference to the previous query is the addition of the extra filter condition of "
c.gp_id = gp.pkid
",
using the redundant join between Child and Grandparent. The row count
is the same - 50,000 - because the extra filter condition is redundant
and doesn't change the number of matching rows in any way. But the
execution plan is completely different, because the Optimizer has
assumed that the two filters on Child are independent of each other, but
this is not true.
The execution plan starts the same, filtering
on Grandparent to 5 estimated rows, then joining to Parent to produce
500 estimated rows (operation 4 Hash Join). Now it does 2 Nested Loops -
the inner to get ROWID's from Child of matching rows, and the outer to
get the data row itself for the "
one
" column used in the
output. And this outermost Nested Loop (operation 2) is only estimated
to produce 50 rows, which is clearly incorrect.
What has happened is that because the "
gp_id
"
column in Child has 1,000 distinct values in it, the Optimizer has
reduced the final row estimate by this ratio i.e. it is estimating 50
matching rows and not 50,000 matching rows. Based on this it has costed
the index based access to Child to come out at 12,271 which is lower
than the 25,000+ cost of the full table scan in the first execution
plan.
This cost seems to be arrived at as the cost of index
access for one pair of values - 21 + 2 = 23 - plus one more for the
access to the data row in the table by ROWID i.e. 24 cost per Child row.
For the estimated 50 Child rows this gives a total cost of 12,000,
which with the costs so far to the Hash Join of 244 are very close to
the reported total cost of 12,271.
In fact this execution plan
will take longer to execute because it will actually be processing
50,000 Child records and not just 50, and the execution plan of the
first query would actually be the better one. But it is the assumption
that these two foreign key columns are independent of each other that
has led the Optimizer to produce this sub-optimal plan.
Third query - remove Parent table completely from query, and join directly from Child to Grandparent:
SQL> @test4
ROW_COUNT
----------
50000
SQL_ID 76tptvjkbx08x, child number 0
-------------------------------------
select sum (c.one) row_count
from child c, grandparent gp
where c.gp_id = gp.pkid
and gp.pct05 = 1
Plan hash value: 2796906588
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 25435 (100)| |
| 1 | SORT AGGREGATE | | 1 | 15 | | |
|* 2 | HASH JOIN | | 50000 | 732K| 25435 (1)| 00:05:06 |
|* 3 | TABLE ACCESS FULL| GRANDPARENT | 5 | 40 | 5 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| CHILD | 10M| 66M| 25403 (1)| 00:05:05 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("C"."GP_ID"="GP"."PKID")
3 - filter("GP"."PCT05"=1)
The result is the same as before - 50,000 rows - and the execution
plan is basically the first one with Parent removed - two full table
scans and a Hash Join. With only one filter condition on the Child
table the Optimizer has correctly estimated the number of matching rows -
50,000 - and produced the appropriate execution plan.
Conclusion
Beware
of database designs that include redundant foreign key columns between
child and grandparent tables, or across even more levels of a
relationship hierarchy, and how you write queries involving such tables.
While it might seem somehow better to include all known joins between
all the tables in a query, this can cause the Optimizer to misestimate
the number of matching rows from a table, and to choose non-optimal
access methods as a result. Rather than helping the Optimizer, adding
such redundant join conditions to queries actually makes things worse.
As a general rule you are better off only having joins in a query
between directly related tables, and not to indirectly related ones.
Otherwise, eliminate the middle, intermediate tables if possible and
join directly between the relevant tables, which should produce the same
results.
I saw this happen recently at a client, and removing
such a redundant join condition led to the Optimizer producing more
realistic row estimates and a much better execution plan, with a
corresponding faster execution given the data volumes involved.
The
other approach to dealing with this kind of scenario is to provide
extra information to the Optimizer so that it knows when two or more
columns are correlated. This can be done simply by creating an index on
just those columns in the table, or by creating extended statistics on
the column group. As long as this extra information is present, then
the Optimizer will produce a better row estimate and a better execution
plan. When I tested this with such an index on the Child table on both "
p_id
" and "
gp_id
"
together, the second query with the redundant join the execution plan
went back to the original one - 3 full table scans and 2 hash joins.
Also
be aware that Oracle is always trying to make the Optimizer more
"intelligent" so the behaviour and execution plans produced can change
between versions of Oracle for the same query on the same database
tables. I briefly tested these same queries on Oracle 12c and got
completely different execution plans. In 12c I actually got an Adaptive
Execution Plan which combined both plan variations under a parent
Statistics Collector operation. Depending on how many matching rows
were initially retrieved it could either do a full table scan on Parent
and Child with Hash Joins, or use the indexes on the foreign key columns
to do direct row lookups in Nested Loops. Combined with other feedback
mechanisms in the 12c Optimizer, it is possible for the Optimizer to
now detect such non-optimal execution plans and produce better ones on
the next execution of the same query.