- Nested Loop Join
- Sort Merge Join
- Hash Join
A Nested Loop Join reads data from one data set - the Outer data set - and for each data row retrieves corresponding rows from the other data set - the Inner data set. Clearly the Inner data set is accessed separately for each data row in the Outer data set. And the number of rows in a data set is termed its Cardinality. Hence the cost formula for a Nested Loop Join is as follows, where "Outer" and "Inner" refer to their respective data sets, and square brackets used to enclose each individual element:
- [Cost of Outer Access] + ([Cardinality of Outer] * [Cost of Inner Access])
Example
Consider the following query using a table with 100,000 rows in it over 1252 * 8 KB blocks:
select it1.i1
from insert_test_1 it1, insert_test_1 it2
where it1.i1 = it2.i4
and it1.i3 = 99
I am using the same table twice in the query, with an alias for each occurrence of it.
There are indexes on each column i.e. on i1, i2 (not used here), i3 and i4. The i1 index is unique, but the others are not.
Remember that the Optimizer starts with the data set with the lowest cardinality (smallest number of data rows). In this case this is "it1", because the "i3" column has 100 rows for each possible value. This gives a cardinality of 100 for "it1" as the Outer data set. The lowest cost access to this is using the index on this, which has a cost of 102 (costed using index range scan described in an earlier post).
To join to "it2" in this case, the index on "i4" can be used. In this case no other data columns are needed from the "it2" table, and thus the cost is really that of reading the index leaf block with that particular value. For this index all entries for one value fit within one leaf block (leaf blocks per key value is 1), so the access cost is just 1 to read that particular leaf block.
The Nested Loop Cost is therefore:-
- 102 + (100 * 1) = 102 + 100 = 202
The 10053 trace file also showed that the Optimizer also costed other variations of the Nested Loop Join. This agrees with the statement that the Optimizer costs all possible access and join plans to determine the cheapest one.
Also considered are a full table scan of it2, at a cost of 243.1. The Nested Loop cost of this should be:
- 102.1 + (100.2 * 243.1) = 102.1 + 24358.2 = 244460.3
No comments:
Post a Comment