This is just a summary of the key results from my findings on how Oracle is costing a Hash Join when it overflows to disk, pulling all the conclusions together into one place.
In the first post I arrived at the following formula:
- ( ((Build Columns Size + 12) * Build Row Count) + ((Probe Columns Size + 12) * Probe Row Count) ) * 0.0475
In the second post I showed that the costing was unaffected by the value of the Multi-Block Read Count parameter i.e. the costing was done only using single block disk I/O's.
In the third post I proposed a simpler formula using only values shown in an execution plan of:
- ( (Build KBytes + (12 * Build Rows / 1000) ) + (Probe KBytes + (12 * Probe Rows / 1000) ) ) * 0.0485
In the fourth post I showed that the reported size of the Temporary Space (TempSpc) in the execution plan is actually the size of the hash table itself, being dependent only on the size of the Build data set. And that knowing this hash table size will help you in deciding how much larger it is than the available memory in the PGA used to store such hash tables.