Temporary Space
It seems that this is simply the size of the hash table itself, which is in turn the size of the Build data set plus an additional 12 bytes per row. Which means that we can use part of the cost formula posted earlier for this to estimate the size in KiloBytes:- Build KBytes + (12 * Build Rows / 1000)
- The "TempSpc" value only changes when the Build size changes - never when the Probe size changes
- When additional columns are added to Build data set in the
SELECT
the "TempSpc" value increases by the same amount as the Build size increases (for the same number of rows) - When the number of rows in the Build data set increases, the "TempSpc" value increases by an additional 11.76 bytes per additional row
- Which is almost 12 bytes per row, as mentioned before as overhead in the hash table
Hash Table Size vs PGA Size
I mentioned in the first post that the Optimizer uses the value of the internal initialization parameter "_smm_min_size
"
for the memory it can use as the work area for the hash table within
the PGA. Thus if the size of the hash table is larger than this the
Optimizer will cost the Hash Join as an overflow to disk operation.
Using the formula posted above for the Temporary Space we can now
estimate the size of the hash table needed and compare it to this
minimum work area size value.The notes I have on "
_smm_min_size
"
say that it will be 0.1% of the PGA when the size of the PGA is greater
than 128 MB, otherwise it will be 128 KB as a minimum. Assuming your
PGA is larger than 128 MB then you can calculate what this is yourself.
Note also that its value is reported in 10053 trace files along with
all other initialization parameters, so you can directly verify its
value if you want to.At run time, though, the PGA may not be heavily used and may have a lot of unused memory available. An executing query can allocate and use more than the minimum work area value at run time for execution plan operations, up to a limit specified by the "
_smm_max_size
" initialization parameter. For PGAs larger than 1 GB this is set to be 10% of the PGA size. For smaller PGAs it is 100 MB, or 20% when the PGA is less than 500 MB.Knowing how large the hash table needed will be, lets you compare it to the maximum possible work area size in your PGA and see what the likelihood is of the hash table being able to fit within the PGA at run time. As the ratio between the minimum and maximum possible work area sizes is 100 (0.1% to 10%) this is a large range and a hash table may have a reasonable chance of fitting in the PGA at run time, depending on how active your system is and the other queries running.
Equally it lets you consider what the benefit might be from increasing the size of the PGA significantly if you have a critical query involving a large hash join operation that is overflowing to disk. That said, there are some ways of rewriting queries that can minimise the size of the data being pulled through into the hash table used by the hash join, and so potentially get a smaller hash table to fit into memory in the PGA. But this depends on multiple factors, mainly being the number of columns of data coming through in the Build data set. If this is only a few columns then there won't anything that can be saved by such a query rewrite.
No comments:
Post a Comment