awrrpt
)
only provides detailed information on the activity that happened in one
period of time between 2 snapshots. There is no summary report of
activity across all of the snapshots in the AWR. Which means that to
use the standard AWR report you first have to know which period of time
is the important one you should be looking at. So how do you identify
the "most active" time period?The lack of any standard reports or analysis that works on the whole of this data set and summarises it seems to be something of a missed opportunity to me. And something that is vital when analysing historical activity and trying to identify hot spots. So I decided to write my own reports summarising activity between each snapshot over a number of snapshots i.e. over a number of days.
For each snapshot the report lists the change in value of the main activity statistics since the previous snapshot, all on one output line per snapshot, ordered by snapshot time. I can then use this output to determine the most active period (e.g. high SQL executions, high DB time) or the highest wait events (wait time) or something else. Generally I do this by loading the output into Excel (fixed width format data fields) and either glancing down some critical columns of data, or graphing them and looking for spikes. And within Excel I can also enter formulae and derive other statistics e.g. Average Active Sessions from DB Time and Elapsed Time.
Your mileage will vary with my report - you may not want some of the statistics I report, you may want some other ones, or you may want them formatted differently. But you can always edit the SQL and tweak it to your satisfaction. The main benefit is that you get all of the activity history captured in the AWR summarised into key measurements per snapshot. And this can be used to easily identify the busiest period, and also examine bottlenecks that give rise to high wait events.
I won't give all the raw SQL, as it is very long with multiple joins between many AWR tables. What I will do is build up the components of the SQL in stages, so you can build your own customised AWR query. I did create this particular SQL query myself, rather than simply copying someone else's, but I have seen the same concept used by others - for instance Karl Aaro, and it is interesting to see both how other people have pulled out this data (the SQL they use), and which statistics they deem most important and why.
Core Query - List of Snapshots
The core query is one that provides the list of snapshots, and the elapsed time of each one. I use the "WITH
" sub-query clause of the "SELECT
" query to keep each sub-query separate and make the final query simpler and easier to read.This provides a list of snapshot identifiers, which are a foreign key that appears in all other AWR tables, and identifies each snapshot of activity data itself. It also provides some other key data items, which can be important when summarising the data:with snaps as (select csnaps.snap_id , csnaps.dbid , min (csnaps.end_interval_time) end_snap_time , min (csnaps.end_interval_time) - min (csnaps.begin_interval_time) snap_interval from dba_hist_snapshot csnaps group by csnaps.snap_id, csnaps.dbid ) -- snaps
- Database Identifier (dbid). This should be the same for all snapshots, but may not be under some circumstances. It is important to use in joins to other tables to ensure you are only getting activity data for the same database.
- When the snapshot period finished (end_interval_time)
- Snapshot duration (end time minus start time)
min
" aggregate function.Database Time Model
This sub-query provides the set of Database Time Model data statistics per snapshot e.g. DB Time, sql execute elapsed time.As before I group by the snapshot identifier, database identifier, and also statistic name. As the statistics values are cumulative, I am interested in the change in the value between two consecutive snapshots. So I join the Sys Time Model table to itself on the snapshot identifier being one more than the previous snapshot, and subtract the two data values (, systimes as -- One row per Database Time Model with change in value between snapshots (select systime.snap_id , systime.dbid , systime.stat_name , sum (systime.value - psystime.value) value from dba_hist_sys_time_model systime, dba_hist_sys_time_model psystime where systime.snap_id = psystime.snap_id + 1 and systime.dbid = psystime.dbid and systime.instance_number = psystime.instance_number and systime.stat_id = psystime.stat_id -- Assume if stat_id the same so is the stat_name group by systime.snap_id, systime.dbid, systime.stat_name ) -- systimes
value
column in this particular case) to
get the change in value between the two snapshots. In case there is
more than one data value per snapshot, which there will be on a RAC
database, I also join by the instance number and add up the values
together (sum
aggregate function).AWR Summary #1
Combining the two sub-queries together with suitable join conditions gives the following SQL to summarise database activity times between all snapshots:Notes:with snaps as ... , systimes as ... select to_char (snaps.end_snap_time, 'DD/MM/YY HH24:MI') snap_time , extract (second from snaps.snap_interval) + (extract (minute from snaps.snap_interval) + (extract (hour from snaps.snap_interval) + (extract (day from snaps.snap_interval) * 24) ) * 60 ) * 60 snap_duration , dbtime.value / 1000000 db_time , sqlexectime.value / 1000000 sql_exec_time from snaps join (select * from systimes where stat_name = 'DB time') dbtime on snaps.snap_id = dbtime.snap_id and snaps.dbid = dbtime.dbid join (select * from systimes where stat_name = 'sql execute elapsed time') sqlexectime on snaps.snap_id = sqlexectime.snap_id and snaps.dbid = sqlexectime.dbid order by snaps.end_snap_time /
- The "snap_interval" is an
INTERVAL
data type, so to convert to just seconds I need to extract each time field component, multiply it up and add on to the other time components. - The times in the Sys Time Model are in microseconds, so I divide by one million to get the value in seconds.
- The Sys Time Model data is stored as name / value pairs, so I need to query the same data set twice restricting to one named statistic each time. I do this in an inline view within the query.
- The joins between all data sets are on the Snapshot Identifier and the Database Identifier.
- From these data items you can derive other useful numbers e.g. Average Active Sessions is DB Time divided by Snapshot Duration.
WHERE
clause. Here is one I use in SQL*Plus that refers to 2 numbers passed in as arguments when you invoke the script.You can also format the columns nicely in SQL*Plus by pre-defining column formats:where snaps.end_snap_time between (trunc (sysdate) - &1) and (trunc (sysdate) - &1 + &2)
In a future post I'll show some other sub-queries against the AWR tables that can be added in to pull out SQL statement execution statistics, and wait events, and other statistics.col snap_duration format 999,990.9 col db_time format 999,990.99 col sql_exec_time format 999,990.99
No comments:
Post a Comment