Tuesday 15 October 2013

AWR Summary Reporting #4 - More Waits

Given the previous sub-query for total wait times including class level waits from the AWR snapshot data, we are going to extend that particular sub-query in order to add some extra wait event data items to it. This will give us a further breakdown of what the major wait events were, and how much time was spent waiting versus doing real work. The upside to these changes are that we will be able to pull out some other wait event times from the AWR snapshots in the same query, for little extra cost or increase in complexity. The downside is that the number of data items being selected for output will increase a lot, and this may start to look like a "mother of all database monitoring queries". However, the query itself is not more complex, and the intention is to get as much data as we can from each AWR snapshot in one hit to allow us to analyse it later outside Oracle (typically in a spreadsheet).

But before we do that I want to refactor that sub-query to make some of the later additions to it easier and simpler. Remember that we had just one sub-query, with the wait time always being calculated by subtracting the current snapshot's wait time from the previous snapshot's time. I want to refactor this by pushing this subtraction down into a nested sub-query, so that the rest of the wait sub-query becomes simpler.

So before we had elements like:
, sum (sysevent.time_waited_micro - psysevent.time_waited_micro) all_wait_time
, sum (decode (sysevent.wait_class, 'Commit', (sysevent.time_waited_micro - psysevent.time_waited_micro), 0)) commit_time
The refactor changes this to:
, syswaits as 
-- One row for total wait time, plus break down into major wait classes, and events
(select syswaitevents.snap_id
      , syswaitevents.dbid
      , sum (syswaitevents.time_waited_micro) all_wait_time
      , sum (decode (syswaitevents.wait_class, 'Commit', syswaitevents.time_waited_micro, 0)) commit_time
...
   from 
        (select sysevent.snap_id
              , sysevent.dbid
              , sysevent.wait_class
              , sysevent.event_name
              , sum (sysevent.time_waited_micro - psysevent.time_waited_micro) time_waited_micro
              , sum (sysevent.total_waits - psysevent.total_waits) wait_count
           from dba_hist_system_event sysevent, dba_hist_system_event psysevent
          where sysevent.snap_id = psysevent.snap_id + 1
...
            group by sysevent.snap_id, sysevent.dbid, sysevent.wait_class, sysevent.event_name
        ) syswaitevents
  group by syswaitevents.snap_id
         , syswaitevents.dbid
) -- syswaits
The key points are:
  • The inner sub-query groups by wait class and individual event name, as well as by snapshot and database id, and sums the wait time to this granularity.
  • It also sums the total occurrences of each wait event, for further use in the outer query.
  • The outer sub-query can now just refer to "syswaitevents.time_waited_micro" as the elapsed wait time, as the subtraction between the current and previous snapshot values are done in the inner sub-query.
  • The outer sub-query still does the "decode" of the class name to extract individual sums for each class desired
Previously I said that there were too many individual events to report them all - around 100 individual wait events. However, experienced DBA's find that they do tend to run across a very common set of wait events that occur most often on "slow" systems. There is a strong enough argument for extracting and reporting the wait times for some individual wait events that you consider to be the "most common" wait events.

The benefit is that if any one of these particular wait event times is very high, then we already have extracted it and reported it for analysis and can see it in the output. If all of these particular wait event times are low, then there was almost no extra cost involved in retrieving them at the same time as the other wait event times (there is no extra disk I/O, only some extra CPU to process the wait time data). It avoids the need to run another query afterwards to drill down into the wait event data, and if the extra wait data is irrelevant then we can just ignore it. And the mechanism is the same one as for the wait classes, using "decode" against the event name.

The top wait events that I am going to include are listed below. It's not a large list as I want to focus on the query, not the specific wait events. You can disagree with some of these, or want to add some others of your own. You can do this just by modifying the query I give below and changing the event names used in the "decode" functions.
  • db file sequential read
  • db file scattered read
  • direct path read
  • log file sync
  • log file parallel write
  • gc buffer busy
  • gc cr block busy
As well as the total wait time for each event, we would also want to know the number of waits that occurred so we can calculate the average time per wait. This is important as it might not be the number of waits that is the problem but the average time per wait. We can achieve a "count" by doing a "sum" using one (1) when the wait event name matches, and zero (0) otherwise. This keeps it uniform with the wait time per event.

The only changes to the refactored sub-query given earlier are the addition of these extra data items to the "select" output list:
      , sum (decode (syswaitevents.event_name, 'db file sequential read', syswaitevents.time_waited_micro, 0)) db_file_sqrd_time
      , sum (decode (syswaitevents.event_name, 'db file sequential read', 1, 0)) db_file_sqrd_count
      , sum (decode (syswaitevents.event_name, 'db file scattered read',  syswaitevents.time_waited_micro, 0)) db_file_sctrd_time
      , sum (decode (syswaitevents.event_name, 'db file scattered read',  1, 0)) db_file_sctrd_count
      , sum (decode (syswaitevents.event_name, 'direct path read',        syswaitevents.time_waited_micro, 0)) direct_path_read_time
      , sum (decode (syswaitevents.event_name, 'direct path read',        1, 0)) direct_path_read_count
      , sum (decode (syswaitevents.event_name, 'log file sync',           syswaitevents.time_waited_micro, 0)) log_file_sync_time
      , sum (decode (syswaitevents.event_name, 'log file sync',           1, 0)) log_file_sync_count
      , sum (decode (syswaitevents.event_name, 'log file parallel write', syswaitevents.time_waited_micro, 0)) log_file_prl_wr_time
      , sum (decode (syswaitevents.event_name, 'log file parallel write', 1, 0)) log_file_prl_wr_count
      , sum (decode (syswaitevents.event_name, 'gc buffer busy',          syswaitevents.time_waited_micro, 0)) gc_buff_busy_time
      , sum (decode (syswaitevents.event_name, 'gc buffer busy',          1, 0)) gc_buff_busy_count
      , sum (decode (syswaitevents.event_name, 'gc cr block busy',        syswaitevents.time_waited_micro, 0)) gc_cr_blk_busy_time
      , sum (decode (syswaitevents.event_name, 'gc cr block busy',        1, 0)) gc_cr_blk_busy_count
Then we need to add these extra columns to the output of the main "select", remembering to divide times in microseconds by one million:
     , syswaits.db_file_sqrd_time     / 1000000 db_file_sqrd_time
     , syswaits.db_file_sctrd_time    / 1000000 db_file_sctrd_time
     , syswaits.direct_path_read_time / 1000000 direct_path_read_time
     , syswaits.log_file_sync_time    / 1000000 log_file_sync_time
     , syswaits.log_file_prl_wr_time  / 1000000 log_file_prl_wr_time
     , syswaits.gc_buff_busy_time     / 1000000 gc_buff_busy_time
     , syswaits.gc_cr_blk_busy_time   / 1000000 gc_cr_blk_busy_time
     , syswaits.db_file_sqrd_count     db_file_sqrd_count
     , syswaits.db_file_sctrd_count    db_file_sctrd_count
     , syswaits.direct_path_read_count direct_path_read_count
     , syswaits.log_file_sync_count    log_file_sync_count
     , syswaits.log_file_prl_wr_count  log_file_prl_wr_count
     , syswaits.gc_buff_busy_count     gc_buff_busy_count
     , syswaits.gc_cr_blk_busy_count   gc_cr_blk_busy_count
Note that I have chosen to keep the times together and the counts together, rather than alternate between them for each event in turn. This particular ordering makes it easier to copy an "average wait time" formula from one cell to another in a spreadsheet, as each referenced column shifts by one column in each case.

With these columns added, the output from my AWR summary report becomes:
SNAP                SNAP       DB SQL EXEC ALL WAIT  BG WAIT   COMMIT  CLUSTER CONCRNCY  NETWORK   SYS IO  USER IO  DBF SEQ DF SCTRD DRCT PTH  LOG FILE LG FL PR  GC BUFF GC CRBLK    DBF SEQ   DF SCTRD   DRCT PTH   LOG FILE   LG FL PR    GC BUFF   GC CRBLK
TIME            DURATION     TIME     TIME     TIME     TIME  WAIT TM  WAIT TM  WAIT TM  WAIT TM  WAIT TM  WAIT TM  RD TIME  RD TIME  RD TIME SYNC TIME  WR TIME BSY TIME BSY TIME     RD CNT     RD CNT     RD CNT   SYNC CNT     WR CNT    BSY CNT    BSY CNT
--------------- -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- -------- --------- -------- -------- -------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
11/09/13 06:00     3,615    1,594    1,292      894      154      104       61        6      482       52      114       85        3       24       104       22        0       18     20,096        880     31,531     45,162     47,067         18      1,390
11/09/13 07:00     3,591    1,638    1,352      907      190       71       38        8      491       98      119       53        9       53        71       49        0        2     13,934      2,584     51,890     47,190     49,327         30        232
11/09/13 08:00     3,589    2,149    1,869    1,324      188       64       70        8      503       86      518      388       14       98        64       41        0        2    199,512      5,722     74,116     49,184     51,518         28        849
11/09/13 09:00     3,640    5,516    5,062    3,344      223       83      382       24      582       94    2,081    1,681       38      218        83       50      146        3    635,825     10,174    147,400     80,669     85,350     65,098      2,136 
11/09/13 10:00     3,558   11,193   10,150    6,625      326      264    1,040      146      670      161    4,007    2,980      227      710       264       87      106       34    744,366     90,168    415,035    129,443    136,572    116,076      8,471
11/09/13 11:00     3,630   16,981   15,578   10,260      486      371    1,294      116      865      255    7,116    5,804      368      495       371      144       14       18  1,346,338    162,856    274,090    191,730    201,510        834      4,778
11/09/13 12:00     3,609   15,954   14,349    9,516      497      508    1,456      143      876      231    5,887    4,979      185      579       508      125       70       68  1,230,115     92,290    317,980    186,878    197,001      3,207      8,931
11/09/13 13:00     3,602   10,800    9,651    6,013      383      278      774       81      747      181    3,665    3,191       87      323       278      102       18       21    954,932     32,045    200,258    173,877    184,167        661      9,443
11/09/13 14:00     3,562    8,532    7,562    4,359      319      203      488       55      698      158    2,566    2,364       25      149       203       92       22       12    553,064      6,607    133,380    159,407    169,369        388      6,391
11/09/13 15:00     3,601   11,486   10,210    6,721      402      374      899      111      751      189    4,102    3,485      157      304       374      109       42       50    814,112     79,733    214,011    172,358    181,824      1,944      7,554
As I said at the beginning this is a lot more columns of data than before. But the whole point is that we can now cut and paste this into a spreadsheet and do further analysis of the data to identify high values of one form or another. Now we not only have a break down of wait time by wait class, we also have the times and counts for the most common wait events.

Previously we saw that from 9am to 10am the major wait time was User I/O at 4,007 seconds out of 6,625 seconds for all waits. Now we can see that the one wait event of "db file sequential read" accounted for almost 3,000 seconds (2,980) out of the 4,000 for User I/O. We can also see that the next longest wait is probably "direct path read" at 710 seconds. So we now know which wait event contributed the most wait time, and how many times it occurred. And this lets us calculate the average "db file sequential read" for that period as being 0.004 seconds i.e. 4 milliseconds. Depending on your particular situation you may view this as being a high average time, or normal and acceptable. If the average wait time is acceptable then you would look at the number of such reads and try and find which SQL statements are causing them and tune those.

2 comments:

VK said...

Can we get the whole script please.

VK said...

Can we get the whole script.

I liked the blog