Haibin's blog

significant unaccounted-for time

leave a comment »

User reported one of their reports ran very slow.  Multipass sort was observered from AWR report, and the sort operation in execution plan took most of time, so the issue seemed related w/the workarea.

Database version: 10.2.0.3.

 

Automatic pga memory management was used for this database, and the pga_aggregate_target was set to 24M only while sga had 2G. Obviously this paramater was set too low, which  could be confirmed from pga advisor too. The performance of this query got improved a lot by increasing the PGA.

However, I was more insterested in the ET unaccounted-for time since lots of ET unacounted-for time was found for this query. From below report, ET unaccounted-for time was two times of  CPU time, and Non_idle wait time was 0, so what led to this significant ET unaccount-for time? Why non idle wait time was 0 if we had to sort to disk?  

Moreover, when this query was running, I didn’t see lots of “direct path read temp” from EM console, which was one of my expectations when Oracle was short of work area and had to sort to disk.

From top activity, I only saw some CPU usage, but no “direct path read temp” at all, so where was the “direct path read temp”?  Did “direct patch read temp” never happen? If it did happen, why it didn’t show up here? 

So I collected the session event statisitcs before and after runing. According to result below,this “direct path read temp” did occur, and the session waited on this event for more than 1 million. But if you checked total waited time, how could Oracle do million reads in less than 2 seconds?  

SQL> select c.name event,
       b.TOTAL_WAITS - a.TOTAL_WAITS TOTAL_WAITS,
       b.TOTAL_TIMEOUTS - a.TOTAL_TIMEOUTS   TOTAL_TIMEOUTS,
       b.TIME_WAITED_MICRO -a .TIME_WAITED_MICRO TIME_WAITED_MICRO
  from run_events a, run_events b, V$EVENT_NAME c
 where a.EVENT_ID = b.EVENT_ID
   and a.runid = 'before'
   and b.runid = 'after'
   and a.event_id=c.event_id
   and b.TOTAL_WAITS - a.TOTAL_WAITS > 0
order by TIME_WAITED_MICRO desc;

EVENT                          TOTAL_WAITS TOTAL_TIMEOUTS TIME_WAITED_MICRO
------------------------------ ----------- -------------- -----------------
SQL*Net message from client              2              0        1740927244
direct path read temp              1216782              0           1842493
direct path write temp               33136              0            112112
SQL*Net more data from client            1              0                23
SQL*Net message to client                2              0                 6

The answer comes from Oracle document for this event(http://docs.oracle.com/cd/B16240_01/doc/doc.102/e16282/oracle_database_help/oracle_database_wait_bottlenecks_direct_path_read_pct.html)

Although it describes the “direct path read”,”direct path read temp” is  similiar. 

For this database, synchronous I/O was used, which also surprised me.

SQL> show parameter filesystemio

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
filesystemio_options                 string      NONE

Thus when the session performs “direct path read temp”, it is blocked, but the session doesn’t wait on “direct path read temp” event at this moement. It shows a a wait on “direct path read temp” when it picks up the data when I/O request is completed,  and this wait will return immediately since no physical I/O is required and the data are ready. Therefore, although Oracle session waits on I/O to be completed, but this time doesn’t get counted as “direct path read temp”. Actually it doesn’t get counted either in any wait event, or in CPU time, so this is the reason we saw significant unaccounted-for time here, and why the total waited time  is less than 2 seconds although we got more than 1 milion “direct path read temp”.

This behaviour can be confirmed from ASH and AWR report also.   

From ASH, the most elapsed time of this sql was spent on CPU+Wait for CPU,but the AWR report for the same time period reported CPU time was only 35% of elapsed time .In ASH the sessions get counted on CPU if they are not known to be on a WAIT event(http://oracledoug.com/serendipity/index.php?/archives/1432-Time-Matters-DB-CPU.html), so even if a proccess is sleeping or in running queue, the time would be calcualted as CPU+Wait for CPU.  However,  AWR gets CPU time from OS level, so only the time actually spending on CPU is counted. This significant unaccounted-for time was the time the process was in sleeping or in run queue. 

About these ads

Written by Haibin Sun

February 6, 2012 at 4:13 PM

Posted in performance

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: