Oracle PL/SQL – Quantify Code Performance For Comparison Of Different Approaches – Tom Kyte’s “runstats”

Is my PL/SQL code going to perform better?

I have rewritten my Oracle code. Now, how do I tell if my coding approach A is better than coding approach B?

If someone were to ask you the above question, typically, you would give the following answers

  • If one approach is faster than the other, that is the better approach
  • If the “cost” on the SQL statements is lower in one approach than other, it is better
  • If the explain plan uses indexes and has no full-table scans, use that approach!
  • ….

The above answers may not necessarily be correct for various reasons. In fact, they could be plain wrong. A SQL can be faster on second execution and forward because the data may be cached resulting in logical reads (from memory) rather than physical reads (from disk). The “cost” is a number that does not necessarily translate to actual resource usage. One approach may be faster than another but may result in excessive locking. Full scans are good under a lot of circumstances.

What is the right way to measure performance?

 Although there are various approaches, the concept is the same. If one approach produces more waits and uses more latches (light weight locks) than the other, it is worse. It is that simple.

What are counters/waits/latches?

To do most things, the resources needed are typically CPU, memory and I/O. Not all processes can have all of them all the time. To keep things orderly, sessions get a slice of these based on Oracle’s internal algorithms. The time spent in between waiting for a specific resource such as CPU is a “wait”. Oracle keeps “summary statistics” of these at both the system level (v$sys_stat) and the session level (v$sess_stat). Latches are light weight locks used for synchronization. Counters (mostly every increasing values) say how many times an activity happened since instance startup (for system or session). In combination, wait event statistics, latches and counters help us identify where resources consumed the most.

Oracle keeps track of all of the above activity in virtual tables named “Dynamic Performance Views” categorized into three

  • Current state views – show the current internal state
  • Counter/accumulator views – views tracking activity count since instance/session startup
  • Information views – not as current but still provide somewhat current information

Follow this link for good coverage on this subject – Dynamic Performance Views for Tuning

There are more than the three waits noted above. In fact there are over a thousand types of “wait events” that Oracle keeps track of. Knowing what the system or a session is waiting on gives clues as to where the bottleneck is. For getting a master list of all waits tracked, one can run this SQL:

SELECT name, wait_class

To get information about activity between two given timestamps, one would capture the counter values before and after the activity and find the differences in numbers. This is exactly the approach used below.

How to do analysis using waits?

Oracle guru Tom Kyte (now replaced with a team of other people since Tom retired) has written a package named “runstats” (original is no longer available on that measures performance as shown below. It uses the system view v$MyStat (statistics for current session) to measure waits and v$latch to measure latches.

a. Run runstats.rs_start – Saves current counter values for wait stats/latches
b. Run “Approach A”
c. Run runstats.rs_middle – Saves current counter values for wait stats since start step a.
d. Run “Approach B”
e. Run runstats.rs_stop – Produces the difference between “Approach A” and “Approach B” in terms of wait events and latches

At this point, the approach with less waits and latches is the better one. I have used the output from a question on AskTom to illustrate the difference.

ops$tkyte@ORA9IR2> exec runstats_pkg.rs_stop(1000); /* only show those things that differ
by at least 1000 */

Run1 ran in 114 hsecs
Run2 ran in 71 hsecs
run 1 ran in 160.56% of the time

Name                                  Run1        Run2        Diff
LATCH.simulator hash latch           4,992       2,816      -2,176
STAT...recursive calls              27,953         284     -27,669
STAT...session logical reads       151,886     115,460     -36,426
STAT...consistent gets             151,876     115,445     -36,431 work - consistent re      86,583      50,149     -36,434
STAT...buffer is pinned count       49,119      85,760      36,641
STAT...buffer is not pinned co     129,219      65,117     -64,102
LATCH.cache buffers chains         301,724     228,847     -72,877
STAT...session pga memory                0     131,072     131,072

Run1 latches total versus runs -- difference and pct
Run1        Run2        Diff       Pct
308,200     233,149     -75,051    132.19%

PL/SQL procedure successfully completed.

In the above example, it is really clear the Run2 is not only faster but also waits less and uses a lot less number of latches.

My goal was to introduce you to a good way of quantifying performance differences between two approaches. As the expert whose advise I look up to and trust, I have introduced you to Tom’s way of doing this. Please follow the documentation here to get this setup and running.

The blog below shows how to use the package:

When to use above approach?

Above approach is a good way to measure performance for code in one or more blocks that may involve lots of DML and potentially DDL. For a single SQL, a good approach is still to SQL*Plus autotrace and then possibly following up with additional research if there are no obvious explain plan/statistics differences

Start autotrace using the SQL

set autotrace traceonly;

Use the explain plan and statistics output to determine whether a SQL is better before or after changes.

SQL> set autotrace traceonly
SQL> select count(1) from comp_dataset;

Execution Plan
Plan hash value: 762842054

| Id  | Operation        | Name            | Rows  | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT |                 |     1 |     7   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE  |                 |     1 |            |          |
|   2 |   INDEX FULL SCAN| COMP_DATASET_PK | 70207 |     7   (0)| 00:00:01 |

          0  recursive calls
          0  db block gets
         72  consistent gets
         72  physical reads
          0  redo size
        348  bytes sent via SQL*Net to client
        363  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed


For example, if “Approach A” produced the above statistics and “Approach B” produced statistics whose “physical reads” are much larger than A, then the choice is clear. Alternatively, if the explain plan showed considerable differences in “rows”, that would also be a key indicator.

References: – Dynamic performance views for tuning


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s