Oracle PL/SQL – Fastest Way To Execute Code Is? To, Not Execute It! – Function Result Caches – Part I

Before you read on, please check out my related post on “Query Result Cache”. You may find it valuable in this context.

What is the fastest way to execute a PL/SQL function?

If you paid attention to the title, your answer would be “to not execute it at all”. In fact, that is the correct answer and Oracle provides the means to do it. The magic is possible because of an Oracle feature called result cache which was introduced in 10g and enhanced in 11g. The result cache feature extends beyond PL/SQL functions to tables (using RESULT_CACHE annotation) and queries too (using hints) but today, we will get a glimpse of just the function result cache.

What are we talking about?

Before your mind races to ask questions, let me try to put it at ease. In 11g R2 and forward:

  • Yes, the function can be small or big. It just has to be deterministic. Understand right candidates though.
  • Yes, if the result of the function changed due to data change, the cache is invalidated and function reevaluated.
  • No, all the data itself for the underlying tables used by the function is not cached (for this purpose). It is just the results.
  • No, you don’t have to do anything special except mark the function with RESULT_CACHE.
  • Yes, the cached results are global in scope and are available across sessions/users.
  • Yes, it is better than any homegrown solution (including varrays/collections).
  • Yes, most, if not all aspects of caching can be monitored/controlled by the DBA before/during/after caching.
  • No, it does not do real magic. It does have to execute at least once for a distinct set of parameter values (and cache it).

How does the function Result Cache work?

Quite simply, the PL/SQL function Result Cache works by caching the results of functions upon first execution keying them by the parameter values that were passed. For example if the function was passed 100 unique parameter combinations for its 3 parameters, the function would be executed exactly 100 times and the cache will have 100 rows. The function itself may be called a million times, but once cached, the values are fetched from the cache to return the caller. In 11gR2 and above, there is no need to specify dependent objects as was the case in 10g. If the underlying data that the function queries changes, resulting in a return value change, the cache is invalidated and rebuilt.

Let us look at a simple example:

Create a simple table named DIM_SEX_CODES

This table holds the dimension values for SEX (Male/Female etc.!)

CREATE TABLE DIM_SEX_CODES
(
  DIM_SEX_CODE_ID       NUMBER                  NOT NULL,
  CREATE_UPDATE_DATE    DATE                    NOT NULL,
  SEX_CODE              VARCHAR2(10 BYTE)       NOT NULL,
  SEX_CODE_DESCRIPTION  VARCHAR2(100 BYTE)
);

The table holds the dimension values of tracked SEX_CODE’s

Insert into DIM_SEX_CODES
(DIM_SEX_CODE_ID, CREATE_UPDATE_DATE, SEX_CODE, SEX_CODE_DESCRIPTION)
Values
(1, TO_DATE('03/11/2002 16:42:45', 'MM/DD/YYYY HH24:MI:SS'), 'UNKNOWN', 'VALUE IS EITHER NULL OR INVALID');
Insert into DIM_SEX_CODES
(DIM_SEX_CODE_ID, CREATE_UPDATE_DATE, SEX_CODE, SEX_CODE_DESCRIPTION)
Values
(2, TO_DATE('06/10/2002 17:37:43', 'MM/DD/YYYY HH24:MI:SS'), 'C', 'COMBINED(UNISEX)');
Insert into DIM_SEX_CODES
(DIM_SEX_CODE_ID, CREATE_UPDATE_DATE, SEX_CODE, SEX_CODE_DESCRIPTION)
Values
(3, TO_DATE('06/10/2002 17:37:43', 'MM/DD/YYYY HH24:MI:SS'), 'F', 'FEMALE');
Insert into DIM_SEX_CODES
(DIM_SEX_CODE_ID, CREATE_UPDATE_DATE, SEX_CODE, SEX_CODE_DESCRIPTION)
Values
(4, TO_DATE('06/10/2002 17:37:43', 'MM/DD/YYYY HH24:MI:SS'), 'M', 'MALE');
COMMIT;

The table contents look like this now

SELECT *
FROM DIM_SEX_CODES;
DIM_SEX_CODE_ID CREATE_UPDATE_DATE SEX_CODE SEX_CODE_DESCRIPTION
1 3/11/2002 4:42:45 PM UNKNOWN VALUE IS EITHER NULL OR INVALID
2 6/10/2002 5:37:43 PM C COMBINED(UNISEX)
3 6/10/2002 5:37:43 PM F FEMALE
4 6/10/2002 5:37:43 PM M MALE

Now, let us say that our incoming data looks like this:

In the table STAGE_TABLE

  • Each set of data called a dataset with thousands of rows per dataset
  • Sex_Code is a column indicating the SEX of the person represented by each row
  • There are numerous other columns (not shown in this example)

Picking one of the many datasets and group by the dataset_id and sex yields the following results

SELECT dataset_id, sex, count(1)
FROM STAGE_TABLE
WHERE dataset_id = 121436
GROUP BY dataset_id, sex;
DATASET_ID SEX COUNT(1)
121436 F 1273
121436 M 1792

When loading data from the staging table to a data warehouse

  • We need to transform the SEX value into its equivalent dimension
  • This would also be done for the other columns of the above table (not shown in this example)

The function that transforms the SEX value to dimension value is

CREATE OR REPLACE FUNCTION fnc_Get_Dim_Sex_Code_Id
    (
        p_Sex_Code IN VARCHAR2
    )
    RETURN NUMBER
AS
    v_Sex_Code_Id NUMBER := NULL;
BEGIN

    DBMS_OUTPUT.PUT_LINE('Called for ' || p_Sex_Code);

    SELECT MAX(DIM_SEX_CODE_ID)
    INTO v_Sex_Code_Id
    FROM DIM_SEX_CODES
    WHERE sex_code = p_Sex_Code;

    RETURN v_Sex_Code_Id;

END fnc_Get_Dim_Sex_Code_Id;

The SQL to transform values to dimensions would involve

SELECT dataset_id,
    sex,
    fnc_Get_Dim_Sex_Code_Id(sex)
FROM stage_table
WHERE dataset_id = 121436;

Question 1:

  • When the above SELECT is run, how many lines will be output in DBMS_OUTPUT for highlighted PUT_LINE?

The DBMS_OUTPUT will have a total of lines that equal the # of rows returned by the SQL itself!

Called for M
Called for F
Called for M
Called for F

….lot more lines of the same

This clearly tells us that the function is called for every qualifying rows of the SELECT

Optimizing through function result cache:

Now, we are going to request Oracle to cache the results of the function and re-use when by using the RESULT_CACHE keyword (highlighted below)

CREATE OR REPLACE FUNCTION fnc_Get_Dim_Sex_Code_Id
    (
        p_Sex_Code IN VARCHAR2
    )
RETURN NUMBER
RESULT_CACHE
AS
    v_Sex_Code_Id NUMBER := NULL;
BEGIN

    DBMS_OUTPUT.PUT_LINE('Called for ' || p_Sex_Code);

    SELECT MAX(DIM_SEX_CODE_ID)
    INTO v_Sex_Code_Id
    FROM DIM_SEX_CODES
    WHERE sex_code = p_Sex_Code;

    RETURN v_Sex_Code_Id;

END fnc_Get_Dim_Sex_Code_Id;

Question 2:

Clear DBMS_OUPUT before proceeding.

Now, if we run the same SQL, how many lines will be output in the DBMS_OUTPUT?

SELECT dataset_id, sex, fnc_Get_Dim_Sex_Code_Id(sex)
FROM stage_table
WHERE dataset_id = 121436;

Results:

We see the DBMS_OUTPUT exactly twice – once for the value of M and then for F

Called for M
Called for F

What do we infer?

From the output, we infer the following that the function content is executed only once for a specific parameter value

  • Since there are two unique values M & F that will be passed it, it is executed twice and cached

Although we do not infer this, if we query v$result_cache_objects.row_count, we will see two rows with row_count=1 for the function indicating that it is caching just two rows of data

SELECT type, row_count, name, namespace
FROM v$result_cache_objects
ORDER BY creation_timestamp DESC;
TYPE ROW_COUNT NAME NAMESPACE
Result 1 [owner]“.”FNC_GET_DIM_SEX_CODE_ID”::8.”FNC_GET_DIM_SEX_CODE_ID”#8440831613f0f5d3 #1 PLSQL
Result 1 [owner]“.”FNC_GET_DIM_SEX_CODE_ID”::8.”FNC_GET_DIM_SEX_CODE_ID”#8440831613f0f5d3 #1 PLSQL

There are numerous other system views to get meta-data about the result cache. In fact, there is even a package called dbms_result_cache with various procedures and functions to manipulate the cache.

Monitoring the Result Cache

Begin: Copy from Oracle Docs

15.4 Monitoring the Result Cache

To view information about the server and client result caches, query the relevant database views and tables.

Table 15-5 describes the most useful views and tables for monitoring the result cache.

Table 15-5 Views and Tables with Information About the Result Cache

View/Table Description
V$RESULT_CACHE_STATISTICS Lists various server result cache settings and memory usage statistics.
V$RESULT_CACHE_MEMORY Lists all the memory blocks in the server result cache and their corresponding statistics.
V$RESULT_CACHE_OBJECTS Lists all the objects whose results are in the server result cache along with their attributes.
V$RESULT_CACHE_DEPENDENCY Lists the dependency details between the results in the server result cache and dependencies among these results.
CLIENT_RESULT_CACHE_STATS$ Stores cache settings and memory usage statistics for the client result caches obtained from the OCI client processes. This statistics table contains entries for each client process that uses result caching. After the client processes terminate, the database removes their entries from this table. The client table contains information similar to V$RESULT_CACHE_STATISTICS.
DBA_TABLES, USER_TABLES, ALL_TABLES Contains a RESULT_CACHE column that shows the result cache mode annotation for the table. If the table is not annotated, then this column shows DEFAULT. This column applies to both server and client result caches.

End: Copy from Oracle Docs

Summary: 

This was meant to be a quick introduction to the result cache feature. Please understand the feature before using it everywhere. Not all tables/functions/queries may be good candidates. If the results returned will vary widely across the different calls, then that is a good sign that they are not a candidate for Result Caching.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s