Before you read on, please check out my related post on “Query Result Cache”. You may find it valuable in this context.
What is the fastest way to execute a PL/SQL function?
If you paid attention to the title, your answer would be “to not execute it at all”. In fact, that is the correct answer and Oracle provides the means to do it. The magic is possible because of an Oracle feature called result cache which was introduced in 10g and enhanced in 11g. The result cache feature extends beyond PL/SQL functions to tables (using RESULT_CACHE annotation) and queries too (using hints) but today, we will get a glimpse of just the function result cache.
What are we talking about?
Before your mind races to ask questions, let me try to put it at ease. In 11g R2 and forward:
- Yes, the function can be small or big. It just has to be deterministic. Understand right candidates though.
- Yes, if the result of the function changed due to data change, the cache is invalidated and function reevaluated.
- No, all the data itself for the underlying tables used by the function is not cached (for this purpose). It is just the results.
- No, you don’t have to do anything special except mark the function with RESULT_CACHE.
- Yes, the cached results are global in scope and are available across sessions/users.
- Yes, it is better than any homegrown solution (including varrays/collections).
- Yes, most, if not all aspects of caching can be monitored/controlled by the DBA before/during/after caching.
- No, it does not do real magic. It does have to execute at least once for a distinct set of parameter values (and cache it).
How does the function Result Cache work?
Quite simply, the PL/SQL function Result Cache works by caching the results of functions upon first execution keying them by the parameter values that were passed. For example if the function was passed 100 unique parameter combinations for its 3 parameters, the function would be executed exactly 100 times and the cache will have 100 rows. The function itself may be called a million times, but once cached, the values are fetched from the cache to return the caller. In 11gR2 and above, there is no need to specify dependent objects as was the case in 10g. If the underlying data that the function queries changes, resulting in a return value change, the cache is invalidated and rebuilt.
Let us look at a simple example:
Create a simple table named DIM_SEX_CODES
This table holds the dimension values for SEX (Male/Female etc.!)
CREATE TABLE DIM_SEX_CODES ( DIM_SEX_CODE_ID NUMBER NOT NULL, CREATE_UPDATE_DATE DATE NOT NULL, SEX_CODE VARCHAR2(10 BYTE) NOT NULL, SEX_CODE_DESCRIPTION VARCHAR2(100 BYTE) );
The table holds the dimension values of tracked SEX_CODE’s
Insert into DIM_SEX_CODES (DIM_SEX_CODE_ID, CREATE_UPDATE_DATE, SEX_CODE, SEX_CODE_DESCRIPTION) Values (1, TO_DATE('03/11/2002 16:42:45', 'MM/DD/YYYY HH24:MI:SS'), 'UNKNOWN', 'VALUE IS EITHER NULL OR INVALID'); Insert into DIM_SEX_CODES (DIM_SEX_CODE_ID, CREATE_UPDATE_DATE, SEX_CODE, SEX_CODE_DESCRIPTION) Values (2, TO_DATE('06/10/2002 17:37:43', 'MM/DD/YYYY HH24:MI:SS'), 'C', 'COMBINED(UNISEX)'); Insert into DIM_SEX_CODES (DIM_SEX_CODE_ID, CREATE_UPDATE_DATE, SEX_CODE, SEX_CODE_DESCRIPTION) Values (3, TO_DATE('06/10/2002 17:37:43', 'MM/DD/YYYY HH24:MI:SS'), 'F', 'FEMALE'); Insert into DIM_SEX_CODES (DIM_SEX_CODE_ID, CREATE_UPDATE_DATE, SEX_CODE, SEX_CODE_DESCRIPTION) Values (4, TO_DATE('06/10/2002 17:37:43', 'MM/DD/YYYY HH24:MI:SS'), 'M', 'MALE'); COMMIT;
The table contents look like this now
SELECT * FROM DIM_SEX_CODES;
DIM_SEX_CODE_ID | CREATE_UPDATE_DATE | SEX_CODE | SEX_CODE_DESCRIPTION |
1 | 3/11/2002 4:42:45 PM | UNKNOWN | VALUE IS EITHER NULL OR INVALID |
2 | 6/10/2002 5:37:43 PM | C | COMBINED(UNISEX) |
3 | 6/10/2002 5:37:43 PM | F | FEMALE |
4 | 6/10/2002 5:37:43 PM | M | MALE |
Now, let us say that our incoming data looks like this:
In the table STAGE_TABLE
- Each set of data called a dataset with thousands of rows per dataset
- Sex_Code is a column indicating the SEX of the person represented by each row
- There are numerous other columns (not shown in this example)
Picking one of the many datasets and group by the dataset_id and sex yields the following results
SELECT dataset_id, sex, count(1) FROM STAGE_TABLE WHERE dataset_id = 121436 GROUP BY dataset_id, sex;
DATASET_ID | SEX | COUNT(1) |
121436 | F | 1273 |
121436 | M | 1792 |
When loading data from the staging table to a data warehouse
- We need to transform the SEX value into its equivalent dimension
- This would also be done for the other columns of the above table (not shown in this example)
The function that transforms the SEX value to dimension value is
CREATE OR REPLACE FUNCTION fnc_Get_Dim_Sex_Code_Id ( p_Sex_Code IN VARCHAR2 ) RETURN NUMBER AS v_Sex_Code_Id NUMBER := NULL; BEGIN DBMS_OUTPUT.PUT_LINE('Called for ' || p_Sex_Code); SELECT MAX(DIM_SEX_CODE_ID) INTO v_Sex_Code_Id FROM DIM_SEX_CODES WHERE sex_code = p_Sex_Code; RETURN v_Sex_Code_Id; END fnc_Get_Dim_Sex_Code_Id;
The SQL to transform values to dimensions would involve
SELECT dataset_id, sex, fnc_Get_Dim_Sex_Code_Id(sex) FROM stage_table WHERE dataset_id = 121436;
Question 1:
- When the above SELECT is run, how many lines will be output in DBMS_OUTPUT for highlighted PUT_LINE?
The DBMS_OUTPUT will have a total of lines that equal the # of rows returned by the SQL itself!
Called for M Called for F Called for M Called for F ….lot more lines of the same
This clearly tells us that the function is called for every qualifying rows of the SELECT
Optimizing through function result cache:
Now, we are going to request Oracle to cache the results of the function and re-use when by using the RESULT_CACHE keyword (highlighted below)
CREATE OR REPLACE FUNCTION fnc_Get_Dim_Sex_Code_Id ( p_Sex_Code IN VARCHAR2 ) RETURN NUMBER RESULT_CACHE AS v_Sex_Code_Id NUMBER := NULL; BEGIN DBMS_OUTPUT.PUT_LINE('Called for ' || p_Sex_Code); SELECT MAX(DIM_SEX_CODE_ID) INTO v_Sex_Code_Id FROM DIM_SEX_CODES WHERE sex_code = p_Sex_Code; RETURN v_Sex_Code_Id; END fnc_Get_Dim_Sex_Code_Id;
Question 2:
Clear DBMS_OUPUT before proceeding.
Now, if we run the same SQL, how many lines will be output in the DBMS_OUTPUT?
SELECT dataset_id, sex, fnc_Get_Dim_Sex_Code_Id(sex) FROM stage_table WHERE dataset_id = 121436;
Results:
We see the DBMS_OUTPUT exactly twice – once for the value of M and then for F
Called for M Called for F
What do we infer?
From the output, we infer the following that the function content is executed only once for a specific parameter value
- Since there are two unique values M & F that will be passed it, it is executed twice and cached
Although we do not infer this, if we query v$result_cache_objects.row_count, we will see two rows with row_count=1 for the function indicating that it is caching just two rows of data
SELECT type, row_count, name, namespace FROM v$result_cache_objects ORDER BY creation_timestamp DESC;
TYPE | ROW_COUNT | NAME | NAMESPACE |
Result | 1 | “[owner]“.”FNC_GET_DIM_SEX_CODE_ID”::8.”FNC_GET_DIM_SEX_CODE_ID”#8440831613f0f5d3 #1 | PLSQL |
Result | 1 | “[owner]“.”FNC_GET_DIM_SEX_CODE_ID”::8.”FNC_GET_DIM_SEX_CODE_ID”#8440831613f0f5d3 #1 | PLSQL |
There are numerous other system views to get meta-data about the result cache. In fact, there is even a package called dbms_result_cache with various procedures and functions to manipulate the cache.
Monitoring the Result Cache
Begin: Copy from Oracle Docs
15.4 Monitoring the Result CacheTo view information about the server and client result caches, query the relevant database views and tables. Table 15-5 describes the most useful views and tables for monitoring the result cache. Table 15-5 Views and Tables with Information About the Result Cache
|
End: Copy from Oracle Docs
Summary:
This was meant to be a quick introduction to the result cache feature. Please understand the feature before using it everywhere. Not all tables/functions/queries may be good candidates. If the results returned will vary widely across the different calls, then that is a good sign that they are not a candidate for Result Caching.