Oracle – PL/SQL – Using ROW/TABLE Locks To Implement Application/Job Semaphore To Handle Race Conditions

What is a Semaphore?

Instead of a clinical illustration of exclusive row lock and exclusive table lock usage in Oracle, here we will try to use them to highlight a higher level concept – Implement a semaphore mechanism.

Wikipedia defines semaphore as below:

In computer science, a semaphore is a variable or abstract data type that provides a simple but useful abstraction for controlling access by multiple processes to a common resource in a parallel programming or multi user environment.

A sample requirement

An example is worth a thousand definitions. Let us consider a not-so-typical scenario where there are two jobs that could run at the same time

Job 1 Job 2
Step 1 – Initialize cache if existing data is older than a day Step 1 – Initialize cache if existing data is older than a day
Step 2 – Use cache Step 2 – Use cache
Step 3 – Do some work Step 3 – Do work unique to job 2
Step 4 – more processing Step 4 – More processing unique to job 2

In the illustration above, here are the reasons a semaphore is needed since Job 1 and Job 2 share the cache and access to it needs to be controlled.

These are the requirements:

  • Job 1 and Job 2 should not refresh the cache at the same time
  • If one job is initializing the cache, the other job should wait for the cache to finish initializing
  • If one job is actively using the data in the cache, the other job should wait
  • All of the above should be guaranteed even under race conditions

This solution sounds simple but it will not work due to reasons stated:

  • Keep detailed statuses in a table to drive cache logic/waiting around
  • Unless locks are involved, this will not work due to race conditions
  • Will require your own implementation of a timer to wait around
  • Can lead to inefficient “idling” logic when waiting
  • Too complex implement logic that ensures that statuses don’t get intertwined

A proposed solution:

Without delving too much into the solutions that will not work, let us consider a solution that will work.

Let us consider two key concepts that we will use to build our semaphore

  • Exclusive row lock using “FOR UPDATE
    • We will use this to make sure two jobs do not initialize cache at the same time
    • COMMIT or ROLLBACK with release the lock
    • SELECT is permitted on locked rows (by other sessions)
  • Exclusive table lock using “LOCK TABLE
    • We will use this to make sure that cache that is being read is not cleared by another job
    • EXCLUSIVE lock mode permits queries on the locked table but prohibits any other activity on it
    • COMMIT or ROLLBACK with release the lock

There are several other lock modes that could be used but in our scenario, these are the two we need to guarantee that our requirements are met. The other lock modes may be used depending on your scenario.

The two tables we will use for status maintenance and for data caching are

Status Table Cache Data Table
CREATE TABLE status
(
  SUBJECT                VARCHAR2(50 BYTE),
  STATUS                  VARCHAR2(50 BYTE),
  BEGIN_DATE              DATE,
  END_DATE                DATE,
  DESCRIPTION             VARCHAR2(256 BYTE),
  CREATE_UPDATE_USER      VARCHAR2(50 BYTE),
  CREATE_UPDATE_DATETIME  DATE
);
CREATE TABLE data_cache
(
  OBJECT_NAME             VARCHAR2(50 BYTE),
  CREATED                 DATETIME
);

The solution will look like this:

  1. Exclusively lock row in status table for our “subject” which is “Initialize cache” in step 1
  2. Initialize cache
  3. Release row lock on status table
  4. Exclusively lock cache table
  5. Use cache table for processing
  6. Release table lock on cache table

With this solution, the status table acts more as informational table than a controlling table. The row and table level locks drive our requirement. Even if the status logic is a bit incomplete/inaccurate, that will not affect the logic behind the requirement.

Sample Implementation

Here is a sample implementation of “Step 1” with comments. We built an artificial delay in the population and usage of cache data to simulate real world conditions. You do need to “GRANT EXECUTE ON sys.dbms_lock TO [yourself]” before using it.


--------------------------------------------------------------------------------
CREATE TABLE status
(
  SUBJECT                 VARCHAR2(50 BYTE),
  STATUS                  VARCHAR2(50 BYTE),
  BEGIN_DATE              DATE,
  END_DATE                DATE,
  DESCRIPTION             VARCHAR2(256 BYTE),
  CREATE_UPDATE_USER      VARCHAR2(50 BYTE),
  CREATE_UPDATE_DATETIME  DATE
);

--------------------------------------------------------------------------------
CREATE TABLE data_cache
(
  OBJECT_NAME             VARCHAR2(50 BYTE),
  CREATED                 DATE
);

--------------------------------------------------------------------------------

--Populate the status table
INSERT INTO STATUS (
   SUBJECT, STATUS, BEGIN_DATE,
   END_DATE, DESCRIPTION, CREATE_UPDATE_USER,
   CREATE_UPDATE_DATETIME)
VALUES (
    'INIT_DATA_CACHE',
    NULL,
    NULL,
    NULL,
    'Initialize the data cache with data from source',
    USER,
    SYSDATE);

COMMIT;

--------------------------------------------------------------------------------

--Populate the cache to begin with
INSERT INTO data_cache
SELECT object_name, created
FROM dba_objects
WHERE rownum < 10;

COMMIT;
--------------------------------------------------------------------------------

CREATE OR REPLACE PROCEDURE prc_Init_Cache(p_Subject IN VARCHAR2, p_cache_was_updated OUT VARCHAR2)
AS
    CURSOR c_status IS
    SELECT *
    FROM status s
    WHERE subject = 'INIT_DATA_CACHE'
        AND create_update_datetime < (sysdate - 1)
    FOR UPDATE;
BEGIN
    p_cache_was_updated  := 'N';

    FOR v_rec IN c_status LOOP

        dbms_output.put_line('About to populate cache');
        UPDATE status
        SET status='PROCESSING',
            create_update_datetime=SYSDATE,
            begin_date=SYSDATE
        WHERE CURRENT OF c_status;

        sys.DBMS_LOCK.SLEEP(10);

        dbms_output.put_line('Delete old data from cache');
        DELETE FROM data_cache;

        dbms_output.put_line('Populate cache with new data');
        INSERT INTO data_cache
        SELECT object_name, created
        FROM dba_objects
        WHERE rownum < 10;         dbms_output.put_line('Update status table to indicate cache was updated');         UPDATE status          SET status='COMPLETE',              create_update_datetime=SYSDATE,              end_date=SYSDATE            WHERE CURRENT OF c_status;                  p_cache_was_updated := 'Y';         COMMIT;         dbms_output.put_line('completed refreshing cache.');                              EXIT;     END LOOP;          IF p_cache_was_updated = 'N' THEN         dbms_output.put_line('Skipping cache table refresh since the current cache is less than one day old.');     END IF; EXCEPTION WHEN OTHERS THEN     dbms_output.put_line('Failed in update_caches at ' || to_char(sysdate,'mm/dd/yyyy hh24:mi:ss'));         dbms_output.put_line('Err: ' || sqlerrm);     ROLLBACK;     UPDATE status      SET status='ERROR',          create_update_datetime = SYSDATE,          begin_date = null,         end_date = null               WHERE subject = p_Subject;         COMMIT;              RAISE; END; -------------------------------------------------------------------------------- CREATE OR REPLACE PROCEDURE prc_Use_Cache  AS     BEGIN     dbms_output.put_line('About to use cache');          LOCK TABLE data_cache IN EXCLUSIVE MODE;      sys.DBMS_LOCK.SLEEP(10);          dbms_output.put_line('Completed using cache');          --Commit or rollback is what releases the table lock     COMMIT; END; -------------------------------------------------------------------------------- CREATE OR REPLACE PROCEDURE prc_Age_Out_Cache  AS     BEGIN     --Set the last updated date to be a day old so that we can test the cache initialization     UPDATE status      SET status='COMPLETE',          create_update_datetime=SYSDATE-2,         begin_date = null,          end_date=null               WHERE subject = 'INIT_DATA_CACHE';     COMMIT;  END; -------------------------------------------------------------------------------- CREATE OR REPLACE PROCEDURE prc_Run_Cache_Init AS   v_Subject             VARCHAR2(20) := 'INIT_DATA_CACHE';   v_Cache_Was_Updated   VARCHAR2(1); BEGIN    prc_Init_Cache(p_Subject => v_Subject,
                p_Cache_Was_Updated => v_Cache_Was_Updated);

  COMMIT;

  dbms_output.put_line('Cache was updated = ' ||  v_Cache_Was_Updated);
END;

--------------------------------------------------------------------------------

SET SERVEROUTPUT ON
SET TIME ON
SET TIMING ON

--------------------------------------------------------------------------------

Simultaneous two-session test results:

The results when running the commands shown in two sessions simultaneously are:

Session 1 Session 2 Observation
EXEC prc_Age_Out_Cache;

EXEC prc_Run_Cache_Init;

About to populate cache

Delete old data from cache

Populate cache with new data

Update status table to indicate cache was updated

completed refreshing cache.

Cache was updated = Y

EXEC prc_Run_Cache_Init;

Skipping cache table refresh since the current cache is less than one day old.

Cache was updated = N

When cache was initialized in two sessions simultaneously,  the second session waited while only the first session re-populated the cache
EXEC prc_Use_Cache;

About to use cache

Completed using cache

EXEC prc_Age_Out_Cache;

EXEC prc_Run_Cache_Init;

About to populate cache

Delete old data from cache

Populate cache with new data

Update status table to indicate cache was updated

completed refreshing cache.

Cache was updated = Y

When cache was in use by one session and a second session tried to run the cache initialization, it waited until the first session finished using the cache.
EXEC prc_Use_Cache;

About to use cache

Completed using cache

EXEC prc_Use_Cache;

About to use cache

Completed using cache

When both sessions tried to use the cache simultaneously, one of the session actually waited till the other session released the table lock
EXEC prc_Use_Cache; DELETE FROM data_cache; Session 2 waits till Session 1 finishes and the DELETE completes successfully
EXEC prc_Use_Cache; TRUNCATE TABLE data_cache; Session 2 fails immediately

truncate table data_cache

*

ERROR at line 1:

ORA-00054: resource busy and acquire with NOWAIT specified

EXEC prc_Use_Cache; SELECT * FROM data_cache; SELECT works fine immediately
EXEC prc_Age_Out_Cache;

EXEC prc_Run_Cache_Init;

EXEC prc_Use_Cache; Depending on which one was started first, the other session waits until completion of the one that started first

Building Timeouts in your application:

If you want to build timeouts into your routines, you could do something like this to wait 10 seconds or fail if the row cannot be locked within that time.

CURSOR c_status IS
    SELECT *
    FROM status s
    WHERE subject = 'INIT_DATA_CACHE'
        AND create_update_datetime < (sysdate - 1)
    FOR UPDATE WAIT 10;

However, when you try to replace the hard-coded number 10 (for 10 seconds) with a variable, you would get the error:

CURSOR c_status IS
    SELECT *
    FROM status s
    WHERE subject = 'INIT_DATA_CACHE'
        AND create_update_datetime < (sysdate - 1)
    FOR UPDATE WAIT v_Timeout_Seconds;

PL/SQL: ORA-30005: missing or invalid WAIT interval

The PL/SQL language support for variable based timeout value is lacking. Some innovation is required as shown in this post which uses a dynamic cursor:

CREATE OR REPLACE PROCEDURE Update_test_t( p_c1 in number, p_willing_maxwait in number ) as
     cur_c1 sys_refcursor;
     lv_c1 number;
    begin
      open cur_c1 for 'select * from test_t where c1='||p_c1 ||' for update wait
'||p_willing_maxwait;
      loop
       exit when cur_c1%notfound;
       fetch cur_c1 into lv_c1;
       update test_t set c2=systimestamp where c1=lv_c1;
    end loop;
  exception
    when others then
     dbms_output.put_line('Error '||SQLERRM(SQLCODE));
 end;

A word of caution:

Extensive testing was not done. Please test thoroughly before you use this concept. Also, COMMIT’s at the right places are key to keeping the semaphore mechanism working. If COMMIT’s or ROLLBACK’s are done too soon, the results will not be the same. In Oracle, a COMMIT or ROLLBACK commits or rollsback all pending transactions (unless it is an commit inside an autonomous transaction). These transactions may have been initiated in another procedure altogether but that does not matter.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s