Oracle ROWNUM vs. SQL Server TOP – Differences & How To Do Pagination In Both

Sometimes we carry knowledge around that can hurt when doing similar things especially when dealing with the different programming languages and constructs. Today, let us see how Oracle’s ROWNUM is different from (as in, almost unrelated) SQL Server’s TOP.

SQL Server – TOP: Basics

Let us take the TOP function in SQL Server. It is very straight-forward. There is no confusion about how it works. When used with the SELECT statement, it returns the TOP x rows based after ordering by what is in “ORDER BY”.

WARNING: SQL Server allows TOP without an ORDER BY. The results are non-deterministic. i.e., for the same query and the same data, the results “could” be different from one run to the next. Always use TOP with ORDER BY.

Simple TOP with ORDER BY

SELECT TOP 10
      object_id, name
FROM sys.columns
ORDER BY name
object_id name
69 addr
56 affinity
95 algorithm
57 alias
7 auid
56 authrealm
56 authtype
76 backuplsn
1019150676 binary_message_body
1051150790 binary_message_body

Simple TOP with ORDER BY DESC

SELECT TOP 10
      object_id, name
FROM sys.columns
ORDER BY name DESC
object_id Name
51 Xtype
50 Xtype
41 Xtype
13 Xtype
149575571 xserver_name
133575514 xserver_name
117575457 xserver_name
91 Xsdid
41 Xmlns
149575571 xfallback_vstart

TOP with ORDER BY multiple columns

SELECT TOP 3
   object_id, name
FROM sys.columns
ORDER BY name DESC, object_id
object_id name
13 xtype
41 xtype
50 xtype

Notice that of the four records with name=”xtype”, the one with object_id=51 is no longer displayed when we said “TOP 3”

TOP PERCENT with ORDER BY multiple columns

SELECT TOP 2 PERCENT
      *
FROM sys.columns
ORDER BY name DESC, object_id
object_id name
13 xtype
41 xtype
50 xtype
51 xtype
117575457 xserver_name
133575514 xserver_name
149575571 xserver_name
91 xsdid
41 xmlns
149575571 xfallback_vstart
133575514 xfallback_low
133575514 xfallback_drive

By the way, SQL Server allows TOP’s usage in DELETE and UPDATES too.

Oracle – ROWNUM: Basics

To understand and use ROWNUM, one has to forget SQL Server’s TOP function. Trying to compare and contrast will only add even more confusion.

ROWNUM in Oracle is a pseudo column that is part of every query result and is not displayed by default.

Simple SELECT with ORDER BY

SELECT column_id, column_name
FROM dba_tab_columns
ORDER BY column_name, column_id
COLUMN_ID COLUMN_NAME
1 A
48 AAD
5 ABA_OS_PID
6 ABA_START_TIME
3 ABA_STATE
4 ABA_STATE_TIME
38 ABIDE_ISS_IND
16 ABORTED_REQUESTS
17 ABORTED_REQUESTS
15 ABORTED_REQUEST_THRESHOLD
…more rows

SELECT with ORDER BY and with ROWNUM column included

SELECT ROWNUM, column_id, column_name
FROM dba_tab_columns
ORDER BY column_name, column_id
ROWNUM COLUMN_ID COLUMN_NAME
654 1 A
33563 48 AAD
48869 5 ABA_OS_PID
48870 6 ABA_START_TIME
48867 3 ABA_STATE
48868 4 ABA_STATE_TIME
32490 38 ABIDE_ISS_IND
3312 16 ABORTED_REQUESTS
5585 17 ABORTED_REQUESTS
3311 15 ABORTED_REQUEST_THRESHOLD
5584 16 ABORTED_REQUEST_THRESHOLD
65082 81 ABORT_STEP
…more rows

ROWNUM seems sort of random in the first few rows displayed because of our ORDER BY but surely the results have ROWNUM values from 1 to the number of rows returned by the SELECT

SELECT with ORDER BY and with ROWNUM column based restriction

SELECT ROWNUM, column_id, column_name
FROM dba_tab_columns
WHERE ROWNUM < 10
ORDER BY column_name, column_id
ROWNUM COLUMN_ID COLUMN_NAME
8 7 CTIME
6 5 DATATS#
3 2 NAME
1 1 OBJ#
5 4 PASSWORD
9 8 PTIME
7 6 TEMPTS#
4 3 TYPE#
2 1 USER#

What happened here? We said “WHERE ROWNUM < 10” but the “WHERE ROWNUM < 10” did not get just the TOP 10 rows as we expected! The column names with “A”, “AAD” are not even in the results.

Oracle’s order of evaluation

To understand why the last query did not get what we wanted, it pays to understand how Oracle comes up with the value for ROWNUM. I am going to copy and paste the sequencing from Tom Kyte’s article which I recommend reading. Below is the excerpt copied from AskTom

select [COLUMNS], ROWNUM
from t
where [WHERE CLAUSE]
group by [COLUMNS]
having [HAVING CLAUSE]
order by [COLUMNS];

Think of it as being processed in this order:
1. The FROM/WHERE clause goes first.
2. ROWNUM is assigned and incremented to each output row from the FROM/WHERE clause.
3. SELECT is applied.
4. GROUP BY is applied.
5. HAVING is applied.
6. ORDER BY is applied.

Did you notice that ROWNUM was assigned to rows before Oracle got to the ORDER BY? Even the HAVING and GROUP BY is being done after the ROWNUM is assigned. Who would have guessed that? What a revelation?

Right way to SELECT with ROWNUM limitation

SELECT * FROM
(
    SELECT ROWNUM, column_id, column_name
    FROM dba_tab_columns
    ORDER BY column_name, column_id
)
WHERE ROWNUM < 10
ROWNUM COLUMN_ID COLUMN_NAME
654 1 A
33563 48 AAD
48869 5 ABA_OS_PID
48870 6 ABA_START_TIME
48867 3 ABA_STATE
48868 4 ABA_STATE_TIME
32490 38 ABIDE_ISS_IND
3312 16 ABORTED_REQUESTS
5585 17 ABORTED_REQUESTS

Now we get the top 10 columns correctly.

If you just wanted any 10 rows after the WHERE condition was satisfied but did not care which 10 rows, then, the ORDER BY does not matter.

Pagination queries in both Oracle and SQL Server:

The above is great to understand but the examples are hardly of practical use. Typical real-life situations call for returning results in pages (web pages) where each page will display x records. It would be a waste of resources to fetch a million rows to only display the first 10 rows and links to the other 999,99 pages corresponding to the remaining 999,990 rows.

To accomplish this, we take the responsibility of assigning each ordered row a number which we will then use to filter down the results.

We use what Oracle calls “analytic functions” and SQL Server calls “Common Table Expressions” or CTE’s for short.

Assigning a row number to each row in ordered result

SELECT
      ROW_NUMBER() OVER (ORDER BY name) as row
      , name
      , object_id
FROM sys.columns
row name object_id
1 addr 69
2 affinity 56
3 algorithm 95
4 alias 57
5 auid 7
6 authrealm 56
7 authtype 56
8 backuplsn 76
9 binary_message_body 1019150676
10 binary_message_body 1051150790
11 binary_message_body 1083150904
12 bitlength 95
13 bitposint 13
14 bitposleaf 13
15 brkrinst 69
16 bstat 56

The ROW_NUMBER() OVER is the key here. The rows are assigned a number after the are ORDERed BY name.

Getting “Page 2 – Records 11 through 20”

If a page is 10 rows, getting the 2nd page is a matter of getting rows 11 through 20

SQL Server sample:

SELECT * FROM (
  SELECT
      ROW_NUMBER() OVER (ORDER BY name) as row1
      , name
      , object_id
  FROM sys.columns
) a
WHERE row BETWEEN 11 AND 20
row1 name object_id
11 binary_message_body 1083150904
12 bitlength 95
13 bitposint 13
14 bitposleaf 13
15 brkrinst 69
16 bstat 56
17 catalog 43
18 category 28
19 cert 46
20 chk 41

Oracle Sample:

SELECT * FROM (
  SELECT
    ROW_NUMBER() OVER (ORDER BY column_id, column_name) AS row1
    , column_id
    , column_name
  FROM dba_tab_columns a
) a
WHERE row1 BETWEEN 11 AND 20

Did you notice how we only fetched the second page and that the newly created row column value starts from 11 and ends with 20?

If the table had millions of rows, we successfully managed to keep the I/O and network traffic to a minimum by fetching just what we wanted. After all, the user may have never bothered to see what was beyond the first web-page with the 10 rows!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s