Sometimes we carry knowledge around that can hurt when doing similar things especially when dealing with the different programming languages and constructs. Today, let us see how Oracle’s ROWNUM is different from (as in, almost unrelated) SQL Server’s TOP.
SQL Server – TOP: Basics
Let us take the TOP function in SQL Server. It is very straight-forward. There is no confusion about how it works. When used with the SELECT statement, it returns the TOP x rows based after ordering by what is in “ORDER BY”.
WARNING: SQL Server allows TOP without an ORDER BY. The results are non-deterministic. i.e., for the same query and the same data, the results “could” be different from one run to the next. Always use TOP with ORDER BY.
Simple TOP with ORDER BY
SELECT TOP 10 object_id, name FROM sys.columns ORDER BY name
Simple TOP with ORDER BY DESC
SELECT TOP 10 object_id, name FROM sys.columns ORDER BY name DESC
TOP with ORDER BY multiple columns
SELECT TOP 3 object_id, name FROM sys.columns ORDER BY name DESC, object_id
Notice that of the four records with name=”xtype”, the one with object_id=51 is no longer displayed when we said “TOP 3”
TOP PERCENT with ORDER BY multiple columns
SELECT TOP 2 PERCENT * FROM sys.columns ORDER BY name DESC, object_id
By the way, SQL Server allows TOP’s usage in DELETE and UPDATES too.
Oracle – ROWNUM: Basics
To understand and use ROWNUM, one has to forget SQL Server’s TOP function. Trying to compare and contrast will only add even more confusion.
ROWNUM in Oracle is a pseudo column that is part of every query result and is not displayed by default.
Simple SELECT with ORDER BY
SELECT column_id, column_name FROM dba_tab_columns ORDER BY column_name, column_id
SELECT with ORDER BY and with ROWNUM column included
SELECT ROWNUM, column_id, column_name FROM dba_tab_columns ORDER BY column_name, column_id
ROWNUM seems sort of random in the first few rows displayed because of our ORDER BY but surely the results have ROWNUM values from 1 to the number of rows returned by the SELECT
SELECT with ORDER BY and with ROWNUM column based restriction
SELECT ROWNUM, column_id, column_name FROM dba_tab_columns WHERE ROWNUM < 10 ORDER BY column_name, column_id
What happened here? We said “WHERE ROWNUM < 10” but the “WHERE ROWNUM < 10” did not get just the TOP 10 rows as we expected! The column names with “A”, “AAD” are not even in the results.
Oracle’s order of evaluation
To understand why the last query did not get what we wanted, it pays to understand how Oracle comes up with the value for ROWNUM. I am going to copy and paste the sequencing from Tom Kyte’s article which I recommend reading. Below is the excerpt copied from AskTom
select [COLUMNS], ROWNUM
where [WHERE CLAUSE]
group by [COLUMNS]
having [HAVING CLAUSE]
order by [COLUMNS];
Think of it as being processed in this order:
1. The FROM/WHERE clause goes first.
2. ROWNUM is assigned and incremented to each output row from the FROM/WHERE clause.
3. SELECT is applied.
4. GROUP BY is applied.
5. HAVING is applied.
6. ORDER BY is applied.
Did you notice that ROWNUM was assigned to rows before Oracle got to the ORDER BY? Even the HAVING and GROUP BY is being done after the ROWNUM is assigned. Who would have guessed that? What a revelation?
Right way to SELECT with ROWNUM limitation
SELECT * FROM ( SELECT ROWNUM, column_id, column_name FROM dba_tab_columns ORDER BY column_name, column_id ) WHERE ROWNUM < 10
Now we get the top 10 columns correctly.
If you just wanted any 10 rows after the WHERE condition was satisfied but did not care which 10 rows, then, the ORDER BY does not matter.
Pagination queries in both Oracle and SQL Server:
The above is great to understand but the examples are hardly of practical use. Typical real-life situations call for returning results in pages (web pages) where each page will display x records. It would be a waste of resources to fetch a million rows to only display the first 10 rows and links to the other 999,99 pages corresponding to the remaining 999,990 rows.
To accomplish this, we take the responsibility of assigning each ordered row a number which we will then use to filter down the results.
We use what Oracle calls “analytic functions” and SQL Server calls “Common Table Expressions” or CTE’s for short.
Assigning a row number to each row in ordered result
SELECT ROW_NUMBER() OVER (ORDER BY name) as row , name , object_id FROM sys.columns
The ROW_NUMBER() OVER is the key here. The rows are assigned a number after the are ORDERed BY name.
Getting “Page 2 – Records 11 through 20”
If a page is 10 rows, getting the 2nd page is a matter of getting rows 11 through 20
SQL Server sample:
SELECT * FROM ( SELECT ROW_NUMBER() OVER (ORDER BY name) as row1 , name , object_id FROM sys.columns ) a WHERE row BETWEEN 11 AND 20
SELECT * FROM ( SELECT ROW_NUMBER() OVER (ORDER BY column_id, column_name) AS row1 , column_id , column_name FROM dba_tab_columns a ) a WHERE row1 BETWEEN 11 AND 20
Did you notice how we only fetched the second page and that the newly created row column value starts from 11 and ends with 20?
If the table had millions of rows, we successfully managed to keep the I/O and network traffic to a minimum by fetching just what we wanted. After all, the user may have never bothered to see what was beyond the first web-page with the 10 rows!