Query Paging in 12c

While writing the last post about paging queries (can be found here) I checked the 12c new top-n feature.
The idea behind this feature is to tell Oracle, in native SQL syntax, how many rows we would like to get from the query, and which row to start with. So, if I want to get only part of the result set, I can ask for the first N rows starting with row M. I can also define a percentage instead of row number.

Here is pretty much the full syntax:

SELECT … OFFSET n ROW[S] FETCH {NEXT|FIRST} m [PERCENT] ROW[S] {ONLY|WITH TIES}

So we can offset any number of rows (or eliminate this part altogether) and fetch specific number of rows or percentage of the table while getting exactly the number I wrote or more if rows have the same values (for example, if I wish to get a list of the 5 top salaries and my company has 7 people who earn the same top salary, with the “only” keyword I’ll get 5 of them, with “with ties” keywords I will get all 7).

Here are few examples just to make it clear:

  • 5 newest employees in the company (show only 5):
    select * from emp order by hiredate desc fetch first 5 rows only;
  • 5 newest employees in the company (but of there are few employees that were hired on the same dates, show all of them):
    select * from emp order by hiredate desc fetch first 5 rows with ties;
  • 5 newest employees in the company but skip the newest one:
    select * from emp order by hiredate desc offset 1 rows fetch next 5 rows only;
  • This query returns the 1st decile of the company (10% of the company with the highest salary)
    select * from emp order by sal desc fetch first 10 percent rows only;

I hope you got it.
Now, I’d like to dive into this new syntax. While I was writing the previous post, I noticed that I might have a performance problem. If you haven’t read the post, I got the best performance to get the second 5 employees by their salary using this query:

select *
from (select ename,sal, rownum r
      from (select ename,sal
            from large_emp order by sal desc
           )
      where rownum<=10 ) where r>=6;

When I used it, the optimizer knew that I cut 10 top rows, so the final cardinality is 10 rows, and it also used the index I have on the SAL column. However, it didn’t realize I take only the last 5 of these 10 rows. When I used the new 12c syntax, the optimizer didn’t know anything about the cardinality (and also didn’t use the index, obviously):

SQL> select ename,sal
  2       from large_emp
  3       where sal is not null
  4       order by sal desc
  5       offset 5 rows fetch next 5 rows only;

Execution Plan
----------------------------------------------------------
Plan hash value: 3391400152

----------------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |           | 50000 |  2246K|       |   324   (1)| 00:00:01 |
|*  1 |  VIEW                    |           | 50000 |  2246K|       |   324   (1)| 00:00:01 |
|*  2 |   WINDOW SORT PUSHED RANK|           | 50000 |   537K|   992K|   324   (1)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL     | LARGE_EMP | 50000 |   537K|       |   103   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=CASE WHEN (5>=0) THEN 5
              ELSE 0 END +5 AND "from$_subquery$_002"."rowlimit_$$_rownumber">5)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("SAL") DESC )<=CASE WHEN (5>=0) THEN 5 ELSE 0 END +5)
   3 - filter("SAL" IS NOT NULL)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        374  consistent gets
          0  physical reads
          0  redo size
        723  bytes sent via SQL*Net to client
        544  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          5  rows processed

As you can see from the predicate information, what the optimizer does, is rewrite of the new syntax into analytic function and filters. To simplify it, the first filter gets all the relevant rows up to the upper limit (first 10 rows in this case, the 5 for offset plus the 5 that I want to get) and the second filter gets only the last 5 out of these 10 rows. This is logically similar to the query above (where I used the rownum), but in this case, the analytic functions seem to make it hard for the optimizer to estimate the correct (or even close) cardinality.

When the optimizer performs the rewrite for a query containing the “with ties” keyword it uses the analytic function RANK instead of ROW_NUMBER, which make sense, since RANK gives the same rank for the same values (salary in my examples). When it rewrites the query containing the “percent” keyword instead of number of rows, it also uses the COUNT analytic function so it can know the number of total rows out of which it needs to calculate the percentage.

In these last two cases, it would be much more difficult for the optimizer to estimate the cardinality, And in any case, it seems that it is difficult for it to estimate the cardinality when we add predicate on the output of analytic functions.

I’m going to the Oracle Open World at the end of this month, I guess I’ll try to talk to the optimizer people about this.

Wait for updates…

Advertisements

3 thoughts on “Query Paging in 12c

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s