Call (800) 766-1884 for Oracle support & training
Free Oracle Tips


Oracle Consulting Support
Oracle Upgrades
Use New Oracle Features
Oracle Replication Support
Oracle Training
Remote Oracle DBA
System Documentation
Oracle Tips
Oracle Performance
 

Free Oracle Tips


 

HTML Text

BC Oracle tuning

Oracle training

Oracle support

Remote Oracle

 

Oracle Tips by Burleson

Oracle SQL Cost-based Optimizer Index Column Order

I have heard it said that because Oracle will reorder the WHERE clause statements to ensure that the leading edge of an index will be used if possible and, that because the new skip-scan index search method is available, it is not required to try to properly order the columns in a concatenated index. In tests against an Oracle database (9.0.1), I found this assertion, that the order of columns in a concatenated index is not important, to be false and can result in poorly performing queries. This chapter shows the results from some basic tests to try to disprove this statement.

Index Order and Structure

In a standard concatenated B-tree index, the first level will be based on the initial column values, and subsequent levels on the following columns. If the index is properly built, then the mapping from these levels into the source table will be fairly linear; that is, the index clustering factor will be close to the number of dirty blocks in the table. Exhibit 1 shows a properly ordered index in relation to its table.

 

Exhibit 1 - A Well-Ordered Index with a Low Clustering Factor

 

If, on the other hand, an improper order is selected for the columns in the index, then the index will not be a linear match to the source table and a large clustering factor that is closer to the number of rows in the table will result. A large clustering factor results in more table reads to process the same amount of data as a small clustering factor. A poorly ordered index is shown in Exhibit 2.

 

Exhibit 2 - A Poorly Ordered Index with a High Clustering Factor

 

A clustering factor can be equated to the number of table block reads that would be required to perform a full table scan based on reads from the index. A large clustering factor indicates that table blocks would be required to be accessed more than once for a particular index scan operation.

 

The order in which a table’s columns are specified in a concatenated B-tree index can make a significant difference in the indexes ordering, and hence in its efficiency. Let’s examine some examples.

Examples Using Various Column Orders in an Index

When I need a test table, I use the DBA series of views to provide input to a CTAS (create table as select). For this example I will use the dba_objectsview because it has an easy structure to use for indexing. Exhibit 3 shows the CTAS and various statistics for the table TEST created as an image of the dba_objects view.

 

CTAS to Create TEST Table

SQL> create table test as select * from dba_objects;

Table created.

SQL> select count(*) from test;

 

COUNT(*)

----------

   31472

 

SQL> SELECT 'test',COUNT( DISTINCT( SUBSTR(

       dbms_rowid.rowid_to_restricted(ROWID,1),1,8))) blocks

  2  FROM    test;

 

TEST BLOCKS

---- ------

test    434

 

SQL> desc test

 

Name           Null?  Type

-------------- ------ --------------

OWNER                 VARCHAR2(30)

OBJECT_NAME           VARCHAR2(128)

SUBOBJECT_NAME        VARCHAR2(30)

OBJECT_ID             NUMBER

DATA_OBJECT_ID        NUMBER

OBJECT_TYPE           VARCHAR2(18)

CREATED               DATE

LAST_DDL_TIME         DATE

TIMESTAMP             VARCHAR2(19)

STATUS                VARCHAR2(7)

TEMPORARY             VARCHAR2(1)

GENERATED             VARCHAR2(1)

SECONDARY             VARCHAR2(1)

Exhibit 3 CTAS to Create TEST Table

 

Notice the number of blocks and the number of rows (434 and 31472), these will be important once we have our index clustering factors. Now that we have our table, let’s create several indexes using the same columns but placing them in different orders, the columns we will use are the OBJECT_TYPE, OWNER, and OBJECT_NAME. The possible orders for these columns are:

n         OBJECT_TYPE, OWNER, OBJECT_NAME

n         OBJECT_TYPE, OBJECT_NAME, OWNER

n         OBJECT_NAME, OWNER, OBJECT_TYPE

n         OBJECT_NAME, OBJECT_TYPE, OWNER

n         OWNER, OBJECT_NAME, OBJECT_TYPE

n         OWNER, OBJECT_TYPE, OBJECT_NAME

Let’s select a couple of these orders and create some example indexes. Exhibit 4 shows the create statements for these indexes. Also shown in Exhibit 4 are the various clustering factors for the indexes, but let’s pull them all together in one spot and compare them; see Exhibit 5.

 

SQL> create index test_ord1 on

     test(object_type,owner,object_name)

SQL> /

Index created.

 

SQL> analyze index test_ord1 compute statistics;

Index analyzed.

 

SQL> select clustering_factor from user_indexes where

     index_name='TEST_ORD1';

CLUSTERING_FACTOR

-----------------

              925

 

SQL> create index test_ord2 on

     test(object_name,owner,object_type);

Index created.

 

SQL> analyze index test_ord2 compute statistics;

Index analyzed.

 

SQL> select clustering_factor from user_indexes where index_name='TEST_ORD2';

CLUSTERING_FACTOR

-----------------

            25289

 

SQL> create index test_ord3 on

     test(owner,object_name,object_type);

Index created.

 

SQL> analyze index test_ord3 compute statistics;

Index analyzed.

 

SQL> select clustering_factor from user_indexes where

     index_name='TEST_ORD3'

SQL> /

CLUSTERING_FACTOR

-----------------

              451

 

SQL> create index test_ord4 on

     test(owner,object_type,object_name);

Index created.

 

SQL> analyze index test_ord4 compute statistics;

Index analyzed.

 

SQL> select clustering_factor from user_indexes where

     index_name='TEST_ORD4';

CLUSTERING_FACTOR

-----------------

              908

Exhibit 4.Creation of Various Indexes with Different Column Orders

 

SQL> select index_name,

            clustering_factor

     from   user_indexes

     where  index_name like'TEST%';

 

INDEX_NAME              CLUSTERING_FACTOR

----------------------- -----------------

TEST_ORD1                             925

TEST_ORD2                           25289

TEST_ORD3                             451

TEST_ORD4                             908

Exhibit 5.The Clustering Factors for the Indexes

 

As you can see, the clustering factors range from a low of 451 for TEST_ORD3 to a high of 25289 for TEST_ORD2. So, based on the clustering factor TEST_ORD3 is the best ordered index and TEST_ORD2 is the worst ordered index. Let's do some example queries and see how the various indexes fare as far as cost in the cost-based optimizer (CBO).

 

SQL> analyze table test compute statistics;

Table analyzed.

Elapsed: 00:00:04.05

 

Now let’s issue a SELECT that will use one of the indexes, the one chosen by the CBO (see Exhibit 6). As you can see, the CBO used the index with the higher clustering factor based on the right-to-left read of the WHERE clause leading column rule. The overall cost of the operation was 2. Let’s force the optimizer to use a different index and see what the cost becomes. First, let’s use the lowest clustering factor index, TEST_ORD3. We will force the query to use the index through use of the index hint (Exhibit 7).

 

SQL> select count(*) from test

  2  where

  3  owner='DBAUTIL' and object_type='TABLE';

 

COUNT(*)

--------

     338

 

Elapsed: 00:00:00.02

 

Execution Plan

---------------------------------------------

0   SELECT STATEMENT Optimizer=

    CHOOSE (Cost=2 Card=1 Bytes=13)

1 0   SORT (AGGREGATE)

2 1     INDEX (RANGE SCAN) OF 'TEST_

        ORD1' (NON-UNIQUE) (Cost=2

    Card=65 Bytes=845)

 


Statistics

---------------------------------------------

      0 recursive calls

      0 db block gets

      4 consistent gets

      0 physical reads

      0 redo size

    384 bytes sent via SQL*Net to client

    503 bytes received via SQL*Net from client

      2 SQL*Net roundtrips to/from client

      0 sorts (memory)

      0 sorts (disk)

      1 rows processed

Exhibit 6.Issuing a SELECT Using One of the Indexes

 

SQL> select /*+ index(test test_ord3) */

     count(*) from test

  2  where

  3  owner='DBAUTIL' and object_type='TABLE';

 

COUNT(*)

--------

     338

 

Elapsed: 00:00:00.02

 

Execution Plan

---------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

   (Cost=14 Card=1 Bytes=13)

1 0   SORT (AGGREGATE)

2 1     INDEX (RANGE SCAN) OF 'TEST_

        ORD3' (NON-UNIQUE) (Cost=14

    Card=65 Bytes=845)

 


Statistics

---------------------------------------------

     0 recursive calls

     0 db block gets

     8 consistent gets

     0 physical reads

     0 redo size

   384 bytes sent via SQL*Net to client

   503 bytes received via SQL*Net from client

     2 SQL*Net roundtrips to/from client

     0 sorts (memory)

     0 sorts (disk)

     1 rows processed

Exhibit 7.Forcing the Query to Use the Index through Use of the INDEX Hint

 

Note that the cost (14) was seven times the cost of our first query (2). The use of the TEST_ORD3 index resulted in a higher cost because the query had to traverse more of the index since the OWNER column was not the leading column.

 

Let’s examine the results form using the highest clus- tering cost index, TEST_ORD2 (Exhibit 8). Wow! A whopping 110 times higher cost (221) over the lowest cost (2) so far. Have you noticed a quasi-correlation between the consistent gets statistics and the cost factors?

 

SQL>  select /*+ test(test_ord2) */ count(*)

      from test

  2   where

  3*  owner='DBAUTIL' and object_type='TABLE'

SQL> /

 

 

COUNT(*)

--------

     338

 

Elapsed: 00:00:00.02

 


Execution Plan

--------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

   (Cost=221 Card=1 Bytes=13)

1 0   SORT (AGGREGATE)

2 1     INDEX (FULL SCAN) OF 'TEST_

        ORD2' (NON-UNIQUE) (Cost=221

Card=65 Bytes=845)

 

Statistics

--------------------------------------------

    0 recursive calls

    0 db block gets

  221 consistent gets

    0 physical reads

    0 redo size

  384 bytes sent via SQL*Net to client

  503 bytes received via SQL*Net from client

    2 SQL*Net roundtrips to/from client

    0 sorts (memory)

    0 sorts (disk)

    1 rows processed

Exhibit 8. Using TEST_ORD2 to View Results

 

Now look at TEST_ORD4 (Exhibit 9). TEST_ORD4 performs as well as the other two low-cost indexes. Let’s try the query with no index by using the full hint (Exhibit 10) and see what happens.

 

aultdb1>select /*+ index(test test_ord4) */

count(*) from test

  2 where

  3 owner='DBAUTIL' and object_type='TABLE';

 

COUNT(*)

--------

     338

 

Elapsed: 00:00:00.02

 


Execution Plan

--------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

    (Cost=2 Card=1 Bytes=13)

1 0   SORT (AGGREGATE)

2 1     INDEX (RANGE SCAN) OF 'TEST_

        ORD1' (NON-UNIQUE) (Cost=2

Card=65 Bytes=845)

 

Statistics

--------------------------------------------

    0 recursive calls

    0 db block gets

    4 consistent gets

    0 physical reads

    0 redo size

  384 bytes sent via SQL*Net to client

  503 bytes received via SQL*Net from client

    2 SQL*Net roundtrips to/from client

    0 sorts (memory)

    0 sorts (disk)

    1 rows processed

Exhibit 9. TEST_ORD4

 

SQL> select /*+ full(test) */ count(*) from test

  2  where

  3* owner='DBAUTIL' and object_type='TABLE'

SQL> /

 

COUNT(*)

--------

     338

 

Elapsed: 00:00:00.02

 

Execution Plan

--------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

    (Cost=67 Card=1 Bytes=13)

1 0   SORT (AGGREGATE)

2 1     TABLE ACCESS (FULL) OF 'TEST'

        (Cost=67 Card=65 Bytes=845)

 


Statistics

--------------------------------------------

    0 recursive calls

   36 db block gets

  446 consistent gets

    0 physical reads

    0 redo size

  384 bytes sent via SQL*Net to client

  503 bytes received via SQL*Net from client

    2 SQL*Net roundtrips to/from client

    0 sorts (memory)

    0 sorts (disk)

    1 rows processed

Exhibit 10. Using the FULL Hint

 

As you can see, the indexes are making a difference as far as cost. However, if you improperly arranged the index columns as in TEST_ORD2, and you had no other column orders to compare to, from these statistics you would choose to use a full table scan, increasing your cost by up to a factor of 30 based on the best performing index.

 

But what about a situation that requires an index scan? Lets change our query to use a like operation (Exhibit 11) and see what the CBO does with it.

 

SQL> select count(*) from test

  2  where

  3* owner='DBAUTIL' and object_name like 'DBA%'

SQL> /

 

COUNT(*)

--------

       6

 

Elapsed: 00:00:00.02

 


Execution Plan

--------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

    (Cost=2 Card=1 Bytes=29)

1 0   SORT (AGGREGATE)

2 1     INDEX (RANGE SCAN) OF 'TEST_

        ORD2' (NON-UNIQUE) (Cost=2

        Card=1 Bytes=29)

 

Statistics

--------------------------------------------

    0 recursive calls

    0 db block gets

    5 consistent gets

    0 physical reads

    0 redo size

  383 bytes sent via SQL*Net to client

  503 bytes received via SQL*Net from client

    2 SQL*Net roundtrips to/from client

    0 sorts (memory)

    0 sorts (disk)

    1 rows processed

Exhibit 11. Using a LIKE Operation

 

Not what we expected was it? In this case it looks like the CBO chose TEST_ORD2 although it has the highest clustering factor.

 

Let’s look at some of the other indexes and see why. First, our lowest clustering factor index, TEST_ORD3 (Exhibit 12). As you can see, this index also produced a cost of 2, so the optimizer probably chose the TEST_ORD2 index based on column order. Now look at our previous queries’ high performer, TEST_ORD1 (Exhibit 13). There is our 221 cost again, and 221 consistent gets. Obviously, the TEST_ORD1 index would not be a good candidate for this query. How about TEST_ORD4? (See Exhibit .14.)  Again, this demonstrates your columns you can reduce the cost of processing. For TEST_ORD4, we would have increased our cost by a factor of 7 over TEST_ORD2 or TEST_ORD3 by using the column order in TEST_ORD4 and by a factor of 110 using the order from TEST_ORD1. What would the cost be for a full table scan in this case? See Exhibit 15.

 

SQL> select /*+ index(test test_ord3) */

     count(*) from test

  2  where

  3* owner='DBAUTIL' and object_name like 'DBA%'

SQL> /

 

COUNT(*)

--------

       6

 

Elapsed: 00:00:00.02

 

Execution Plan

--------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

    (Cost=2 Card=1 Bytes=29)

1 0   SORT (AGGREGATE)

2 1     INDEX (RANGE SCAN) OF 'TEST_

        ORD3' (NON-UNIQUE) (Cost=2

        Card=1 Bytes=29)

 

Statistics

--------------------------------------------

    0 recursive calls

    0 db block gets

    2 consistent gets

    0 physical reads

    0 redo size

  383 bytes sent via SQL*Net to client

  503 bytes received via SQL*Net from client

    2 SQL*Net roundtrips to/from client

    0 sorts (memory)

    0 sorts (disk)

    1 rows processed

Exhibit 12. TEST_ORD3

 

SQL> select /*+ index(test test_ord1) */

     count(*) from test

  2  where

  3* owner='DBAUTIL' and object_name like 'DBA%'

SQL> /

 

COUNT(*)

--------

       6

 


Elapsed: 00:00:00.02

 

Execution Plan

--------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

    (Cost=221 Card=1 Bytes=29)

1 0   SORT (AGGREGATE)

2 1     INDEX (FULL SCAN) OF 'TEST_

        ORD1' (NON-UNIQUE) (Cost=221

        Card=1 Bytes=29)

 

Statistics

--------------------------------------------

    0 recursive calls

    0 db block gets

  221 consistent gets

    0 physical reads

    0 redo size

  383 bytes sent via SQL*Net to client

  503 bytes received via SQL*Net from client

    2 SQL*Net roundtrips to/from client

    0 sorts (memory)

    0 sorts (disk)

    1 rows processed

Exhibit 13. TEST_ORD1

 

SQL> select /*+ index(test test_ord4) */

     count(*) from test

  2  where

  3* owner='DBAUTIL' and object_name like 'DBA%'

SQL> /

 

COUNT(*)

--------

       6

 

Elapsed: 00:00:00.02

 

Execution Plan

--------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

    (Cost=14 Card=1 Bytes=29)

1 0   SORT (AGGREGATE)

2 1     INDEX (RANGE SCAN) OF 'TEST_

        ORD4' (NON-UNIQUE) (Cost=14

        Card=1 Bytes=29)

 


Statistics

--------------------------------------------

    0 recursive calls

    0 db block gets

    8 consistent gets

    0 physical reads

    0 redo size

  383 bytes sent via SQL*Net to client

  503 bytes received via SQL*Net from client

    2 SQL*Net roundtrips to/from client

    0 sorts (memory)

    0 sorts (disk)

    1 rows processed

Exhibit 14. TEST_ORD4

 

SQL> select /*+ full(test) */ count(*) from test

  2  where

  3* owner='DBAUTIL' and object_name like 'DBA%'

SQL> /

 

COUNT(*)

--------

       6

 

Elapsed: 00:00:00.02

 

Execution Plan

--------------------------------------------

0   SELECT STATEMENT Optimizer=CHOOSE

    (Cost=67 Card=1 Bytes=29)

1 0   SORT (AGGREGATE)

2 1     TABLE ACCESS (FULL) OF 'TEST'

        (Cost=67 Card=1 Bytes=29)

 


Statistics

--------------------------------------------

    0 recursive calls

   36 db block gets

  446 consistent gets

    0 physical reads

    0 redo size

  383 bytes sent via SQL*Net to client

  503 bytes received via SQL*Net from client

    2 SQL*Net roundtrips to/from client

    0 sorts (memory)

    0 sorts (disk)

    1 rows processed

Exhibit 15. Full Table Scan

 

As we could have anticipated, the same as with the other query, but, had we used the column order in TEST_ORD1 for our index, based solely on the results from our analysis of the first query, we would choose a full table scan for this query and again increased our cost by up to a factor of 30 over the best performing indexes.

 

These examples show that the index column order should be analyzed for all of the major queries and a matrix showing cost prepared. Using the matrix in Exhibit 16, the index with the lowest overall average cost for all queries should be used instead of just blindly choosing a column order.

 

Query   TEST_ORD1   TEST_ORD2   TEST_ORD3   TEST_ORD4FTS

------- ---------   ---------   ---------   ------------

Q1          2          221         14           267

Q2        221            2          2          1467

Average   111.5        111.5        8           867

Exhibit 16. Matrix Showing Cost Prepared

 

So, for our indexes, the TEST_ORD3 or TEST_ORD4 column orders are the best choices, assuming that the queries have equal weight in our applications. However, if Q1 was performed thousands of times a day and Q2 only a few, then TEST_ORD1 would be a better choice; while if Q2 were the predominate query, then TEST_ORD2 would be a better choice.

 

All of this goes to show that you must take into consideration the ordering of index columns and the predominant queries that will access those indexes in order to make a logical and rational choice of index column order.

 


The above text is an excerpt from:

Oracle SQL Tuning & CBO Internals
ISBN 0-9745993-3-6

by Kimberly Floss


Download your Oracle scripts now:

www.oracle-script.com

The definitive Oracle Script collection for every Oracle professional DBA

 

  
 

Oracle performance tuning software 
 
 
 
 

Oracle performance tuning book

 

 
 
 
Oracle performance Tuning 10g reference poster
 
 
 
Oracle training in Linux commands
 
Oracle training Excel
 
Oracle training & performance tuning books
 
   

Copyright © 1996 -  2014 by Burleson. All rights reserved.

Oracle® is the registered trademark of Oracle Corporation.