Call (800) 766-1884 for Oracle support & training
Free Oracle Tips


Oracle Consulting Support
Oracle Upgrades
Use New Oracle Features
Oracle Replication Support
Oracle Training
Remote Oracle DBA
System Documentation
Oracle Tips
Oracle Performance
 

Free Oracle Tips


 

HTML Text

BC Oracle tuning

Oracle training

Oracle support

Remote Oracle

 

 

   
 

Collecting Statistics


Oracle Tips by Burleson
 

You can collect statistics about the physical storage characteristics and data distribution of an index, table, column, or cluster, and store them as histograms in the data dictionary. For computing or estimating statistics, computation always provides exact values but can take longer than estimation, and requires large amounts of temporary tablespace (up to four times the size of your largest table). Estimation is often much faster than computation, and the results are usually nearly exact. You cannot compute or estimate histogram statistics for the following column types:

* REFs

* varrays 

* nested tables

* LOBs

* LONGs

* object types

Use estimation, rather than computation, unless you feel you need exact values. Some statistics are always computed exactly, regardless of whether you specify computation or estimation. If you choose estimation, and the time saved by estimating a statistic is negligible, Oracle computes the statistic exactly.

If the data dictionary already contains statistics for the analyzed object, Oracle updates the existing statistics with the new ones.

ANALYZE Myths

Since ANALYZE was introduced, many myths about its use have been circulated. Some of the more harmful ones are:

* You need to ANALYZE entire schema.

* ANALYZE does a full row count no matter what.

* You can sample 5 to 10 percent and get good results.

* You can sample 50 rows and get good results.

Letís take a quick look at these to determine if any are valid.

First, the statement that you need to analyze the entire schema to get good results is not true. You only need to analyze the tables that have changed. Oracle provides the monitoring clause for the CREATE an ALTER TABLE commands, which will place a table into monitored mode; any changes on the table will result in its being analyzed the next time the DBMS_STATS.GATHER_STATISTICS procedure is run. However, this means that if a table undergoes only a few INSERT, UPDATE, and DELETE operations, it will be analyzed. If the table has several million rows, and you only change 10, there is no need to reanalyze it. Source 4.4 shows a procedure that can be utilized to analyze tables based on a percent change in row count.

You may need to comment out the write_out procedure and subsequent calls to it, I like to track what tables need analysis using a dba_running_stats table.

CREATE OR REPLACE PROCEDURE check_tables (
   owner_name in varchar2,
   pchng IN NUMBER, 
   lim_rows IN NUMBER) AS

CURSOR get_tab_count (own varchar2) IS
        SELECT table_name, nvl(num_rows,1)
        FROM dba_tables
        WHERE owner = upper(own);

tab_name   VARCHAR2(64);
rows       NUMBER;
string     VARCHAR2(255);
cur        INTEGER;
ret        INTEGER;
row_count  NUMBER;
com_string VARCHAR2(255);

PROCEDURE write_out(
  par_name  IN VARCHAR2,


  par_value IN NUMBER,
  rep_ord   IN NUMBER,
  m_date    IN DATE,
  par_delta IN NUMBER) IS
 BEGIN
  INSERT INTO dba_running_stats VALUES(
   par_name,par_value,rep_ord,m_date,par_delta
  );
 END;

BEGIN

The next line Is for schemas with many tables. If you don't lose the cursors you can exceed open_cursor limits and flood the shared pool.

DBMS_SESSION.SET_CLOSE_CACHED_OPEN_CURSORS(TRUE);
OPEN get_tab_count (owner_name);
LOOP
BEGIN
        FETCH get_tab_count INTO tab_name, rows;
        tab_name:=owner_name||'.'||tab_name;
        IF rows=0 THEN
        rows:=1;
        END IF;
EXIT WHEN get_tab_count%NOTFOUND;
DBMS_OUTPUT.PUT_LINE('Table name: '||tab_name||' rows: '||to_char(rows));


Need to have created the get_count procedure in the same schema

GET_COUNT(tab_name,row_count);
   IF row_count=0 THEN
        row_count:=1;
   END IF;
DBMS_OUTPUT.PUT_LINE('Row count for '||tab_name||': '||to_char(row_count));
DBMS_OUTPUT.PUT_LINE('Ratio: '||to_char(row_count/rows));
        IF (row_count/rows)>1+(pchng/100) OR (rows/row_count)>1+(pchng/100) THEN
               BEGIN
     IF (row_count<lim_rows) THEN
           string :=
'ANALYZE TABLE '||tab_name||' COMPUTE STATISTICS ';
     ELSE
          string :=
'ANALYZE TABLE '||tab_name||' ESTIMATE STATISTICS SAMPLE 30 PERCENT';
     END IF;
           cur := DBMS_SQL.OPEN_CURSOR;
DBMS_OUTPUT.PUT_LINE('Beginning analysis');
           DBMS_SQL.PARSE(cur,string,dbms_sql.v7);
           ret := DBMS_SQL.EXECUTE(cur)  ;
           DBMS_SQL.CLOSE_CURSOR(cur);
           DBMS_OUTPUT.PUT_LINE(' Table: '||tab_name||' had to be analyzed.');
           write_out(' Table: '||tab_name||' had to be analyzed.',
row_count/rows,33,sysdate,0);
           EXCEPTION
           WHEN OTHERS THEN
            raise_application_error(-20002,'Error in analyze:
'||to_char(sqlcode)||' on '||tab_name,TRUE);
           write_out(' Table: '||tab_name||' error during analyze.
'||to_char(sqlcode), row_count/rows,33,sysdate,0);
            IF dbms_sql.is_open(cur) THEN
                dbms_sql.close_cursor(cur);
              END IF;
           END;
        END IF;
EXCEPTION
WHEN others THEN
null;
END;
COMMIT;
END LOOP;
CLOSE get_tab_count;
END;

Source 4.4 Procedure to conditionally ANALYZE tables.

Notice in the procedure in Source 4.4 that you specify the schema (owner) whose tables you wish to analyze, the percent change (I usually use 10), and the number of rows at which to switch from a COMPUTE to a ESTIMATE SAMPLE 30 PERCENT type of ANALYZE.

In one environment, using a procedure similar to the one in Source 4.4, resulted in analysis times dropping from over four hours to less than an hour.

To test the next myth, that ANALYZE does a full row count no matter whether you use COMPUTE or ESTIMATE, I ran various sample sizes. The tests were run first with a fixed row size and then with a varying row size. Figure 4.3 is a chart of my results on an Oracle8i, 8.1.7 database, and the actual numeric results are shown in Tables 4.1 and 4.2.

Figure 4.3 Chart of statistics from various sample sizes.

As you can see, the row counts did change with sample size, as did other key statistics. In general, a sample size of 20 to 30 percent gives the best results on large tables. I donít suggest sampling based on row counts at all.

Table 4.1 Results of Various Sample Sizes for Fixed-Size Row

Table 4.2 Results of Various Sample Sizes for Variable-Size Row

To understand sampling based on number of rows, letís look at a graph that shows the results as the row count increases from 0 to 5,000 rows for the same table. The graph is shown in Figure 4.4.

Figure 4.4 Graph showing row count for increasing values of rows sampled.

As you can see, the results vary widely until a substantial number of rows have been sampled (the graph begins to stabilize at around 2,800 rows).

Based on this research, I suggest using COMPUTE or ESTIMATE SAMPLE 20-30 PERCENT to ANALYZE your database tables.


See Code Depot for Full Scripts


This is an excerpt from Mike Ault, bestselling author of "Oracle 10g Grid and Real Application Clusters".

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.
   
††
 

Oracle performance tuning software 
 
 
 
 

Oracle performance tuning book

 

 
 
 
Oracle performance Tuning 10g reference poster
 
 
 
Oracle training in Linux commands
 
Oracle training Excel
 
Oracle training & performance tuning books
 

 

   

Copyright © 1996 -  2014 by Burleson. All rights reserved.

Oracleģ is the registered trademark of Oracle Corporation. 

Hit Counter