Call (800) 766-1884 for Oracle support & training
Free Oracle Tips


Oracle Consulting Support
Oracle Upgrades
Use New Oracle Features
Oracle Replication Support
Oracle Training
Remote Oracle DBA
System Documentation
Oracle Tips
Oracle Performance
 

Free Oracle Tips


 

HTML Text

BC Oracle tuning

Oracle training

Oracle support

Remote Oracle

 

 

   
 

Frequency of Statistics Gathering
Oracle Tips by Burleson
 

There's a great debate within the Oracle community about how frequently to reanalyze statistics, and the level of detail in the sample size for the statistics (Figure 14-2). Some DBAs subscribe to the notion that the SQL statistics should be reanalyzed very frequently to ensure optimal statistics and better CBO execution plans. Others subscribe to the notion that statistics should only be recomputed when the basic nature of the table data changes.

Figure 2: Statistics gathering frequency versus sample size

Along the other dimension, we see debate regarding the sample size that should be used when computing statistics. We have two options available within the analyze table command, the computes statistics option and the estimate statistics command syntax. When we issue a computes statistics command, the entire table is examined via a full-table scan, and very accurate statistics are then placed inside the data dictionary for use by the cost-based optimizer. By using the estimate statistics syntax, samples are taken from the table, and the samples are stored within the data dictionary. At the heart of this issue are the time and database resource consumption required to reanalyze all of the tables.

As you might remember from your college statistics classes, a sample size greater than 40 gives us averages that are within two standard deviations from the mean for the population that is being sampled. Within Oracle, however, there still exists a great deal of debate about the trade-offs between taking more detail row samples and the additional quality of the resulting statistics (Figure 14-3). Some people argue that a sample 50 rows from each and every table gives excellent statistics in order to drive their cost-based optimizers, while other DBAs insist that a sample of at least 5,000 rows is required for many table in order to get accurate statistics. Remember, the decisions of the cost-based optimizer are only as good as the statistics that are being fed to it.

Figure 3: The sample size as a function of the quality of the estimate

Essentially we see the debate regarding cost-based optimization falling along two dimensions, the frequency of analysis and the depth of the analysis. We also have to remember that if we plan to use optimizer plan stability on all our queries, the statistics are meaningless to the cost-based optimizer because the execution plan has been predetermined and saved in a stored outline.

Computing statistics for tables and indexes can be a very time-consuming operation, especially for data warehouses as systems that have many gigabytes or terabytes of information. In the real world, most Oracle professionals use the estimate statistics clause, sample a meaningful percentage of their data, and choose a reanalysis frequency that coincides with any scheduled changes to the database that might affect the distribution of values. For example, if a database runs purges from their transaction table each month, the period immediately following the purge would be a good time to reanalyze the CBO statistics.

Regardless of your philosophy, let’s take a look at how you can automate the reanalysis of your CBO statistics.

Gathering Statistics for the CBO

It is important that the statistics are refreshed periodically, especially when the distribution of data changes frequently. For this reason, the following SQL may be used to run the analyze statements for all of the tables and indexes. It is not always a good idea to use Oracle’s dbms_utility.analyze_schema or dbms_ddl.analyze_object package to perform this task, since a failure on one of the statements can affect the results of subsequent analyze statements. When working with databases that do not employ stored outlines, I use the following script to generate and execute the proper SQL analyze syntax.

analyze.ksh

#!/bin/ksh

# First, we must set the environment . . . .
ORACLE_SID=$ORACLE_SID
export ORACLE_SID
# The line below is for Solaris databases.  Otherwise, use /etc/oratab
ORACLE_HOME=`cat /var/opt/oracle/oratab|grep ^$ORACLE_SID:|cut -f2 -d':'`
export ORACLE_HOME
PATH=$ORACLE_HOME/bin:$PATH
export PATH

$ORACLE_HOME/bin/sqlplus /<<!

set pages 999
set heading off
set echo off
set feedback off

connect internal;

spool /export/home/oracle/analyze.sql;
select
'analyze table ‘||owner||’.’||table_name||' estimate statistics sample 5000 rows;'
from dba_tables
where owner not in (‘SYS’,’SYSTEM’,’PERFSTAT’);

select
'analyze index reader.'||index_name||' compute statistics;'
from dba_indexes
where owner not in (‘SYS’,’SYSTEM’,’PERFSTAT');

spool off;

set echo on
set feedback on

@/export/home/oracle/analyze

exit
!

Most shops schedule a script like this to run weekly, or whenever there have been significant changes to the table data. However, it is not necessary to reanalyze tables and indexes that remain relatively constant. For example, a database in which the tables have a constant number of rows and indexes where the distribution of values remain constant will not benefit from frequent reanalysis. Let’s continue with an overview of the important issues of SQL tuning with a cost-based default optimizer.


This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald K. Burleson, published by Oracle Press.

  
 

Oracle performance tuning software 
 
 
 
 

Oracle performance tuning book

 

 
 
 
Oracle performance Tuning 10g reference poster
 
 
 
Oracle training in Linux commands
 
Oracle training Excel
 
Oracle training & performance tuning books
 

 

   

Copyright © 1996 -  2014 by Burleson. All rights reserved.

Oracle® is the registered trademark of Oracle Corporation. 

Hit Counter