Oracle Tips by Burleson
You can collect statistics about the physical
storage characteristics and data distribution of an index, table,
column, or cluster, and store them as histograms in the data
dictionary. For computing or estimating statistics, computation always
provides exact values but can take longer than estimation, and
requires large amounts of temporary tablespace (up to four times the
size of your largest table). Estimation is often much faster than
computation, and the results are usually nearly exact. You cannot
compute or estimate histogram statistics for the following column
* nested tables
* object types
Use estimation, rather than computation,
unless you feel you need exact values. Some statistics are always
computed exactly, regardless of whether you specify computation or
estimation. If you choose estimation, and the time saved by estimating
a statistic is negligible, Oracle computes the statistic exactly.
If the data dictionary already contains
statistics for the analyzed object, Oracle updates the existing
statistics with the new ones.
Since ANALYZE was introduced, many myths about
its use have been circulated. Some of the more harmful ones are:
* You need to ANALYZE entire schema.
* ANALYZE does a full row count no matter
* You can sample 5 to 10 percent and get good
* You can sample 50 rows and get good results.
Letís take a quick look at these to determine
if any are valid.
First, the statement that you need to analyze
the entire schema to get good results is not true. You only need to
analyze the tables that have changed. Oracle provides the monitoring
clause for the CREATE an ALTER TABLE commands, which will place a
table into monitored mode; any changes on the table will result in its
being analyzed the next time the DBMS_STATS.GATHER_STATISTICS
procedure is run. However, this means that if a table undergoes only a
few INSERT, UPDATE, and DELETE operations, it will be analyzed. If the
table has several million rows, and you only change 10, there is no
need to reanalyze it. Source 4.4 shows a procedure that can be
utilized to analyze tables based on a percent change in row count.
You may need to comment out the write_out
procedure and subsequent calls to it, I like to track what tables need
analysis using a dba_running_stats table.
REPLACE PROCEDURE check_tables (
owner_name in varchar2,
pchng IN NUMBER,
lim_rows IN NUMBER) AS
CURSOR get_tab_count (own varchar2) IS
SELECT table_name, nvl(num_rows,1)
WHERE owner = upper(own);
par_name IN VARCHAR2,
par_value IN NUMBER,
rep_ord IN NUMBER,
m_date IN DATE,
par_delta IN NUMBER) IS
INSERT INTO dba_running_stats VALUES(
The next line Is for schemas with many tables. If you don't
lose the cursors you can exceed open_cursor limits and flood the
OPEN get_tab_count (owner_name);
FETCH get_tab_count INTO tab_name,
IF rows=0 THEN
EXIT WHEN get_tab_count%NOTFOUND;
DBMS_OUTPUT.PUT_LINE('Table name: '||tab_name||' rows: '||to_char(rows));
Need to have created the get_count
procedure in the same schema
IF row_count=0 THEN
DBMS_OUTPUT.PUT_LINE('Row count for '||tab_name||': '||to_char(row_count));
IF (row_count/rows)>1+(pchng/100) OR
IF (row_count<lim_rows) THEN
'ANALYZE TABLE '||tab_name||' COMPUTE STATISTICS ';
'ANALYZE TABLE '||tab_name||' ESTIMATE STATISTICS SAMPLE 30 PERCENT';
DBMS_OUTPUT.PUT_LINE(' Table: '||tab_name||' had to be analyzed.');
'||tab_name||' had to be analyzed.',
WHEN OTHERS THEN
raise_application_error(-20002,'Error in analyze:
'||to_char(sqlcode)||' on '||tab_name,TRUE);
'||tab_name||' error during analyze.
WHEN others THEN
Source 4.4 Procedure to conditionally ANALYZE
Notice in the procedure in Source 4.4 that you
specify the schema (owner) whose tables you wish to analyze, the
percent change (I usually use 10), and the number of rows at which to
switch from a COMPUTE to a ESTIMATE SAMPLE 30 PERCENT type of ANALYZE.
In one environment, using a procedure similar
to the one in Source 4.4, resulted in analysis times dropping from
over four hours to less than an hour.
To test the next myth, that ANALYZE does a
full row count no matter whether you use COMPUTE or ESTIMATE, I ran
various sample sizes. The tests were run first with a fixed row size
and then with a varying row size. Figure 4.3 is a chart of my results
on an Oracle8i, 8.1.7 database, and the actual numeric results are
shown in Tables 4.1 and 4.2.
Figure 4.3 Chart of statistics from various
As you can see, the row counts did change with
sample size, as did other key statistics. In general, a sample size of
20 to 30 percent gives the best results on large tables. I donít
suggest sampling based on row counts at all.
Table 4.1 Results of Various Sample Sizes for
Table 4.2 Results of Various Sample Sizes for
To understand sampling based on number of
rows, letís look at a graph that shows the results as the row count
increases from 0 to 5,000 rows for the same table. The graph is shown
in Figure 4.4.
Figure 4.4 Graph showing row count for
increasing values of rows sampled.
As you can see, the results vary widely until
a substantial number of rows have been sampled (the graph begins to
stabilize at around 2,800 rows).
Based on this research, I suggest using
COMPUTE or ESTIMATE SAMPLE 20-30 PERCENT to ANALYZE your database
Code Depot for Full Scripts
||This is an excerpt
from Mike Ault, bestselling author of "Oracle
10g Grid and Real Application Clusters".
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts.