Call (800) 766-1884 for Oracle support & training
Free Oracle Tips


Oracle Consulting Support
Oracle Upgrades
Use New Oracle Features
Oracle Replication Support
Oracle Training
Remote Oracle DBA
System Documentation
Oracle Tips
Oracle Performance
 

Free Oracle Tips


 

HTML Text

BC Oracle tuning

Oracle training

Oracle support

Remote Oracle

 

 

   
 

Using Oracle Parallel Query


Oracle Tips by Burleson
 

There are several Oracle parameters that are set when using Oracle Parallel Query. Many of these are default values and are set by Oracle when your database is created. Oracle Parallel Query can be turned on in several ways. You can turn it on permanently for a table, or you can isolate the Parallel Query to a single table.

Permanent Parallelism

L 12-1

Alter table customer parallel degree 35;

Single Query Parallelism

L 12-2

select /*+ FULL(emp) PARALLEL(emp, 35) */
         emp_name
      from
        emp;

Note the use of the double hints in the preceding query. Most Oracle DBAs always use the FULL hint with the PARALLEL hint because they are both required to use Oracle Parallel Query.

Most Oracle DBAs identify those tables that perform full table scans and then alter those tables to specify the degree of parallelism. This way, all full table scans against the tables will invoke Oracle Parallel Query.

Parallel Query Parameters

There are several important Oracle parameters that have a direct impact on the behavior of Oracle Parallel Query:

  •  sort_area_size (pre Oracle)The higher the value, the more memory is available for individual sorts on each parallel process. Note that the sort_area_size parameter allocates memory for every query on the system that invokes a sort. For example, if a single query needs more memory, and you increase the sort_area_size, all Oracle tasks will allocate the new amount of sort area, regardless of whether they will use all of the space. It is also possible to dynamically change the sort_area_size for a specific session with the alter session command. This technique can be used when a specific transaction requires a larger sort area than the default for the database.  Note that in Oracle, the sort areas are managed by the pga_aggregate_target parameter.

  • parallel_min_serversThis value specifies the minimum number of query servers that will be active on the instance. There are system resources involved in starting a query server, and having the query server started and waiting for requests will accelerate processing. Note that if the actual number of required servers is less than the values of parallel_min_servers, the idle query servers will be consuming unnecessary overhead, and the value should be decreased.

  • parallel_max_serversThis value specifies the maximum number of query servers allowed on the instance. This parameter will prevent Oracle from starting so many query servers that the instance cannot service all of them properly.

  • optimizer_percent_parallelThis parameter defines the amount of parallelism that the optimizer uses in its cost functions. The default of 0 means that the optimizer chooses the best serial plan. A value of 100 means that the optimizer uses each object's degree of parallelism in computing the cost of a full table scan operation.

NOTE: Cost-based optimization will always be used for any query that references an object with a nonzero degree of parallelism. Hence, you should be careful when setting parallelism if your default is optimizer_mode=RULE.

Setting the Optimal Degree of Parallelism

Determining the optimal degree of parallelism for Oracle tasks is not easy. Because of the highly volatile nature of most SMP systems, there is no general rule that will apply to all situations. As you may know, the degree of parallelism is the number of operating system processes that are created by Oracle to service the query.

Oracle states that the optimal degree of parallelism for a query is based on several factors. These factors are presented in their order of importance:

  • The number of CPUs on the server.

  • The number of physical disks that the tables resides on.

  • For parallelizing by partition, the number of partitions that will be accessed, based upon partition pruning (if appropriate).

  • For Parallel DML operations with global index maintenance, the minimum number of transaction freelists among all the global indexes to be updated. The minimum number of transaction freelists for a partitioned global index is the minimum number across all index partitions. This is a requirement in order to prevent self-deadlock.

For example, if your system has 20 CPUs and you issue a Parallel Query on a table that is stored on 15 disk drives, the default degree of parallelism for your query is 15 query servers.

There has been a great deal of debate about what number of parallel processes results in the fastest response time. As a general rule, the optimal degree of parallelism can be safely set to n-1 where n is the number of processors in your SMP or MPP cluster.

In practice, the best method is a trial-and-error approach. When tuning a specific query, the DBA can set the query to force a full table scan and then experiment with different degrees of parallelism until the fastest response time is achieved.

Finding the Number of CPUs on Your Database Server

Sometimes the Oracle DBA does not know the number of CPUs on the database server. The following UNIX commands can be issued to report on the number of CPUs on the database server.

Windows NT

If you are using MS Windows NT, you can find the number of CPUs by entering the Control Panel and choosing the System icon.

Linux

To see the number of CPUs on a Linux server, you can cat the /proc/cpuinfo file. In the following example, we see that our Linux server has four CPUs:

L 12-3

>cat /proc/cpuinfo|grep processor|wc -l
      4

Solaris

In Sun Solaris, the prsinfo command can be used to count the number of CPUs on the processor.

L 12-4

>psrinfo -v|grep "Status of processor"|wc -l
       24

IBM-AIX

The following example is taken from an AIX server, and shows that the server has four CPUs:

L 12-5

>lsdev -C|grep Process|wc –l

       36

HP/UX

In HP UNIX, you can use the glance or top utilities to display the number of CPUs.

NOTE: Parallel hints will often speed up index creation even on uniprocessor machines. This is not because there is more processing power available, but because there is less I/O wait contention with multiple processes. On the other end of the spectrum, we generally see diminishing elapsed time when the degree of parallelism exceeds the number of processors in the cluster.

There are several formulas for computing the optimal parallelism. Oracle provides a formula for computing the optimal parallelism based on the number of CPUs and the number of disks that the file is striped onto. Assume that D is the number of devices that P is striped across (either SQL*Loader striping or OS striping). Assume that C is the number of CPUs available:

L 12-6

         P = ceil(D/max(floor(D/C), 1))

Simply put, the degree of parallelism for a table should generally be the number of devices on which the table is loaded, scaled down so that it isn't too much greater than the number of CPUs. For example, with ten devices and eight CPUs, a good choice for the degree of parallelism is ten. With only four CPUs, a better choice of parallelism might be five.

However, this complex rule is not always suitable for the real world. A better rule for setting the degree of parallelism is to simply use the number of CPUs:

L 12-7

P=(number of CPUs)-1

As a general rule, you can set the degree of parallelism to the number of CPUs on your server, minus one. This is because one processor will be required to handle the Parallel Query coordinator.

Setting Automatic Parallelism

Oracle Parallel Query allows you to control the number of Parallel Query slave processes that service a table. Oracle Parallel Query processes can be seen on the server because background processes will start when the query is serviced. These factotum processes are generally numbered from P000 through Pnnn. For example, if our server is on AIX, we can create a script to gather the optimal degree of parallelism and pass this argument to the SQL.

parallel_query.ksh

L 12-8

#!/bin/ksh
# Get the number of CPUs
num_cpu=`lsdev –C|grep mem|wc –l`
optimal_parallelism=`expr $num_cpu`-1

sqlplus system/manager<<!
select /*+ FULL(employee_table) PARALLEL(employee_table, $optimal_parallelism)*/
employee_name
from
employee_table;
exit
!

Resource Contention and Oracle Parallel Query

There are several sources of contention in Oracle Parallel Query. As we already mentioned, Oracle Parallel Query works best on servers that have multiple CPUs, but we can often see disk contention when the whole table resides on the same physical disk. In short, the use of Oracle Parallel Query can precipitate several external bottlenecks. These include:

  • Overloaded processors This is normally evidenced when the vmstat run queue values exceed the number of CPUs on the server.

  • Disk enqueues When multiple processes compete for data blocks on the same disk, I/O related slowdowns may occur. This is evidenced by high activity from the UNIX iostat utility.

  • Increased RAM usage The parallel sorting feature may increase the demands on the server RAM. Each parallel process can allocate sort_area_size in RAM to manage the sort.

Let's explore things that we can do to prevent contention-related slowdowns when using Oracle Parallel Query. To be most effective, the table should be partitioned onto separate disk devices, such that each process can do I/O against its segment of the table without interfering with the other simultaneous query processes. However, the client/server environment of the 1990s relies on RAID or a logical volume manager (LVM), which scrambles datafiles across disk packs in order to balance the I/O load. Consequently, full utilization of Parallel Query involves “striping” a table across numerous datafiles, each on a separate device. It is also important to note that large contiguous extents can help the query coordinator break up scan operations more efficiently for the query servers. Even if your system uses RAID or a logical volume manager (such as Veritas), there are still some performance gains from using Parallel Query. In addition to using multiple processes to retrieve the table, the Query Manager will also dedicate numerous processes to simultaneously sort the result set (see Figure 12-2).

Figure 12-108: Increase RAM memory demands with Oracle Parallel Query

As we know, a RAM memory overload can cause swapping on the database server. Because of the parallel sorting feature, it is also a good idea to beef up the memory on the processor. We may also see the TEMP tablespace fall short when using Parallel Query and Parallel DML. Here is an example of the error:

L 12-9

SQL> alter session enable parallel dml;

Session altered.

SQL> insert /*+ parallel(customer, 6) */ into customer;
2 select /*+ full(c) parallel(c, 6) */
3 from customer c;

ERROR at line 3:
ORA-12801: error signaled in parallel query server P000
ORA-01652: unable to extend temp segment by 128000 in tablespace CUSTOMER_
TS

Disk Contention with Oracle Parallel Query

Many DBAs are surprised to note that Oracle Parallel Query does not always improve the speed of queries where the whole table resides on a single physical disk.

The data retrieval for a table on a single disk will not be particularly fast, since all of the parallel retrieval processes may be competing for a channel on the same disk. But each sort process has its own sort area (as determined by the sort_area_size parameter before Oracle), so the sorting of the result set will progress very quickly. In addition to full table scans and sorting, the Parallel Query option also allows for parallel processes for merge JOINs and nested loops.

Using Parallel Query Hints

Invoking the Parallel Query with hints requires several steps. The most important is that the execution plan for the query specifies a full table scan. If the output of the execution plan does not indicate a full table scan, the query can be forced to ignore the index by using the FULL hint.

The number of processors dedicated to service an SQL request is ultimately determined by Oracle Query Manager, but the programmer can specify the upper limit on the number of simultaneous processes. When using the cost-based optimizer, the PARALLEL hint can be embedded into the SQL to specify the number of processes. For instance:

L 12-10

select /*+ FULL(employee_table) PARALLEL(employee_table, 35) */
employee_name
from
employee_table
where
emp_type = ''SALARIED'';

If you are using an SMP or MPP database server with many CPUs, you can issue a parallel request and leave it up to each Oracle instance to use its default degree of parallelism. For example:

L 12-11

select /*+ FULL(employee_table) PARALLEL(employee_table, DEFAULT, DEFAULT) */
employee_name
from
employee_table
where
emp_type = ''SALARIED'';

In most cases, it is better for the Oracle DBA to determine the optimal degree of parallelism and then set that degree in the data dictionary with the following command:

L 12-12

Alter table employee_table parallel degree 35;

This way, the DBA can always be sure of the degree of parallelism for any particular table.

Oracle also provides the parallel_automatic_tuning parameter to assist in setting the best degree of parallelism. When setting parallel_automatic_tuning, you only need to specify parallelism for a table, and Oracle will dynamically change the parallel_adaptive_multi_user parameter to override the execution plan in favor of maintaining an acceptable overall load on the database. You should also note that setting parallel_automatic_tuning will cause extra storage in the large pool because Oracle will allocate message buffers from the large pool instead of the shared pool.

Now let's move on to look at how the Oracle DBA can query the behavior of parallel queries.


This is an excerpt from "Oracle High Performance tuning with STATSPACK" by Oracle Press.


If you like Oracle tuning, you may enjoy the new book "Oracle Tuning: The Definitive Reference", over 900 pages of BC's favorite tuning tips & scripts. 

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.

  
 

Oracle performance tuning software 
 
 
 
 

Oracle performance tuning book

 

 
 
 
Oracle performance Tuning 10g reference poster
 
 
 
Oracle training in Linux commands
 
Oracle training Excel
 
Oracle training & performance tuning books
 

 

   

Copyright © 1996 -  2014 by Burleson. All rights reserved.

Oracle® is the registered trademark of Oracle Corporation. 

Hit Counter