Technical Blog

Jul 13, 2011

UNDERSTANDING EXPLAIN PLAN 4

Set Operations

CONCATENATION

Merges multiple result sets in the same way as in an explicit UNION statement. This typically occurs when an Or statement is used with indexed columns.

CONCATENATION performs a UNION ALL (a UNION without elimination of duplicate values) of result sets.

Example

select Name, City, State from COMPANY where State = ‘TX’ and City in (‘Houston’, ‘Austin’, ‘Dallas’);

This query selects from the COMPANY Table based on two criteria in the WHERE clause. The criteria can use the COMPANY$CITY index and the COMPANY$STATE index to obtain ROWIDs for the rows to be returned. Since the NAME column is required by the query, the COMPANY table will have to be accessed; the other column values are already available via the index searches.

The query of the CITY column uses an IN clause, which is functionally equivalent to an OR clause. The query could be rewritten as:

select Name, City, State from COMPANY where State = ‘TX’ and (City = ‘Houston’
or City = ‘Austin’ or City = ‘Dallas’);

Taking this expansion one step further, the STATE portion of the query can be placed into each section of the OR clause. The revised query would now read:

select Name, City, State from COMPANY where (State = ‘TX’ and City = ‘Houston’)
or (State = ‘TX’ and City = ‘Austin’) or (State = ‘TX’ and City = ‘Dallas’);

This query helps with understanding the plan that is generated.

Execution Plan

CONCATENATION
TABLE ACCESS BY ROWID COMPANY
AND-EQUAL
INDEX RANGE SCAN COMPANY$CITY
INDEX RANGE SCAN COMPANY$STATE
TABLE ACCESS BY ROWID COMPANY
AND-EQUAL
INDEX RANGE SCAN COMPANY$CITY
INDEX RANGE SCAN COMPANY$STATE
TABLE ACCESS BY ROWID COMPANY
AND-EQUAL
INDEX RANGE SCAN COMPANY$CITY
INDEX RANGE SCAN COMPANY$STATE

Interpreting the Execution Plan

The Execution Plan shows that the query is executed as if the IN clause is rewritten as an OR clause, and the other criteria are placed within the OR clauses. Within each OR clause, an AND-EQUAL operation is performed to merge the lists of ROWIDs returned from the index scans. The ROWIDs returned by the AND-EQUAL operations are then used to select the requested columns from the COMPANY table via a TABLE ACCESS BY ROWID operation. The resulting records from each part of the query are then concatenated to carry out the OR clause.

Note: When the query is very complex, the Optimizer may decide not to use the CONCATENATION operation. Instead, it will use partial index range scans. If you want to force the use of the CONCATENATION operation, you may need to use the format shown in the last example query.

INTERSECTION

Compares two result sets and returns only rows common to both. This operation usually only takes place as a result of an explicit use of the INTERSECTION clause.

Set operation.

INTERSECTION is used to merge sets of records returned by multiple queries; in this sense, it is analogous to the index record merge performed by AND-EQUAL. INTERSECTION is used when the INTERSECT clause is used in a query.

Most INTERSECT queries should be rewritten to use joins instead, to improve their ability to use row operations instead of set operations. The following example could be rewritten to use a join, but for purposes of illustration is shown using an INTERSECT clause.

Example

select Company_ID from COMPANY where State = ‘AZ’ and Parent_Company_ID is null
INTERSECT
select Company_ID from COMPETITOR;

The query in the previous listing will select all COMPANY_IDs from the COMPANY table whose state value is ‘AZ’, whose Parent_Company_ID value is null, and whose Company_ID value also is present in the COMPETITOR table.

Execution Plan

PROJECTION
INTERSECTION
SORT UNIQUE
TABLE ACCESS BY ROWID COMPANY
INDEX RANGE SCAN COMPANY$STATE
SORT UNIQUE
TABLE ACCESS FULL COMPETITOR

Interpreting the Explain Plan

The Execution Plan shows the INTERSECTION operation, along with two operations described later: SORT UNIQUE and PROJECTION.

The Explain Plan shows that each of the queries is executed separately, and the results of the two queries are returned to the INTERSECTION operation. In the first (top) query, the index on the State column is used, and the ROWID values returned from that index scan are used to select records from the COMPANY table. During the TABLE ACCESS BY ROWID on the COMPANY table, the Parent_Company_ID is null criteria is applied (via an implicit FILTER operation).

The second (bottom) query does not have any WHERE clause criteria, so a TABLE ACCESS FULL (full table scan) of the COMPETITOR table is performed. The results of each query are then sorted separately via the SORT UNIQUE operations—only unique Company_ID values will be passed on to the INTERSECTION operation.

The INTERSECTION operation takes the sorted rows from the two queries and returns to the user the rows that are common to both queries via the PROJECTION operation (which makes the two result sets appear as one).

Advanced INTERSECTION Example

A second example of INTERSECTION involves the CONNECT BY operation. Assume that you want to add a row into the COMPANY table, with a Company_ID =10 and Parent_Company_ID =5. How can you be sure that there will be no loops in the hierarchical structure of the data in the COMPANY table? The query in the following listing checks for the intersection of two searches - one up the tree, and one down the tree. If the intersection of these two queries returns a row, then there is a loop in the hierarchy of the Company_ID values.

Example

select Company_ID, Name from COMPANY where State = ‘VA’
connect by Parent_Company_ID = prior Company_ID /*down the tree*/
start with Company_ID = 10 INTERSECTION
select Company_ID, Name from COMPANY where State = ‘VA’
connect by Company_ID = prior Parent_Company_ID /*up the tree*/
start with Company_ID = 5;

If the query in the preceding listing does not return any rows, then there will be no loop in the Company_ID hierarchy if you insert a row with a Company_ID value of 10 and a Parent_Company_ID value of 5.

PROJECTION

Returns a single set of records from the results of multiple queries. PROJECTION is used by the INTERSECTION, MINUS, and UNION operations.

PROJECTION is used by the INTERSECTION, MINUS, and UNION operations to return a single set of records from the results of multiple queries.

Example

select Company_ID from COMPANY MINUS select Company_ID from COMPETITOR;

Execution Plan

PROJECTION
MINUS
SORT UNIQUE
TABLE ACCESS FULL COMPANY
SORT UNIQUE
TABLE ACCESS FULL COMPETITOR

Interpreting the Execution Plan

The Execution Plan shows that after the MINUS operation is performed, the PROJECTION operation resolves the output into a single set of data for output to the user. PROJECTION is always used in conjunction with the MINUS, INTERSECTION, and UNION operations.

MINUS	Returns all result sets in the first result set, except those appearing in the second result set. This occurs as a result of the MINUS set operator.
UNION ALL	Combines two result sets and returns rows from both are returned.

When a query contains a WHERE clause with multiple conditions combined with OR operators, the optimizer transforms the query into an equivalent compound query that uses the UNION ALL set operator. The optimizer determines whether this modified query will execute more efficiently in the following manner:

· When each condition individually makes an index access path available, then the optimizer can make the transformation. The optimizer then chooses an Explain Plan for the resulting statement that accesses the table multiple times using the different indexes, and then puts the results together.

· When any condition requires a full table scan because it does not make an index available, then the optimizer does not transform the statement. The optimizer chooses a full table scan to execute the statement, and Oracle tests each row in the table to determine whether it satisfies any of the conditions.

· For statements that use the cost-based optimizer, the optimizer may use statistics to determine whether to make the transformation by estimating and then comparing the costs of executing the original statement versus the resulting statement.

· The cost-based optimizer does not use the OR transformation for IN-lists or ORs on the same column; instead, it uses the INLIST ITERATOR operator.

Example

In the following query, the WHERE clause contains two conditions combined with an OR operator:

SELECT * FROM emp WHERE job = 'CLERK' OR deptno = 10;

When there are indexes on both the job and deptno columns, the optimizer may transform this query into the equivalent query as follows:

SELECT * FROM emp WHERE job = 'CLERK' UNION ALL
SELECT * FROM emp WHERE deptno = 10 AND job <> 'CLERK';

When the cost-based optimizer is deciding whether to make a transformation, the optimizer compares the cost of executing the original query using a full table scan with that of executing the resulting query.

With the rule-based optimizer, the optimizer makes this UNION ALL transformation, because each component query of the resulting compound query can be executed using an index. The rule-based optimizer assumes that executing the compound query using two index scans is faster than executing the original query using a full table scan.

VIEW	Accesses a view definition or creates a temporary table to store a result set.

VIEW resolves any query nested deeper than the VIEW operation into a temporary area. The use of VIEW may be caused by correlated queries or by the inability of Oracle to pull a view’s query into the rest of a larger query, forcing it to resolve the view separately.

The example is in two parts: a view is created and then queried. The plan that follows applies only to the query of the view, not to the view creation itself.

Example

create view COMPANY_COUNT as select Zip, COUNT(*) Company_Count
from COMPANY group by Zip;

select Zip, Company_Count from COMPANY_COUNT where Company_Count BETWEEN 10 and 20;

Execution Plan

VIEW COMPANY_COUNT
FILTER
SORT GROUP BY
TABLE ACCESS FULL COMPANY

Interpreting the Explain Plan

Because there is a set operation (SORT GROUP BY) within the view syntax, the optimizer must resolve the view before executing the conditions specified in the query. All the rows are fetched from COMPANY table using a full table scan, then they are sorted and counted by Zip during the SORT GROUP BY operation. The WHERE clause condition in the query is applied during the FILTER operation on the result of the view.

WINDOW

Consists of a physical or logical set of rows used for calculating the result of an analytical function.

The WINDOW operation consists of a physical or logical set of rows used for calculating the result of an analytical function. The function is then applied to all the rows in the window. The window "slides" through the query result set or partition from top-to-bottom.

Jul 4, 2011

UNDERSTANDING EXPLAIN PLAN 3

Partition Operations

SINGLE

Accesses a single partition

Describes partition boundaries applicable to a single partitioned object (table or index). The partition boundaries are provided by the values of pstart and pstop of the PARTITION

ITERATOR

Accesses multiple partitions.

Accesses multiple partitions. Iterates over the operation below it, for each partition in the range given by the PARTITION_START and PARTITION_STOP columns.

ALL				Accesses all partitions.
INLIST		Accesses multiple partitions based on the values contained in an IN list predicate.
INVALID			Indicates an empty partition set.
RANGE	Partitions a table where each of the various segments contain table rows that share the same range of values for a particular column

Each partition is a separate segment, and partitions can be stored in separate tablespaces and have individual storage definitions. Although partitions are implemented transparently to an application and need not be specified in SQL, SQL statements can specify a partition name. For instance, the following statement applies an update to an individual partition (S1) only:

UPDATE sales_part PARTITION(s1) SET purge_ind=’Y’ WHERE sale_value < 10000

SINGLE RANGE		Creates a single partition of a table.
HASH	Partitions a table into various segments, subjecting the column to a hashing algorithm to determine which partition the rows will be stored in. This methodology ensures that each partition has approximately the same number of rows.

In Oracle8i hash partition, rows are distributed to the various partitions based on the value of specific columns. However, in the case of hash partitions the column is subjected to a hashing algorithm to determine which partition the rows will be stored in. The hashing algorithm applies a mathematical function to the column value and assigns the row to its partition based on the return value of the function. This ensures that each partition has approximately the same number of rows. It also means that you cannot reasonably anticipate which rows will end up in specific partitions. Rows that have adjacent column values will not necessarily end up in the same partition. They are usually allocated to separate partitions.

The following statement creates a variant of the SALES table that is partitioned by the hash of SALE_DATE. Six partitions are created with default storage attributes, though it could have explicitly named and supplied storage parameters for each partition:

CREATE TABLE SALES_HASH_PART (column definitions) PARTITION BY HASH (sale_date) PARTITIONS 8

The key advantage of hash partitions is that they are usually well-balanced: each partition will hold approximately the same number of rows. This improves the performance of many parallel operations on the partition.

Note: For a hash partition to be well-balanced using the default hashing algorithm, the number of partitions must be a power of 2 (i.e. 2, 4, 8, 16, 32).

The disadvantage of hash partitions is that you often will not get partition elimination for range-based queries such as "what were the total sales for quarter 4". Furthermore, you cannot purge old data using a hash partition. For example, a range partition that would have allowed you to easily remove all sales data for a specific year will require all the overhead of locating and removing each individual when you use hash partitioning.

Note: Consider hash partitioning when the balance of rows between partitions is more important than the benefits of partition elimination or purging data by dropping a partition. Remember to make the number of partitions a power of 2.

UNDERSTANDING EXPLAIN PLAN 2

Join Operations

OUTER

An outer join that returns rows from one of the tables, even if there is no matching row in the other table. This is achieved in Oracle using the "(+)" operator in the WHERE clause.

OUTER JOINS are used with CONNECT BY, MERGE JOIN, NESTED LOOPS and HASH JOIN operations. OUTER JOIN enables rows from the driving table to be returned to the calling query even though no matching rows were found in the joined table. The following example is based on the same query illustrated in the NESTED LOOPS topic, using an OUTER JOIN, instead.

Example

select c.Name from COMPANY c, SALES s where c.Company_ID = s.Company_ID (+) and s.Period_ID = 3 and s.Sales_Total >1000;

Execution Plan

NESTED LOOPS OUTER
TABLE ACCESS FULL SALES
TABLE ACCESS BY ROWID COMPANY
INDEX UNIQUE SCAN COMPANY_PK

Interpreting the Execution Plan

The Execution Plan shows that the SALES table is used as the driving table for the query. For each COMPANY_ID value in SALES, the COMPANY_ID index on the COMPANY table will be checked to see if a matching value exists. Even if a match does not exist, that record is returned to the user via the NESTED LOOPS OUTER join operation.

Note the difference in Explain Plan of below 2 queries .

select D.NN_EMPCODE,VC_EMPNAME,D.NN_MGRCODE from hr_emp_mast d ,hr_emp_personal dd where D.NN_EMPCODE=DD.NN_EMPCODE(+) and D.NN_EMPCODE=269

select D.NN_EMPCODE,VC_EMPNAME,D.NN_MGRCODE from hr_emp_mast d ,hr_emp_personal dd where D.NN_EMPCODE=DD.NN_EMPCODE(+) and dD.NN_EMPCODE=269

ANTI	An anti-join that returns all rows in one table that do not have a matching row in the other table. It is typically implemented using a NOT IN sub-query.

An ANTI-JOIN is a query that returns rows in one table that do not match some set of rows from another table. Since this is effectively the opposite of normal join behavior, the term ANTI-JOIN has been used to describe this operation. ANTI-JOINs are usually expressed using a sub-query, although there are alternative formulations.

One method of using an ANTI-JOIN query is to combine the IN operator with the NOT operator. This method works well when using the cost-based optimizer.

The rule-based optimizer method of using an ANTI-JOIN query is to use it with the NOT EXISTS operator in place of NOT IN. This method uses the WHERE clause in the sub-query.

Note: When Oracle is using rule-based optimization, avoid using NOT IN to perform an anti-join. Use NOT EXISTS instead.

You can also implement the ANTI-JOIN operation as an OUTER JOIN. An OUTER JOIN includes NULL for rows in the inner table, which have no match in the outer table. This feature can be used to include only rows that have no match in the inner table. However, the most efficient implementation is to use HASH JOIN and its hints.

To take advantage of Oracle’s ANTI-JOIN optimizations, the following must be true:

· Cost-based optimization must be enabled.

· The ANTI-JOIN columns used must not be NULL. This either means that they are not NULL in the table definition, or an IS NOT NULL clause appears in the query for all the relevant columns.

· The subquery is not correlated.

· The parent query does not contain an Or clause.

· The database parameter ALWAYS_ANTI_JOIN is set to either MERGE or HASH or a MERGE_AJ or HASH_AJ hint appears within the sub-query.

SEMI	A semi-join that returns rows from a table which have matching rows in a second table but which does not return multiple rows if there are multiple matches. This is usually expressed using a WHERE EXISTS sub-query.

A SEMI JOIN is a join which returns rows from a table which have matching rows in a second table but which does not return multiple rows if there are multiple matches. This is usually expressed in Oracle using a WHERE EXISTS sub-query.

CARTESIAN

Every row in one result set is joined to every row in the other result set.

A join with no join condition results in a CARTESIAN product, or a cross product. A CARTESIAN product is the set of all possible combinations of rows drawn from each table. In other words, for a join of two tables, each row in one table is matched with every row in the other. A CARTESIAN product for more than two tables is the result of pairing each row of one table with every row of the Cartesian product of the remaining tables.

All other kinds of joins are subsets of CARTESIAN products effectively created by deriving the CARTESIAN product and then excluding rows that fail the join condition.

Note: When using the ORDERED hint, it is important that the tables in the FROM clause are listed in the correct order to prevent CARTESIAN joins.

Hint: Consider using Oracle’s STAR query optimization when joining a very large "fact" table to smaller, unrelated "dimension" tables. You will need a concatenated index on the fact table and may need to specify the STAR hint.

OUTER, ANTI, SEMI, CARTESIAN WILL COME WITH CONNECT BY, MERGE JOIN, NESTED LOOP, HASH JOIN.

CONNECT BY

A hierarchical self-join is performed on the output of the preceding steps.

CONNECT BY does a recursive join of a table to itself, in a hierarchical fashion.

Example

select Company_ID, Name from COMPANY where State = ‘VA’ connect by Parent_Company_ID = prior Company_ID start with Company_ID = 1;

The query shown in the preceding statement selects companies from the COMPANY in a hierarchical fashion; that is, it returns the rows based on each Company’s parent company. If there are multiple levels of company parentage, those levels display in the report.

Execution Plan

FILTER
CONNECT BY
INDEX UNIQUE SCAN COMPANY_PK
TABLE ACCESS BY ROWID COMPANY
TABLE ACCESS BY ROWID COMPANY
INDEX RANGE SCAN COMPANY$PARENT

Interpreting the Execution Plan

The plan shows that first the COMPANY_PK index is used to find the root node (Company_ID = 1), then index on the Parent_Company_ID column is used to provide values for queries against the Company_ID column in an iterative fashion. After the hierarchy of Company_IDs is complete, the FILTER operation—the WHERE clause related to the STATE value—is applied. Notice that the query does not use the index on the STATE column, although it is available and the column is used in the WHERE clause.

MERGE

MERGE JOIN

A MERGE JOIN performed on the output of the preceding steps

MERGE JOIN joins tables by merging sorted lists of records from each table. It is effective for large batch operations, but may be ineffective for joins used by transaction-processing applications. MERGE JOIN is used whenever Oracle cannot use an index while conducting a join. In the following example, all of the tables are fully indexed. So the example deliberately disables the indexes by adding 0 to the numeric keys during the join to force a merge join to occur.

Example

select COMPANY.Name from COMPANY, SALES where COMPANY.Company_ID+0 = SALES.Company_ID+0 and SALES.Period_ID =3 and SALES.Sales_Total>1000;

Execution Plan

MERGE JOIN
SORT JOIN
TABLE ACCESS FULL SALES
SORT JOIN
TABLE ACCESS FULL COMPANY

Interpreting the Execution Plan

There are two potential indexes that could be used by a query joining the COMPANY table to the SALES table. First, there is an index on COMPANY.COMPANY_ID - but that index cannot be used because of the +0 value added to it (disabling indexes is described in detail in the Top SQL Tuning Tips topic). Second, there is an index whose first column is SALES.COMPANY_ID - but that index cannot be used, for the same reason.

As shown in the plan, Oracle will perform a full table scan (TABLE ACCESS FULL) on each table, sort the results (using the SORT JOIN operation), and merge the result sets. The use of merge joins indicates that indexes are either unavailable or disabled by the query’s syntax.

NESTED LOOPS

A nested loops join is performed on the preceding steps. For each row in the upper result set, the lower result set is scanned to find a matching row.

NESTED LOOPS joins table access operations when at least one of the joined columns is indexed.

Example

select COMPANY.Name from COMPANY, SALES where COMPANY.Company_ID = SALES.Company_ID and SALES.Period_ID =3 and SALES.Sales_Total>1000;

Execution Plan

NESTED LOOPS
TABLE ACCESS FULL SALES
TABLE ACCESS BY ROWID COMPANY
INDEX UNIQUE SCAN COMPANY_PK

Interpreting the Execution Plan

The Execution Plan shows that the SALES table is used as the driving table for the query. During NESTED LOOPS joins, one table is always used to drive the query. The Implications of the Driving Table in a NESTED LOOPS Join topic provides tuning guidance on the selection of a driving table for a NESTED LOOPS operation.

For each COMPANY_ID value in the SALES table, the COMPANY_ID index on the COMPANY table will be checked to see if a matching value exists. If a match exists, the record is returned to the user via the NESTED LOOPS operation.

There are several important things to note about this query:

· Although all of the primary key columns in the SALES table were specified in the query, the SALES_PK index was not used. The SALES_PK index was not used because there was not a limiting condition on the leading column (the COMPANY_ID column) of the SALES_PK index. The only condition on SALES.COMPANY_ID is a join condition.

· The optimizer could have selected either table as the driving table. When the COMPANY table is the driving table, Oracle performs a full table scan.

· In rule-based optimization, when there is equal chance of using an index regardless of the choice of the driving table, the driving table will be the one that is listed last in the FROM clause.

· In cost-based optimization, the optimizer will consider the size of the tables and the selectivity of the indexes while selecting a driving table.

Interpreting the Order of Operations within NESTED LOOPS

NESTED LOOPS operations pose a special challenge when reading the output from PLAN_TABLE. Given the Explain path shown in the following listing, it appears that the first step in the Explain path is the scan of the COMPANY_PK index, since that is the innermost step of the Explain path.

NESTED LOOPS
TABLE ACCESS FULL SALES
TABLE ACCESS BY ROWID COMPANY
INDEX UNIQUE SCAN COMPANY_PK

Despite its placement as the innermost step, the scan of the COMPANY_PK index is not the first step in the Explain path. A NESTED LOOPS join needs to be driven by a row source (such as a full table scan or an index scan) - so to determine the first step within a NESTED LOOPS join, you need to determine which operations directly provide data to the NESTED LOOPS operation. In this example, two operations provide data directly to the NESTED LOOPS operation - the full table scan of SALES, and the ROWID access of the COMPANY table.

Of the two operations that provide data to the NESTED LOOPS operation, the full table scan of SALES is listed first. Therefore, within the NESTED LOOPS operation, the order of operations is:

1. The full table scan of SALES.

2. For each record in SALES, access COMPANY by Company_ID. Since an index (COMPANY_PK) is available on COMPANY.Company_ID, use that index via a unique scan.

3. For each ROWID returned from the COMPANY_PK index, access the COMPANY table (to get the NAME value, as requested by the query).

When reading the Explain path for a NESTED LOOPS operation, you need to look first at the order of the operations that directly provide data to it, and determine their order.

HASH JOIN

A HASH JOIN is performed of two row sources.

HASH JOIN is one of the algorithms that Oracle can use to join two tables.

In a HASH JOIN a hash table, an on-the-fly index, is constructed for the larger of the two tables. The smaller table is then scanned, and the hash table used to find matching rows in the larger table.

HASH JOIN joins tables by creating an in-memory bitmap of one of the tables and then using a hashing function to locate the join rows in the second table.

In the following query, the COMPANY and SALES are joined based on their common COMPANY_ID column.

Example

select COMPANY.Name from COMPANY, SALES where COMPANY.Company_ID = SALES.Company_ID and SALES.Period_ID =3 and SALES.Sales_Total>1000;

Execution Plan

HASH JOIN
TABLE ACCESS FULL SALES
TABLE ACCESS FULL COMPANY

Interpreting the Execution Plan

The Execution Plan shows that the SALES table is used as the first table in the hash join. SALES table will be read into memory. Oracle will use a hashing function to compare the values in COMPANY table to the records that have been read into memory.

When one of the tables is significantly smaller than the other in the join, and the smaller table fits into the available memory area, then the optimizer will generally use a hash join instead of a traditional NESTED LOOPS join. Even if an index is available for the join, a hash join may be preferable to a NESTED LOOPS join.

Pages

Jul 13, 2011

UNDERSTANDING EXPLAIN PLAN 4

Example

Execution Plan

Interpreting the Execution Plan

Example

Execution Plan

Interpreting the Explain Plan

Advanced INTERSECTION Example

Example

Example

Execution Plan

Interpreting the Execution Plan

Example

Example

Execution Plan

Interpreting the Explain Plan

Jul 4, 2011

UNDERSTANDING EXPLAIN PLAN 3

UNDERSTANDING EXPLAIN PLAN 2

Example

Execution Plan

Interpreting the Execution Plan

Example

Execution Plan

Interpreting the Execution Plan

MERGE

Example

Execution Plan

Interpreting the Execution Plan

Example

Execution Plan

Interpreting the Execution Plan

Interpreting the Order of Operations within NESTED LOOPS

Example

Execution Plan

Interpreting the Execution Plan

Translate

Wikipedia