Jul 4, 2011

UNDERSTANDING EXPLAIN PLAN 2


Join Operations
OUTER
An outer join that returns rows from one of the tables, even if there is no matching row in the other table. This is achieved in Oracle using the "(+)" operator in the WHERE clause.
OUTER JOINS are used with CONNECT BY, MERGE JOIN, NESTED LOOPS and HASH JOIN operations. OUTER JOIN enables rows from the driving table to be returned to the calling query even though no matching rows were found in the joined table. The following example is based on the same query illustrated in the NESTED LOOPS topic, using an OUTER JOIN, instead.

Example

select c.Name  from COMPANY  c, SALES  s where c.Company_ID = s.Company_ID (+)  and s.Period_ID = 3  and s.Sales_Total >1000;

Execution Plan

NESTED LOOPS OUTER
TABLE ACCESS FULL SALES
TABLE ACCESS BY ROWID COMPANY
INDEX UNIQUE SCAN COMPANY_PK

Interpreting the Execution Plan

The Execution Plan shows that the SALES table is used as the driving table for the query. For each COMPANY_ID value in SALES, the COMPANY_ID index on the COMPANY table will be checked to see if a matching value exists. Even if a match does not exist, that record is returned to the user via the NESTED LOOPS OUTER join operation.
Note the difference in Explain Plan of below 2 queries .
select D.NN_EMPCODE,VC_EMPNAME,D.NN_MGRCODE from hr_emp_mast d  ,hr_emp_personal dd where D.NN_EMPCODE=DD.NN_EMPCODE(+) and D.NN_EMPCODE=269

select D.NN_EMPCODE,VC_EMPNAME,D.NN_MGRCODE from hr_emp_mast d  ,hr_emp_personal dd where D.NN_EMPCODE=DD.NN_EMPCODE(+) and dD.NN_EMPCODE=269


ANTI
An anti-join that returns all rows in one table that do not have a matching row in the other table. It is typically implemented using a NOT IN sub-query.
An ANTI-JOIN is a query that returns rows in one table that do not match some set of rows from another table. Since this is effectively the opposite of normal join behavior, the term ANTI-JOIN has been used to describe this operation. ANTI-JOINs are usually expressed using a sub-query, although there are alternative formulations.
One method of using an ANTI-JOIN query is to combine the IN operator with the NOT operator. This method works well when using the cost-based optimizer.
The rule-based optimizer method of using an ANTI-JOIN query is to use it with the NOT EXISTS operator in place of NOT IN. This method uses the WHERE clause in the sub-query.
Note: When Oracle is using rule-based optimization, avoid using NOT IN to perform an anti-join. Use NOT EXISTS instead.
You can also implement the ANTI-JOIN operation as an OUTER JOIN. An OUTER JOIN includes NULL for rows in the inner table, which have no match in the outer table. This feature can be used to include only rows that have no match in the inner table. However, the most efficient implementation is to use HASH JOIN and its hints.
To take advantage of Oracle’s ANTI-JOIN optimizations, the following must be true:
·         Cost-based optimization must be enabled.
·         The ANTI-JOIN columns used must not be NULL. This either means that they are not NULL in the table definition, or an IS NOT NULL clause appears in the query for all the relevant columns.
·         The subquery is not correlated.
·         The parent query does not contain an Or clause.
·         The database parameter ALWAYS_ANTI_JOIN is set to either MERGE or HASH or a MERGE_AJ or HASH_AJ hint appears within the sub-query.
SEMI
A semi-join that returns rows from a table which have matching rows in a second table but which does not return multiple rows if there are multiple matches. This is usually expressed using a WHERE EXISTS sub-query.

     A SEMI JOIN is a join which returns rows from a table which have matching rows in a second table but which does not return multiple rows if there are multiple matches. This is usually expressed in Oracle using a WHERE EXISTS sub-query.

CARTESIAN
Every row in one result set is joined to every row in the other result set.
A join with no join condition results in a CARTESIAN product, or a cross product. A CARTESIAN product is the set of all possible combinations of rows drawn from each table. In other words, for a join of two tables, each row in one table is matched with every row in the other. A CARTESIAN product for more than two tables is the result of pairing each row of one table with every row of the Cartesian product of the remaining tables.
All other kinds of joins are subsets of CARTESIAN products effectively created by deriving the CARTESIAN product and then excluding rows that fail the join condition.
Note: When using the ORDERED hint, it is important that the tables in the FROM clause are listed in the correct order to prevent CARTESIAN joins.
Hint: Consider using Oracle’s STAR query optimization when joining a very large "fact" table to smaller, unrelated "dimension" tables. You will need a concatenated index on the fact table and may need to specify the STAR hint.

OUTER, ANTI, SEMI, CARTESIAN WILL COME WITH CONNECT BY, MERGE JOIN, NESTED LOOP, HASH JOIN.


CONNECT BY
A hierarchical self-join is performed on the output of the preceding steps.
CONNECT BY does a recursive join of a table to itself, in a hierarchical fashion.

Example

select Company_ID, Name from COMPANY where State = ‘VA’ connect by Parent_Company_ID = prior Company_ID start with Company_ID = 1;
The query shown in the preceding statement selects companies from the COMPANY in a hierarchical fashion; that is, it returns the rows based on each Company’s parent company. If there are multiple levels of company parentage, those levels display in the report.

Execution Plan

FILTER
CONNECT BY
INDEX UNIQUE SCAN COMPANY_PK
TABLE ACCESS BY ROWID COMPANY
TABLE ACCESS BY ROWID COMPANY
INDEX RANGE SCAN COMPANY$PARENT

Interpreting the Execution Plan

The plan shows that first the COMPANY_PK index is used to find the root node (Company_ID = 1), then index on the Parent_Company_ID column is used to provide values for queries against the Company_ID column in an iterative fashion. After the hierarchy of Company_IDs is complete, the FILTER operation—the WHERE clause related to the STATE value—is applied. Notice that the query does not use the index on the STATE column, although it is available and the column is used in the WHERE clause.

MERGE


MERGE JOIN
A MERGE JOIN performed on the output of the preceding steps
MERGE JOIN joins tables by merging sorted lists of records from each table. It is effective for large batch operations, but may be ineffective for joins used by transaction-processing applications. MERGE JOIN is used whenever Oracle cannot use an index while conducting a join. In the following example, all of the tables are fully indexed. So the example deliberately disables the indexes by adding 0 to the numeric keys during the join to force a merge join to occur.

Example

select COMPANY.Name from COMPANY, SALES where COMPANY.Company_ID+0 = SALES.Company_ID+0 and SALES.Period_ID =3 and SALES.Sales_Total>1000;

Execution Plan

MERGE JOIN
SORT JOIN
TABLE ACCESS FULL SALES
SORT JOIN
TABLE ACCESS FULL COMPANY

Interpreting the Execution Plan

There are two potential indexes that could be used by a query joining the COMPANY table to the SALES table. First, there is an index on COMPANY.COMPANY_ID - but that index cannot be used because of the +0 value added to it (disabling indexes is described in detail in the Top SQL Tuning Tips topic). Second, there is an index whose first column is SALES.COMPANY_ID - but that index cannot be used, for the same reason.
As shown in the plan, Oracle will perform a full table scan (TABLE ACCESS FULL) on each table, sort the results (using the SORT JOIN operation), and merge the result sets. The use of merge joins indicates that indexes are either unavailable or disabled by the query’s syntax.

NESTED LOOPS
A nested loops join is performed on the preceding steps. For each row in the upper result set, the lower result set is scanned to find a matching row.
NESTED LOOPS joins table access operations when at least one of the joined columns is indexed.

Example

select COMPANY.Name  from COMPANY, SALES where COMPANY.Company_ID = SALES.Company_ID and SALES.Period_ID =3 and SALES.Sales_Total>1000;

Execution Plan

NESTED LOOPS
TABLE ACCESS FULL SALES
TABLE ACCESS BY ROWID COMPANY
INDEX UNIQUE SCAN COMPANY_PK

Interpreting the Execution Plan

The Execution Plan shows that the SALES table is used as the driving table for the query. During NESTED LOOPS joins, one table is always used to drive the query. The Implications of the Driving Table in a NESTED LOOPS Join topic provides tuning guidance on the selection of a driving table for a NESTED LOOPS operation.
For each COMPANY_ID value in the SALES table, the COMPANY_ID index on the COMPANY table will be checked to see if a matching value exists. If a match exists, the record is returned to the user via the NESTED LOOPS operation.
There are several important things to note about this query:
·         Although all of the primary key columns in the SALES table were specified in the query, the SALES_PK index was not used. The SALES_PK index was not used because there was not a limiting condition on the leading column (the COMPANY_ID column) of the SALES_PK index. The only condition on SALES.COMPANY_ID is a join condition.
·         The optimizer could have selected either table as the driving table. When the COMPANY table is the driving table, Oracle performs a full table scan.
·         In rule-based optimization, when there is equal chance of using an index regardless of the choice of the driving table, the driving table will be the one that is listed last in the FROM clause.
·         In cost-based optimization, the optimizer will consider the size of the tables and the selectivity of the indexes while selecting a driving table.

Interpreting the Order of Operations within NESTED LOOPS

NESTED LOOPS operations pose a special challenge when reading the output from PLAN_TABLE. Given the Explain path shown in the following listing, it appears that the first step in the Explain path is the scan of the COMPANY_PK index, since that is the innermost step of the Explain path.
NESTED LOOPS
TABLE ACCESS FULL SALES
TABLE ACCESS BY ROWID COMPANY
INDEX UNIQUE SCAN COMPANY_PK
Despite its placement as the innermost step, the scan of the COMPANY_PK index is not the first step in the Explain path. A NESTED LOOPS join needs to be driven by a row source (such as a full table scan or an index scan) - so to determine the first step within a NESTED LOOPS join, you need to determine which operations directly provide data to the NESTED LOOPS operation. In this example, two operations provide data directly to the NESTED LOOPS operation - the full table scan of SALES, and the ROWID access of the COMPANY table.
Of the two operations that provide data to the NESTED LOOPS operation, the full table scan of SALES is listed first. Therefore, within the NESTED LOOPS operation, the order of operations is:
1.      The full table scan of SALES.
2.      For each record in SALES, access COMPANY by Company_ID. Since an index (COMPANY_PK) is available on COMPANY.Company_ID, use that index via a unique scan.
3.      For each ROWID returned from the COMPANY_PK index, access the COMPANY table (to get the NAME value, as requested by the query).
When reading the Explain path for a NESTED LOOPS operation, you need to look first at the order of the operations that directly provide data to it, and determine their order. 
.
HASH JOIN
A HASH JOIN is performed of two row sources.
HASH JOIN is one of the algorithms that Oracle can use to join two tables.
In a HASH JOIN a hash table, an on-the-fly index, is constructed for the larger of the two tables. The smaller table is then scanned, and the hash table used to find matching rows in the larger table.
HASH JOIN joins tables by creating an in-memory bitmap of one of the tables and then using a hashing function to locate the join rows in the second table.
In the following query, the COMPANY and SALES are joined based on their common COMPANY_ID column.

Example

select COMPANY.Name from COMPANY, SALES where COMPANY.Company_ID = SALES.Company_ID and SALES.Period_ID =3 and SALES.Sales_Total>1000;

Execution Plan

HASH JOIN
TABLE ACCESS FULL SALES
TABLE ACCESS FULL COMPANY

Interpreting the Execution Plan

The Execution Plan shows that the SALES table is used as the first table in the hash join. SALES table will be read into memory. Oracle will use a hashing function to compare the values in COMPANY table to the records that have been read into memory.
When one of the tables is significantly smaller than the other in the join, and the smaller table fits into the available memory area, then the optimizer will generally use a hash join instead of a traditional NESTED LOOPS join. Even if an index is available for the join, a hash join may be preferable to a NESTED LOOPS join.

Jun 29, 2011

UNDERSTANDING EXPLAIN PLAN 1



 UNIQUE
Sorts to eliminate duplicate rows. This typically occurs as a result of using the DISTINCT clause.


SORT UNIQUE sorts result sets and eliminates duplicate records prior to processing with the MINUS, INTERSECTION and UNION operations.

Example

A MINUS operation will be used in this example, although the SORT UNIQUE operation is also used in the INTERSECTION and UNION operation.
select Company_ID  from COMPANY
MINUS
select Company_ID from COMPETITOR;

Execution Plan

PROJECTION
MINUS
SORT UNIQUE
TABLE ACCESS FULL COMPANY
SORT UNIQUE
TABLE ACCESS FULL COMPETITOR

Interpreting the Execution Plan

The Execution Plan shows that after each of the queries is separately resolved (by the TABLE ACCESS FULL operations), the records are passed to the SORT UNIQUE operation prior to being input into the MINUS operation. The SORT UNIQUE operation sorts the records and eliminates any duplicates, then sends the records to the MINUS operation.
SAME IN THE CASE FOR INTERSECT ALSO.
FOR  UNION
select Company_ID  from COMPANY
UNION
select Company_ID from COMPETITOR;

Execution Plan

PROJECTION
SORT UNIQUE
UNION -ALL
TABLE ACCESS FULL COMPANY
TABLE ACCESS FULL COMPETITOR
For UnionAll operation there will not be SORT-UNIQUE OPERATION IN EXPLAIN PLAN
GROUP BY
Sorts a result set to group it for the GROUP BY clause.
SORT GROUP BY performs grouping functions on sets of records.

Example

select Zip, COUNT(*) from COMPANY group by Zip;

Execution Plan

SORT GROUP BY
TABLE ACCESS FULL COMPANY
GROUP BY  NOSORT
GROUP BY clause that does not require a sort operation.
One cause of sorting is when indexes are created . Creating an index for a table involves sorting all of the rows in a table based on the values of the indexed columns. Oracle also allows you to create indexes without sorting, using the SORT GROUP BY NOSORT operation. When the rows in the table are loaded in ascending order, you can create the index faster without sorting.

NOSORT Clause

To create an index without sorting, load the rows into the table in ascending order of the indexed column values. Your operating system may provide a sorting utility to sort the rows before you load them. When you create the index, use the NOSORT clause on the CREATE INDEX statement. For example, this CREATE INDEX statement creates the index EMP_INDEX on the ENAME column of the emp table without sorting the rows in the EMP table:
CREATE INDEX emp_index ON emp(ename)  NOSORT;

When to Use the NOSORT Clause

Presorting your data and loading it in order may not always be the fastest way to load a table. When you have a multiple-CPU computer, you may be able to load data faster using multiple processors in parallel, each processor loading a different portion of the data. To take advantage of parallel processing, load the data without sorting it first. Then create the index without the NOSORT clause. When you have a single-CPU computer, you should sort your data before loading, when possible. Then create the index by using the NOSORT clause.

GROUP BY NOSORT

Sorting can be avoided when performing a GROUP BY operation when you know that the input data is already ordered, so that all rows in each group are clumped together. This may be the case when the rows are being retrieved from an index that matches the grouped columns, or when a sort-merge join produces the rows in the right order. ORDER BY sorts can be avoided in the same circumstances. When no sort takes place, the Explain Plan output indicates GROUP BY NOSORT.
GROUP BY ROLLUP
GROUP BY clause that includes the ROLLUP option.
SORT GROUP BY ROLLUP enables a SELECT statement to calculate multiple levels of subtotals across a specified group of dimensions. It also calculates a grand total. ROLLUP is a simple extension to the GROUP BY clause, so its syntax is extremely easy to use. The ROLLUP extension is highly efficient, adding minimal overhead to a query. For example, ROLLUP appears in the GROUP BY clause in a SELECT statement. ROLLUP creates subtotals at n+1 levels, where n is the number of grouping columns. For instance, if a query specifies ROLLUP on grouping columns of Time, Region, and Department (n=3), the result set will include rows at four aggregation levels.
GROUP BY CUBE
GROUP BY clause that includes the CUBE option.

The subtotals created by ROLLUP represent only a fraction of the possible subtotal combinations. The easiest way to generate the full set of subtotals needed for cross-tabular reports is to use the CUBE extension.
CUBE enables a SELECT statement to calculate subtotals for all of the possible combinations of a group of dimensions. It also calculates a grand total. This is the set of information typically needed for all cross-tabular reports, so CUBE can calculate a cross-tabular report with a single SELECT statement. Like ROLLUP, CUBE is a simple extension to the GROUP BY clause. When n columns are specified for a CUBE, there will be 2n combinations of subtotals returned.
 .
Index Operations
 .
AND-EQUAL
Combines the results from one or more index scans.
INDEX
Indicates an index lookup.

The following INDEX options are available:
Option
Description
Where Clause Example
SINGLE VALUE
Access a single value in the index and return a bitmap for all the matching rows.
Where State = ‘MD’
FULL SCAN
A complete scan of the index to find any matching values.
Where State not in (‘HI’, ‘AL’)
RANGE SCAN
Access a range of values in the index and return multiple bitmaps. These bitmaps are then merged into one bitmap.
Where City like ‘New*’
 .
INDEX UNIQUE SCAN
An index lookup that returns the address (ROWID) of only one row.
INDEX UNIQUE SCAN, which selects a unique value from a unique index, is the most efficient method of selecting a row from known field values.
Each unique index access is built from a separate access into the index’s B*-tree structure, drilling down from the index root to the leaf blocks. On average, three blocks are read to fulfill the unique index access.

Example

select Name, City, State from COMPANY where Company_ID = 12345;

Execution Plan

TABLE ACCESS BY ROWID COMPANY
INDEX UNIQUE SCAN COMPANY_PK

Interpreting the Execution Plan

The query uses the COMPANY_ID column as the sole criteria in its WHERE clause. Since COMPANY_ID is the primary key of the COMPANY table, it has a unique index associated with it. The unique index for the COMPANY_ID primary key is named COMPANY_PK.
During the query, the COMPANY_PK index is scanned for one COMPANY_ID value (12345). When the COMPANY_ID value is found, the ROWID associated with that COMPANY_ID is used to query the COMPANY table. 
.
RANGE SCAN
Returns the ROWID of more than one row. This can occur because the index is non-unique or because a range operator (>) was used. Indexed values are scanned in ascending order.
 .
INDEX RANGE SCAN selects a range of values from an index; the index can be either unique or non-unique. Range scans are used when one of the following conditions are met:
·         A range operator (such as < or >) is used.
·         The BETWEEN clause is used.
·         A search string with a wildcard is used (such as A*).
·         Only part of a concatenated index is used (such as by using only the leading column of a two-column index).
·         The access to the range of values within the index starts with an index search for the first row that is included in the range. After the first row has been located, there is a "horizontal" scan of the index blocks until the last row inside the range is found.
·         Note: The efficiency of an INDEX RANGE SCAN is directly related to two factors: (1) the number of keys in the selected range (the more values, the longer the search), (2) the condition of the index (the more fragmented, the longer the search).
The access to the range of values within the index starts with an index search for the first row that is included in the range. After the first row has been located, there is a "horizontal" scan of the index blocks until the last row inside the range is found.
Note: The efficiency of an INDEX RANGE SCAN is directly related to two factors: (1) the number of keys in the selected range (the more values, the longer the search), (2) the condition of the index (the more fragmented, the longer the search).

Example

select Name, City, State  from COMPANY where City > ‘Roanoke’;

Execution Plan

TABLE ACCESS BY ROWID COMPANY
INDEX RANGE SCAN COMPANY$CITY

Interpreting the Execution Plan

The Execution Plan shows that the index on the City column is used to find ROWIDs in the COMPANY table that satisfies the limiting condition on the City value. Since a range of values is specified City > ‘Roanoke’, an INDEX RANGE SCAN is performed. The first value that falls within the range is found in the index; the rest of the index is then searched for the remaining values. For each matching value, the ROWID is recorded. The ROWIDs from the INDEX RANGE SCAN are used to query the COMPANY table for the Name and State values. 
.
RANGE SCAN (MIN/MAX)
Finds the highest or lowest index entry in the range.

RANGE SCAN DESCENDING
Retrieves one or more ROWIDs from an index. Indexed values are scanned in descending order.
FULL SCAN
Scans every entry in the index in key order.


.


Reading rows in key order requires a block-by-block full scan of the index, which is incompatible with the Fast Full Scan. Although the fast full scan is much more efficient than the "normal" full index scan, the fast full scan does not return rows in index order.
Although using an index can eliminate the need to perform a sort, the overhead of reading all the index blocks and all the table blocks may be greater than the overhead of performing the sort. However, using the index should result in a faster retrieval of the first row since as soon as the row is retrieved it may be returned, whereas the sort approach will require that all rows be retrieved before the first row is returned. As a result, the cost based optimizer will tend to use the index if the optimizer goal is FIRST ROWS, but will choose a full table scan if the goal is ALL ROWS.
A way of avoiding both sort and table lookup overhead is to create an index which contains all the columns in the select list as well as the columns in the ORDER BY clause. Oracle can then resolve the query by using an index lookup alone.
Using an index to avoid a sort will lead to vastly superior response time (time to retrieve the first row) but much poorer throughput (time to retrieve the last row).
.
FULL SCAN (MIN/MAX)
Finds the highest or lowest index entry.

FULL SCAN DESCENDING
Finds one or more index entries. Index entries are scanned in descending order.

FAST FULL SCAN
Scans every entry in the index in block order, possibly using multi-block read.
 .
There are many examples in which an index alone has been used to resolve a query. Providing all the columns needed to resolve the query are in the index, there is no reason why Oracle cannot use the index alone to generate the result set.
The FAST FULL INDEX SCAN operation improves the efficiency of queries that can be resolved by reading an entire index. FAST FULL INDEX SCAN offers some significant advantages over other index scan methods, as follows:
·         In an index range scan or full index scan, index blocks are read in key order, one at a time. In a full fast scan, blocks are read in the order in which they appear on disk. Oracle is able to read multiple blocks in a single I/O - depending on the value of the server parameter DB_FILE_MULTIBLOCK_READ_COUNT (multi-block reads are discussed further later in this chapter).
·         The fast full index scan can be performed in parallel, while an index range scan or full index scan can only be processed serially. That is, Oracle can allocate multiple processes to perform a fast full index scan, but can only use a single process for traditional index scans.
Although a full table scan can use parallelism and multi-block read techniques, the number of blocks in a table will typically be many times the number of blocks in an index. The fast full index scan will therefore usually outperform an equivalent full table scan.
·         You can consider a fast full index scan in the following circumstances:
·         All the columns required to satisfy the query are included in the index.
·         At least one of the columns in the index is defined as NOT NULL.
·         The query will return more than 10-20* of the rows in the index.
·         The cost based optimizer can use the fast full scan as it sees fit unless you have FAST_FULL_SCAN_ENABLED=FALSE or V733_PLANS_ENABLED=FALSE (depending on your version of Oracle).
The Index fast full scan can take advantage of optimizations normally only available to table scans, such as multi-block read and parallel query. Counting the number of rows in a table is a perfect application for the fast full scan because there will almost always be an index on a NOT NULL column which could be used to resolve the query.
.
When you are using an index to optimize a GROUP BY, a fast full index scan solution will probably result in better throughput, while a index full scan solution will probably result in better response time. When you need to scan your Index Organized table, it is essential that you take advantage of the fast full index scan. Without the fast full index scan, you will be unable to use multi-block reads or exploit parallel query capabilities.
Note: Fast full scan is disabled by default, but it is possible to enable it in Oracle by setting FAST_FULL_SCAN_ENABLED to True. Make sure that you do not inadvertently try to scan Index Organized tables with fast full scans disabled.
The fast full index scan can provide a powerful alternative to the full table scan when the query references only columns in the index.
 .
.
DOMAIN INDEX
Retrieves one or more ROWIDs from a user-defined index.
DOMAIN INDEX is a user-defined index typically created on complex datatypes whose algorithms and optimizer characteristics are provided by the user. DOMAIN INDEXES are created using the Oracle Data Cartridge Interface API.
You can use the Oracle Explain Plan to derive user-defined CPU and I/O costs for domain indexes. The Oracle Explain plan displays these statistics in the OTHER column of PLAN_TABLE.
For example, assume table EMP has user-defined operator CONTAINS with a Domain Index EMP_RESUME on the resume column, and the index type of EMP_RESUME supports the operator CONTAINS.

Example

SELECT * FROM emp WHERE CONTAINS(resume, 'Oracle') = 1

Execution Plan

OPERATION
OPTIONS
OBJECT_NAME
OTHER
SELECT STATEMENT
TABLE ACCESS 
DOMAIN INDEX
 
BY ROWID
 
EMP
EMP_RESUME
 
CPU: 300, I/O: 4