|


Five Performance Hints for Efficient SQL
Welcome to OReview's new performance-tuning column. In this column I
hope to provide regular and reliable performance-related strategies, news,
and information that will supplement - although not replace - the more in-depth
performance-related feature articles that appear periodically in OReview.
In particular, I'll provide:
- advice on new performance-related features of the Oracle server and
tools
- coverage of Oracle performance-optimization techniques at beginner,
intermediate, and advanced levels
- information about undocumented or little-known performance-tuning facilities
Most performance experts agree that tuning application SQL is usually
the most effective way of improving performance, and that SQL tuning is
an essential prerequisite to effective server or operating-system tuning.
Unfortunately, SQL tuning often takes second seat to server tuning because
SQL tuning is less of an exact science and often out of a DBA's control.
I have 10 hints that can help you tune your SQL and improve your application
performance. In this inaugural column, I give you five hints; look for five
more hints in my August column.
Hint #1: Establish a Tuning Environment
It's not uncommon for SQL that works well in a development environment
to exhibit poor performance once it is in a production system. One of the
primary causes of such unpleasant surprises is an inadequate development
or volume-testing environment. In particular, environments without realistic
or representative data volumes are bound to lead to unrealistic SQL performance.
The ideal tuning or development environment is one in which:
- data volumes are realistic or at least proportional - With today's
large production databases, it's often not possible to duplicate production
data volumes exactly. However, you should always be able to construct a
reasonable subset of the data. For example, a 20 percent sample of a 1GB
database may be adequate for performance tuning. You must, however, avoid
the situation in which SQL developers are constructing and testing code
against empty tables - even the proverbial "SQL query from hell"
will run fast in these environments.
- tuning facilities are available - Supply your volume and development
environments with as many tuning tools as you have available. This may
involve third-party SQL tuning tools, but it will at least involve enabling
the default Oracle tools. Make sure that developers know how to use
EXPLAIN
PLAN, SQL_TRACE, and tkprof facilities. Ensure that relevant database
options are effectively set.(See my article"Getting
the Most from the SQL_TRACE Facility,"OTJ,Winter1996)
- documentation is available - Documentation should include the
database design, index details, performance requirements, and volume metrics.
The SQL programmer needs all of this information in order to produce efficient
SQL.
Hint #2: Use SQL Tuning Tools
Oracle provides powerful facilities for measuring the execution of SQL
statements, such as the EXPLAIN PLAN command, which can generate
the execution plan for a SQL statement. To use EXPLAIN PLAN:
- Create the
PLAN_TABLE using the script in $Oracle_HOME/rdbms/admin/utlxplan.sql
(the location might be slightly different under non-Unix hosts).
- Issue an
EXPLAIN PLAN FOR sql_statement statement.
- Display the execution plan for the statement using a query such as:
SELECT rtrim(lpad(' ',2*level)||
rtrim(operation)||' '||
rtrim(options)||' '||
object_name) query_plan
FROM plan_table
CONNECT BY prior id=parent_id
START WITH id=0;
tkprof input_trace_file output_report_file
sort='(prsela,exeela,fchela)'
explain=username/password
Interpreting tkprof output and execution plans is a complex topic that
I can't give adequate coverage here. For more information, see my article
"Getting the Most from the SQL_TRACE Facility,"
OTJ, Winter 1996.
Hint #3: Index Carefully
Indexes exist to improve the performance of SQL statements. In order
to establish the "best" indexes to achieve optimal performance,
you should do the following:
- Index Selectively. A "selective" index is an index
with columns that have a large number of distinct values. Such indexes
are efficient because each index key lookup identifies only a small number
of rows and hence requires less database I/O. The most selective index
is a unique index - each index key points to only one row. It's usually
not worth indexing on columns with only a few distinct values, unless one
of the values is very rare - and even then you may need to implement histograms
(which I discuss below) to use the index appropriately.
- Use Concatenated Indexes. Try not to use two indexes when one
will suffice. For example, if you are searching for
SURNAME
and FIRSTNAME, don't create separate indexes for each column.
Instead, create a concatenated index on both SURNAME and FIRSTNAME.
You can then use the "leading" portion of the concatenated index
on its own; thus, if you sometimes query on the SURNAME column
without supplying the FIRSTNAME, SURNAME should
come first in the index. Note that you can't use the trailing portions
of a concatenated index. For example, the concatenated index on SURNAME
and FIRSTNAME can't help you find all people with a first
name of "Fred."
- Overindex to Avoid a Table Lookup. You can sometimes improve
SQL execution by "overindexing." Overindexing involves
concatenating columns that appear in the SELECT clause to
the index, but not in the WHERE clause. Imagine that you are
searching on SURNAME and FIRSTNAME in order to
find EMPLOYEE_ID. Your concatenated index on SURNAME
and FIRSTNAME will let you quickly locate the row containing
the appropriate EMPLOYEE_ID, but you will need to access both
the index and the table. If there is an index on SURNAME,
FIRSTNAME, and EMPLOYEE_ID, then the query can
be satisfied using the index alone. This technique is particularly useful
when optimizing joins, because intermediate tables in a join are sometimes
queried merely to obtain the join key for the next table.
- Consider Advanced Indexing Options. Oracle's default B-tree
indexes are flexible, efficient, and suitable for the majority of situations
you will encounter. However, Oracle offers a number of alternative indexing
schemes that can improve your performance in specific situations:
- Index clusters let you locate rows from one or more tables in cluster
key order. Clustering tables can result in a substantial improvement in
join performance. However, the performance of table scans of individual
tables in the cluster may be severely degraded. Index clusters are usually
recommended only for tables that are always accessed together. Even then,
you should consider alternatives such as denormalization.
- In hash clusters, the key values are translated mathematically to
a hash value. Rows are stored in the hash cluster based on this hash value.
Locating a row when the hash key is known may require only a single I/O
rather than the two or three I/Os required by an index lookup. However,
you cannot perform range scans of the hash key. Furthermore, if the cluster
is poorly configured or the size of the cluster changes, overflows on the
hash keys may occur or the cluster may become sparsely populated. In the
first case, the performance of hash-key retrieval can degrade; in the second
case, table scans will be less efficient.
- Bitmapped indexes were introduced in Oracle 7.3. These indexes suit
queries on multiple columns that each have only a few distinct values.
They are more compact than a corresponding concatenated index and, unlike
the concatenated index, you may query columns in any combination. However,
bitmapped indexes are not suitable for tables with high modification rates,
because locking of bitmapped indexes occurs at the block level rather than
the row level. Bitmapped indexes are also not suitable for columns with
large numbers of distinct values.
- Ensure that your query uses the best index. Novice SQL programmers
are often satisfied that the execution plan for their SQL statement uses
an index - any index. However, sometimes you have a choice of indexed retrievals,
and the Oracle optimizer - especially the older rule-based optimizer -
will not always choose the best index. Make sure that the indexes selected
by Oracle are the most appropriate, and use hints (which I discuss later
in this article) to change the index if necessary.
Hint #4: Reduce Parsing
Parsing a SQL statement includes the process of validating the SQL and
determining the optimal execution plan. For SQL that has low I/O requirements
but is frequently executed (for example, SQL generated by OLTP-type applications),
reducing the overhead of SQL parsing is extremely important.
When an Oracle session needs to parse an SQL statement, it first looks
for an identical shared SQL statement in the Oracle shared pool. If it cannot
find a matching statement, Oracle will determine the optimal execution plan
for the statement and store the parsed representation in the shared pool.
The process of parsing SQL is CPU-intensive. When I/O is well tuned,
the overhead of parsing a SQL statement can be a significant portion of
the total overhead of executing that statement. You can reduce the parsing
overhead by:
- using bind variables - Bind variables allow the variable part
of a query to be represented by "pointers" to program variables.
If you use bind variables, the text of a SQL statement will not change
from execution to execution, and Oracle will usually find a match in the
shared pool, which dramatically reduces parsing overhead. (For more on
bind variables, see Roger Snowden's article "Application
Tuning, Part 1," OReview, January/February 1997).
- reusing cursors - Cursors (or context areas) are areas of memory
that store the parsed representation of SQL statements. If you execute
the same SQL statement more than once, you can reopen an existing cursor
and completely avoid issuing a parse call. The mechanism of reusing cursors
varies from tool to tool. Table 1 summarizes
the techniques you can use in some popular development tools.
- using a cursor cache - If your development tool makes it difficult
or impossible to reuse cursors, you can instruct Oracle to create a cursor
"cache" for each session using the
SESSION_CACHED_CURSORS
server parameter. If SESSION_CACHED_CURSORS is greater than
zero, Oracle will store recently executed cursors in a cache and, if you
reexecute such a SQL statement, you can avoid reparsing the statement.
Hint #5: Take Advantage of the Cost-Based Optimizer
The Oracle software component that determines the execution plan for
a SQL statement is called the optimizer. Oracle supports two approaches
to query optimization: rule-based and cost-based. The rule-based optimizer
determines the execution plan based on a set of rules that rank various
access paths. For example, an index-based retrieval has a lower rank than
a full table scan; therefore, the rule-based optimizer will use indexes
wherever possible.
The cost-based optimizer determines the execution plan based on an estimate
of the computer resources (the cost) required to satisfy various access
methods. The cost-based optimizer uses statistics, including the number
of rows in a table and the number of distinct values in indexes, to determine
this ideal plan.
Many people's early experiences with the cost-based optimizer in Oracle
7.0 and 7.1 were disappointing, which gave the cost-based optimizer a bad
reputation in some quarters. However, the cost-based optimizer has been
improving with each release of Oracle7, and many advanced SQL access methods,
such as star and hash joins, are only available when you use the cost-based
optimizer.
The cost-based optimizer is your best choice for almost all new projects,
and it may be cost-effective to convert your existing projects from rule-based
to cost-based optimization. Consider the following guidelines for getting
the most from the cost-based optimizer:
In Retrospect
To briefly review this month's SQL tuning hints:
Hint #1. Establish a tuning environment that will enable you to
effectively write and tune efficient SQL. This environment should contain
realistic or representative data volumes, include tuning and diagnostic
tools, and have application design documentation available.
Hint #2. Use the tuning tools that are included in Oracle. In
particular, learn to use EXPLAIN PLAN, SQL_TRACE, and tkprof.
Hint #3. Make effective use of Oracle indexes. Take advantage
of concatenated indexes, overindexing, and "advanced" indexing
options, such as bitmapped indexes and hash and index clusters.
Hint #4. Reduce the overhead of SQL statement parsing by using
bind variables, retaining SQL "cursors," and implementing an Oracle
cursor cache.
Hint #5. Take advantage of the Oracle cost-based optimizer. If
you're still using the older rule-based optimizer, consider switching to
the cost-based optimizer. When using the cost-based optimizer, ensure that
you analyze all of your tables regularly. Also, make sure that OPTIMIZER_MODE
is set appropriately. Using hints and column histograms can improve the
performance of SQL statements that are particularly difficult to optimize.
Next month, I'll look at some of the performance features of Oracle version
8. In August, I'll present a final list of five hints for optimizing SQL,
which will include tips on avoiding accidental table scans, optimizing unavoidable
scans, using array processing, effective locking strategies, and leveraging
features introduced in recent releases of Oracle.
Guy Harrison is an independent Oracle consultant specializing
in Oracle development and performance issues. He is the author of Oracle
SQL High-Performance Tuning (Prentice Hall, 1997). You can contact Guy via
email at gharriso@werple.net.au or at his home page, werple.net.au/~gharriso.
Table 1
How to reuse cursors and reduce parsing in popular development tools.
| Development Environment |
How to Reduce Parsing by Retaining Cursors |
| PowerBuilder |
The SQLCache setting in the DBParm determines the number of cursors to keep
open after execution. The default value is zero, which means that no cursors
are retained. Setting this parameter to 20 or 30 usually reduces parse overhead
significantly without using excessive memory. |
| SQL*Windows |
It's hard to avoid destroying cursors in SQL*Windows, but it is possible
if you:
- Set the
DBP_PRESERVE parameter to TRUE.
- Create separate SQL statement handles for frequently executed SQL statements
with the
SqlConnect call.
- Use
SQLPrepare to associate these handles with the SQL
statements. Make sure you do this only once in your code.
- Use
SQLExecute when you want to execute your SQL statement
or refresh the result set.
|
| Precompilers (Pro*C,Pro*Cobol, and so on) compiler option |
The Precompilers will cache SQL cursors. The number of SQL statements in
the cache is determined by the MAXOPENCURSORS.You can use the
HOLD_CURSOR and RELEASE_CURSOR options to determine
which statements get preference in the cache. |
| Oracle Forms(Developer/2000) |
Oracle Forms automatically retains and reuses cursors effectively. If you
specify the OptimizeTP=NO option on the command line,only SELECT
statements are allocated separate cursors. |
| Oracle Objects for OLE |
Cursors are represented as dynasets in Oracle objects for OLE. As long as
your dynasets are not deallocated, your cursors will be reusable. |
| Oracle Call Interface (OCI) |
In OCI, you have complete control over cursor management. To keep a cursor
in memory, do not close the cursor using the OCLOSE call after
statement execution - this completely deallocates the memory and the SQL
statement. You can reuse the cursor simply by rebinding and reexecuting
the cursor. |
| Oracle Power Objects PL/SQL |
Cursor reuse is transparent and automatic. Cursors are kept in memory as
required. If you specify CLOSE_CACHED_OPEN_CURSORS=TRUE
in an ALTER SESSION statement or in the server's
parameter file, cursors are discarded on COMMIT or ROLLBACK. |
Back to text
Table 2
Some commonly used optimizer hints.
| Hint |
Usage |
ALL_ROWS |
Uses the cost-based optimizer to optimize for the retrieval of all rows. |
CACHE |
Encourages rows retrieved by a full table scan to remain in the buffer cache
of the SGA. |
FIRST_ROWS |
Specifies that the cost-based optimizer should optimize the statement to
reduce the cost of retrieving the first row only. |
FULL(table_name) |
Uses a full table scan to retrieve rows from the specified table. |
INDEX(table_name [index_name]) |
Uses the specified index to retrieve rows from the table or, if no index
is specified, to use any index. |
ORDERED |
Instructs the optimizer to join the tables in exactly the left-to-right
order specified in the FROM clause. |
PARALLEL(table_name, degree_of_parallelism) |
Instructs the optimizer to perform parallel scans on the nominated table.
If no degree of parallelism is specified, the default will be used. |
RULE |
Uses rule-based optimization. |
Back to text
Copyright 1997 Miller Freeman Inc. All Rights Reserved
Redistributed without permission is prohibited.
Questions? Comments? email us at
webmaster@oreview.com
|
|