Level: Introductory Scott Hayes (scott@database-guys.com), President, Database-Guys Inc.
09 Mar 2004 Every few weeks or so, we get performance calls from DBAs in distress. "Our Web site is crawling," they bemoan. "We're losing customers, and the situation is dire. Can you help?" To answer these calls, I've developed an analytics process for my consulting company that letsus rapidly find the cause of performance problems and develop remedial actions and tuning suggestions. Rarely do these callers inquire about fees or costs - they're only interested in stopping the bleeding. When a DB2 or e-business application isn't performing up to expectations, the entire organization and financial bottom line measurably suffers.
Reprinted with permission from DB2 Magazine.
To help DB2 DBAs
avoid performance fires and achieve high performance on their own,
I've summarized a troubleshooting process for our clients,
customers, and fellow DB2 professionals. The top 10 performance
tips for e-business OLTP applications in DB2 UDB for Unix, Windows,
and OS/2 environments are explained in detail below -- and summarized at the end of this article.
10. Monitor
switches
Make sure the
monitor switches are turned on. If they aren't, you won't have
access to the performance information you need. To turn the monitor
switches on, issue the command:
db2 "update monitor switches using
lock ON sort ON bufferpool ON uow ON
table ON statement ON" |
9.
Agents
Make sure there are
enough DB2 agents to handle the workload. To find out, issue the
command:
db2 "get snapshot for database manager"
|
and look for the
following lines:
High water mark for agents registered = 7
High water mark for agents waiting for a token = 0
Agents registered= 7
Agents waiting for a token= 0
Idle agents= 5
Agents assigned from pool= 158
Agents created from empty Pool = 7
Agents stolen from another application= 0
High water mark for coordinating agents= 7
Max agents overflow= 0
|
If you find either
Agents waiting for a token or
Agents stolen from another application,
increase the number of agents available to the database manager
(MAXAGENTS and/or MAX_COORDAGENTS as appropriate).
8. Maximum open files
DB2 tries to be a
good citizen within the constraints of operating system resources.
One of its "good citizen" acts is putting a ceiling, or upper
limit, on the maximum number of files open at any one time. The
MAXFILOP database configuration
parameter stipulates the maximum number of files that DB2 can have
open concurrently. After it reaches that point, DB2 will start
closing and opening its tablespace files (including raw devices).
Opening and closing files slows SQL response times and burns CPU
cycles. To find out if DB2 is closing files, issue the
command:
db2 "get snapshot for database on DBNAME"
|
and look for the
line that reads:
Database files closed = 0
|
If files are being
closed, increase the value of MAXFILOP
until the opening and closing stops. Use the command:
db2 "update db cfg for DBNAME using MAXFILOP N"
|
7.
Locks
The default value
for LOCKTIMEOUT is -1, which means that
there will be no lock timeouts - a situation that can be
catastrophic for OLTP applications. Nevertheless, I all too
frequently find many DB2 users with LOCKTIMEOUT = -1. Set LOCKTIMEOUT to a very short value, such as 10 or
15 seconds. Waiting on locks for extended periods of time can have
an avalanche effect on locks.
First, check the
value of LOCKTIMEOUT with this
command:
db2 "get db cfg for DBNAME"
|
and look for the
line containing this text:
Lock timeout (sec) (LOCKTIMEOUT) = -1
|
If the value is -1,
consider changing it to 15 seconds by using the following command
(be sure to consult with the application developers or your vendor
first to make sure the application is prepared to handle lock
timeouts):
db2 "update db cfg for DBNAME using LOCKTIMEOUT 15"
|
You should also
monitor the number of lock waits, lock wait time, and amount of
lock list memory in use. Issue the command:
db2 "get snapshot for database on DBNAME"
|
Look for the
following lines:
Locks held currently= 0
Lock waits= 0
Time database waited on locks (ms)= 0
Lock list memory in use (Bytes)= 576
Deadlocks detected= 0
Lock escalations= 0
Exclusive lock escalations= 0
Agents currently waiting on locks= 0
Lock Timeouts= 0
|
If the Lock list memory in use (Bytes) exceeds 50 percent
of the defined LOCKLIST size, then
increase the number of 4K pages in the LOCKLIST database configuration.
6. Temporary
tablespaces
Temporary
tablespaces should have at least three containers on three
different disk drives in order to help DB2 perform parallel I/O and
improve performance for sorts, hash joins, and other database
activities that use TEMPSPACE.
To find out how
many containers your temporary tablespace has, issue the
command:
db2 "list tablespaces show detail"
|
Look for the
TEMPSPACE tablespace definition similar
to this example:
Tablespace ID= 1
Name= TEMPSPACE1
Type= System managed space
Contents= Temporary data
State= 0x0000
Detailed explanation:Normal
Total pages= 1
Useable pages= 1
Used pages= 1
Free pages= Not applicable
High water mark (pages)= Not applicable
Page size (bytes)= 4096
Extent size (pages)= 32
Prefetch size (pages)= 96
Number of containers= 3
|
Notice that
Number of containers has the value 3,
and that Prefetch size is three times
Extent size. For best parallel I/O
performance, it is important for Prefetch
size to be a multiple of Extent
size. The multiple should be equal to the number of
containers.
To find the
definitions for the containers, issue the command:
db2 "list tablespace containers for 1 show detail"
|
The 1 refers to
tablespace ID #1, which is TEMPSPACE1 in the example just given.
5. Sort
memory
OLTP applications
should not be performing large sorts. They are too costly in terms
of CPU, I/O, and elapsed time and will slow down any OLTP
application. Therefore, the default SORTHEAP size of 256 4K pages (1MB) should be more
than adequate. You should also know the number of sort overflows
and the number of sorts per transaction.
Issue the
command:
Db2 "get snapshot for database on DBNAME"
|
and look for the
following lines:
Total sort heap allocated= 0
Total sorts = 1
Total sort time (ms)= 8
Sort overflows = 0
Active sorts = 0
Commit statements attempted = 3
Rollback statements attempted = 0
Let transactions = Commit statements attempted + Rollback
statements attempted
Let SortsPerTX= Total sorts / transactions
Let PercentSortOverflows = Sort overflows * 100 / Total sorts
|
If PercentSortOverflows ((Sort overflows * 100) / Total
sorts ) is greater than 3 percent, there may be serious and
unexpected sort problems in the application SQL. Because the very
presence of overflows indicates that large sorts are occurring,
finding zero sort overflows, or at least a percentage less than
one, would be ideal.
If excessive sort
overflows are present, the "band aid" solution is to increase the
size of SORTHEAP. However, doing so
only masks the real performance problem. Instead, you should
identify the SQL that is causing the sorts and change the SQL,
indexes, or clustering to avoid or reduce the sort cost.
If SortsPerTX is greater than 5 (as a rule of thumb),
the number of sorts per transaction may be high. Some application
transactions perform dozens of small composite sorts (which do not
overflow and have very short durations), but consume excessive CPU.
When SortsPerTX is high, my experience
indicates that these machines are typically CPU bound. Identifying
the SQL that is causing the sorts and improving the access plans
(via indexes, clustering, or SQL changes) is paramount to improving
transaction throughput rates.
4. Table access
For each table,
identify how many rows DB2 is reading for each transaction. You
must issue two commands:
db2 "get snapshot for database on DBNAME"
|
db2 "get snapshot for tables on DBNAME"
|
After you issue the
first command, determine how many transactions have occurred (by
taking the sum of Commit statements
attempted plus Rollback statements
attempted - see Tip 3).
After issuing the
second command, divide the number of rows read by the number of
transactions (RowsPerTX). OLTP
applications should typically read one to 20 rows from each table
per transaction. If you discover that hundreds or thousands of rows
are being read for each transaction, scans are taking place and
indexes may need to be created. (Sometimes simply running runstats
with distribution and detailed indexes all provides a
cure.)
Sample output from
"get snapshot for tables on DBNAME"
follows:
Snapshot timestamp = 09-25-2000
4:47:09.970811
Database name= DGIDB
Database path= /fs/inst1/inst1/NODE0000/SQL00001/
Input database alias= DGIDB
Number of accessed tables= 8
Table List
Table Schema= INST1
Table Name= DGI_
SALES_ LOGS_TB
Table Type= User
Rows Written= 0
Rows Read= 98857
Overflows= 0
Page Reorgs= 0
|
A high number of
Overflows probably means you need to reorganize the table.
Overflows occur when DB2 must locate a row on a suboptimal page due
to a change in a row's width.
3. Tablespace analysis
A tablespace
snapshot can be extremely valuable to understanding what data is
being accessed and how. To get one, issue the command:
db2 "get snapshot for tablespaces on DBNAME"
|
For each
tablespace, answer the following questions:
- What is the
average read time (ms)?
- What is the
average write time (ms)?
- What percentage of
the physical I/O is asynchronous (prefetched) vs. synchronous
(random)?
- What are the
buffer pool hit ratios for each tablespace?
- How many physical
pages are being read each minute?
- How many physical
and logical pages are being read for each transaction?
For all
tablespaces, answer the following questions:
- Which tablespaces
have the slowest read and write times? Why? Containers on slow
disks? Are container sizes unequal?
- Are the access
attributes, asynchronous versus synchronous access, consistent with
expectations? Randomly read tables should have randomly read
tablespaces, meaning high synchronous read percentages, usually
higher buffer pool hit ratios, and lower physical I/O
rates.
For each
tablespace, make sure that the prefetch size is equal to the extent
size multiplied by the number of containers. Issue the
command:
db2 "list tablespaces show detail"
|
The prefetch size
can be altered for a given tablespace if necessary. Container
definitions can be checked by using the command:
db2 "list tablespace containers for N show detail"
|
in which N is the
tablespace ID number.
2. Buffer pool
optimization
All too often I
find DB2 UDB sites where the machines have 2, 4, or 8GB of memory,
yet the DB2 database has one buffer pool, IBMDEFAULTBP, which is only 16MB in
size!
If this is the case
at your site, create a buffer pool for the SYSCATSPACE catalog
tablespace, one for the TEMPSPACE
tablespace, and at least two more buffer pools: BP_RAND and BP_SEQ.
Tablespaces that are accessed randomly should be assigned to a
buffer pool with random objectives, BP_RAND. Tablespaces that are accessed
sequentially (with asynchronous prefetch I/O) should be assigned to
a buffer pool with sequential objectives, BP_SEQ. You can create additional buffer pools
depending on performance objectives for certain transactions; for
example, you could make a buffer pool large enough to store an
entire "hot," or very frequently accessed, table. When large tables
are involved, some DB2 users have great success placing the indexes
for important tables into an index, BP_IX, buffer pool.
Buffer pools that
are too small result in excessive, unnecessary, physical I/O.
Buffer pools that are too large put a system at risk for operating
system paging and consume unnecessary CPU cycles managing the
overallocated memory. Somewhere between "too small" and "too large"
lies the size that is just right. The right size exists where the
point of diminishing returns is reached. If you're not using a tool
to automate the diminishing returns analysis, you should
scientifically test buffer pool performance (hit ratios, I/O times,
physical I/O read rates) at incremental sizes until an optimum size
is reached. Because businesses constantly change and grow, the
"optimum size" decision should be reevaluated
periodically.
1. SQL cost
analysis
One bad SQL
statement can ruin your whole day. Time and time again I've seen a
single, relatively simple SQL statement bring a finely tuned
database and machine to its knees. For many of these statements,
there isn't a DB2 UDB configuration parameter under the sun (or in
the doc) that can make right the high cost of an errant SQL
statement.
Making matters
worse, the DBA's hands are frequently tied: You can't change the
SQL can because it's provided by an application vendor (such as
SAP,
PeopleSoft,
or Siebel).
This leaves the DBA three courses of action:
1. Change or add
indexes
2. Change
clustering
3. Change catalog
statistics
What's more,
today's robust applications are made up of hundreds or thousands of
different SQL statements. These statements are executed at varying
rates of frequency depending on application functionality and the
business needs du jour. A SQL statement's true cost is the resource
cost to execute it once multiplied by the number of times it is
executed.
The monumental task
that confronts each DBA is the challenge of identifying the SQL
statements with the highest "true cost," and working to reduce the
costs of these statements.
You can find out
the resource cost to execute a SQL statement once from native DB2
Explain utilities, a number of tools from third-party vendors, or
the DB2 UDB SQL Event Monitor data. But the frequency of statement
execution can only be learned through careful and time-consuming
analysis of DB2 UDB SQL Event Monitor data.
In researching
problem SQL statements, the standard procedure used by DBAs
is:
1. Create an SQL
Event Monitor, write to file:
$> db2 "create event monitor SQLCOST for statements write to ..."
|
2. Activate the
event monitor (be sure ample free disk space is
available):
$> db2 "set event monitor SQLCOST state = 1"
|
3. Let the
application run.
4. Deactivate the
event monitor:
$> db2 "set event monitor SQLCOST state = 0"
|
5. Use the
DB2-supplied db2evmon tool to format the raw SQL Event Monitor data
(hundreds of megabytes of free disk space may be required depending
on SQL throughput rates):
$> db2evmon -db DBNAME -evm SQLCOST > sqltrace.txt
|
6. Browse through
the formatted file scanning for unusually large cost numbers, a
time-consuming process:
7. Undertake a more
complete analysis of the formatted file that attempts to identify
unique statements (independent of literal values), each unique
statement's frequency (how many times it occurred), and the
aggregate of its total CPU, sort, and other resource costs. Such a
thorough analysis could take a week or more on just a 30-minute
sample of application SQL activity.
To reduce the time
it takes to identify SQL statements with high costs, you can
consider many sources of available information:
- From Tip 4, be sure to compute the number of rows read from
each table per transaction. If the resulting number seems high, the
DBA may be able to identify problem statements by searching the SQL
Event Monitor formatted output for the table name in question (this
will narrow the search, and possibly save some time).
- From Tip 3, be sure to compute the asynchronous read
percentage and physical I/O read rates for each tablespace. If a
tablespace has a very high asynchronous read percentage and way
above average physical I/O read rates, one or more of the tables in
the tablespace is probably being scanned. Query the catalog and
find out which tables are assigned to the suspect tablespaces (one
table per tablespace provides the best performance
instrumentation), then search the SQL Event Monitor formatted
output for the tables. This, too, may help narrow the search for
costly SQL statements.
- Try to look at DB2
Explain information for each of the SQL statements being executed
by the application. However, I've found that high frequency,
lower-cost statements often rob a machine of its capacity and
ability to provide desired performance.
- If analysis time
is short and maximum performance is critical, consider vendor tools
that can quickly automate the process of identifying resource
intensive SQL statements. The SQL-GUY tool from Database-GUYS Inc.
provides an accurate, real-time, proportional, cost rank analysis
of SQL statements.
 |
Staying in
tune
Optimum performance
requires not only eliminating high cost SQL statements, but also
making sure that appropriate physical infrastructures are in place.
Peak performance results when all the tuning knobs are set just
right, memory is allocated to pools and heaps effectively, and I/O
is evenly balanced across disks. Although it takes time to measure
and make adjustments, the DBA who performs these 10 suggestions
will be very successful at satisfying internal and external DB2
customers. As e-businesses change and grow, even the
best-administered database will need regular fine-tuning. The DBA's
job is never done!
The Top 10 at a
glance
- Use sufficient
agents for the workload.
- Do not allow DB2
to needlessly close and open files.
- Do not allow
extended lock waits.
- Ensure parallel
I/O capabilities to the database
TEMPSPACE tablespace.
- Manage DB2 sort
memory conservatively and don't mask sort problems with large
SORTHEAPs.
- Analyze table
access activity and identify tables with unusually high rows read
per transaction or overflow counts.
- Analyze the
performance characteristics of each tablespace, and seek to improve
the performance of the tablespaces with the slowest read times,
longest write times, highest physical I/O read rates, worst hit
ratios, and access attributes that are inconsistent with
expectations.
- Create multiple
buffer pools, and make purposeful assignments of tablespaces to
buffer pools such that access attributes are shared.
- Examine DB2 UDB
SQL Event Monitor information to discover which SQL statements are
consuming the largest proportions of computing resources, and take
corrective actions.
- Reevaluate
configuration and physical design settings once high cost SQL is
eliminated.
Reprinted with
permission from the Spring 2001 issue of DB2 Magazine. Copyright CMP Media.
About the author  | |  | Scott Hayes is
president of Database-Guys Inc., a performance tools and consulting
company specializing in DB2 UDB on UNIX platforms. He is also a
member of the IBM DB2 Gold Consultant Group and a DB2 UDB Advanced
Certified Technical Expert. Scott has worked with DB2/AIX since V1,
and his diverse experiences have taught him a number of DB2 UDB
best practices. A frequent speaker at IDUG and DB2 technical
conferences, Scott has published a number of articles in
DB2-related magazines. For more information, visit www.database-guys.com or write
scott@database-guys.com. |
Rate this page
|