Thursday, June 14, 2007

Tuning Statspack Oracle 9i/10g

“Perfection is achieved, not when there is nothing left to add, but when there is nothing left to take away.”




Statspack Tuning Oracle9i & 10g

Overview

Statspack, Tools & Scripts that you can still use

Top Waits

Load Profile

Latch Waits

Top SQL

Instance Activity

File I/O

The Future OEM & ADDM

Helpful V$/X$

Tuning in General

Both an Art and a Science

Exceptions often rule the day…Not a “one size fits all”

Hardware & Architecture must be right for your application or difficult to succeed.

Statspack & Enterprise Manager (also 3rd party products) are best for simple tuning and ongoing maintenance.

V$/X$ are best for drilling deep into problems

Enterprise Manager 10g will radically change things

Statspack – Check Regularly

1 . Top 5 wait events

2. Load Profile

3. Instance Efficiency Hit Ratios

4. Wait Events / Wait Event Histograms

5. Latch Waits

6. Top SQL

7. Instance Activity / Time Model Stats / O/S Stats

8. File I/O / File Read Histogram / Undo Stats

9. Memory Allocation

10. Undo



Statspack – Miscellaneous notes

SQL> @spcreate (system/manager as SYSDBA) <1m

SQL> alter user PERFSTAT account lock; (also unlock)

SQL> EXECUTE STATSPACK.MODIFY_STATSPACK_PARAMETER – (i_snap_level=>5, i_buffer_gets_th=>100000, - i_modify_parameter=>'true');

SQL> execute STATSPACK.SNAP; (do this for start/end) PL/SQL procedure successfully completed.

(spauto.sql – Setup a snap on the hour every hour)

SQL> @ORACLE_HOME/rdbms/admin/spreport (also sprepsql)

Statspack – Header Information

DB Name DB Id Instance Inst Num Release Cluster Host

------------ ----------- ------------ -------- ----------- -------- ----

ORA92 968233682 P10 1 9.2.0.4.0 NO RJN1

Snap Id Snap Time Sessions Curs/Sess Comment

--------- ------------------ -------- --------- ------------

Begin Snap: 458 28-Nov-03 00:15:00 814 179.1

End Snap: 505 28-Nov-03 23:45:00 816 211.4

Elapsed: 1,410.00 (mins)

Cache Sizes (end)

~~~~~~~~~~~~~~~~~

Buffer Cache: 32,773M Std Block Size: 8K

Shared Pool Size: 2,048M Log Buffer: 1,024K

Statspack – Header Information (Take care)

Ensure that you’re running for the right instance.

Check the start/end times

Check the cache sizes

Could have been changed during the run

Last page will tell you more on starting/ending values of initialization parameters

Statspack – Load Profile

Load Profile

~~~~~~~~~~~~ Per Second Per Transaction

--------------- ---------------

Redo size: 1,409,245.79 36,596.21

Logical reads: 157,472.47 4,089.35

Block changes: 4,061.85 105.48

Physical reads: 5,965.05 154.90

Physical writes: 587.76 15.26

User calls: 5,922.08 153.79

Parses: 92.11 2.39

Hard parses: 0.17 0.00

Sorts: 93.88 2.44

Logons: 0.25 0.01

Executes: 5,686.76 147.68

Transactions: 38.51

% Blocks changed per Read: 2.58 Recursive Call %: 2.38

Rollback per transaction %: 1.22 Rows per Sort: 114.10

Statspack – Instance Efficiency

Instance Efficiency Percentages (Target 100%)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Buffer Nowait %: 99.08 Redo NoWait %: 99.86

Buffer Hit %: 96.39 In-memory Sort %: 99.95

Library Hit %: 100.00 Soft Parse %: 99.82

Execute to Parse %: 98.38 Latch Hit %: 99.64

Parse CPU to Parse Elapsd %: 85.11 % Non-Parse CPU: 99.86

Statspack - Instance Efficiency


Things to look for…

Buffer NoWait % of less than 99 percent. This is ratio of hits on a request for a specific buffer where the buffer was immediately available in memory. If the ratio is low, then could be a (hot) block(s) being contended for that should be found in the Buffer Wait Section.

Buffer Hit % of less than 95 percent. This is the ratio of hits on a request for a specific buffer and the buffer was in memory instead of needing to do a physical I/O.

When this varies greatly one day to the next, further investigation should be done as to the cause.

If you have unselective indexes that are frequently accessed, it will drive your hit ratio higher, which can be misleading indication of good performance.

When you effectively tune your SQL and have effective indexes on your entire system, this issue is not encountered as frequently and the hit ratio is a better performance indicator.

Library Hit % of less than 95 percent. A lower library hit ratio usually indicates that SQL is being pushed out of the shared pool early (could be due to a shared pool that is too small).

A lower ratio could also indicate that bind variables are not used or some other issue is causing SQL not to be reused (in which case a smaller shared pool may only be a band-aid that will potentially fix a library latch problem which may result).

You must fix the problem (use bind variables or CURSOR_SHARING) and then appropriately size the shared pool. I’ll discuss this further when we get to latch issues.

In-Memory Sort % of less than 95 percent in OLTP. In an OLTP system, you really don’t want to do disk sorts. Setting the PGA_AGGREGATE_TARGET (or SORT_AREA_SIZE) initialization parameter effectively will eliminate this problem.

Latch Hit % of less than 99 percent is usually a big problem. Finding the specific latch will lead you to solving this issue. More in the Latch Wait section.

Tuesday, June 12, 2007

Oracle Tuning Methodology


The Oracle Tuning Methodology consists of 10 ordered steps. Tune the:



  1. Business Rules

  2. Data Design

  3. Application Design

  4. Database Logical Design

  5. Database Operations

  6. Access Paths

  7. Memory

  8. Physical Database Implementation

  9. Resource Contention

  10. Platform

Broadly, the steps divide into Application Tuning and Database Tuning. Divide them more specifically into:
Application Tuning = steps 1, 2, 3
Database Design Tuning = steps 4, 5, 6
Server Tuning = steps 7, 8, 9, 10

To tune you must:



  1. Establish a benchmark (initial performance profile prior to tuning)

  2. Define specific tuning goals

  3. Measure progress towards those goals

  4. Keep statistics (to create the benchmark & measure improvements on it)

  5. Tuning measurements:

  6. Throughput – A measurement of how much work the system performs over a time period

  7. Response Time – How long it takes for a user to get a response from the system


  • There are three key hardware resources to optimize:

  • CPU

  • Disk

  • Memory

  • There are three kinds of systems or applications:
    Transactional - Many users, short DML transactions. Usually short rows favoring small block sizes. Values maximum throughput. Examples: inventory management or order taking. Aka (Also known as) Online Transaction Processing or OLTP.



  • Decision Support - Few to no updates (all updates are done during mass insertion or batch update in off-hours). Big read-only queries do full table scans (FTS). Favors large block size and tuning for scans. Might benefit from such features as bitmapped indexes and materialized views. Values response time. Examples: management information systems. Aka Decision Support Systems (DSS) or Data Warehouses (DW).



  • Hybrid - A mixture of OLTP and DSS/DW systems. Hard to tune since the goals and tuning parameters (parms) of the two basic types of systems conflict. If you can separate OLTP and DSS/DW and put them on different platforms, your tuning task is way easier.
    To collect timing statistics, set TIMED_STATISTICS=TRUE in the init.ora (default is FALSE). This requires recycling the instance. Or you can dynamically issue:
    ALTER SYSTEM SET TIMED_STATISTICS = TRUE ; This populates statistics into the data dictionary for the V$ views, UTLBSTAT/UTLESTAT utilities, and SQL Trace (aka TKPROF). (Use of these tools is explained below.)

Next Coming: Oracle Tuning Flow-chart

http://easyoracledba.blogspot.com/