Thursday, November 10, 2011

TQ-Oracle DB tuning basics

Minimum Tools Necessary for Troubleshooting

The bare minimum tools necessary for troubleshooting database related issues are as following

AWR Report

AWR (Automatic Workload Repository) report is a Oracle tool which provides a comprehensive list of database activities between a specified interval of time.

· To get the AWR report take the snapshots before and after the potential bottleneck.

SQL> exec dbms_workload_repository.create_snapshot ;

· After taking the AWR snapshot run the below command to take the snap_id (that will be used to generate the report)

SQL> select max(snap_id) from dba_hist_snapshot;

· Generate the AWR report using the following

sqlplus / @$ORACLE_HOME/rdbms/admin/awrrpt.sql

ADDM Report

ADDM (Automatic Database Diagnostic Monitor) report is an Oracle tool which provides upfront recommendations of database activities between a specified interval of time. For each of the identified issues it locates the root cause and provides recommendations for correcting the problem. An ADDM analysis task is performed and its findings and recommendations stored in the database every time an AWR snapshot is taken provided the STATISTICS_LEVEL parameter is set to TYPICAL or ALL. ADDM report can be execute by as following

· Take a snapshot before and after the potential bottleneck.

SQL> exec dbms_workload_repository.create_snapshot ;

· After taking the AWR snapshot run the below command to take the snap_id (that will be used to generate the report)

SQL> select max(snap_id) from dba_hist_snapshot;

· Generate the ADDM report using the following

sqlplus / @$ORACLE_HOME/rdbms/admin/addmrpt.sql

Statistics Collection

Make sure that statistics are current on the database and gathered using DBMS_STATS package as mentioned below

SQL> alter session enable parallel dml;

SQL> exec dbms_stats.gather_schema_stats(ownname => '', estimate_percent => 100, cascade => TRUE, degree => , granularity => 'ALL',force => TRUE);

Explain Plan of SQL statement

If the issue is related to a slow down of a particular pre-identified SQL statement then we need to get the SQL plan from the production system to make sure the right access path is taken. To do this the following three options are there.

· If you have already identified the SQL_ID of the SQL then it will be just a matter of querying the V$SQL_PLAN table to get the explain plan.

· The other way is to get the explain plan using the following way on sqlplus SQL> explain plan for < sql statement > ; SQL> SELECT PLAN_TABLE_OUTPUT FROM

TABLE (DBMS_XPLAN.DISPLAY ());

· The other way to get the explain plan is using Tkprof. Please refer to the “Requesting Tkprof” section to know the different kinds of Tkprofs.

To troubleshoot the SQL plan please refer to the section “Troubleshooting Bad SQL Plan”

Troubleshooting Bad SQL Plan

Troubleshooting Bad SQL Plan can be done in following ways

Cross check all base provided Indexes

The following SQL can be executed to give a comprehensive list of Indexes for the tables in question.

SQL> select distinct a.table_OWNER, c.table_name,a.index_owner,a.index_name,c.num_rows Table_Rows ,b.num_rows Index_Rows,b.sample_size,a.table_owner,a.column_name,a.column_position ,b.uniqueness,b.last_analyzed

from dba_ind_columns a, dba_indexes b, dba_tables c

Where A.Index_Name = B.Index_Name

And A.Index_Owner = B.Owner

And A.Table_Owner =B.Table_Owner

AND B.TABLE_OWNER = C.OWNER

And B.Table_Name = C.Table_Name

and a.table_name = b.table_name and a.table_name in (‘< List of Table Names>’)

order by a.table_owner,a.index_name,a.column_position

Full scans

Make sure that there is no full scans in the explain plan

Table access + Index access instead of Index only access

In almost all the cases if the SQL can make the Index only access then that is the best plan but that is subjected to the Index feasibility.

Clustering Factor of the Indexes

Make sure that that the clustering factor of the Indexes is good (which means close to the number of blocks then to the number of rows). Good clustering factor of Indexes means the rows are nicely packed in the blocks and the SQL query result can result from few numbers of blocks which ultimately leads to less CPU and IO overhead.

Indexes with Low Cardinality

Indexes which have low cardinality are likely to cause scalability issues. These Indexes leads to buffer busy waits when large numbers of threads are trying to obtain the lock on the same block. To mitigate this, Indexes with Low Cardinality Indexes should be avoided.

Database Optimizer Parameters for Indexes

There are three optimizer parameters which affect SQL plan in terms of giving preference to Index access in the SQL plan. As our application is mostly Index driven so make sure that these parameters are appropriately set. Please note that these are the recommended starting values but customer can adjust this to values that give maximum performance gains. These parameters are

· optimizer_index_cost_adj - This should be set to 1

· optimizer_index_caching - This should be set to 100.

· optimizer_mode - This should be ALL_ROWS from batch and FIRST_ROWS_10 for online

Requesting Tkprofs

Tkprofs are good source for troubleshooting customer’s issues as they provide information about wait events on SQL level, provide list of all SQLs running during the duration when you are tracing the sessions and also gives a run time explain plan. There are multiple ways to get the Tkprofs

Basic Tkprofs

To enable the Tkprofs set the below before you start the test

SQL> ALTER SYSTEM SET trace_enabled = TRUE ;

SQL> ALTER SYSTEM SET sql_trace = TRUE ;

SQL> ALTER SYSTEM SET timed_statistics=TRUE;

SQL> ALTER SYSTEM SET STATISTICS_LEVEL=ALL;

SQL> ALTER SYSTEM SET EVENTS '10046 trace name context forever, level 2';

Once above is set the traces will start coming at the location defined by “user_dump_dest” parameter.

Once the test to troubleshoot the issue is completed run the following to stop the trace

SQL> ALTER SYSTEM SET EVENTS '10046 trace name context off';

SQL> ALTER SYSTEM SET sql_trace = FALSE;

SQL> ALTER SYSTEM SET trace_enabled = FALSE;

Concatenate the *ora* files and take the Tkprof as below

tkprof sys=no sort=EXEELA

Tkprofs with Bind Variables

To enable the Tkprofs to get the bind variables set the below before you start the test

SQL> ALTER SYSTEM SET trace_enabled = TRUE ;

SQL> ALTER SYSTEM SET sql_trace = TRUE ;

SQL> ALTER SYSTEM SET timed_statistics=TRUE;

SQL> ALTER SYSTEM SET STATISTICS_LEVEL=ALL;

SQL> ALTER SYSTEM SET EVENTS '10046 trace name context forever, level 4';

Once above is set the traces will start coming at the location defined by “user_dump_dest” parameter.

Once the test to troubleshoot the issue is completed run the following to stop the trace

SQL> ALTER SYSTEM SET EVENTS '10046 trace name context off';

SQL> ALTER SYSTEM SET sql_trace = FALSE;

SQL> ALTER SYSTEM SET trace_enabled = FALSE;

Concatenate the *ora* files and take the Tkprof as below

tkprof sys=no sort=EXEELA

The concatenated trace file provides the bind variables used in the sessions.

Tkprofs with Wait Events

To enable the Tkprofs to get the wait events set the below before you start the test

SQL> ALTER SYSTEM SET trace_enabled = TRUE ;

SQL> ALTER SYSTEM SET sql_trace = TRUE ;

SQL> ALTER SYSTEM SET timed_statistics=TRUE;

SQL> ALTER SYSTEM SET STATISTICS_LEVEL=ALL;

SQL> ALTER SYSTEM SET EVENTS '10046 trace name context forever, level 8';

Once above is set the traces will start coming at the location defined by “user_dump_dest” parameter.

Once the test to troubleshoot the issue is completed run the following to stop the trace

SQL> ALTER SYSTEM SET EVENTS '10046 trace name context off';

SQL> ALTER SYSTEM SET sql_trace = FALSE;

SQL> ALTER SYSTEM SET trace_enabled = FALSE;

Concatenate the *ora* files and take the Tkprof as below

tkprof sys=no sort=EXEELA

Tkprofs with Bind Variables and Wait Events

To enable the Tkprofs to get the bind variables and wait events set the below before you start the test

To enable the Tkprofs to get the wait events set the below before you start the test

SQL> ALTER SYSTEM SET trace_enabled = TRUE ;

SQL> ALTER SYSTEM SET sql_trace = TRUE ;

SQL> ALTER SYSTEM SET timed_statistics=TRUE;

SQL> ALTER SYSTEM SET STATISTICS_LEVEL=ALL;

SQL> ALTER SYSTEM SET EVENTS '10046 trace name context forever, level 12';

Once above is set the traces will start coming at the location defined by “user_dump_dest” parameter.

Once the test to troubleshoot the issue is completed run the following to stop the trace

SQL> ALTER SYSTEM SET EVENTS '10046 trace name context off';

SQL> ALTER SYSTEM SET sql_trace = FALSE;

SQL> ALTER SYSTEM SET trace_enabled = FALSE;

Concatenate the *ora* files and take the Tkprof as below

tkprof sys=no sort=EXEELA

The concatenated trace file provides the bind variables used in the session also.

AWR Analysis for Troubleshooting

AWR Report has a comprehensive list of data which is helpful in troubleshooting a database issue. There are few points which I want to highlight.

Top 5 Timed Foreground wait events

The top 5 timed foreground wait events is the first place to start the troubleshoot as it gives a good overview of potential issues.

SQL Statistics

This section of the AWR report gives a very comprehensive list of SQLs ordered by elapsed time, CPU, physical disk reads, consistent gets and executions. This is very helpful to see SQL statistics from these different perspectives.

Instance Activities

This section is helpful to see the Instance activities from per sec and per transaction level.

IO Stats

This section is helpful to see if there are issues related to IO. This helps us to visualize the IO bottleneck from the Tablespace as well as File level.

Advisory Pools

This section is helpful to see if the different pool s are sized properly and also gives an idea if they are undersized.

Wait Stats

This section is helpful to see the wait statistics.

Segment Statistics

This section is helpful to see issues related to the segment level and is one of the most important section for troubleshooting.

Database Initialization Parameters

The following database initialization parameters are good to start with and are recommended by the product. These can be further tuned based on the Implementation.

· processes=3000

· sessions=4500

· sga_max_size= < set according to physical memory available >

· sga_target= < set according to physical memory available >

· pga_aggregate_target=

available >

· memory_target=0 < DO NOT SET THIS >

· memory_max_target=0 < DO NOT SET THIS >

· db_writer_processes=12

· db_block_size=8192

· log_buffer=250M

· log_checkpoint_interval=0

· db_file_multiblock_read_count=8

· dml_locks=4860

· transactions=3000

· undo_retention=900

· sec_case_sensitive_logon=FALSE

· session_cached_cursors=0

· parallel_min_servers=32

· parallel_max_servers=256

· open_cursors=3000

· optimizer_index_cost_adj=1

· optimizer_index_caching=100

No comments:

Post a Comment