High level steps
to follow:-
1. Hanganalyze and systemstate Dumps
2. Collecting Hanganalyze and Systemstate dumps
3.
Collection
commands for Hanganalyze and Systemstate: Non-RAC:
4.
Collection
commands for Hanganalyze and Systemstate: RAC
5.
Collection
commands for Hanganalyze and Systemstate: RAC with fixes for bug 11800959 and bug 11827088
6. Collection commands for Hanganalyze and Systemstate:
RAC without fixes for Bug 11800959 and Bug 11827088
7. Explanation of Hanganalyze and
Systemstate Levels
8. Taking Systemstate Dumps when You cannot Connect to
Oracle
1. Hanganalyze and systemstate Dumps
Hanganalyze and
Systemstate dumps provide information on the processes in the database at a
specific point in time. Hanganalyze provides information on all processes
involved in the hang chain, whereas systemstate provides information on all
processes in the database. When looking at a potential hang situation, you need
to determine whether a process is stuck or moving slowly. By collecting these
dumps at 2 consecutive intervals this can be established. If a process is
stuck, these traces also provide the information to start further diagnosis and
possibly help to provide the solution.
- Hanganalyze is a
summary and will confirm if the db is really hung or just slow and
provides a consistent snapshot.
- Systemstate dump shows
what each process on the database is doing
2. Collecting
Hanganalyze and Systemstate dumps
Using
SQL*Plus connect as SYSDBA using the following command:
sqlplus
'/ as sysdba'
If
there are problems making this connection then in 10gR2 and above, the sqlplus
"preliminary connection" can be used :
sqlplus -prelim '/ as sysdba'
Note:
From 11.2.0.2 onwards, hanganalyze will not produce output under a sqlplus
"preliminary connection" since it requires a process state object and
a session state object. If a hanganalyze is attempted, the trace will output:
HANG ANALYSIS:
ERROR: Cannot perform hang analysis dump without a process state
object and a session state object.
( process=(nil), sess=(nil) )
See:
Note 986640.1 How To Connect Using A Sqlplus Preliminary Connection
3. Collection commands for Hanganalyze and
Systemstate: Non-RAC
Sometimes, database may
actually just be very slow and not actually hanging. It is therefore recommended,
where possible to get 2 hanganalyze and 2 systemstate dumps in order to
determine whether processes are moving at all or whether they are
"frozen".
Hanganalyze
sqlplus '/ as
sysdba'
oradebug setmypid
oradebug unlimit
oradebug
hanganalyze 3
-- Wait one
minute before getting the second hanganalyze
oradebug
hanganalyze 3
oradebug
tracefile_name
exit
Systemstate
sqlplus '/ as
sysdba'
oradebug setmypid
oradebug unlimit
oradebug dump
systemstate 266
oradebug dump
systemstate 266
oradebug
tracefile_name
exit
4. Collection commands for Hanganalyze and
Systemstate: RAC
There are 2 bugs affecting
RAC that without the relevant patches being applied on your system, make using
level 266 or 267 very costly.
Therefore without these
fixes in place it highly unadvisable to use these level
For information on these
patches see:
Document
11800959.8 Bug 11800959 - A SYSTEMSTATE dump with level >= 10 in RAC dumps
huge BUSY GLOBAL CACHE ELEMENTS - can hang/crash instances
Document
11827088.8 Bug 11827088 - Latch 'gc element' contention, LMHB terminates the
instance
5. Collection commands for Hanganalyze and
Systemstate: RAC with fixes for
bug 11800959 and bug 11827088
sqlplus '/ as sysdba'
oradebug setorapname reco
oradebug unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate
266
oradebug -g all dump systemstate
266
oradebug
tracefile_name
exit
6.
Collection
commands for Hanganalyze and Systemstate: RAC without fixes for Bug 11800959 and Bug 11827088
sqlplus '/ as sysdba'
oradebug setorapname reco
oradebug unlimit
oradebug -g all hanganalyze 3
oradebug -g all hanganalyze 3
oradebug -g all dump systemstate
258
oradebug -g all dump systemstate
258
oradebug
tracefile_name
exit
In RAC environment, a dump will
be created for all RAC instances in the DIAG trace file for each instance.
7. Explanation of Hanganalyze and Systemstate Levels
Hanganalyze levels:
Level 3: In 11g onwards, level 3 also
collects a short stack for relevant processes in hang chain
Systemstate levels:
Level 258 is a fast alternative but we'd lose
some lock element data
Level 267 can be used if additional buffer
cache / lock element data is needed with an understanding of the cost
8. Taking Systemstate Dumps when You cannot Connect to
Oracle
On very rare occasions it is not possible to
make the sqlplus connection necessary for issuing ordinary commands used
above.
In that event, there is still a back door
method of getting the systemstate dump using a debugger like dbx, if that is
available on your system.
The process you attach to will most likely
dump core and be killed, so do not attach to an Oracle background process.
Here is the syntax:
dbx -a PID (where PID = any oracle
shadow process)
dbx() print ksudss(10)
...return value printed here
dbx() detach
First, you will need
to find a shadow process:
$ ps -ef |grep
sqlplus
osupport 78526 154096
0 12:11:05 pts/1 0:00 sqlplus scott/tiger
osupport 94130
84332 1 12:11:20 pts/3
0:00 grep sqlplus
$ps -ef |grep 78526
osupport 28348
78526 0 12:11:05 -
0:00 oracles734 (DESCRIPTION=(LOCAL
osupport 78526 154096
0 12:11:05 pts/1 0:00 sqlplus scott/tiger
osupport 94132
84332 1 12:11:38 pts/3
0:00 grep 78526
You will attach to shadow PID 28348, enter
the print ksudss (10) command when the prompt comes back, and detach:
$ dbx -a 28348
Waiting to attach to process 28348 ...
Successfully attached to oracle.
warning: Directory containing oracle could
not be determined.
Apply 'use' command to initialize source
path.
Type 'help' for help.
reading symbolic information ...
stopped in read at 0xd016fdf0
0xd016fdf0 (read+0x114) 80410014 lwz
r2,0x14(r1)
(dbx) print ksudss(10) 2
(dbx) detach
You will find a systemstate in the
user_dump_dest directory under the PID trace (and probably a useless core files
as well).
$ ls -lrt *28348*
-rw-r-----
1 osupport dba 46922 Oct 10
12:12 ora_28348.trc
core_28348:
total 72
-rw-r--r--
1 osupport dba 16567 Oct 10
12:12 core
drwxr-xr-x
7 osupport dba 12288 Oct 10
12:12 ../
drwxr-x---
2 osupport dba 512 Oct 10
12:12 ./
You will find the usual header information in
the trace file.
Make sure there is an END OF SYSTEM STATE in
the file. You can grep for it or do a
search in vi. If it is not there then
the dump is not complete, possibly because max_dump_file size is too small in
the init.ora.
Update for 10g and
higher:
In some cases, no connections are allowed on
the instance (in some ORA-20 situations for example).
As of 10.1.x, there is a new option with
SQL*Plus to allow access to an instance to generate traces.
sqlplus -prelim / as sysdba
For example
export ORACLE_SID=PROD ###Replace PROD
with the SID you want to trace
sqlplus -prelim / as sysdba
oradebug setmypid
oradebug unlimit;
oradebug dump systemstate 10
oradebug
tracefile_name
exit
No comments:
Post a Comment