Collect Diagnostic Data during a database hang.

The following set of diagnostic data collection routines must be run when database is not responding to TNS connection request and queries to views like v$session, etc hang.

The symptoms of this hang are
  • Unable to connect using tnsnames to local instance. E.g. if ssh on node 01A, then cannot sqlplus to instance hosted on that server. E.g. sqlplus username/password@connectstring
  • Unable to connect using tnsname .e.g sqlplus username/password@connectstring_active
  • Bequeath connections are successful. E.g. sqlplus / as sysdba
  • Querying certain V$ views causes the session to hang.

Diagnostic Data set 1:
  SQL> set lines 250 pages 1000
  SQL> spool xle_xkclfx.out
  SQL> select * from x$le;
  SQL> select * from x$kclfx;
  SQL> exec dbms_lock.sleep(60);
  SQL> select * from x$le;
  SQL> select * from x$kclfx;
  SQL> exec dbms_lock.sleep(60);
  SQL> select * from x$le;
  SQL> select * from x$kclfx;
  SQL> exec dbms_lock.sleep(60);
  SQL> select * from x$le;
  SQL> select * from x$kclfx;
  SQL> exec dbms_lock.sleep(60);
  SQL> select * from x$le;
  SQL> select * from x$kclfx;
  SQL> spool off;

Diagnostic Data set 2:
  Fetch the OSPID for all the running lms processes using the following command
$ ps -ef | grep -i lms | grep -v grep  | awk '{print $2}'

   For each of the PID returned by the above command run the following
  SQL> oradebug setospid <OS PID of LMS process>
  SQL> oradebug unlimit
  SQL> oradebug tracefile_name
  SQL> oradebug dump errorstack 3
  SQL> !sleep 180s
  SQL> oradebug dump errorstack 3
  SQL> !sleep 180s
  SQL> oradebug dump errorstack 3
  SQL> exit;

Diagnostic Data set 3:
  SQL> oradebug setmypid
  SQL> oradebug tracefile_name
  SQL> oradebug unlimit
  SQL> oradebug -g all hanganalyze 3
  SQL> oradebug -g all dump systemstate 266
  SQL> !sleep 180s
  SQL> oradebug -g all dump systemstate 266
  SQL> !sleep 180s
  SQL> oradebug -g all dump systemstate 266

Diagnostic Data set 4:
Run the following to obtain the PID of the LMS processes
  $ ps -ef | grep -i lms | grep -v grep  | awk '{print $2}'

  For each of the PID returned by the above command
    SQL> oradebug setospid <OS PID of LMS process>
  SQL> oradebug unlimit
  SQL> oradebug tracefile_name
  SQL> oradebug event 10708 trace name context forever, level 15
  SQL> !sleep 180s

Reboot the cluster now.

Post reboot the files collected by in the above steps must be uploaded to Oracle.

Comments

Popular posts from this blog

Recover Standby database from ORA-00332: archived log is too small

Check Active Dataguard Replication Status

Top CPU consuming oracle sessions