Posts

Showing posts from February, 2013

PRKP-1024 : The service [service_name] is still running when removing a service

At times when you try to delete a service from CRS GI stack, you may the below error PRKP-1024 : The service [service_name] is still running. This is despite the fact that the service is definitely OFFLINE. While its not clear (to me at least) as  to why CRS has got itself into a twist, but a quick check using "crs_stat -t" may show that the service is indeed offline but a related entry ending with ".cs" may be online. On our rig no amount of restarting CRS or rebooting the cluster helped us, goes to show the amount of muddle that went in before, therefore the following can be done in order to get rid of it 1. Use the crs_stat -p | grep .srv to obtain the full name of the service.   e.g NAME=ora.[cluster_name].[service_name].cs 2. Use the crs_stop command to put offline the service    e.g crs_stop ora.[cluster_name].[service_name].cs 3. Repeat steps 1 and 2 to stop all other related resources e.g. srv You c...

Oracle HangAnalyze

There comes a time in your life (more often then not and when you least expect) you will find that your sql session is hanging or some client application is hung or the database itself is hanging. While utilities AWR and ADDM are helpful, these don't tell the internal status of the instance, in particular if you are stuck with a hung instance. In such cases running AWR/ADDM is out of the equation. Oracle provides a additional tool as part of "oradebug" called "hanganalyze". This tool can be used to either diagnose a known hung session or the database in general. This blog contains steps to generate hang analyze trace file Run the following query to obtain the SPID of the hung session Obtain session SPID select a.inst_id,         a.sid,         a.serial#,          b.spid ospid,         to_char(logon_time,'dd-Mon-rr hh24:mi') LogonTime,         lower(...

Extract ASM disk info

If you are asked to provide an extract of SAN allocation, then the following query can be used to provide the breakdown ASM Disk Info Query $ sqlplus / as sysdba set pages 200 lines 250 --colsep , col DiskPath format a50 col diskname format a20 col diskgroup_name format a20 break on Diskgroup_name skip 2 compute sum label 'Group_Size-GB' of total_gb on Diskgroup_name compute sum label 'Group_Size-GB' of Used_GB on Diskgroup_name compute sum label 'Group_Size-GB' of Free_GB on Diskgroup_name compute Avg label 'Free_percent%' of free_percent on Diskgroup_name select a.name Diskgroup_name,        b.name diskname,        b.path DiskPath,        round((b.total_mb/1024),2) Total_GB,        round(((b.total_mb-b.free_mb)/1024),2) Used_GB,        round((b.free_mb/1024),2) Free_GB,        100 - round((((b.total_mb - b.free_mb)*100)/b.total_mb),2) free_pe...

How to restore ASM based OCR after complete loss of the CRS diskgroup

It is not possible to directly restore a manual or automatic OCR backup if the OCR is located in an ASM disk group. This is caused by the fact that the command 'ocrconfig -restore' requires ASM to be up and running in order to restore an OCR backup to an ASM disk group. However, for ASM to be available, the CRS stack must have been successfully started. For the restore to succeed, the OCR also must not be in use (r/w), i.e. no CRS daemon must be running while the OCR is being restored The following steps must be used in order to recover from a situation where the disks belonging to the ASM diskgroup +MGMT have been lost. The instructions assumes that new/fixed disks are now available on the effected servers and the following have already been carried out on both nodes of the cluster. - Disks for OCR and Voting are visible to the oracle cluster - The disk ownership has been changed from root:disk to oracle:dba or its equivalent accounts - /etc/udev/rules/50-udev.rules fi...

Check Active Dataguard Replication Status

In order to check if ADG is up-to-date, one has to tail the alert log on both the Primary and Physical Standby databases, followed by a manual check to look for the log status of (in-transit) against each of the threads. While this method is perfect it can be bit tedious to perform. You can now do the same using one of the attached SQL from the Primary database. Deployment and usage steps - Open a vi editor on the Primary database. - Copy of the below SQL into the editor - Complete the  database link creation command based on the database    you are checking. - Save the text file as check_adg_status.sql - execute the sql file as sys user. e.g. sqlplus -s  / as sysdba @check_adg_status.sql  Note : These commands cannot be run from the Physical standby as the query relies on using a temporary database link that cannot be created on a read-only database. Troubleshooting 1. Some database clusters may report the following error when invoked   ...

Streams recreation - meta data mismatch

Sometimes the streams drop and recreate may not clean up all off relevant meta data. So far I have come across one instance  where a data mismatch caused the failure to create capture process. This blog will be used to catalogue these errors and where possible document the fix.   Error: ORA-00001: unique constraint (SYSTEM.LOGMNR_SESSION_UK1) violated The error was reported on one of test systems while re-creating Capture Message queue. The root case was down to a mismatch between DBA_CAPTURE view showing 0 rows and SYSTEM.LOGMNR_SESSIONS$ returning 1 row for the capture process. The soution in this case was to delete the offending row from SYSTEM.LOGMNR_SESSIONS$ and re-run DROP_RECREATE_STREAMS_CAPTURE stored procedure. Note: Some of the other options worth considering, but not used in this instance were use DBMS_STREAMS_ADM.REMOVE_STREAMS_CONFIGURATION() Refer How To Rebuild LogMiner Metadata Tables [ID 550197.1].  Bef...