0g RAC: Troubleshooting CRS Root.sh Problems
Doc ID: 240001.1 Type: TROUBLESHOOTING
Modified Date : 19-MAR-2008 Status: PUBLISHED
Symptom(s)
~~~~~~~~~~
The CRS stack does not come up while running root.sh after installing CRS
(Cluster Ready Services):
You may see the startup timing out or failing. Example:
Successfully accumulated necessary OCR keys.
Using ports: CSS=49895 CRS=49896 EVMC=49898 and EVMR=49897.
node
node 1: opcbhp1 int-opcbhp3 opcbhp1
node 2: opcbhp2 int-opcbhp4 opcbhp2
Creating OCR keys for user ‘root’, privgrp ‘sys’..
Operation successful.
Now formatting voting device: /dev/usupport_vg/rV10B_vote.dbf
Successful in setting block0 for voting disk.
Format complete.
Adding daemons to inittab
Preparing Oracle Cluster Ready Services (CRS):
Expecting the CRS daemons to be up within 600 seconds.
Failure at final check of Oracle CRS stack.
Or you may see one of the daemons core dump:
Expecting the CRS daemons to be up within 600 seconds.
4714 Abort – core dumped
Or you may get another error.
Change(s)
~~~~~~~~~~
Installing CRS (Cluster Ready Services)
Cause
~~~~~~~
Usually a problem in the configuration.
Fix
~~~~
1. Check and make sure you have public and private node names defined and that
these node names are pingable from each node of the cluster.
2. Verify that the OCR file and Voting file are readable and writable by the
Oracle user and the root user. The permissions that CRS uses for these files
are:
Pre Install:
OCR – root:oinstall – 640
Voting – oracle:oinstall – 660
Post Install:
OCR – root:oinstall – 640
Voting – oracle:oinstall – 644
In RHAS 4.0, permissions should be added to /etc/rc.d/rc.local. See
Note 293819.1 for more information.
3. Unless you are upgrading your OCR from a previous verison, make sure that
the OCR file and voting files have been cleared prior to running root.sh.
Example:
dd if=/dev/zero of=/dev/traindata_dg/ocrV1064_100m.dbf bs=8192 count=12800
dd if=/dev/zero of=/dev/traindata_dg/V1064_vote_01_20m.dbf bs=8192 count=2560
4. Verify that the Oracle user has permissions on /var/tmp (specifically
/var/tmp/.oracle)
5. Is pam being used? Look for pam_unix messages in the messages file. The
pam configuration might need to be altered to allow the root.sh to complete.
6. Verify that the correct vendor clusterware version is being used (if vendor
clusterware is being used). If on Sun, make sure you are using the latest UDLM.
If on Sun, make sure the udlm has the keyword “reentrant”. Example:
> more /var/sadm/pkg/ORCLudlm/pkginfo | grep VERSION
VERSION=Dev Release 10/29/03, 64bit 3.3.4.7 reentrant
7. Veirfy that crs, css, or evm is not already running ( ps -ef | grep d.bin )
8. If you are not hitting any of these issues, run root.sh again with a
debugging flag. Example:
sh -x root.sh
This time you should be able to see more information about what root.sh is
doing prior to the failure. Example:
Adding daemons to inittab
+ /usr/bin/cp /etc/inittab.crs /etc/inittab
+ /sbin/init.d/init.crs start
Preparing Oracle Cluster Ready Services (CRS):
+ /sbin/init q
+ /u01/64bit/app/oracle/product/crs/bin/crsctl check install -wait 600
Expecting the CRS daemons to be up within 600 seconds.
+ /usr/bin/echo Failure at final check of Oracle CRS stack.
Failure at final check of Oracle CRS stack.
+ /usr/bin/echo 0
You should also review the files collected from RAC-DDT ( Note 301138.1 ).
Specifically the OCR dump, CRS logs, CSS logs, messages file, etc… Also
check for /tmp/crsctl.
If a core files was written it would be useful to obtain a stack trace of the
core file using Note 1812.1 “TECH Getting a Stack Trace from a CORE file”.