0
点赞
收藏
分享

微信扫一扫

ORA-15077 PROC-26 CRSD Fails During CRS Startup on 11gR2 [ID 1152583.1]


ORA-15077 PROC-26 CRSD Fails During CRS Startup on 11gR2 [ID 1152583.1]



 

Modified 04-AUG-2010     Type PROBLEM     Status PUBLISHED

 

In this Document
  ​​​Symptoms​​​  ​​Changes​​  ​​Cause​​  ​​Solution​​  ​​References​​



Applies to:


Oracle Server - Enterprise Edition - Version: 11.2.0.1 and later   [Release: 11.2 and later ]
Information in this document applies to any platform.


Symptoms


2 node RAC, node 2 rebooted manually, after node restart and restart CRS, CRSD crashed with:

​The OCR location +DG_DATA_01 is inaccessible​​​​2010-06-27 09:58:56.869: [ OCRASM][4156924400]proprasmo: Error in open/create file in dg [DG_DATA_01]​​​​[ OCRASM][4156924400]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge​​​​ORA-15077: could not locate ASM instance serving a required diskgroup​​​​2010-06-27 09:58:56.871: [ CRSOCR][4156924400] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge​​​​ORA-15077: could not locate ASM instance serving a required diskgroup] [7]​​​​2010-06-27 09:58:56.871: [ CRSD][4156924400][PANIC] CRSD exiting: Could not init OCR, code: 26​

alertracnode2.log shows:

​2010-06-27 09:45:04.759​​​​[cssd(13087)]CRS-1713:CSSD daemon is started in clustered mode​​​​2010-06-27 09:45:24.911​​​​[cssd(13087)]CRS-1601:CSSD Reconfiguration complete. Active nodes are racnode1 racnode2 .​
​2010-06-27 09:45:43.399​​​​[crsd(13556)]CRS-1201:CRSD started on node racnode2.​​​​2010-06-27 09:58:43.026​​​​[crsd(13556)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /opt/oracle/11.2.0/grid/log/racnode2/crsd/crsd.log.​​​​2010-06-27 09:58:43.207​​​​[/opt/oracle/11.2.0/grid/bin/oraagent.bin(14944)]CRS-5822:Agent '/opt/oracle/11.2.0/grid/bin/oraagent_oracle' disconnected from server. Details at (:CRSAGF00117:) in /opt/oracle/11.2.0/grid/log/racnode2/agent/crsd/oraagent_oracle/oraagent_oracle.log.​​​​2010-06-27 09:58:43.465​​​​[ohasd(12493)]CRS-2765:Resource 'ora.crsd' has failed on server 'racnode2'.​​​​...​​​​2010-06-27 09:59:02.943​​​​[crsd(15055)]CRS-1013:The OCR location in an ASM disk group is inaccessible. Details in /opt/oracle/11.2.0/grid/log/racnode2/crsd/crsd.log.​​​​2010-06-27 09:59:03.713​​​​[ohasd(12493)]CRS-2765:Resource 'ora.crsd' has failed on server 'racnode2'.​​​​2010-06-27 09:59:03.713​​​​[ohasd(12493)]CRS-2771:Maximum restart attempts reached for resource 'ora.crsd'; will not restart.​

 


Changes


reboot the node


Cause


This issue is caused by VIP address being already assigned in the network due to a wrong system configuration.

In the crsd.log, we can see:

​2010-06-27 09:49:15.743: [UiServer][1519442240] Container [ Name: ORDER​​​​MESSAGE:​​​​TextMessage[CRS-2672: Attempting to start 'ora.racnode2.vip' on 'racnode2']​

​2010-06-27 09:49:35.827: [UiServer][1519442240] Container [ Name: ORDER​​​​MESSAGE:​​​​TextMessage[​CRS-5005: IP Address: 10.18.14.16 is already in use in the network​]​
​2010-06-27 09:49:35.829: [UiServer][1519442240] Container [ Name: ORDER​​​​MESSAGE:​​​​TextMessage[CRS-2674: Start of 'ora.racnode2.vip' on 'racnode2' failed]​
​2010-06-27 09:51:32.746: [UiServer][1519442240] Container [ Name: ORDER​​​​MESSAGE:​​​​TextMessage[Attempting to stop `ora.asm` on member `racnode2`]​
​2010-06-27 09:58:44.543: [ CRSOCR][1147494896] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=15077, loc=kgfokge​​​​ORA-15077: could not locate ASM instance serving a required diskgroup] [7]​​​​2010-06-27 09:58:44.543: [ CRSD][1147494896][PANIC] CRSD exiting: Could not init OCR, code: 26​​​​2010-06-27 09:58:44.543: [ CRSD][1147494896] Done.​

So ASM and OCR diskgroup were ONLINE, CRSD was starting resource, when it starts VIP, due to VIP address already used in network, it failed to start ​​ora.racnode2.vip,​​​ it then shutdown ASM, causing OCR device access failure and CRSD abort.

Checking network, we see:

​/etc/hosts​​​​# public node names​​​​10.12.14.13 racnode1​​​​10.12.14.14 racnode2​​​​#Oracle RAC VIP​​​​10.12.14.15 racnode1-vip​​​​10.12.14.16 racnode2-vip​

The ifconfig output from node 2 shows that the VIP address for racnode2 is permanently assigned to eth1:

​eth1 Link encap:Ethernet HWaddr 00:22:64:F7:0C:E8​​​​inet addr:​10.12.14.16​ Bcast:10.12.14.255 Mask:255.255.248.0​​​​inet6 addr: fe80::222:64ff:fef7:ce8/64 Scope:Link​​​​UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1​​​​RX packets:2772 errors:0 dropped:0 overruns:0 frame:0​​​​TX packets:119 errors:0 dropped:0 overruns:0 carrier:0​​​​collisions:0 txqueuelen:1000​​​​RX bytes:203472 (198.7 KiB) TX bytes:22689 (22.1 KiB)​​​​Interrupt:177 Memory:f4000000-f4012100​

while it should have been bound to the public interface on node 1 (eth1:<n>) while CRS was down on node 2:

​eth0 Link encap:Ethernet HWaddr 00:22:64:F7:0B:22​​​​inet addr:​10.12.14.13​ Bcast:10.12.14.255 Mask:255.255.248.0​​​​inet6 addr: fe80::222:64ff:fef7:b22/64 Scope:Link​​​​UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1​
​eth0:1 Link encap:Ethernet HWaddr 00:22:64:F7:0B:22​​​​inet addr:​10.12.14.15​ Bcast:10.12.14.255 Mask:255.255.248.0​​​​UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1​​​​Interrupt:169 Memory:f2000000-f2012100​



Solution


Modify network configuration at OS layer, eg:
/etc/sysconfig/network-scripts/ifcfg-eth*
script, remove the VIP IP from ifcfg-eth1 definition.

Restart network service, check ifconfig -a result, ensure VIP is not assigned to network interface before CRSD is started (unless it is failed over to the other node).

Restart CRSD on node 2.


References


​​NOTE:1050908.1​​ - How to Troubleshoot Grid Infrastructure Startup Issues

举报

相关推荐

0 条评论