0
点赞
收藏
分享

微信扫一扫

terminating the instance due to ORA error 481 异常处理记录

芥子书屋 2022-07-12 阅读 96

数据库:oracle 19.12

系统:rhel 7.9

环境:rac(2个节点)+dg

第一节点正常运行,启动第二节点报警如下:

2022-06-06T13:02:28.608936+08:00

PMON (ospid: 114407): terminating the instance due to ORA error 481

2022-06-06T13:02:28.609147+08:00

Cause - 'Instance is being terminated due to fatal process death (pid: 22, ospid: 114470, LMON)'

2022-06-06T13:02:28.657006+08:00

System state dump requested by (instance=2, osid=114407 (PMON)), summary=[abnormal instance termination]. error - 'Instance is terminating.

此前还存在以下告警:

2022-06-01T23:49:24.188316+08:00

ORA-1092 : opitsk aborting process

2022-06-01T23:49:24.780985+08:00

Instance terminated by PMON, pid = 66912

检查集群状态:

[grid@hisdb2 ~]$ crsctl stat res -t

--------------------------------------------------------------------------------

Name           Target  State        Server                   State details      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.LISTENER.lsnr

               ONLINE  ONLINE       hisdb1                   STABLE

               ONLINE  ONLINE       hisdb2                   STABLE

ora.chad

               ONLINE  ONLINE       hisdb1                   STABLE

               ONLINE  ONLINE       hisdb2                   STABLE

ora.net1.network

               ONLINE  ONLINE       hisdb1                   STABLE

               ONLINE  ONLINE       hisdb2                   STABLE

ora.ons

               ONLINE  ONLINE       hisdb1                   STABLE

               ONLINE  ONLINE       hisdb2                   STABLE

ora.proxy_advm

               OFFLINE OFFLINE      hisdb1                   STABLE

               OFFLINE OFFLINE      hisdb2                   STABLE

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   STABLE

      2        ONLINE  ONLINE       hisdb2                   STABLE

ora.DATA.dg(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   STABLE

      2        ONLINE  OFFLINE                               STABLE

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  ONLINE       hisdb1                   STABLE

ora.VOTDISK.dg(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   STABLE

      2        OFFLINE OFFLINE                               STABLE

ora.asm(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   Started,STABLE

      2        ONLINE  OFFLINE                               STABLE

ora.asmnet1.asmnetwork(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   STABLE

      2        ONLINE  ONLINE       hisdb2                   STABLE

ora.cvu

      1        ONLINE  ONLINE       hisdb1                   STABLE

ora.hisdb1.vip

      1        ONLINE  ONLINE       hisdb1                   STABLE

ora.hisdb2.vip

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.orclrac.db

      1        ONLINE  ONLINE       hisdb1                   Open,Readonly,HOME=/

                                                               u01/app/oracle/produ

                                                               ct/19.12.0/db_1,STAB

                                                               LE

      2        ONLINE  OFFLINE                               STABLE

ora.qosmserver

      1        ONLINE  ONLINE       hisdb1                   STABLE

ora.scan1.vip

      1        ONLINE  ONLINE       hisdb1                   STABLE

--------------------------------------------------------------------------------

说明:如上所示,集群状态黄颜色高亮部分,asm、VOTDISK、DATA状态为OFF,显示异常.

异常原因:

The problem is caused by HAIP is not ONLINE on either the running node or the problem node(s).

Basically the ASM or DB instance(s) can not startup if they use a different cluster_interconnect than the running instance.

With HAIP ONLINE, all instances (DB and ASM) should use HAIP IP address: 169.254.x.x.

If on any node HAIP is OFFLINE, the ASM and DB instance will use the native private network address which causes communication problem with the instance using HAIP.

Use the following commands to verify HAIP status, as grid user:

$ crsctl stat res -t -init

查看两节点HAIP状态:

节点2

[grid@hisdb2 ~]$ crsctl stat res -t -init

--------------------------------------------------------------------------------

Name           Target  State        Server                   State details      

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.asm

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.cluster_interconnect.haip

      1        ONLINE  OFFLINE                               STABLE

ora.crf

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.crsd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.cssd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.cssdmonitor

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.ctssd

      1        ONLINE  ONLINE       hisdb2                   OBSERVER,STABLE

ora.diskmon

      1        OFFLINE OFFLINE                               STABLE

ora.drivers.acfs

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.evmd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.gipcd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.gpnpd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.mdnsd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.storage

      1        ONLINE  ONLINE       hisdb2                   STABLE

--------------------------------------------------------------------------------

说明:节点2中ora.cluster_interconnect.haip状态为OFFLINE,而节点1中为ONLINE.

解决方案:

The solution is to start HAIP on all nodes before start ASM or DB instance by either restart HAIP resource or restart the GI stack.

1. Try to start HAIP manually on node 2

as grid user:

[grid@hisdb2 ~]$ crsctl start res ora.cluster_interconnect.haip -init

CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'hisdb2'

CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'hisdb2' succeeded

To verify:

[grid@hisdb2 ~]$ crsctl stat res -t -init

--------------------------------------------------------------------------------

Name           Target  State        Server                   State details      

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.asm

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.cluster_interconnect.haip

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.crf

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.crsd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.cssd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.cssdmonitor     

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.ctssd

      1        ONLINE  ONLINE       hisdb2                   OBSERVER,STABLE

ora.diskmon

      1        OFFLINE OFFLINE                                   STABLE

ora.drivers.acfs

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.evmd     

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.gipcd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.gpnpd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.mdnsd

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.storage

      1        ONLINE  ONLINE       hisdb2                   STABLE

--------------------------------------------------------------------------------

 

2、If this succeeds, then restart ora.asm resource (note, this will bring down all dependent diskgroup resource and db resource):

as root user:

[root@hisdb2 ~]# /u01/app/19.12.0/grid/bin/crsctl stop res ora.crsd -init

CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb2'

CRS-2677: Stop of 'ora.crsd' on 'hisdb2' succeeded

[root@hisdb2 ~]# /u01/app/19.12.0/grid/bin/crsctl stop res ora.asm -init -f

CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'

CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded

[root@hisdb2 ~]# /u01/app/19.12.0/grid/bin/crsctl start res ora.asm -init

CRS-2672: Attempting to start 'ora.asm' on 'hisdb2'

CRS-2676: Start of 'ora.asm' on 'hisdb2' succeeded

[root@hisdb2 ~]# /u01/app/19.12.0/grid/bin/crsctl start res ora.crsd -init

CRS-2672: Attempting to start 'ora.crsd' on 'hisdb2'

CRS-2676: Start of 'ora.crsd' on 'hisdb2' succeeded

startup any dependent resource as necessary

3. If above does not help, try to restart the GI stack on node 1, see if HAIP can be ONLINE after that.

As root user:

# crsctl stop crs

# crsctl start crs

说明:此处第3步骤未执行,因为步骤2已成功开启ora.asm、ora.crsd.

4. Once HAIP is ONLINE on node 2, proceed to start ASM on the rest of cluster nodes and ensure HAIP are ONLINE on all nodes.

[grid@hisdb2 ~]$ crsctl start res ora.asm -init

CRS-5702: Resource 'ora.asm' is already running on 'hisdb2'

CRS-4000: Command Start failed, or completed with errors.

 

[grid@hisdb2 ~]$ crsctl stat res -t

--------------------------------------------------------------------------------

Name           Target  State        Server                   State details      

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.LISTENER.lsnr

               ONLINE  ONLINE       hisdb1                   STABLE

               ONLINE  ONLINE       hisdb2                   STABLE

ora.chad

               ONLINE  ONLINE       hisdb1                   STABLE

               ONLINE  ONLINE       hisdb2                   STABLE

ora.net1.network

               ONLINE  ONLINE       hisdb1                   STABLE

               ONLINE  ONLINE       hisdb2                   STABLE

ora.ons

               ONLINE  ONLINE       hisdb1                   STABLE

               ONLINE  ONLINE       hisdb2                   STABLE

ora.proxy_advm

               OFFLINE OFFLINE      hisdb1                   STABLE

               OFFLINE OFFLINE      hisdb2                   STABLE

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

ora.ASMNET1LSNR_ASM.lsnr(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   STABLE

      2        ONLINE  ONLINE       hisdb2                   STABLE

ora.DATA.dg(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   STABLE

      2        ONLINE  ONLINE       hisdb2                   STABLE

ora.LISTENER_SCAN1.lsnr

      1        ONLINE  ONLINE       hisdb1                   STABLE

ora.VOTDISK.dg(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   STABLE

      2        ONLINE  ONLINE       hisdb2                   STABLE

ora.asm(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   Started,STABLE

      2        ONLINE  ONLINE       hisdb2                   Started,STABLE

ora.asmnet1.asmnetwork(ora.asmgroup)

      1        ONLINE  ONLINE       hisdb1                   STABLE

      2        ONLINE  ONLINE       hisdb2                   STABLE

ora.cvu

      1        ONLINE  ONLINE       hisdb1                   STABLE

ora.hisdb1.vip

      1        ONLINE  ONLINE       hisdb1                   STABLE

ora.hisdb2.vip

      1        ONLINE  ONLINE       hisdb2                   STABLE

ora.orclrac.db

      1        ONLINE  ONLINE       hisdb1                   Open,Readonly,HOME=/

                                                                             u01/app/oracle/produ

                                                                             ct/19.12.0/db_1,STAB

                                                                             LE

      2        ONLINE  ONLINE       hisdb2                   Open,Readonly,HOME=/

                                                                             u01/app/oracle/produ

                                                                             ct/19.12.0/db_1,STAB

                                                                             LE

ora.qosmserver

      1        ONLINE  ONLINE       hisdb1                   STABLE

ora.scan1.vip

      1        ONLINE  ONLINE       hisdb1                   STABLE

--------------------------------------------------------------------------------

结论:集群恢复正常.

举报

相关推荐

0 条评论