OS환경 : Oracle Linux 6.8(64bit)
DB 환경 : Oracle Database 11.2.0.4
에러 : ORA-27302: failure occurred at: sskgxpsnd2
현재시각확인
1
2
|
$ date
Tue Jul 31 09:40:52 KST 2018
|
db 기동 시 여러개의 ORA 메세지 발생
1
2
3
4
5
6
7
8
9
|
SQL> startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+DATA/ORAC/spfileORAC.ora'
ORA-17503: ksfdopn:10 Failed to open file +DATA/ORAC/spfileORAC.ora
ORA-00603: ORACLE server session terminated by fatal error
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:sendmsg failed with status: 105
ORA-27301: OS failure message: No buffer space available
ORA-27302: failure occurred at: sskgxpsnd2
|
crsctl 상태 확인
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
|
$ crsctl status resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE OFFLINE rac1
ONLINE ONLINE rac2
ora.FRA.dg
ONLINE UNKNOWN rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.OCR_VOTE.dg
ONLINE OFFLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2
ora.cvu
1 ONLINE ONLINE rac2
ora.oc4j
1 ONLINE ONLINE rac2
ora.orac.db
1 OFFLINE OFFLINE Instance Shutdown
2 ONLINE ONLINE rac2 Open
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac2
|
=>1번노드 db만 OFFLINE 상태
1번노드 최근 alert log 확인
1
2
3
4
5
6
7
8
9
10
|
Shutting down archive processes
Archiving is disabled
Fri Jun 29 08:28:44 2018
Stopping background process VKTM
Fri Jun 29 08:28:44 2018
NOTE: Shutting down MARK background process
Fri Jun 29 08:28:46 2018
freeing rdom 0
Fri Jun 29 08:28:51 2018
Instance shutdown complete
|
1번노드 최근 crsd log 확인
1
2
3
4
5
6
7
8
|
$ vi crsd.log
2018-07-31 05:09:30.218: [UiServer][3474949888]{1:26996:1121} Done for ctx=0x7f53ec19f520
2018-07-31 09:08:17.196: [UiServer][3472848640] CS(0x7f53d00e9cf0)set Properties ( oracle,0x7f53e00ecda0)
2018-07-31 09:08:17.206: [UiServer][3474949888]{1:26996:1122} Sending message to PE. ctx= 0x7f53ec1a8200, Client PID: 4944
2018-07-31 09:08:17.224: [UiServer][3474949888]{1:26996:1122} Done for ctx=0x7f53ec1a8200
2018-07-31 09:42:24.413: [UiServer][3472848640] CS(0x7f53d00e9cf0)set Properties ( oracle,0x7f53e00ecda0)
2018-07-31 09:42:24.423: [UiServer][3474949888]{1:26996:1123} Sending message to PE. ctx= 0x7f53ec1b8870, Client PID: 13797
2018-07-31 09:42:24.436: [UiServer][3474949888]{1:26996:1123} Done for ctx=0x7f53ec1b8870
|
1번노드 최근 cssd log 확인
1
2
3
4
5
6
7
8
|
$ vi ocssd.log
2018-07-31 09:45:54.488: [ CSSD][4213798656]clssgmpcBuildNodeList: nodename for node 0 is NULL
2018-07-31 09:45:59.487: [ CSSD][4181890816]clssnmSendingThread: sending status msg to all nodes
2018-07-31 09:45:59.487: [ CSSD][4181890816]clssnmSendingThread: sent 5 status msgs to all nodes
2018-07-31 09:45:59.490: [ CSSD][4213798656]clssgmpcBuildNodeList: nodename for node 0 is NULL
2018-07-31 09:46:04.488: [ CSSD][4181890816]clssnmSendingThread: sending status msg to all nodes
2018-07-31 09:46:04.489: [ CSSD][4181890816]clssnmSendingThread: sent 5 status msgs to all nodes
2018-07-31 09:46:04.491: [ CSSD][4213798656]clssgmpcBuildNodeList: nodename for node 0 is NULL
|
해결 방법 : 루프백 어댑터의 MTU를 16384로 변경
1번노드 MTU 확인
1
2
|
$ ifconfig lo | grep MTU
UP LOOPBACK RUNNING MTU:65536 Metric:1
|
2번노드 MTU 확인
1
2
|
$ ifconfig lo | grep MTU
UP LOOPBACK RUNNING MTU:65536 Metric:1
|
=> 동일함
메타링크 솔루션(2041723.1, 2322410.1)
루프백 어댑터의 MTU가 너무 높아서 발생하는 문제여서
루프백 어댑터의 MTU를 16436로 변경해야함
1번노드 루트 계정으로 MTU변경
1
2
|
$ su -
# ifconfig lo mtu 16436
|
1번노드 변경확인
1
2
|
# ifconfig lo | grep MTU
UP LOOPBACK RUNNING MTU:16436 Metric:1
|
1번노드 DB 기동
1
2
3
4
5
6
7
8
9
|
SQL> startup
ORACLE instance started.
Total System Global Area 835104768 bytes
Fixed Size 2257840 bytes
Variable Size 620760144 bytes
Database Buffers 209715200 bytes
Redo Buffers 2371584 bytes
ORA-00205: error in identifying control file, check alert log for more info
|
=> 컨트롤파일 에러 발생 alert log 확인
1번노드 최근 alert log 확인
1
2
3
4
5
6
7
8
9
10
11
12
13
|
ALTER DATABASE MOUNT
NOTE: Loaded library: /opt/oracle/extapi/64/asm/orcl/1/libasm.so
NOTE: Loaded library: System
SUCCESS: diskgroup DATA was mounted
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+FRA/orac/controlfile/current.256.979730787'
ORA-17503: ksfdopn:2 Failed to open file +FRA/orac/controlfile/current.256.979730787
ORA-15001: diskgroup "FRA" does not exist or is not mounted
ORA-15001: diskgroup "FRA" does not exist or is not mounted
ORA-205 signalled during: ALTER DATABASE MOUNT...
NOTE: dependency between database ORAC and diskgroup resource ora.DATA.dg is established
Tue Jul 31 09:53:51 2018
ALTER SYSTEM SET local_listener=' (ADDRESS=(PROTOCOL=TCP)(HOST=192.168.0.102)(PORT=1521))' SCOPE=MEMORY SID='ORAC1';
|
asmcmd로 컨트롤 파일 확인
1
2
3
4
5
|
$ export ORACLE_SID=+ASM1
$ asmcmd
asmcmd> cd FRA/ORAC/CONTROLFILE/
asmcmd> ls
Current.256.979730787
|
=> 파일 존재함
1번노드 다시 DB 기동
1
2
3
4
5
6
7
|
SQL> alter database mount;
Database altered.
SQL> alter database open;
Database altered.
|
=> db 오픈 되는걸로 봐서는 일시적인 오류였던거 같음
crsctl 상태 확인
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
|
$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.FRA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.OCR_VOTE.dg
ONLINE OFFLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac2
ora.cvu
1 ONLINE ONLINE rac2
ora.oc4j
1 ONLINE ONLINE rac2
ora.orac.db
1 ONLINE ONLINE rac1 Open
2 ONLINE ONLINE rac2 Open
ora.rac1.vip
1 ONLINE ONLINE rac1
ora.rac2.vip
1 ONLINE ONLINE rac2
ora.scan1.vip
1 ONLINE ONLINE rac2
|
=> 정상화 완료
1번노드 최근 crsd 로그 확인
1
2
3
4
5
6
7
8
|
$ vi crsd.log
2018-07-31 10:10:22.405: [ AGFW][3487557376]{0:1:14} Agfw Proxy Server replying to the message: RESOURCE_STATUS[Proxy] ID 20481:65119
2018-07-31 10:10:22.413: [ AGFW][3487557376]{0:1:14} Agfw Proxy Server received the message: CMD_COMPLETED[Proxy] ID 20482:1315276
2018-07-31 10:10:22.414: [ AGFW][3487557376]{0:1:14} Agfw Proxy Server replying to the message: CMD_COMPLETED[Proxy] ID 20482:1315276
2018-07-31 10:10:22.414: [ AGFW][3487557376]{0:1:14} Agfw received reply from PE for resource state change for ora.orac.db 1 1
2018-07-31 10:11:12.522: [UiServer][3472848640] CS(0x7f53d00e9cf0)set Properties ( oracle,0x7f53e00f1150)
2018-07-31 10:11:12.532: [UiServer][3474949888]{1:26996:1141} Sending message to PE. ctx= 0x7f53ec1a8200, Client PID: 21497
2018-07-31 10:11:12.544: [UiServer][3474949888]{1:26996:1141} Done for ctx=0x7f53ec1a8200
|
1번노드 최근 cssd log 확인
1
2
3
4
5
6
7
|
vi ocssd.log
2018-07-31 10:25:46.161: [ CSSD][4181890816]clssnmSendingThread: sending status msg to all nodes
2018-07-31 10:25:46.161: [ CSSD][4181890816]clssnmSendingThread: sent 5 status msgs to all nodes
2018-07-31 10:25:50.164: [ CSSD][4213798656]clssgmpcBuildNodeList: nodename for node 0 is NULL
2018-07-31 10:25:51.162: [ CSSD][4181890816]clssnmSendingThread: sending status msg to all nodes
2018-07-31 10:25:51.162: [ CSSD][4181890816]clssnmSendingThread: sent 5 status msgs to all nodes
2018-07-31 10:25:55.165: [ CSSD][4213798656]clssgmpcBuildNodeList: nodename for node 0 is NULL
|
=> 버그라고 함(2023159.1)
추가로 2번노드도 MTU 똑같이 맞춰주면 좋을 듯
2번노드 루트 계정으로 MTU변경
1
2
|
$ su -
# ifconfig lo mtu 16436
|
2번노드 변경확인
1
2
|
# ifconfig lo | grep MTU
UP LOOPBACK RUNNING MTU:16436 Metric:1
|
원인 : 루프백 어댑터의 MTU가 너무 높아서 발생하는 문제라고 함 + 컨트롤파일 에러
Linux 7 버전에서도 동일한 설정이 필요함
참조 : 메타링크 (2322410.1)(2041723.1)
https://community.oracle.com/mosc/discussion/4167776/mtu-value-for-loopback-adapter-in-four-node-rac
https://community.oracle.com/mosc/discussion/4167774/mtu-value-for-loop-back-adapter-in-11g-rac
https://www.ora-solutions.net/web/2018/05/12/beware-of-loopback-mtu-size-with-rac-on-oracle-linux-7/