Oracle小型机日常巡检

  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。

Oracle小型机日常巡检



企业的业务数据库系统是IT运维的重中之重,为使数据库长期稳定的运行,需要相关人员对数据库进行每日巡检和记录,下面对数据库日常巡检工作做一个全面详细的计划:

一、小型机日常巡检:

1. 检查小型机硬件健康状态
1.1 显示内核启用的是32位还是64位
# bootinfo -K
64
1.2 显示硬件32位还是64位:
# bootinfo -y
64
1.3 显示以KB为单位的实际内存:
# bootinfo -r
32505856
1.4 显示系统上的硬盘数量
# lspv
hdisk0 00c7c505bc0669c5 rootvg active
hdisk1 00c7c50592cdd77a rootvg active
hdisk2 00cb9934c0a92e73 datavg active
hdisk3 00c7c505ce5e6688 datavg active
1.5 查看硬盘hdisk1的详细信息:
# lspv hdisk1
PHYSICAL VOLUME: hdisk1 VOLUME GROUP: rootvg
PV IDENTIFIER: 00c7c50592cdd77a VG IDENTIFIER 00c7c50500004c0000000129bc06773f
PV STATE: active
STALE PARTITIONS: 0 ALLOCATABLE: yes
PP SIZE: 512 megabyte(s) LOGICAL VOLUMES: 14
TOTAL PPs: 558 (285696 megabytes) VG DESCRIPTORS: 2
FREE PPs: 224 (114688 megabytes) HOT SPARE: no
USED PPs: 334 (171008 megabytes) MAX REQUEST: 1 megabyte
FREE DISTRIBUTION: 01..00..00..111..112
USED DISTRIBUTION: 111..112..111..00..00
MIRROR POOL: None

# smitty fs
# smitty lvm

1.6 查看处理器数量:
# lscfg | grep proc
+ proc0 Processor
+ proc2 Processor
+ proc4 Processor
+ proc6 Processor
1.7 查看一个CPU的详细信息:
# lsattr -El proc0
frequency 4204000000 Processor Speed False
smt_enabled true Processor SMT enabled False
smt_threads 2 Processor SMT threads False
state enable Processor state False
type PowerPC_POWER6 Processor type False
#
1.8 查看系统硬件资源列表:
#lscfg
1.9 查看芯片类型:
# uname -p
powerpc
1.10 查看操作系统版本号:
oslevel
1.11 显示系统名称:
# uname -s
AIX
1.12 显示节点名称:
# uname -n
DL-DB-02
1.13 显示uname的很多信息(系统名称、节点名称、版本、计算机ID):
# uname -a
AIX DL-DB-02 1 6 00C7C5054C00
1.14 显示系统型号:
# uname -M
IBM,8204-E8A
1.15 显示操作系统版本:
# uname -v
6
1.16 显示运行系统的硬件的计算机ID编号:
# uname

-m
00C7C5054C00
1.17 显示系统ID编号:
# uname -u
IBM,02067C505
1.18 显示AIX的主要版本、次要版本和维护级:
# oslevel -r
6100-04
# lslpp -h bos.rte
Fileset Level Action Status Date Time
----------------------------------------------------------------------------
Path: /usr/lib/objrepos
bos.rte
6.1.4.0 COMMIT COMPLETE 07/10/10 19:07:31


Path: /etc/objrepos
bos.rte
6.1.4.0 COMMIT COMPLETE 07/10/10 19:07:31
#
1.19 查看磁盘使用情况(参数k表示以k为单位,m表示以M为单位):
# df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 5242880 5039512 4% 14271 2% /
/dev/hd2 11534336 5382688 54% 52471 5% /usr
/dev/hd9var 5242880 4544720 14% 7487 1% /var
/dev/hd3 10485760 10397956 1% 4002 1% /tmp
/dev/fwdump 1048576 1046932 1% 13 1% /var/adm/ras/platform
/dev/hd1 5242880 5241708 1% 8 1% /home
/dev/hd11admin 524288 523848 1% 5 1% /admin
/proc - - - - - /proc
/dev/hd10opt 10485760 5696856 46% 10713 1% /opt
/dev/livedump 524288 523880 1% 4 1% /var/adm/ras/livedump
/dev/oradmpbak 10485760 4488028 58% 28042 3% /orainstbak1
/dev/oraclebak 62914560 9605248 85% 33 1% /oradatabak1
/dev/oradata 367001600 321016968 13% 33 1% /oradata
/dev/orainst 20971520 14943512 29% 28707 1% /orainst
1.20 查看文件大小
# du -s tmp
166552 tmp


2. 检查系统报错信息
2.1 显示简短报错信息
# errpt | more
TIMESTAMP: MMDDHHMMYY (月日时分年)
T(类型): P 永久; T 临时; U 未知 (永久性的错误应引起重视)
C(分类): H 硬件; S 软件; O 用户; U未知
2.2 列出所有硬件出错信息:
# errpt -d H
2.3 列出所有软件出错信息:
# errpt -d S
2.4 查看具体某个ID的报错信息:
# errpt -aj D666A8C7 > aaa.txt
D666A8C7是简短报错信息中的ID号。
2.5 3.控制面板上的LED 代码
.一般为8 位代码,通常系统故障灯会同时亮起。某些机型还会同时显示故障设备位置代码。
.4 位代码,通常是Exxx。
.3 位代码,通常为0yyy,只看后3位。
.8 位和4位代码可查看系统服务手册 (Service Guide)。
3 位代码可查看系统诊断手册(Diagnostic Information for Multiple Bus System)。
.闪动的 888, 系统崩溃,硬件或软件原因造成。按reset 键会显示更多内容。
888-102 一般为软件故障(888-102-207 例外)
系统会产生一个dump。
888-102-xxx-0C9 系统正在做dump, 请等待。
888-102-xxx-0C0 系统dump完成,可关电重启。
888-103 或 105
硬件故障,一般有 SRN

代码及位置代码。
2.6 SMS (System Management Service) 故障记录
当主控台出现键盘图标后(LED 显示E1F1时)按1键。进入SMS 菜单
选择"Utilities"
选择"Error Log", 抄下8位故障代码
(在SMS 中还可以更改系统启动顺序表

3. 查看操作系统日志
errpt -a | head -150

4. 查看相关用户mail
# mail
# su - oracle
$ mail
31 esaadmin Sun Jan 9 03:01 15/735 "Electronic Service Agent not"
?

会出现所有邮件的列表,这时在“?”后面按数字表示要查看那一封邮件。按“h”表示查看邮件列表。

5. 检查CPU利用率
# sar 1 5


AIX DL-DB-02 1 6 00C7C5054C00 10/29/13


System configuration: lcpu=8 mode=Capped


15:10:57 %usr %sys %wio %idle physc
15:10:58 16 1 0 83 4.10
15:10:59 44 2 0 54 3.95
15:11:00 11 1 0 88 3.97
15:11:01 1 0 0 99 4.00
15:11:02 24 2 0 75 4.00


Average 19 1 0 80 4.00
#
%usr + %sys > 80% 的时候 CPU将是瓶颈

5.2 # topas


6. 检查内存利用率

# vmstat


System configuration: lcpu=8 mem=31744MB


kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
1 2 6211275 29987 0 0 0 130 261 0 82 3730 558 2 0 97 0
#
# vmstat -v
8126464 memory pages
7873088 lruable pages
15376 free pages
2 memory pools
691381 pinned pages
80.0 maxpin percentage
3.0 minperm percentage
90.0 maxperm percentage
22.8 numperm percentage
1801544 file pages
0.0 compressed percentage
0 compressed pages
22.8 numclient percentage
90.0 maxclient percentage
1801544 client pages
0 remote pageouts scheduled
0 pending disk I/Os blocked with no pbuf
0 paging space I/Os blocked with no psbuf
2484 filesystem I/Os blocked with no fsbuf
754 client filesystem I/Os blocked with no fsbuf
2483337 external pager filesystem I/Os blocked with no fsbuf
#

6.2
# svmon
size inuse free pin virtual mmode
memory 8126464 8041605 84859 688230 6156257 Ded
pg space 12582912 19307


work pers clnt other
pin 446593 0 2773 238864
in use 6156257 0 1885348


PageSize PoolSize inuse pgsp pin virtual
s 4 KB

- 7408421 19307 327814 5523073
m 64 KB - 39574 0 22526 39574
#
# svmon -G
size inuse free pin virtual mmode
memory 8126464 8056058 70406 689450 6185976 Ded
pg space 12582912 19692


work pers clnt other
pin 447813 0 2773 238864
in use 6185976 0 1870082


PageSize PoolSize inuse pgsp pin virtual
s 4 KB - 7420874 19692 327818 5550792
m 64 KB - 39699 0 22602 39699
#


7. 检查虚拟内存利用率
# lsps -a
Page Space Physical Volume Volume Group Size %Used Active Auto Type Chksum
hd6 hdisk0 rootvg 49152MB 1 yes yes lv 0
#

8. 检查系统磁盘空间利用率
# df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 5242880 5039212 4% 14272 2% /
/dev/hd2 11534336 5382376 54% 52471 5% /usr
/dev/hd9var 5242880 4553428 14% 7487 1% /var
/dev/hd3 10485760 10397952 1% 4003 1% /tmp
/dev/fwdump 1048576 1046932 1% 13 1% /var/adm/ras/platform
/dev/hd1 5242880 5241708 1% 8 1% /home
/dev/hd11admin 524288 523848 1% 5 1% /admin
/proc - - - - - /proc
/dev/hd10opt 10485760 5696856 46% 10713 1% /opt
/dev/livedump 524288 523880 1% 4 1% /var/adm/ras/livedump
/dev/oradmpbak 10485760 4488028 58% 28042 3% /orainstbak1
/dev/oraclebak 62914560 9605248 85% 33 1% /oradatabak1
/dev/oradata 367001600 321016968 13% 33 1% /oradata
/dev/orainst 20971520 14900920 29% 28709 1% /orainst
#

9. 检查逻辑卷状态
# lsvg -l rootvg
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 2 2 closed/syncd N/A
hd6 paging 96 192 2 open/syncd N/A
hd8 jfs2log 1 2 2 open/syncd N/A
hd4 jfs2 10 20 2 open/syncd /
hd2 jfs2 22 44 2 open/syncd /usr
hd9var jfs2 10 20 2 open/syncd /var
hd3 jfs2 20 40 2 open/syncd /tmp
hd1 jfs2 10 20 2 open/syncd /home
hd10opt jfs2 20 40 2 open/syncd /opt
hd11admin jfs2 1 2 2 open/syncd /admin
fwdump jfs2 2 4 2 open/syncd /var/adm/ras/platform
lg_dumplv sysdump 6

6 1 open/syncd N/A
livedump jfs2 1 2 2 open/syncd /var/adm/ras/livedump
oradmpbak jfs2 20 20 1 open/syncd /orainstbak1
oraclebak jfs2 120 120 1 open/syncd /oradatabak1
# lsvg -l datavg
datavg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
orainst jfs2 40 40 1 open/syncd /orainst
oradata jfs2 700 700 2 open/syncd /oradata
loglv00 jfs2log 1 1 1 open/syncd N/A
#

10. 检查网络连通性
# ifconfig -a
en0: flags=5e080863,c0
inet 10.76.16.25 netmask 0xffffff00 broadcast 10.76.16.255
tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
en2: flags=1e080863,c0
inet 10.76.16.24 netmask 0xffffff00 broadcast 10.76.16.255
tcp_sendspace 131072 tcp_recvspace 65536 rfc1323 0
lo0: flags=e08084b
inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
inet6 ::1/0
tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
# netstat -in
Name Mtu Network Address ZoneID Ipkts Ierrs Opkts Oerrs Coll
en0 1500 link#2 0.15.17.d3.19.a8 - 57553153 0 50270548 6 0
en0 1500 10.76.16 10.76.16.25 - 57553153 0 50270548 6 0
en2 1500 link#3 0.21.5e.b3.e6.20 - 462983882 0 50159523 0 0
en2 1500 10.76.16 10.76.16.24 - 462983882 0 50159523 0 0
lo0 16896 link#1 - 6216486 0 6224734 0 0
lo0 16896 127 127.0.0.1 - 6216486 0 6224734 0 0
lo0 16896 ::1 1 6216486 0 6224734 0 0
#


11. 查看HACMP状态和日志
/usr/sbin/cluster/diag/clconfig -v '-tr'
输出判断:
结果无Fail项输出。
11.1 查看HACMP网卡 enX 的网络监控进程日志
/var/ha/log/nim.topsvcs.en0.DL_DB
11.2 查看心跳出现丢失后RSCT 对网络拓扑的逻辑分析判断的过程
/var/ha/log/nmDiag.nim.topsvcs.en0.DL_DB
11.3 查看主进程日志文件
/var/ha/log/topsvcs.08.140605.DL_DB
11.4 查看HACMP状态
cat /usr/sbin/cluster/clstat

# llssrc -ls topsvcs | more
12. 检查hosts文件是否有特殊定义
# cat /etc/hosts

13. 检查系统备份

14. 磁带机清洁检查
# /usr/lpp/diagnostics/bin/utape -cd rmt0 -n
1.316667
显示结果为磁带机使用的小时数,若大于72小时,则不论磁带机黄灯是否亮都应用清洁带清洗。

15. 查看cluster状态:
15.1 # lssrc -g cluster
Subsystem

Group PID Status
clstrmgrES cluster 164078 active
15.2 # /usr/sbin/cluster/clstat
15.3 # netstat -in (看IP公共是否生效)
15.4 # ps -ef | grep cluster
root 286878 200838 0 Sep 19 - 6:45 /usr/es/sbin/cluster/clinfo
root 311494 200838 0 Sep 19 - 26:02 /usr/es/sbin/cluster/clstrmgr
root 368846 200838 0 Sep 19 - 11:16 /usr/es/sbin/cluster/clcomd -d
root 1601538 938202 0 08:44:46 pts/0 0:00 grep cluster
15.5 # lsvg –o
此时应可以看到本地的VG及共享VG

16. 检查自动任务的状态

17. 检查AIX官方补丁

18. 关注AIX官方新闻
18. 检查硬盘IO
# iostat


System configuration: lcpu=8 drives=6 paths=6 vdisks=0


tty: tin tout avg-cpu: % user % sys % idle % iowait
0.0 3.5 2.5 0.5 96.9 0.2


Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk2 0.0 0.1 0.0 12152 167280
hdisk3 0.0 0.0 0.0 344 24684
hdisk0 0.0 0.0 0.0 12489 18780
hdisk1 0.0 0.0 0.0 17 18780
cd0 0.0 0.0 0.0 0 0
usbms0 0.0 0.0 0.0 0 0
#
19、检查心跳线是否可用:

附:IBM工程师季度巡检操作:
# uname -Mu
IBM,8203-E4A IBM,020680E75
# lsdev -Cc adapter
ent0 Available 03-00 2-Port 10/100/1000 Base-TX PCI-Express Adapter (14104003)
ent1 Available 03-01 2-Port 10/100/1000 Base-TX PCI-Express Adapter (14104003)
ent2 Available Logical Host Ethernet Port (lp-hea)
ent3 Available Logical Host Ethernet Port (lp-hea)
fcs0 Available 06-00 4Gb FC PCI Express Adapter (df1000fe)
fcs1 Available 07-00 4Gb FC PCI Express Adapter (df1000fe)
lai0 Available 08-00 GXT135P Graphics Adapter
lhea0 Available Logical Host Ethernet Adapter (l-hea)
sa0 Available 01-08 2-Port Asynchronous EIA-232 PCI Adapter
sissas0 Available 00-08 PCI-X266 Planar 3Gb SAS Adapter
usbhc0 Available 02-08 USB Host Controller (33103500)
usbhc1 Available 02-09 USB Host Controller (33103500)
usbhc2 Available 02-0a USB Enhanced Host Controller (3310e000)
vsa0 Available LPAR Virtual Serial Adapter
vsa1 Available LPAR Virtual Serial Adapter
# lsdev -Cc disk
hdisk0 Available 00-08-00 SAS Disk Drive
hdisk1 Available 00-08-00 SAS Disk Drive
hdisk2 Available 00-08-00 SAS Disk Drive
hdisk3 Available 00-08-00 SAS Disk Drive
hdisk4 Available 07-00-01 MPIO Other DS4K Array Disk
# lspv
hdisk0 00c80e7540b90b70 rootvg active
hdisk1 00c80e7540b8ec50 rootvg active
hdisk2 00c80e7540b91b24 None
hdisk3 00c80e75e305248b None
hdisk4

00c80e6592b8e36a datavg active
#
#
# df -k
Filesystem 1024-blocks Free %Used Iused %Iused Mounted on
/dev/hd4 5242880 2021884 62% 14365 4% /
/dev/hd2 10485760 7541524 29% 46638 3% /usr
/dev/hd9var 5242880 4460040 15% 7647 1% /var
/dev/hd3 10485760 5972844 44% 11176 1% /tmp
/dev/fwdump 1048576 1047540 1% 13 1% /var/adm/ras/platform
/dev/hd1 5242880 5240532 1% 37 1% /home
/dev/hd11admin 262144 261744 1% 5 1% /admin
/proc - - - - - /proc
/dev/hd10opt 5242880 1536220 71% 12543 4% /opt
/dev/livedump 262144 261776 1% 4 1% /var/adm/ras/livedump
/dev/orainstbak 10485760 4998540 53% 28301 3% /orainstbak
/dev/oradatabak 41943040 12439948 71% 57 1% /oradatabak
/dev/oradata 72089600 42977892 41% 32 1% /oradata
/dev/orainst 20971520 14404416 32% 36210 2% /orainst
# lsvg -l rootvg
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 1 2 2 closed/syncd N/A
hd6 paging 96 192 2 open/syncd N/A
hd8 jfs2log 1 2 2 open/syncd N/A
hd4 jfs2 20 40 2 open/syncd /
hd2 jfs2 40 80 2 open/syncd /usr
hd9var jfs2 20 40 2 open/syncd /var
hd3 jfs2 40 80 2 open/syncd /tmp
hd1 jfs2 20 40 2 open/syncd /home
hd10opt jfs2 20 40 2 open/syncd /opt
hd11admin jfs2 1 2 2 open/syncd /admin
fwdump jfs2 4 8 2 open/syncd /var/adm/ras/platform
lg_dumplv sysdump 8 8 1 open/syncd N/A
livedump jfs2 1 2 2 open/syncd /var/adm/ras/livedump
orainstbak jfs2 40 40 1 open/syncd /orainstbak
oradatabak jfs2 160 160 1 open/syncd /oradatabak
#
#
# lspv hdisk0
PHYSICAL VOLUME: hdisk0 VOLUME GROUP: rootvg
PV IDENTIFIER: 00c80e7540b90b70 VG IDENTIFIER 00c80e7500004c00000001268e094434
PV STATE: active
STALE PARTITIONS: 0 ALLOCATABLE: yes
PP SIZE: 256 megabyte(s) LOGICAL VOLUMES: 13
TOTAL PPs: 546 (139776 megabytes) VG DESCRIPTORS: 2
FREE PPs: 274 (70144 megabytes) HOT SPARE: no
USED PPs: 272 (69632 megabytes) MAX REQUEST: 1 megabyte
FREE DISTRIBUTION: 109..00..00..56..109
USED

DISTRIBUTION: 01..109..109..53..00
MIRROR POOL: None
#
#
# lsvg rootvg
VOLUME GROUP: rootvg VG IDENTIFIER: 00c80e7500004c00000001268e094434
VG STATE: active PP SIZE: 256 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 1092 (279552 megabytes)
MAX LVs: 256 FREE PPs: 356 (91136 megabytes)
LVs: 15 USED PPs: 736 (188416 megabytes)
OPEN LVs: 14 QUORUM: 1 (Disabled)
TOTAL PVs: 2 VG DESCRIPTORS: 3
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 2 AUTO ON: yes
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 1024 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
#
#
# lsvg
rootvg
datavg
# lsvg -o
datavg
rootvg
#
#
# lsmcode
#
#
# lssrc -ls clstrmgrES
Current state: ST_STABLE
sccsid = "@(#)36 1.135.1.80 src/43haes/usr/sbin/cluster/hacmprd/main.C, hacmp.pe, 52haes_r541, 0736A_hacmp541 7/19/07 06:36:20"
i_local_nodeid 0, i_local_siteid -1, my_handle 1
ml_idx[1]=0 ml_idx[2]=1
There are 0 events on the Ibcast queue
There are 0 events on the RM Ibcast queue
CLversion: 9
local node vrmf is 5410
cluster fix level is "0"
The following timer(s) are currently active:
Current DNP values
DNP Values for NodeId - 1 NodeName - node1
PgSpFree = 6275593 PvPctBusy = 0 PctTotalTimeIdle = 33.336896
DNP Values for NodeId - 2 NodeName - node2
PgSpFree = 6286798 PvPctBusy = 0 PctTotalTimeIdle = 33.357282
#
#
# lsdev -Cc tape
rmt0 Available 00-08-00 SAS 4mm Tape Drive
rmt1 Available 06-00-01 IBM 3580 Ultrium Tape Drive (FCP)
rmt2 Available 07-00-01 IBM 3580 Ultrium Tape Drive (FCP)
smc0 Available 07-00-01 IBM 3573 Tape Medium Changer (FCP)
#
#
# errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
DCB47997 0622090614 T H hdisk4 DISK OPERATION ERROR
D666A8C7 0622090514 T H fcs0 ADAPTER ERROR
D666A8C7 0622090514 T H fcs0 ADAPTER ERROR
C43F90ED 0526090414 P H hdisk4 SUBSYSTEM COMPONENT FAILURE
D666A8C7 0526090314 T H fcs1 ADAPTER ERROR
D666A8C7 0526090314 T H fcs1 ADAPTER ERROR
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE C

AUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
9D3B9A10 0627172811 P U SYSDUMP platform_dump processing failure
9D3B9A10 0627172811 P U SYSDUMP platform_dump processing failure
9D3B9A10 0627172811 P U SYSDUMP platform_dump processing failure
9D3B9A10 0627172811 P U SYSDUMP platform_dump processing failure
A29426DA 020******* P U topsvcs Local adapter misconfiguration detected
A29426DA 020******* P U topsvcs Local adapter misconfiguration detected
# errpt -aj dcb47997|pg
---------------------------------------------------------------------------
LABEL: SC_DISK_ERR4
IDENTIFIER: DCB47997


Date/Time: Sun Jun 22 09:06:13 BEIST 2014
Sequence Number: 1365
Machine Id: 00C80E754C00
Node Id: HCBA-DB-2
Class: H
Type: TEMP
WPAR: Global
Resource Name: hdisk4
Resource Class: disk
Resource Type: mpioapdisk
Location: U789C.001.DQD4P95-P1-C3-T1-W200500A0B850E829-L0
VPD:
Manufacturer................IBM
Machine Type and Model......1814 FAStT
ROS Level and ID............30393136
Serial Number...............
Device Specific.(Z0)........0000053245004032
Device Specific.(Z1)........


Description
DISK OPERATION ERROR


Probable Causes
MEDIA
DASD DEVICE


User Causes
MEDIA DEFECTIVE


Recommended Actions
FOR REMOVABLE MEDIA, CHANGE MEDIA AND RETRY
PERFORM PROBLEM DETERMINATION PROCEDURES


Failure Causes
MEDIA
DISK DRIVE


Recommended Actions
FOR REMOVABLE MEDIA, CHANGE MEDIA AND RETRY
PERFORM PROBLEM DETERMINATION PROCEDURES


Detail Data
PATH ID
0
SENSE DATA
0A00 2800 05F0 2CC0 0000 1004 0000 0000 0000 0000 0000 0000 0102 0000 7000 0600
0000 009E 0000 0000 2904 0000 0000 0000 0100 0000 0000 0000 0000 0000 0000 0000
0008 3600 0028 0005 F02C C000 0010 0400 0000 0000 0000 0000 534B 3932 3336 3130
3739 2020 2020 2020 0715 1000 0000 0000 0900 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 18C7 3036 3231 3134 2F31 3835 3434 3000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 8E01
81C5 0082 0080
#
#
#
#
# errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
DCB47997 0622090614 T H hdisk4 DISK OPERATION ERROR
D666A8C7 0622090514 T H fcs0 ADAPTER ERROR

D666A8C7 0622090514 T H fcs0 ADAPTER ERROR
C43F90ED 0526090414 P H hdisk4 SUBSYSTEM COMPONENT FAILURE
D666A8C7 0526090314 T H fcs1 ADAPTER ERROR
D666A8C7 0526090314 T H fcs1 ADAPTER ERROR
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
BD7DCB17 0501002414 I H tty0 FAILURE CAUSED AUTOMATIC RESET
9D3B9A10 0627172811 P U SYSDUMP platform_dump processing failure
9D3B9A10 0627172811 P U SYSDUMP platform_dump processing failure
9D3B9A10 0627172811 P U SYSDUMP platform_dump processing failure
9D3B9A10 0627172811 P U SYSDUMP platform_dump processing failure
A29426DA 020******* P U topsvcs Local adapter misconfiguration detected
A29426DA 020******* P U topsvcs Local adapter misconfiguration detected
#
#
# lsvg -l datavg
datavg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
orainst jfs2 160 160 1 open/syncd /orainst
oradata jfs2 550 550 1 open/syncd /oradata
loglv00 jfs2log 1 1 1 open/syncd N/A
# lsvg datavg
VOLUME GROUP: datavg VG IDENTIFIER: 00c80e6500004c000000012692b8e443
VG STATE: active PP SIZE: 128 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 959 (122752 megabytes)
MAX LVs: 256 FREE PPs: 248 (31744 megabytes)
LVs: 3 USED PPs: 711 (91008 megabytes)
OPEN LVs: 3 QUORUM: 2 (Enabled)
TOTAL PVs: 1 VG DESCRIPTORS: 2
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 1 AUTO ON: no
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
#


二、Oracle数据库日常巡检:

1. 检查数据库日志
1.1 警告日志:
cat /or

ainst/admin/cba/bdump/alert_cba.log
1.2 联机(在线)重做日志:
# cd /orainst/flash_recovery_area/CBA/archivelog
1.3 归档重做日志:
ls /orainst/flash_recovery_area/CBA/archivelog/
1.4 跟踪日志:
用LogMiner查看。

2. 查看相关用户的mail
$ mail

3. 检查核心转储

4. 检查所有无效的对象
select * from dba_objects where status!='VALID';

5. 检查数据库表空间使用率

select

a.a1 表空间名称,

c.c2 类型,

c.c3 区管理,

b.b2/1024/1024 表空间大小M,

(b.b2-a.a2)/1024/1024 已使用M,

substr((b.b2-a.a2)/b.b2*100,1,5) 利用率

from

(select tablespace_name a1, sum(nvl(bytes,0)) a2 from dba_free_space group by tablespace_name) a,

(select tablespace_name b1,sum(bytes) b2 from dba_data_files group by tablespace_name) b,

(select tablespace_name c1,contents c2,extent_management c3 from dba_tablespaces) c

where a.a1=b.b1 and c.c1=b.b1;


6. 检查数据库的数据文件是否为自动扩展
6.1查看所有的表空间以及所对应数据文件名:

select tablespace_name, file_id, file_name,
round(bytes/(1024*1024),0) total_space
from dba_data_files
order by tablespace_name
6.2 查看表空间是否自动扩展:
select tablespace_name,file_name,autoextensible from dba_data_files where tablespace_name = 'CIS_DATA';
select tablespace_name,file_name,autoextensible from dba_data_files where tablespace_name = '2012';
1. 2012
2. CIS_DATA
3. LHCBA_DATA
4. MMDB_DAT1
5. MMDB_LOB1
6. MMDB_NDX1
7. PMDB_DAT1
8. PMDB_LOB1
9. PMDB_NDX1
10. SYSAUX
11. SYSTEM
12. UNDOTBS1
13. USERS

7. 检查备份是否出错

8. 检查Oracle实例状态
SQL> select status from v$instance;


STATUS
------------
OPEN

9. 检查Oracle数据库状态
SQL> select open_mode from v$database;

10. 检查Oracle死锁
select username,lockwait,status,machine,program from v$session where sid in
(select session_id from v$locked_object)

11. 检查缓冲区命中率

12. 检查共享池命中率

13. 数据字典命中率

14. 库缓存命中率

15. 最浪费内存的前10个语句占全部内存读取量的比例

16. 检查失效的索引

17. 检查日志文件、控制文件、参数文件、数据文件、表空间、回滚段等Oracle对象的状态

18. 检查每一个扩展异常对象状态

19. 查询等待事件

20. SQL语句的利用率和效率查询

21. 实时监控操作系统,发生异常需短信或邮件报警

22. 检查Oracle数据库的进程

23. 检查Oracle数据库的监听进程

24. 检查Oracle的自动负载库报告

25. 检查Oracle官方补丁

26. 关注Oracle官方新闻

27. 检查Oracle的自动诊断报告

28. 使用LogMiner分析数据库的日志
29. 查看当前实例SID:
# echo $ORACLE_SID
# env
# set
30. 定义选择的sid(适用于一个数据库多个实例)
expor

t ORACLE_SID=orcl
30. 查看时spfile启动,还是pfile启动:
SQL> show parameter spfile;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
spfile string /orainst/product/10g/dbs/spfil
ecba.ora
如果有值说明使用spfile启动,反之pfile

相关文档
最新文档