http://www.loveunix.net/thread-126455-1-1.html 昨晚收到手机短信,有个数据库节点的paginspace占用率到了31%(告警阀是30%),早晨到现场后发现交换空间的利用率已经从31%增长到58%,而且物理内存的占用率到了100%,这个是ibm p595上的一个分区aix5304,ha
http://www.loveunix.net/thread-126455-1-1.html
昨晚收到手机短信,有个数据库节点的paginspace占用率到了31%(告警阀值是30%),早晨到现场后发现交换空间的利用率已经从31%增长到58%,而且物理内存的占用率到了100%,这个是ibm p595上的一个分区aix5304,hacmp5.3,oracle 9208 rac
ossresdb2:[/]lsps -a
page space physical volume volume group size %used active auto type
hd6 hdisk0 rootvg 32768mb 58 yes yes lv
查看进程占用内存情况时,发现pid为438434的oralce进程占用了系统53%的内存,共计19g内存:
ossresdb2:[/]ps aux | head -1 ; ps aux | sort -rn +3 | head
user pid %cpu %mem sz rss tty stat stime time command
oracle 438434 0.9 53.0 19353060 17625500 - a mar 02 2137:42 oracleresdb2 (l
zte 450808 0.0 0.0 720 756 pts/0 a 10:02:26 0:00 -ksh
zte 327788 0.3 0.0 9728 9744 pts/0 a 10:05:04 0:14 topas
root 6914224 0.0 0.0 1988 1956 - a feb 28 5:27 /usr/sbin/rsct/
root 5878006 0.0 0.0 52 48 - a jan 30 0:09 aioserver
root 5419188 0.0 0.0 60 32 - a jan 31 2:49 aioserver
root 4788268 0.0 0.0 1988 1956 - a feb 28 5:28 /usr/sbin/rsct/
root 4755470 0.0 0.0 1888 1860 - a feb 28 9:02 /usr/sbin/rsct/
root 4616446 0.0 0.0 48 32 - a 15:55:52 0:00 aioserver
root 3989894 0.0 0.0 320 108 - a jan 24 0:00 storwatchd star
在数据库中查询该进程对应的sql语句:
sql> select sql_text from v$sqlarea where address in (select sql_address from v$session where paddr in (select addr from v$process where spid = 438434));
sql_text
--------------------------------------------------------------------------------
begin pg_topo_430021.recreatetopodate(:1,:2,:3); end;
将该存储过程异常占用内存的情况先应用人员反映,和应用人员确认后,将该进程杀掉,杀掉改进程后,释放了大量内存,交换空间利用率下降到35.5%,物理内存的利用率下降到50.6%
nmon--------l=longterm-cpu-----host=ossresdb2------refresh=2 secs---10:53.23-----------------------------------------------------+
| memory --------------------------------------------------------------------------------------------------------------------------|
| physical pagespace | pages/sec in out | filesystemcache |
|% used 50.6% 35.5% | to paging space 1.5 0.0 | (numperm) 0.1% |
|% free 49.4% 64.5% | to file system 0.0 0.0 | process 36.2% |
|mb used 16588.3mb 11632.4mb | page scans 0.0 | system 14.3% |
|mb free 16179.6mb 21135.6mb | page cycles 0.0 | free 49.4% |
|total(mb) 32768.0mb 32768.0mb | page steals 0.0 | ------ |
| | page faults 154.9 | total 100.0% |
|------------------------------------------------------------ | numclient 0.2% |
|min/maxperm 1555mb( 5%) 3110mb( 9%) |min/maxfree 960 1088 total virtual 64.0gb | user 32.2% |
|min/maxpgahead 2 8 accessed virtual 24.3gb 38.0% pinned 18.0% |
| |
|---------------------------------------------------------------------------------
之前我们的这个分区就出现过pagingspace到达100%,造成分区宕机,当时也不知道原因,今天总算是找到真凶了,应用上的问题也会造成系统宕机,大家可要当心了哦,尤其要注意pagingspace的利用率