February 11, 2012

RMAN memory leak

Today, when checking on one server, very occasionally found rman memory leak on the host:
cihcispdb788.corporate.ge.com[oracle]_cdi11> top
top - 03:12:07 up 12 days, 15:26, 29 users,  load average: 3.60, 3.91, 3.82
Tasks: 1999 total,   5 running, 1994 sleeping,   0 stopped,   0 zombie
Cpu(s):  5.2%us,  3.6%sy,  0.0%ni, 89.0%id,  1.5%wa,  0.1%hi,  0.7%si,  0.0%st
Mem:  131955268k total, 84549420k used, 47405848k free,   950096k buffers
Swap: 67108856k total,  2679932k used, 64428924k free, 40716860k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
28506 oracle    25   0 12348 2620  776 R 99.8  0.0   8012:57 top
 1251 oracle    15   0 1274m  52m  47m S 20.5  0.0 213:58.15 oracle
31892 oracle    15   0 1255m 230m 225m R 18.9  0.2   0:32.08 oracle
 1660 oracle    25   0 1424m  35m  10m S 17.3  0.0   0:00.54 java
 5284 oracle    16   0 3065m 466m 459m S 13.8  0.4   0:12.62 oracle
 1641 oracle    18   0  142m  20m 7740 S  6.7  0.0   0:00.21 perl
29194 oracle    15   0 3208m 119m  16m S  6.4  0.1   1413:46 oraagent.bin
23998 oracle    RT   0  335m 163m  53m S  5.8  0.1   1027:28 ocssd.bin
 5274 oracle    15   0 13.5g  13g  13m S  5.1 10.7   2218:01 rman
 2660 oracle    15   0 12.6g  12g  13m S  4.5 10.0   2086:55 rman

28624 root      15   0  570m 236m  22m S  3.9  0.2 152:00.70 crsd.bin
 1054 oracle    16   0 12344 2560  776 R  3.5  0.0   0:01.27 top
 1075 oracle    -2   0 1265m 547m 533m S  3.2  0.4  69:36.10 oracle
 1087 oracle    -2   0 1265m 548m 534m S  3.2  0.4  77:13.70 oracle
 1102 oracle    -2   0 1265m 548m 534m S  2.9  0.4  70:56.91 oracle
15739 oracle    19   0  842m 168m  16m S  2.6  0.1 191:52.59 emagent
..................
..................

The red part show there are two rman processes consuming totally 25gb memory.


These Tow rman processes are standalone, no parent processes no child processes and they had already been there for nearly one week.
cihcispdb788.corporate.ge.com[oracle]_cdi11> ps -ef|grep 5274
oracle     769 29600  0 03:11 pts/25   00:00:00 grep 5274
oracle    5274     1 21 Feb03 ?        1-12:57:58 rman target /

cihcispdb788.corporate.ge.com[oracle]_cdi11> ps -ef|grep 5274
oracle    1011 29600  0 03:11 pts/25   00:00:00 grep 5274
oracle    5274     1 21 Feb03 ?        1-12:57:59 rman target /

cihcispdb788.corporate.ge.com[oracle]_cdi11> pmap 5274
5274:   rman target /
0000000000400000  13332K r-x--  /prod/product/oracle/product/11.2.0.2/bin/rman
0000000001305000    412K rw---  /prod/product/oracle/product/11.2.0.2/bin/rman
000000000136c000     12K rw---    [ anon ]
0000000018dc2000 14070476K rw---    [ anon ]
0000003e7a600000    112K r-x--  /lib64/ld-2.5.so
0000003e7a81b000      4K r----  /lib64/ld-2.5.so
0000003e7a81c000      4K rw---  /lib64/ld-2.5.so
0000003e7aa00000    520K r-x--  /lib64/libm-2.5.so
0000003e7aa82000   2044K -----  /lib64/libm-2.5.so
0000003e7ac81000      4K r----  /lib64/libm-2.5.so
0000003e7ac82000      4K rw---  /lib64/libm-2.5.so
.............
 total         14161260K

cihcispdb788.corporate.ge.com[oracle]_cdi11> pmap 5274
5274:   rman target /
0000000000400000  13332K r-x--  /prod/product/oracle/product/11.2.0.2/bin/rman
0000000001305000    412K rw---  /prod/product/oracle/product/11.2.0.2/bin/rman
000000000136c000     12K rw---    [ anon ]
0000000018dc2000 14070884K rw---    [ anon ]
0000003e7a600000    112K r-x--  /lib64/ld-2.5.so
0000003e7a81b000      4K r----  /lib64/ld-2.5.so
0000003e7a81c000      4K rw---  /lib64/ld-2.5.so
0000003e7aa00000    520K r-x--  /lib64/libm-2.5.so
0000003e7aa82000   2044K -----  /lib64/libm-2.5.so
0000003e7ac81000      4K r----  /lib64/libm-2.5.so
0000003e7ac82000      4K rw---  /lib64/libm-2.5.so
0000003e7ae00000   1332K r-x--  /lib64/libc-2.5.so
0000003e7af4d000   2048K -----  /lib64/libc-2.5.so
0000003e7b14d000     16K r----  /lib64/libc-2.5.so
0000003e7b151000      4K rw---  /lib64/libc-2.5.so
0000003e7b152000     20K rw---    [ anon ]
0000003e7b200000      8K r-x--  /lib64/libdl-2.5.so
0000003e7b202000   2048K -----  /lib64/libdl-2.5.so
0000003e7b402000      4K r----  /lib64/libdl-2.5.so
............. 
total         14161668K

We can see the anonnymouse part used most memroy. Typically rman memory leak issue.
And as long as the time esclape, the memory used by those two rman processes still keep increasing.
cihcispdb788.corporate.ge.com[oracle]_cdi11> free -m
             total       used       free     shared    buffers     cached
Mem:        128862      82493      46368          0        927      39774
-/+ buffers/cache:      41792      87070
Swap:        65535       2617      62918
Still we have quite enough free memory here, but sooner or later all free memory will be allocated by those two processes.


Now kill them and release the space.
[root@cihcispdb788 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:        128862      82354      46508          0        928      39787
-/+ buffers/cache:      41638      87224
Swap:        65535       2617      62918
[root@cihcispdb788 ~]# kill -9 5274
[root@cihcispdb788 ~]# kill -9 2660
[root@cihcispdb788 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:        128862      55696      73165          0        928      39776
-/+ buffers/cache:      14992     113870
Swap:        65535       2617      62918

The 25GB memory released now.

0 Comments:

Post a Comment