[0.48.3] OSD memory leak when scrubbing

* [0.48.3] OSD memory leak when scrubbing
@ 2013-01-22 20:01 Sylvain Munaut
  2013-01-22 21:19 ` Sébastien Han
  0 siblings, 1 reply; 30+ messages in thread
From: Sylvain Munaut @ 2013-01-22 20:01 UTC (permalink / raw)
  To: ceph-devel

Hi,

Since I have ceph in prod, I experienced a memory leak in the OSD
forcing to restart them every 5 or 6 days. Without that the OSD
process just grows infinitely and eventually gets killed by the OOM
killer. (To make sure it wasn't "legitimate", I left one grow up to 4G
or RSS ...).

Here's for example the RSS usage of the 12 OSDs process
http://i.imgur.com/ZJxyldq.png during a few hours.

What I've just noticed is that if I look at the logs of the osd
process right when it grows, I can see it's scrubbing PGs from pool
#3. When scrubbing PGs from other pools, nothing really happens memory
wise.

Pool #3 is the pool where I have all the RBD images for the VMs and so
have a bunch of small read/write/modify. The other pools are used by
RGW for object storage and are mostly write-once,read-many-times of
relatively large objects.

I'm planning to upgrade to 0.56.1 this week end and I was hoping to
see if someone knew if that issue had been fixed with the scrubbing
code ?

I've seen other posts about memory leaks but at the time, it wasn't
confirmed what was the source. Here I clearly see it's the scrubbing
on pools that have RBD image.

Cheers,

      Sylvain

^ permalink raw reply	[flat|nested] 30+ messages in thread