[Lustre-devel] Hangs with cgroup memory controller

* [Lustre-devel] Hangs with cgroup memory controller
@ 2011-07-27 16:21 Mark Hills
  2011-07-27 17:11 ` Andreas Dilger
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Hills @ 2011-07-27 16:21 UTC (permalink / raw)
  To: lustre-devel

We are unable to use the combination of Lustre and the cgroup memory 
controller, because of intermittent hangs when trying to close the cgroup.

In a thread on LKML [1] we diagnosed that the problem was a leak of page 
accounting or resources.

Memory pages are charged to the cgroup, but the cgroup is unable to 
un-charge them, and so it spins. It suggests that, perhaps, at least one 
page gets allocated but not placed in the LRU.

Using the NFS client, via a gateway, has never shown this problem.

I'm in the client code, but I really need some pointers. And disadvantaged 
by being unable to find a reproducable test case. Any ideas?

Our system is Lustre 1.8.6 server, with clients on Linux 2.6.32 and Lustre 
1.8.5.

Thanks

[1] https://lkml.org/lkml/2010/9/9/534

-- 
Mark

^ permalink raw reply	[flat|nested] 11+ messages in thread