[PATCH 0/7] memcg background reclaim , yet another one.

* [PATCH 0/7] memcg background reclaim , yet another one.
@ 2011-04-25  9:25 KAMEZAWA Hiroyuki
  2011-04-25  9:28 ` [PATCH 1/7] memcg: add high/low watermark to res_counter KAMEZAWA Hiroyuki
                   ` (10 more replies)
  0 siblings, 11 replies; 68+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-04-25  9:25 UTC (permalink / raw)
  To: Ying Han
  Cc: linux-mm, kosaki.motohiro, balbir, nishimura, akpm,
	Johannes Weiner, minchan.kim, Michal Hocko

This patch is based on Ying Han's one....at its origin, but I changed too much ;)
Then, start this as new thread.

(*) This work is not related to the topic "rewriting global LRU using memcg"
    discussion, at all. This kind of hi/low watermark has been planned since
    memcg was born. 

At first, per-memcg background reclaim is used for
  - helping memory reclaim and avoid direct reclaim.
  - set a not-hard limit of memory usage.

For example, assume a memcg has its hard-limit as 500M bytes.
Then, set high-watermark as 400M. Here, memory usage can exceed 400M up to 500M
but memory usage will be reduced automatically to 400M as time goes by.

This is useful when a user want to limit memory usage to 400M but don't want to
see big performance regression by hitting limit when memory usage spike happens.

1) == hard limit = 400M ==
[root@rhel6-test hilow]# time cp ./tmpfile xxx                
real    0m7.353s
user    0m0.009s
sys     0m3.280s

2) == hard limit 500M/ hi_watermark = 400M ==
[root@rhel6-test hilow]# time cp ./tmpfile xxx

real    0m6.421s
user    0m0.059s
sys     0m2.707s

Above is a brief result on VM and needs more study. But my impression is positive.
I'd like to use bigger real machine in the next time.

Here is a short list of updates from Ying Han's one.

 1. use workqueue and visit memcg in round robin.
 2. only allow setting hi watermark. low-watermark is automatically determined.
    This is good for avoiding bad cpu usage by background reclaim.
 3. totally rewrite algorithm of shrink_mem_cgroup for round-robin.
 4. fixed get_scan_count() , this was problematic.
 5. added some statistics, which I think necessary.
 6. added documenation

Then, the algorithm is not a cut-n-paste from kswapd. I thought kswapd should be
updated...and 'priority' in vmscan.c seems to be an enemy of memcg ;)

Thanks
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 68+ messages in thread