[RFC][PATCH 0/4] memcg: per cgroup background reclaim

* [RFC][PATCH 0/4] memcg: per cgroup background reclaim
@ 2010-11-30  6:49 Ying Han
  2010-11-30  6:49 ` [PATCH 1/4] Add kswapd descriptor Ying Han
                   ` (5 more replies)
  0 siblings, 6 replies; 52+ messages in thread
From: Ying Han @ 2010-11-30  6:49 UTC (permalink / raw)
  To: Balbir Singh, Daisuke Nishimura, KAMEZAWA Hiroyuki,
	Andrew Morton, Mel Gorman, Johannes Weiner, Christoph Lameter,
	Wu Fengguang, Andi Kleen, Hugh Dickins, Rik van Riel,
	KOSAKI Motohiro, Tejun Heo
  Cc: linux-mm

The current implementation of memcg only supports direct reclaim and this
patchset adds the support for background reclaim. Per cgroup background
reclaim is needed which spreads out the memory pressure over longer period
of time and smoothes out the system performance.

The current implementation is not a stable version, and it crashes sometimes
on my NUMA machine. Before going further for debugging, I would like to start
the discussion and hear the feedbacks of the initial design.

Current status:
I run through some simple tests which reads/writes a large file and makes sure
it triggers per cgroup kswapd on the low_wmark. Also, I compared at
pg_steal/pg_scan ratio w/o background reclaim.

Step1: Create a cgroup with 500M memory_limit and set the min_free_kbytes to 1024.
$ mount -t cgroup -o cpuset,memory cpuset /dev/cgroup
$ mkdir /dev/cgroup/A
$ echo 0 >/dev/cgroup/A/cpuset.cpus
$ echo 0 >/dev/cgroup/A/cpuset.mems
$ echo 500m >/dev/cgroup/A/memory.limit_in_bytes
$ echo 1024 >/dev/cgroup/A/memory.min_free_kbytes
$ echo $$ >/dev/cgroup/A/tasks

Step2: Check the wmarks.
$ cat /dev/cgroup/A/memory.reclaim_wmarks
memcg_low_wmark 98304000
memcg_high_wmark 81920000

Step3: Dirty the pages by creating a 20g file on hard drive.
$ ddtest -D /export/hdc3/dd -b 1024 -n 20971520 -t 1

Checked the memory.stat w/o background reclaim. It used to be all the pages are
reclaimed from direct reclaim, and now about half of them are reclaimed at
background. (note: writing '0' to min_free_kbytes disables per cgroup kswapd)

Only direct reclaim                                                With background reclaim:
kswapd_steal 0                                                     kswapd_steal 2751822
pg_pgsteal 5100401                                               pg_pgsteal 2476676
kswapd_pgscan 0                                                  kswapd_pgscan 6019373
pg_scan 5542464                                                   pg_scan 3851281
pgrefill 304505                                                       pgrefill 348077
pgoutrun 0                                                             pgoutrun 44568
allocstall 159278                                                    allocstall 75669

Step4: Cleanup
$ echo $$ >/dev/cgroup/tasks
$ echo 0 > /dev/cgroup/A/memory.force_empty

Step5: Read the 20g file into the pagecache.
$ cat /export/hdc3/dd/tf0 > /dev/zero;

Checked the memory.stat w/o background reclaim. All the clean pages are reclaimed at
background instead of direct reclaim.

Only direct reclaim                                                With background reclaim
kswapd_steal 0                                                      kswapd_steal 3512424
pg_pgsteal 3461280                                               pg_pgsteal 0
kswapd_pgscan 0                                                  kswapd_pgscan 3512440
pg_scan 3461280                                                   pg_scan 0
pgrefill 0                                                                pgrefill 0
pgoutrun 0                                                             pgoutrun 74973
allocstall 108165                                                    allocstall 0

Ying Han (4):
  Add kswapd descriptor.
  Add per cgroup reclaim watermarks.
  Per cgroup background reclaim.
  Add more per memcg stats.

 include/linux/memcontrol.h  |  112 +++++++++++
 include/linux/mmzone.h      |    3 +-
 include/linux/res_counter.h |   88 +++++++++-
 include/linux/swap.h        |   10 +
 kernel/res_counter.c        |   26 ++-
 mm/memcontrol.c             |  447 ++++++++++++++++++++++++++++++++++++++++++-
 mm/mmzone.c                 |    2 +-
 mm/page_alloc.c             |   11 +-
 mm/vmscan.c                 |  346 ++++++++++++++++++++++++++++++----
 9 files changed, 994 insertions(+), 51 deletions(-)

-- 
1.7.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 52+ messages in thread