linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] memcg soft limit (yet another new design) v1
@ 2009-03-27  4:59 KAMEZAWA Hiroyuki
  2009-03-27  5:01 ` [RFC][PATCH 1/8] soft limit support in res_counter KAMEZAWA Hiroyuki
                   ` (11 more replies)
  0 siblings, 12 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  4:59 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-mm, balbir, kosaki.motohiro, nishimura

Hi,

Memory cgroup's soft limit feature is a feature to tell global LRU 
"please reclaim from this memcg at memory shortage".

And Balbir's one and my one was proposed.
This is new one. (so restart from v1), this is very new-born.

While testing soft limit, my dilemma was following.

 - needs additional cost of can if implementation is naive (unavoidable?)

 - Inactive/Active rotation scheme of global LRU will be broken.

 - File/Anon reclaim ratio scheme of global LRU will be broken.
    - vm.swappiness will be ignored.

 - If using memcg's memory reclaim routine, 
    - shrink_slab() will be never called.
    - stale SwapCache has no chance to be reclaimed (stale SwapCache means
      readed but not used one.)
    - memcg can have no memory in a zone.
    - memcg can have no Anon memory
    - lumpty_reclaim() is not called.


This patch tries to avoid to use existing memcg's reclaim routine and
just tell "Hints" to global LRU. This patch is briefly tested and shows
good result to me. (But may not to you. plz brame me.)

Major characteristic is.
 - memcg will be inserted to softlimit-queue at charge() if usage excess
   soft limit.
 - softlimit-queue is a queue with priority. priority is detemined by size
   of excessing usage.
 - memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
 - Behavior is affected by vm.swappiness and LRU scan rate is determined by
   global LRU's status.

I'm sorry that I'm tend not to tell enough explanation.  plz ask me.
There will be much discussion points, anyway. As usual, I'm not in hurry.


==brief test result==
On 2CPU/1.6GB bytes machine. create group A and B
  A.  soft limit=300M
  B.  no soft limit

  Run a malloc() program on B and allcoate 1G of memory. The program just
  sleeps after allocating memory and no memory refernce after it.
  Run make -j 6 and compile the kernel.

  When vm.swappiness = 60  => 60MB of memory are swapped out from B.
  When vm.swappiness = 10  => 1MB of memory are swapped out from B    

  If no soft limit, 350MB of swap out will happen from B.(swapiness=60)

I'll try much more complexed ones in the weekend.

Thanks,
-Kame
































^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC][PATCH 1/8] soft limit support in res_counter
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
@ 2009-03-27  5:01 ` KAMEZAWA Hiroyuki
  2009-03-27  5:03 ` [RFC][PATCH 2/8] soft limit framework in memcg KAMEZAWA Hiroyuki
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:01 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

This is from Balbir's one.
-Kame
==
From: Balbir Singh <balbir@linux.vnet.ibm.com>
Changelog v2...v1
1. Add support for res_counter_check_soft_limit_locked. This is used
   by the hierarchy code.

Add an interface to allow get/set of soft limits. Soft limits for memory plus
swap controller (memsw) is currently not supported. Resource counters have
been enhanced to support soft limits and new type RES_SOFT_LIMIT has been
added. Unlike hard limits, soft limits can be directly set and do not
need any reclaim or checks before setting them to a newer value.

Kamezawa-San raised a question as to whether soft limit should belong
to res_counter. Since all resources understand the basic concepts of
hard and soft limits, it is justified to add soft limits here. Soft limits
are a generic resource usage feature, even file system quotas support
soft limits.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
---
Index: mmotm-2.6.29-Mar23/include/linux/res_counter.h
===================================================================
--- mmotm-2.6.29-Mar23.orig/include/linux/res_counter.h
+++ mmotm-2.6.29-Mar23/include/linux/res_counter.h
@@ -35,6 +35,10 @@ struct res_counter {
 	 */
 	unsigned long long limit;
 	/*
+	 * the limit that usage can be exceed
+	 */
+	unsigned long long soft_limit;
+	/*
 	 * the number of unsuccessful attempts to consume the resource
 	 */
 	unsigned long long failcnt;
@@ -85,6 +89,7 @@ enum {
 	RES_MAX_USAGE,
 	RES_LIMIT,
 	RES_FAILCNT,
+	RES_SOFT_LIMIT,
 };
 
 /*
@@ -130,6 +135,36 @@ static inline bool res_counter_limit_che
 	return false;
 }
 
+static inline bool res_counter_soft_limit_check_locked(struct res_counter *cnt)
+{
+	if (cnt->usage < cnt->soft_limit)
+		return true;
+
+	return false;
+}
+
+/**
+ * Get the difference between the usage and the soft limit
+ * @cnt: The counter
+ *
+ * Returns 0 if usage is less than or equal to soft limit
+ * The difference between usage and soft limit, otherwise.
+ */
+static inline unsigned long long
+res_counter_soft_limit_excess(struct res_counter *cnt)
+{
+	unsigned long long excess;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cnt->lock, flags);
+	if (cnt->usage <= cnt->soft_limit)
+		excess = 0;
+	else
+		excess = cnt->usage - cnt->soft_limit;
+	spin_unlock_irqrestore(&cnt->lock, flags);
+	return excess;
+}
+
 /*
  * Helper function to detect if the cgroup is within it's limit or
  * not. It's currently called from cgroup_rss_prepare()
@@ -145,6 +180,17 @@ static inline bool res_counter_check_und
 	return ret;
 }
 
+static inline bool res_counter_check_under_soft_limit(struct res_counter *cnt)
+{
+	bool ret;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cnt->lock, flags);
+	ret = res_counter_soft_limit_check_locked(cnt);
+	spin_unlock_irqrestore(&cnt->lock, flags);
+	return ret;
+}
+
 static inline void res_counter_reset_max(struct res_counter *cnt)
 {
 	unsigned long flags;
@@ -178,4 +224,16 @@ static inline int res_counter_set_limit(
 	return ret;
 }
 
+static inline int
+res_counter_set_soft_limit(struct res_counter *cnt,
+				unsigned long long soft_limit)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&cnt->lock, flags);
+	cnt->soft_limit = soft_limit;
+	spin_unlock_irqrestore(&cnt->lock, flags);
+	return 0;
+}
+
 #endif
Index: mmotm-2.6.29-Mar23/kernel/res_counter.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/kernel/res_counter.c
+++ mmotm-2.6.29-Mar23/kernel/res_counter.c
@@ -19,6 +19,7 @@ void res_counter_init(struct res_counter
 {
 	spin_lock_init(&counter->lock);
 	counter->limit = (unsigned long long)LLONG_MAX;
+	counter->soft_limit = (unsigned long long)LLONG_MAX;
 	counter->parent = parent;
 }
 
@@ -101,6 +102,8 @@ res_counter_member(struct res_counter *c
 		return &counter->limit;
 	case RES_FAILCNT:
 		return &counter->failcnt;
+	case RES_SOFT_LIMIT:
+		return &counter->soft_limit;
 	};
 
 	BUG();
Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -2002,6 +2002,20 @@ static int mem_cgroup_write(struct cgrou
 		else
 			ret = mem_cgroup_resize_memsw_limit(memcg, val);
 		break;
+	case RES_SOFT_LIMIT:
+		ret = res_counter_memparse_write_strategy(buffer, &val);
+		if (ret)
+			break;
+		/*
+		 * For memsw, soft limits are hard to implement in terms
+		 * of semantics, for now, we support soft limits for
+		 * control without swap
+		 */
+		if (type == _MEM)
+			ret = res_counter_set_soft_limit(&memcg->res, val);
+		else
+			ret = -EINVAL;
+		break;
 	default:
 		ret = -EINVAL; /* should be BUG() ? */
 		break;
@@ -2251,6 +2265,12 @@ static struct cftype mem_cgroup_files[] 
 		.read_u64 = mem_cgroup_read,
 	},
 	{
+		.name = "soft_limit_in_bytes",
+		.private = MEMFILE_PRIVATE(_MEM, RES_SOFT_LIMIT),
+		.write_string = mem_cgroup_write,
+		.read_u64 = mem_cgroup_read,
+	},
+	{
 		.name = "failcnt",
 		.private = MEMFILE_PRIVATE(_MEM, RES_FAILCNT),
 		.trigger = mem_cgroup_reset,


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC][PATCH 2/8] soft limit framework in memcg.
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
  2009-03-27  5:01 ` [RFC][PATCH 1/8] soft limit support in res_counter KAMEZAWA Hiroyuki
@ 2009-03-27  5:03 ` KAMEZAWA Hiroyuki
  2009-03-27  8:01   ` KAMEZAWA Hiroyuki
  2009-03-29 17:22   ` Balbir Singh
  2009-03-27  5:05 ` [RFC][PATCH 3/8] trigger for updating soft limit information KAMEZAWA Hiroyuki
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:03 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Add minimal modification for soft limit to res_counter_charge() and memcontol.c
Based on Balbir Singh <balbir@linux.vnet.ibm.com> 's work but most of
features are removed. (dropped or moved to later patch.)

This is for building a frame to implement soft limit handler in memcg.
 - Checks soft limit status at every charge.
 - Adds mem_cgroup_soft_limit_check() as a function to detect we need
   check now or not.
 - mem_cgroup_update_soft_limit() is a function for updates internal status
   of soft limit controller of memcg.
 - This has no hooks in uncharge path. (see later patch.)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Index: mmotm-2.6.29-Mar23/include/linux/res_counter.h
===================================================================
--- mmotm-2.6.29-Mar23.orig/include/linux/res_counter.h
+++ mmotm-2.6.29-Mar23/include/linux/res_counter.h
@@ -112,7 +112,8 @@ void res_counter_init(struct res_counter
 int __must_check res_counter_charge_locked(struct res_counter *counter,
 		unsigned long val);
 int __must_check res_counter_charge(struct res_counter *counter,
-		unsigned long val, struct res_counter **limit_fail_at);
+		unsigned long val, struct res_counter **limit_fail_at,
+		bool *soft_limit_failure);
 
 /*
  * uncharge - tell that some portion of the resource is released
Index: mmotm-2.6.29-Mar23/kernel/res_counter.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/kernel/res_counter.c
+++ mmotm-2.6.29-Mar23/kernel/res_counter.c
@@ -37,9 +37,11 @@ int res_counter_charge_locked(struct res
 }
 
 int res_counter_charge(struct res_counter *counter, unsigned long val,
-			struct res_counter **limit_fail_at)
+			struct res_counter **limit_fail_at,
+			bool *soft_limit_failure)
 {
 	int ret;
+	int soft_cnt = 0;
 	unsigned long flags;
 	struct res_counter *c, *u;
 
@@ -48,6 +50,8 @@ int res_counter_charge(struct res_counte
 	for (c = counter; c != NULL; c = c->parent) {
 		spin_lock(&c->lock);
 		ret = res_counter_charge_locked(c, val);
+		if (!res_counter_soft_limit_check_locked(c))
+			soft_cnt += 1;
 		spin_unlock(&c->lock);
 		if (ret < 0) {
 			*limit_fail_at = c;
@@ -55,6 +59,12 @@ int res_counter_charge(struct res_counte
 		}
 	}
 	ret = 0;
+	if (soft_limit_failure) {
+		if (!soft_cnt)
+			*soft_limit_failure = false;
+		else
+			*soft_limit_failure = true;
+	}
 	goto done;
 undo:
 	for (u = counter; u != c; u = u->parent) {
Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -897,6 +897,15 @@ static void record_last_oom(struct mem_c
 	mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
 }
 
+static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
+{
+	return false;
+}
+
+static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
+{
+	return;
+}
 
 /*
  * Unlike exported interface, "oom" parameter is added. if oom==true,
@@ -909,6 +918,7 @@ static int __mem_cgroup_try_charge(struc
 	struct mem_cgroup *mem, *mem_over_limit;
 	int nr_retries = MEM_CGROUP_RECLAIM_RETRIES;
 	struct res_counter *fail_res;
+	bool soft_fail;
 
 	if (unlikely(test_thread_flag(TIF_MEMDIE))) {
 		/* Don't account this! */
@@ -938,12 +948,13 @@ static int __mem_cgroup_try_charge(struc
 		int ret;
 		bool noswap = false;
 
-		ret = res_counter_charge(&mem->res, PAGE_SIZE, &fail_res);
+		ret = res_counter_charge(&mem->res, PAGE_SIZE, &fail_res,
+						&soft_fail);
 		if (likely(!ret)) {
 			if (!do_swap_account)
 				break;
 			ret = res_counter_charge(&mem->memsw, PAGE_SIZE,
-							&fail_res);
+							&fail_res, NULL);
 			if (likely(!ret))
 				break;
 			/* mem+swap counter fails */
@@ -985,6 +996,10 @@ static int __mem_cgroup_try_charge(struc
 			goto nomem;
 		}
 	}
+
+	if (soft_fail && mem_cgroup_soft_limit_check(mem))
+		mem_cgroup_update_soft_limit(mem);
+
 	return 0;
 nomem:
 	css_put(&mem->css);
@@ -2409,6 +2424,7 @@ static void __mem_cgroup_free(struct mem
 {
 	int node;
 
+	mem_cgroup_update_soft_limit(mem);
 	free_css_id(&mem_cgroup_subsys, &mem->css);
 
 	for_each_node_state(node, N_POSSIBLE)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC][PATCH 3/8] trigger for updating soft limit information
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
  2009-03-27  5:01 ` [RFC][PATCH 1/8] soft limit support in res_counter KAMEZAWA Hiroyuki
  2009-03-27  5:03 ` [RFC][PATCH 2/8] soft limit framework in memcg KAMEZAWA Hiroyuki
@ 2009-03-27  5:05 ` KAMEZAWA Hiroyuki
  2009-03-27  5:06 ` [RFC][PATCH 4/8] memcg soft limit priority array queue KAMEZAWA Hiroyuki
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:05 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Check/Update softlimit information at every charge is over-killing, so
we need some filter.

This patch tries to count events in the memcg and if events > threshold
tries to update memcg's soft limit status and reset event counter to 0.
Both of page-in/out is counted as event.

Event counter is maintained by per-cpu which has been already used,
Then, no siginificant overhead(extra cache-miss etc..) in theory.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -66,6 +66,7 @@ enum mem_cgroup_stat_index {
 	MEM_CGROUP_STAT_PGPGIN_COUNT,	/* # of pages paged in */
 	MEM_CGROUP_STAT_PGPGOUT_COUNT,	/* # of pages paged out */
 
+	MEM_CGROUP_STAT_EVENTS,  /* sum of page-in/page-out for internal use */
 	MEM_CGROUP_STAT_NSTATS,
 };
 
@@ -105,6 +106,22 @@ static s64 mem_cgroup_local_usage(struct
 	return ret;
 }
 
+/* For intenal use of per-cpu event counting. */
+
+static inline void
+__mem_cgroup_stat_reset_safe(struct mem_cgroup_stat_cpu *stat,
+		enum mem_cgroup_stat_index idx)
+{
+	stat->count[idx] = 0;
+}
+
+static inline s64
+__mem_cgroup_stat_read_local(struct mem_cgroup_stat_cpu *stat,
+			    enum mem_cgroup_stat_index idx)
+{
+	return stat->count[idx];
+}
+
 /*
  * per-zone information in memory controller.
  */
@@ -235,6 +252,8 @@ static void mem_cgroup_charge_statistics
 	else
 		__mem_cgroup_stat_add_safe(cpustat,
 				MEM_CGROUP_STAT_PGPGOUT_COUNT, 1);
+	__mem_cgroup_stat_add_safe(cpustat, MEM_CGROUP_STAT_EVENTS, 1);
+
 	put_cpu();
 }
 
@@ -897,9 +916,26 @@ static void record_last_oom(struct mem_c
 	mem_cgroup_walk_tree(mem, NULL, record_last_oom_cb);
 }
 
+#define SOFTLIMIT_EVENTS_THRESH (1024) /* 1024 times of page-in/out */
+/*
+ * Returns true if sum of page-in/page-out events since last check is
+ * over SOFTLIMIT_EVENT_THRESH. (counter is per-cpu.)
+ */
 static bool mem_cgroup_soft_limit_check(struct mem_cgroup *mem)
 {
-	return false;
+	bool ret = false;
+	int cpu = get_cpu();
+	s64 val;
+	struct mem_cgroup_stat_cpu *cpustat;
+
+	cpustat = &mem->stat.cpustat[cpu];
+	val = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_EVENTS);
+	if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
+		__mem_cgroup_stat_reset_safe(cpustat, MEM_CGROUP_STAT_EVENTS);
+		ret = true;
+	}
+	put_cpu();
+	return ret;
 }
 
 static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC][PATCH 4/8] memcg soft limit priority array queue.
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (2 preceding siblings ...)
  2009-03-27  5:05 ` [RFC][PATCH 3/8] trigger for updating soft limit information KAMEZAWA Hiroyuki
@ 2009-03-27  5:06 ` KAMEZAWA Hiroyuki
  2009-03-29 16:56   ` Balbir Singh
  2009-03-27  5:09 ` [RFC][PATCH 5/8] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:06 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

I'm now search a way to reduce lock contention without complex...
-Kame
==
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Softlimitq. for memcg.

Implements an array of queue to list memcgs, array index is determined by
the amount of memory usage excess the soft limit.

While Balbir's one uses RB-tree and my old one used a per-zone queue
(with round-robin), this is one of mixture of them.
(I'd like to use rotation of queue in later patches)

Priority is determined by following.
   unit = total pages/1024.
   if excess is...
      < unit,          priority = 0
      < unit*2,        priority = 1,
      < unit*2*2,      priority = 2,
      ...
      < unit*2^9,      priority = 9,
      < unit*2^10,      priority = 10,

This patch just includes queue management part and not includes 
selection logic from queue. Some trick will be used for selecting victims at
soft limit in efficient way.

And this equips 2 queues, for anon and file. Inset/Delete of both list is
done at once but scan will be independent. (These 2 queues are used later.)

Major difference from Balbir's one other than RB-tree is bahavior under
hierarchy. This one adds all children to queue by checking hierarchical
priority. This is for helping per-zone usage check on victim-selection logic.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |  121 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 120 insertions(+), 1 deletion(-)

Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -192,7 +192,13 @@ struct mem_cgroup {
 	atomic_t	refcnt;
 
 	unsigned int	swappiness;
-
+	/*
+	 * For soft limit.
+	 */
+	int soft_limit_priority;
+	struct list_head soft_limit_anon;
+	struct list_head soft_limit_file;
+	spinlock_t soft_limit_lock;
 	/*
 	 * statistics. This must be placed at the end of memcg.
 	 */
@@ -938,11 +944,116 @@ static bool mem_cgroup_soft_limit_check(
 	return ret;
 }
 
+/*
+ * Assume "base_amount", and excess = usage - soft limit.
+ *
+ * 0...... if excess < base_amount
+ * 1...... if excess < base_amount * 2
+ * 2...... if excess < base_amount * 2^2
+ * 3.......if excess < base_amount * 2^3
+ * ....
+ * 9.......if excess < base_amount * 2^9
+ * 10 .....if excess < base_amount * 2^10
+ */
+
+#define SLQ_MAXPRIO (11)
+static struct {
+	spinlock_t lock;
+	struct list_head queue[SLQ_MAXPRIO][2]; /* 0:anon 1:file */
+#define SL_ANON (0)
+#define SL_FILE (1)
+} softlimitq;
+
+#define SLQ_PRIO_FACTOR (1024) /* 2^10 */
+static unsigned long memcg_softlimit_base __read_mostly;
+
+static int __calc_soft_limit_prio(unsigned long long excess)
+{
+	unsigned long val;
+
+	val = excess / PAGE_SIZE;
+	val = val /memcg_softlimit_base;
+	return fls(val);
+}
+
+static int mem_cgroup_soft_limit_prio(struct mem_cgroup *mem)
+{
+	unsigned long long excess, max_excess;
+	struct res_counter *c;
+
+	max_excess = 0;
+	for (c = &mem->res; c; c = c->parent) {
+		excess = res_counter_soft_limit_excess(c);
+		if (max_excess < excess)
+			max_excess = excess;
+	}
+	return __calc_soft_limit_prio(max_excess);
+}
+
+static void __mem_cgroup_requeue(struct mem_cgroup *mem)
+{
+	/* enqueue to softlimit queue */
+	int prio = mem->soft_limit_priority;
+
+	spin_lock(&softlimitq.lock);
+	list_del_init(&mem->soft_limit_anon);
+	list_add_tail(&mem->soft_limit_anon, &softlimitq.queue[prio][SL_ANON]);
+	list_del_init(&mem->soft_limit_file,ist[SL_FILE]);
+	list_add_tail(&mem->soft_limit_file, &softlimitq.queue[prio][SL_FILE]);
+	spin_unlock(&softlimitq.lock);
+}
+
+static void __mem_cgroup_dequeue(struct mem_cgroup *mem)
+{
+	spin_lock(&softlimitq.lock);
+	list_del_init(&mem->soft_limit_anon);
+	list_del_init(&mem->soft_limit_file);
+	spin_unlock(&softlimitq.lock);
+}
+
+static int
+__mem_cgroup_update_soft_limit_cb(struct mem_cgroup *mem, void *data)
+{
+	int priority;
+	/* If someone updates, we don't need more */
+	if (!spin_trylock(&mem->soft_limit_lock))
+		return 0;
+
+	priority = mem_cgroup_soft_limit_prio(mem);
+
+	if (priority != mem->soft_limit_priority) {
+		mem->soft_limit_priority = priority;
+		__mem_cgroup_requeue(mem);
+	}
+	spin_unlock(&mem->soft_limit_lock);
+	return 0;
+}
+
 static void mem_cgroup_update_soft_limit(struct mem_cgroup *mem)
 {
+	int priority;
+
+	/* check status change */
+	priority = mem_cgroup_soft_limit_prio(mem);
+	if (priority != mem->soft_limit_priority) {
+		mem_cgroup_walk_tree(mem, NULL,
+				     __mem_cgroup_update_soft_limit_cb);
+	}
 	return;
 }
 
+static void softlimitq_init(void)
+{
+	int i;
+
+	spin_lock_init(&softlimitq.lock);
+	for (i = 0; i < SLQ_MAXPRIO; i++) {
+		INIT_LIST_HEAD(&softlimitq.queue[i][SL_ANON]);
+		INIT_LIST_HEAD(&softlimitq.queue[i][SL_FILE]);
+	}
+	memcg_softlimit_base = totalram_pages / SLQ_PRIO_FACTOR;
+}
+
 /*
  * Unlike exported interface, "oom" parameter is added. if oom==true,
  * oom-killer can be invoked.
@@ -2527,6 +2638,7 @@ mem_cgroup_create(struct cgroup_subsys *
 	if (cont->parent == NULL) {
 		enable_swap_cgroup();
 		parent = NULL;
+		softlimitq_init();
 	} else {
 		parent = mem_cgroup_from_cont(cont->parent);
 		mem->use_hierarchy = parent->use_hierarchy;
@@ -2547,6 +2659,10 @@ mem_cgroup_create(struct cgroup_subsys *
 		res_counter_init(&mem->memsw, NULL);
 	}
 	mem->last_scanned_child = 0;
+	mem->soft_limit_priority = 0;
+	INIT_LIST_HEAD(&mem->soft_limit_anon);
+	INIT_LIST_HEAD(&mem->soft_limit_file);
+	spin_lock_init(&mem->soft_limit_lock);
 	spin_lock_init(&mem->reclaim_param_lock);
 
 	if (parent)
@@ -2571,6 +2687,9 @@ static void mem_cgroup_destroy(struct cg
 {
 	struct mem_cgroup *mem = mem_cgroup_from_cont(cont);
 
+	spin_lock(&mem->soft_limit_lock);
+	__mem_cgroup_dequeue(mem);
+	spin_unlock(&mem->soft_limit_lock);
 	mem_cgroup_put(mem);
 }
 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC][PATCH 5/8] memcg soft limit (yet another new design) v1
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (3 preceding siblings ...)
  2009-03-27  5:06 ` [RFC][PATCH 4/8] memcg soft limit priority array queue KAMEZAWA Hiroyuki
@ 2009-03-27  5:09 ` KAMEZAWA Hiroyuki
  2009-03-27  5:14   ` KAMEZAWA Hiroyuki
  2009-03-31  8:18   ` KAMEZAWA Hiroyuki
  2009-03-27  5:11 ` [RFC][PATCH 6/8] soft limit victim select KAMEZAWA Hiroyuki
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:09 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

memcg's reclaim routine is designed to ignore locality andplacements and
then, inactive_anon_is_low() function doesn't take "zone" as its argument.

In later soft limit patch, we use "zone" as an arguments.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -561,15 +561,28 @@ void mem_cgroup_record_reclaim_priority(
 	spin_unlock(&mem->reclaim_param_lock);
 }
 
-static int calc_inactive_ratio(struct mem_cgroup *memcg, unsigned long *present_pages)
+static int calc_inactive_ratio(struct mem_cgroup *memcg,
+			       unsigned long *present_pages,
+			       struct zone *z)
 {
 	unsigned long active;
 	unsigned long inactive;
 	unsigned long gb;
 	unsigned long inactive_ratio;
 
-	inactive = mem_cgroup_get_local_zonestat(memcg, LRU_INACTIVE_ANON);
-	active = mem_cgroup_get_local_zonestat(memcg, LRU_ACTIVE_ANON);
+	if (!z) {
+		inactive = mem_cgroup_get_local_zonestat(memcg,
+							 LRU_INACTIVE_ANON);
+		active = mem_cgroup_get_local_zonestat(memcg, LRU_ACTIVE_ANON);
+	} else {
+		int nid = z->zone_pgdat->node_id;
+		int zid = zone_idx(z);
+		struct mem_cgroup_per_zone *mz;
+
+		mz = mem_cgroup_zoneinfo(memcg, nid, zid);
+		inactive = MEM_CGROUP_ZSTAT(mz, LRU_INACTIVE_ANON);
+		active = MEM_CGROUP_ZSTAT(mz, LRU_ACTIVE_ANON);
+	}
 
 	gb = (inactive + active) >> (30 - PAGE_SHIFT);
 	if (gb)
@@ -585,14 +598,14 @@ static int calc_inactive_ratio(struct me
 	return inactive_ratio;
 }
 
-int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg)
+int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *z)
 {
 	unsigned long active;
 	unsigned long inactive;
 	unsigned long present_pages[2];
 	unsigned long inactive_ratio;
 
-	inactive_ratio = calc_inactive_ratio(memcg, present_pages);
+	inactive_ratio = calc_inactive_ratio(memcg, present_pages, NULL);
 
 	inactive = present_pages[0];
 	active = present_pages[1];
@@ -2337,7 +2350,8 @@ static int mem_control_stat_show(struct 
 
 
 #ifdef CONFIG_DEBUG_VM
-	cb->fill(cb, "inactive_ratio", calc_inactive_ratio(mem_cont, NULL));
+	cb->fill(cb, "inactive_ratio",
+			calc_inactive_ratio(mem_cont, NULL, NULL));
 
 	{
 		int nid, zid;
Index: mmotm-2.6.29-Mar23/include/linux/memcontrol.h
===================================================================
--- mmotm-2.6.29-Mar23.orig/include/linux/memcontrol.h
+++ mmotm-2.6.29-Mar23/include/linux/memcontrol.h
@@ -93,7 +93,7 @@ extern void mem_cgroup_note_reclaim_prio
 							int priority);
 extern void mem_cgroup_record_reclaim_priority(struct mem_cgroup *mem,
 							int priority);
-int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg);
+int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *z);
 unsigned long mem_cgroup_zone_nr_pages(struct mem_cgroup *memcg,
 				       struct zone *zone,
 				       enum lru_list lru);
@@ -234,7 +234,7 @@ static inline bool mem_cgroup_oom_called
 }
 
 static inline int
-mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg)
+mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *z)
 {
 	return 1;
 }
Index: mmotm-2.6.29-Mar23/mm/vmscan.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/vmscan.c
+++ mmotm-2.6.29-Mar23/mm/vmscan.c
@@ -1341,7 +1341,7 @@ static int inactive_anon_is_low(struct z
 	if (scanning_global_lru(sc))
 		low = inactive_anon_is_low_global(zone);
 	else
-		low = mem_cgroup_inactive_anon_is_low(sc->mem_cgroup);
+		low = mem_cgroup_inactive_anon_is_low(sc->mem_cgroup, NULL);
 	return low;
 }
 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC][PATCH 6/8] soft limit victim select
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (4 preceding siblings ...)
  2009-03-27  5:09 ` [RFC][PATCH 5/8] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
@ 2009-03-27  5:11 ` KAMEZAWA Hiroyuki
  2009-03-27  5:12 ` [RFC][PATCH 7/8] memcg soft limit LRU reorder KAMEZAWA Hiroyuki
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:11 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Soft Limit victim selection/cache logic.

This patch implements victim selection logic and caching method.

victim memcg is selected in following way, assume a zone under shrinking
is specified. Selected memcg will be
  - has the highest priority (high usage)
  - has memory on the zone.

When a memcg is selected, it's rotated and cached per cpu with tickets.

This cache is refreshed when
  - given ticket is exhausetd
  - very long time since last update.
  - the cached memcg doesn't include proper zone.

Even when no proper memcg is not found in victim selection logic,
some tickets are assigned to NULL victim.

As softlimitq, this cache's information has 2 ents for anon and file.

TODO:
  - need to handle cpu hotplug (in other patch)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |  121 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 121 insertions(+)

Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -1055,6 +1055,127 @@ static void mem_cgroup_update_soft_limit
 	return;
 }
 
+/* softlimit victim selection logic */
+
+/* Returns the amount of evictable memory in memcg */
+static int mem_cgroup_usage(struct mem_cgroup *mem, struct zone *zone, int file)
+{
+	struct mem_cgroup_per_zone *mz;
+	int nid = zone->zone_pgdat->node_id;
+	int zid = zone_idx(zone);
+	unsigned long usage = 0;
+
+	mz = mem_cgroup_zoneinfo(mem, nid, zid);
+	if (!file) {
+		usage = MEM_CGROUP_ZSTAT(mz, LRU_ACTIVE_ANON)
+			+ MEM_CGROUP_ZSTAT(mz, LRU_INACTIVE_ANON);
+	} else {
+		usage = MEM_CGROUP_ZSTAT(mz, LRU_ACTIVE_FILE)
+			+ MEM_CGROUP_ZSTAT(mz, LRU_INACTIVE_FILE);
+	}
+	return usage;
+}
+
+struct soft_limit_cache {
+	/* If ticket is 0, refresh and refill the cache.*/
+	unsigned long ticket[2];
+	/* next update time for ticket(jiffies)*/
+	unsigned long next_update;
+	/* An event count per cpu. */
+	unsigned long total_events;
+	/* victim memcg */
+	struct mem_cgroup *mem[2];
+};
+/* In fast-path, 32pages are reclaimed per call. 4*32=128pages as base ticket */
+#define SLCACHE_NULL_TICKET (4)
+#define SLCACHE_UPDATE_JIFFIES (HZ*5) /* 5 minutes is very long. */
+DEFINE_PER_CPU(struct soft_limit_cache, soft_limit_cache);
+
+/* This is called under preempt disabled context....*/
+static void reload_softlimit_victim(struct soft_limit_cache *slc,
+				    struct zone *zone, int file)
+{
+	struct mem_cgroup *mem = NULL;
+	struct mem_cgroup *tmp;
+	struct list_head *queue;
+	int prio, bonus;
+
+	if (slc->mem[file]) {
+		mem_cgroup_put(slc->mem[file]);
+		slc->mem[file] = NULL;
+	}
+	slc->ticket[file] = SLCACHE_NULL_TICKET;
+	slc->next_update = jiffies + SLCACHE_UPDATE_JIFFIES;
+	slc->total_events++;
+
+	/* brief check the queue */
+	for (prio = SLQ_MAXPRIO - 1; prio > 0; prio--) {
+		if (!list_empty(&softlimitq.queue[prio][file]))
+			break;
+	}
+retry:
+	if (prio == 0)
+		return;
+
+	/* check queue in priority order */
+
+	queue = &softlimitq.queue[prio][file];
+	spin_lock(&softlimitq.lock);
+	if (file) {
+		list_for_each_entry(tmp, queue, soft_limit_file) {
+			if (mem_cgroup_usage(tmp, zone, file)) {
+				mem = tmp;
+				break;
+			}
+		}
+		if (mem)
+			list_move_tail(&mem->soft_limit_file, queue);
+	} else {
+		list_for_each_entry(tmp, queue, soft_limit_anon) {
+			if (mem_cgroup_usage(tmp, zone, file)) {
+				mem = tmp;
+				break;
+			}
+		}
+		if (mem)
+			list_move_tail(&mem->soft_limit_anon, queue);
+	}
+	spin_unlock(&softlimitq.lock);
+	/* If not found, goes to next priority */
+	if (!mem) {
+		prio--;
+		goto retry;
+	}
+	if (!css_is_removed(&mem->css)) {
+		slc->mem[file] = mem;
+		bonus = prio * 2;
+		slc->ticket[file] += bonus;
+		mem_cgroup_get(mem);
+	}
+}
+
+static struct mem_cgroup *get_soft_limit_victim(struct zone *zone, int file)
+{
+	struct mem_cgroup *ret;
+	struct soft_limit_cache *slc;
+
+	slc = &get_cpu_var(soft_limit_cache);
+	/*
+	 * If ticket is expired or long time since last ticket or
+	 * there are no evictables in memcg, reload victim.
+	 */
+	ret = slc->mem[file];
+	if ((!slc->ticket[file]-- ||
+	     time_after(jiffies, slc->next_update)) ||
+	    (ret && !mem_cgroup_usage(ret, zone, file))) {
+		reload_softlimit_victim(slc, zone, file);
+		ret = slc->mem[file];
+	}
+	put_cpu_var(soft_limit_cache);
+	return ret;
+}
+
+
 static void softlimitq_init(void)
 {
 	int i;


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC][PATCH 7/8] memcg soft limit LRU reorder
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (5 preceding siblings ...)
  2009-03-27  5:11 ` [RFC][PATCH 6/8] soft limit victim select KAMEZAWA Hiroyuki
@ 2009-03-27  5:12 ` KAMEZAWA Hiroyuki
  2009-03-30  7:52   ` Balbir Singh
  2009-03-27  5:13 ` [RFC][PATCH 8/8] extends soft limit event filter KAMEZAWA Hiroyuki
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:12 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

This patch adds a function to change the LRU order of pages in global LRU
under control of memcg's victim of soft limit.

FILE and ANON victim is divided and LRU rotation will be done independently.
(memcg which only includes FILE cache or ANON can exists.)

The routine finds specfied number of pages from memcg's LRU and
move it to top of global LRU. They will be the first target of shrink_xxx_list.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 include/linux/memcontrol.h |   15 +++++++++++
 mm/memcontrol.c            |   60 +++++++++++++++++++++++++++++++++++++++++++++
 mm/vmscan.c                |   18 ++++++++++++-
 3 files changed, 92 insertions(+), 1 deletion(-)

Index: mmotm-2.6.29-Mar23/include/linux/memcontrol.h
===================================================================
--- mmotm-2.6.29-Mar23.orig/include/linux/memcontrol.h
+++ mmotm-2.6.29-Mar23/include/linux/memcontrol.h
@@ -117,6 +117,9 @@ static inline bool mem_cgroup_disabled(v
 
 extern bool mem_cgroup_oom_called(struct task_struct *task);
 
+void mem_cgroup_soft_limit_reorder_lru(struct zone *zone,
+			       unsigned long nr_to_scan, enum lru_list l);
+int mem_cgroup_soft_limit_inactive_anon_is_low(struct zone *zone);
 #else /* CONFIG_CGROUP_MEM_RES_CTLR */
 struct mem_cgroup;
 
@@ -264,6 +267,18 @@ mem_cgroup_print_oom_info(struct mem_cgr
 {
 }
 
+static inline void
+mem_cgroup_soft_limit_reorder_lru(struct zone *zone, unsigned long nr_to_scan,
+				  enum lru_list lru);
+{
+}
+
+static inline
+int mem_cgroup_soft_limit_inactive_anon_is_low(struct zone *zone)
+{
+	return 0;
+}
+
 #endif /* CONFIG_CGROUP_MEM_CONT */
 
 #endif /* _LINUX_MEMCONTROL_H */
Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -1175,6 +1175,66 @@ static struct mem_cgroup *get_soft_limit
 	return ret;
 }
 
+/*
+ * zone->lru and memcg's lru is synchronous under zone->lock.
+ * This tries to rotate pages in specfied LRU.
+ */
+void mem_cgroup_soft_limit_reorder_lru(struct zone *zone,
+				      unsigned long nr_to_scan,
+				      enum lru_list l)
+{
+	struct mem_cgroup *mem;
+	struct mem_cgroup_per_zone *mz;
+	int nid, zid, file;
+	unsigned long scan, flags;
+	struct list_head *src;
+	LIST_HEAD(found);
+	struct page_cgroup *pc;
+	struct page *page;
+
+	nid = zone->zone_pgdat->node_id;
+	zid = zone_idx(zone);
+
+	file = is_file_lru(l);
+
+	mem = get_soft_limit_victim(zone, file);
+	if (!mem)
+		return;
+	mz = mem_cgroup_zoneinfo(mem, nid, zid);
+	src = &mz->lists[l];
+	scan = 0;
+
+	/* Find at most nr_to_scan pages from local LRU */
+	spin_lock_irqsave(&zone->lru_lock, flags);
+	list_for_each_entry_reverse(pc, src, lru) {
+		if (scan >= nr_to_scan)
+			break;
+		/* We don't check Used bit */
+		page = pc->page;
+		/* Can happen ? */
+		if (unlikely(!PageLRU(page)))
+			continue;
+		/* This page is on (the same) LRU */
+		list_move(&page->lru, &found);
+		scan++;
+	}
+	/* vmscan searches pages from lru->prev. link this to lru->prev. */
+	list_splice_tail(&found, &zone->lru[l].list);
+	spin_unlock_irqrestore(&zone->lru_lock, flags);
+}
+
+/* Returns 1 if soft limit is active && memcg's zone's status is that */
+int mem_cgroup_soft_limit_inactive_anon_is_low(struct zone *zone)
+{
+	struct soft_limit_cache *slc;
+	int ret = 0;
+
+	slc = &get_cpu_var(soft_limit_cache);
+	if (slc->mem[0])
+		ret = mem_cgroup_inactive_anon_is_low(slc->mem[SL_ANON], zone);
+	put_cpu_var(soft_limit_cache);
+	return ret;
+}
 
 static void softlimitq_init(void)
 {
Index: mmotm-2.6.29-Mar23/mm/vmscan.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/vmscan.c
+++ mmotm-2.6.29-Mar23/mm/vmscan.c
@@ -1060,6 +1060,13 @@ static unsigned long shrink_inactive_lis
 	pagevec_init(&pvec, 1);
 
 	lru_add_drain();
+	if (scanning_global_lru(sc)) {
+		enum lru_list l = LRU_INACTIVE_ANON;
+		if (file)
+			l = LRU_INACTIVE_FILE;
+		mem_cgroup_soft_limit_reorder_lru(zone, max_scan, l);
+	}
+
 	spin_lock_irq(&zone->lru_lock);
 	do {
 		struct page *page;
@@ -1227,6 +1234,13 @@ static void shrink_active_list(unsigned 
 	struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
 
 	lru_add_drain();
+	if (scanning_global_lru(sc)) {
+		enum lru_list l = LRU_ACTIVE_ANON;
+		if (file)
+			l = LRU_ACTIVE_FILE;
+		mem_cgroup_soft_limit_reorder_lru(zone, nr_pages, l);
+	}
+
 	spin_lock_irq(&zone->lru_lock);
 	pgmoved = sc->isolate_pages(nr_pages, &l_hold, &pgscanned, sc->order,
 					ISOLATE_ACTIVE, zone,
@@ -1322,7 +1336,9 @@ static int inactive_anon_is_low_global(s
 
 	if (inactive * zone->inactive_ratio < active)
 		return 1;
-
+	/* check soft limit vicitm's status */
+	if (mem_cgroup_soft_limit_inactive_anon_is_low(zone))
+		return 1;
 	return 0;
 }
 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [RFC][PATCH 8/8] extends soft limit event filter
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (6 preceding siblings ...)
  2009-03-27  5:12 ` [RFC][PATCH 7/8] memcg soft limit LRU reorder KAMEZAWA Hiroyuki
@ 2009-03-27  5:13 ` KAMEZAWA Hiroyuki
  2009-03-28  8:23 ` [RFC][PATCH] memcg soft limit (yet another new design) v1 Balbir Singh
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:13 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

Reduce softlimit update ratio depends on its priority(usage).

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
===================================================================
--- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
+++ mmotm-2.6.29-Mar23/mm/memcontrol.c
@@ -945,11 +945,15 @@ static bool mem_cgroup_soft_limit_check(
 	bool ret = false;
 	int cpu = get_cpu();
 	s64 val;
+	int thresh;
 	struct mem_cgroup_stat_cpu *cpustat;
 
 	cpustat = &mem->stat.cpustat[cpu];
 	val = __mem_cgroup_stat_read_local(cpustat, MEM_CGROUP_STAT_EVENTS);
-	if (unlikely(val > SOFTLIMIT_EVENTS_THRESH)) {
+	/* If usage is big, this check can be rough */
+	thresh = SOFTLIMIT_EVENTS_THRESH;
+	thresh <<= (mem->soft_limit_priority >> 1);
+	if (unlikely(val > thresh)) {
 		__mem_cgroup_stat_reset_safe(cpustat, MEM_CGROUP_STAT_EVENTS);
 		ret = true;
 	}


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 5/8] memcg soft limit (yet another new design) v1
  2009-03-27  5:09 ` [RFC][PATCH 5/8] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
@ 2009-03-27  5:14   ` KAMEZAWA Hiroyuki
  2009-03-31  8:18   ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  5:14 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

On Fri, 27 Mar 2009 14:09:23 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> memcg's reclaim routine is designed to ignore locality andplacements and
> then, inactive_anon_is_low() function doesn't take "zone" as its argument.
> 
The subject should be "modify inactive_anon_is_low"...

-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 2/8] soft limit framework in memcg.
  2009-03-27  5:03 ` [RFC][PATCH 2/8] soft limit framework in memcg KAMEZAWA Hiroyuki
@ 2009-03-27  8:01   ` KAMEZAWA Hiroyuki
  2009-03-29 17:22   ` Balbir Singh
  1 sibling, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-27  8:01 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

On Fri, 27 Mar 2009 14:03:46 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Add minimal modification for soft limit to res_counter_charge() and memcontol.c
> Based on Balbir Singh <balbir@linux.vnet.ibm.com> 's work but most of
> features are removed. (dropped or moved to later patch.)
> 
> This is for building a frame to implement soft limit handler in memcg.
>  - Checks soft limit status at every charge.
>  - Adds mem_cgroup_soft_limit_check() as a function to detect we need
>    check now or not.
>  - mem_cgroup_update_soft_limit() is a function for updates internal status
>    of soft limit controller of memcg.
>  - This has no hooks in uncharge path. (see later patch.)
Note:
Why I don't insert hook to uncharge() is because uncharge() is called under
spin locks (and my softlimit update() routine is heavy).
But need some hook anyway. I'll take care of this in other patch if I got new idea.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (7 preceding siblings ...)
  2009-03-27  5:13 ` [RFC][PATCH 8/8] extends soft limit event filter KAMEZAWA Hiroyuki
@ 2009-03-28  8:23 ` Balbir Singh
  2009-03-28 16:10   ` KAMEZAWA Hiroyuki
  2009-03-28 18:11 ` Balbir Singh
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-03-28  8:23 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:

> Hi,
> 
> Memory cgroup's soft limit feature is a feature to tell global LRU 
> "please reclaim from this memcg at memory shortage".
> 
> And Balbir's one and my one was proposed.
> This is new one. (so restart from v1), this is very new-born.
> 
> While testing soft limit, my dilemma was following.
> 
>  - needs additional cost of can if implementation is naive (unavoidable?)
> 
>  - Inactive/Active rotation scheme of global LRU will be broken.
> 
>  - File/Anon reclaim ratio scheme of global LRU will be broken.
>     - vm.swappiness will be ignored.
> 
>  - If using memcg's memory reclaim routine, 
>     - shrink_slab() will be never called.
>     - stale SwapCache has no chance to be reclaimed (stale SwapCache means
>       readed but not used one.)
>     - memcg can have no memory in a zone.
>     - memcg can have no Anon memory
>     - lumpty_reclaim() is not called.
> 
> 
> This patch tries to avoid to use existing memcg's reclaim routine and
> just tell "Hints" to global LRU. This patch is briefly tested and shows
> good result to me. (But may not to you. plz brame me.)
> 
> Major characteristic is.
>  - memcg will be inserted to softlimit-queue at charge() if usage excess
>    soft limit.
>  - softlimit-queue is a queue with priority. priority is detemined by size
>    of excessing usage.
>  - memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
>  - Behavior is affected by vm.swappiness and LRU scan rate is determined by
>    global LRU's status.
> 
> I'm sorry that I'm tend not to tell enough explanation.  plz ask me.
> There will be much discussion points, anyway. As usual, I'm not in hurry.
> 
> 
> ==brief test result==
> On 2CPU/1.6GB bytes machine. create group A and B
>   A.  soft limit=300M
>   B.  no soft limit
> 
>   Run a malloc() program on B and allcoate 1G of memory. The program just
>   sleeps after allocating memory and no memory refernce after it.
>   Run make -j 6 and compile the kernel.
> 
>   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
>   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> 
>   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> 

How did you calculate the swap usage of group B?

> I'll try much more complexed ones in the weekend.

You might want to try experiments where the group with the higher soft
limit starts much later than the group with lower soft limit and both
have a high demand for memory. Also try corner cases such as soft
limits being 0, or groups where soft limits are equal, etc.

We have a long weekend, so I've been unable to test/review your
patches. I'll do so soon if possible.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-28  8:23 ` [RFC][PATCH] memcg soft limit (yet another new design) v1 Balbir Singh
@ 2009-03-28 16:10   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-28 16:10 UTC (permalink / raw)
  To: balbir
  Cc: KAMEZAWA Hiroyuki, linux-kernel, linux-mm, kosaki.motohiro, nishimura

Balbir Singh wrote:
> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27
> 13:59:33]:

>>   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
>>   When vm.swappiness = 10  => 1MB of memory are swapped out from B
>>
>>   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
>>
>
> How did you calculate the swap usage of group B?
>
 memsory.memsw.usage_in_bytes - memory.usage_in_bytes.

>> I'll try much more complexed ones in the weekend.
>
> You might want to try experiments where the group with the higher soft
> limit starts much later than the group with lower soft limit and both
> have a high demand for memory. Also try corner cases such as soft
> limits being 0, or groups where soft limits are equal, etc.
>
thx, good input. maybe I need some hook in "set soft limit" path.

> We have a long weekend, so I've been unable to test/review your
> patches. I'll do so soon if possible.
>
thank you.
-Kame

> --
> 	Balbir
>



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (8 preceding siblings ...)
  2009-03-28  8:23 ` [RFC][PATCH] memcg soft limit (yet another new design) v1 Balbir Singh
@ 2009-03-28 18:11 ` Balbir Singh
  2009-03-28 18:27   ` Balbir Singh
  2009-03-30 23:54   ` KAMEZAWA Hiroyuki
  2009-03-29 13:01 ` Balbir Singh
  2009-04-01 14:42 ` Balbir Singh
  11 siblings, 2 replies; 41+ messages in thread
From: Balbir Singh @ 2009-03-28 18:11 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:

> ==brief test result==
> On 2CPU/1.6GB bytes machine. create group A and B
>   A.  soft limit=300M
>   B.  no soft limit
> 
>   Run a malloc() program on B and allcoate 1G of memory. The program just
>   sleeps after allocating memory and no memory refernce after it.
>   Run make -j 6 and compile the kernel.
> 
>   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
>   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> 
>   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
>

I ran the same tests, booted the machine with mem=1700M and maxcpus=2

Here is what I see with

A has a swapout of 344M and B has not swapout at all, since B is
always under its soft limit. vm.swappiness is set to 60

I think the above is more along the lines of the expected functional behaviour. 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-28 18:11 ` Balbir Singh
@ 2009-03-28 18:27   ` Balbir Singh
  2009-03-30 23:55     ` KAMEZAWA Hiroyuki
  2009-03-31  0:06     ` KAMEZAWA Hiroyuki
  2009-03-30 23:54   ` KAMEZAWA Hiroyuki
  1 sibling, 2 replies; 41+ messages in thread
From: Balbir Singh @ 2009-03-28 18:27 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:

> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> 
> > ==brief test result==
> > On 2CPU/1.6GB bytes machine. create group A and B
> >   A.  soft limit=300M
> >   B.  no soft limit
> > 
> >   Run a malloc() program on B and allcoate 1G of memory. The program just
> >   sleeps after allocating memory and no memory refernce after it.
> >   Run make -j 6 and compile the kernel.
> > 
> >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > 
> >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> >
> 
> I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> 
> Here is what I see with

I meant to say, Here is what I see with my patches (v7)

> 
> A has a swapout of 344M and B has not swapout at all, since B is
> always under its soft limit. vm.swappiness is set to 60
> 
> I think the above is more along the lines of the expected functional behaviour. 
> 
> -- 
> 	Balbir

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (9 preceding siblings ...)
  2009-03-28 18:11 ` Balbir Singh
@ 2009-03-29 13:01 ` Balbir Singh
  2009-03-30 23:57   ` KAMEZAWA Hiroyuki
  2009-04-01 14:42 ` Balbir Singh
  11 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-03-29 13:01 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:

> Hi,
> 
> Memory cgroup's soft limit feature is a feature to tell global LRU 
> "please reclaim from this memcg at memory shortage".
> 
> And Balbir's one and my one was proposed.
> This is new one. (so restart from v1), this is very new-born.
> 
> While testing soft limit, my dilemma was following.
> 
>  - needs additional cost of can if implementation is naive (unavoidable?)

I think and I speak for my patches, you should look at soft limit
reclaim as helping global reclaim and not working against it. It
provides an opportunity to reclaim from groups that might not be
important to the system.

> 
>  - Inactive/Active rotation scheme of global LRU will be broken.
> 
>  - File/Anon reclaim ratio scheme of global LRU will be broken.
>     - vm.swappiness will be ignored.
> 

Not true, with my patches none of these are affected since the reclaim
for soft limits is limited to mem cgroup LRU lists only. Zone reclaim
that happens in parallel can of-course change the global LRU.

>  - If using memcg's memory reclaim routine, 
>     - shrink_slab() will be never called.
>     - stale SwapCache has no chance to be reclaimed (stale SwapCache means
>       readed but not used one.)
>     - memcg can have no memory in a zone.
>     - memcg can have no Anon memory
>     - lumpty_reclaim() is not called.
> 
> 
> This patch tries to avoid to use existing memcg's reclaim routine and
> just tell "Hints" to global LRU. This patch is briefly tested and shows
> good result to me. (But may not to you. plz brame me.)
> 

I don't like the results, they are functionaly broken (see my other
email). Why should "B" get reclaimed from if it is not above its soft
limit? Why is there a swapout from "B"?


> Major characteristic is.
>  - memcg will be inserted to softlimit-queue at charge() if usage excess
>    soft limit.
>  - softlimit-queue is a queue with priority. priority is detemined by size
>    of excessing usage.
>  - memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
>  - Behavior is affected by vm.swappiness and LRU scan rate is determined by
>    global LRU's status.
> 
> I'm sorry that I'm tend not to tell enough explanation.  plz ask me.
> There will be much discussion points, anyway. As usual, I'm not in hurry.
>

The code seems to add a lot of complexity and does not achieve expected
functionality. I am going to start testing this series soon
 
> 
> ==brief test result==
> On 2CPU/1.6GB bytes machine. create group A and B
>   A.  soft limit=300M
>   B.  no soft limit
> 
>   Run a malloc() program on B and allcoate 1G of memory. The program just
>   sleeps after allocating memory and no memory refernce after it.
>   Run make -j 6 and compile the kernel.
> 
>   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
>   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> 
>   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> 
> I'll try much more complexed ones in the weekend.

Please see my response to this test result in a previous email.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 4/8] memcg soft limit priority array queue.
  2009-03-27  5:06 ` [RFC][PATCH 4/8] memcg soft limit priority array queue KAMEZAWA Hiroyuki
@ 2009-03-29 16:56   ` Balbir Singh
  2009-03-30 23:58     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-03-29 16:56 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 14:06:53]:

> I'm now search a way to reduce lock contention without complex...
> -Kame
> ==
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> +static void __mem_cgroup_requeue(struct mem_cgroup *mem)
> +{
> +	/* enqueue to softlimit queue */
> +	int prio = mem->soft_limit_priority;
> +
> +	spin_lock(&softlimitq.lock);
> +	list_del_init(&mem->soft_limit_anon);
> +	list_add_tail(&mem->soft_limit_anon, &softlimitq.queue[prio][SL_ANON]);
> +	list_del_init(&mem->soft_limit_file,ist[SL_FILE]);

Patch fails to build here, what is ist?

> +	list_add_tail(&mem->soft_limit_file, &softlimitq.queue[prio][SL_FILE]);
> +	spin_unlock(&softlimitq.lock);
> +}
> 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 2/8] soft limit framework in memcg.
  2009-03-27  5:03 ` [RFC][PATCH 2/8] soft limit framework in memcg KAMEZAWA Hiroyuki
  2009-03-27  8:01   ` KAMEZAWA Hiroyuki
@ 2009-03-29 17:22   ` Balbir Singh
  1 sibling, 0 replies; 41+ messages in thread
From: Balbir Singh @ 2009-03-29 17:22 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 14:03:46]:

> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Add minimal modification for soft limit to res_counter_charge() and memcontol.c
> Based on Balbir Singh <balbir@linux.vnet.ibm.com> 's work but most of
> features are removed. (dropped or moved to later patch.)
>

Feel free to use my signed-off-by on this. 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 7/8] memcg soft limit LRU reorder
  2009-03-27  5:12 ` [RFC][PATCH 7/8] memcg soft limit LRU reorder KAMEZAWA Hiroyuki
@ 2009-03-30  7:52   ` Balbir Singh
  2009-03-31  0:00     ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-03-30  7:52 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 14:12:25]:

> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> This patch adds a function to change the LRU order of pages in global LRU
> under control of memcg's victim of soft limit.
> 
> FILE and ANON victim is divided and LRU rotation will be done independently.
> (memcg which only includes FILE cache or ANON can exists.)
> 
> The routine finds specfied number of pages from memcg's LRU and
> move it to top of global LRU. They will be the first target of shrink_xxx_list.

This seems to be the core of the patch, but I don't like this very
much. Moving LRU pages of the mem cgroup seems very subtle, why can't
we directly use try_to_free_mem_cgroup_pages()?

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-28 18:11 ` Balbir Singh
  2009-03-28 18:27   ` Balbir Singh
@ 2009-03-30 23:54   ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-30 23:54 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Sat, 28 Mar 2009 23:41:00 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> 
> > ==brief test result==
> > On 2CPU/1.6GB bytes machine. create group A and B
> >   A.  soft limit=300M
> >   B.  no soft limit
> > 
> >   Run a malloc() program on B and allcoate 1G of memory. The program just
> >   sleeps after allocating memory and no memory refernce after it.
> >   Run make -j 6 and compile the kernel.
> > 
> >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > 
> >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> >
> 
> I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> 
with your patch ?

> Here is what I see with
> 
> A has a swapout of 344M and B has not swapout at all, since B is
> always under its soft limit. vm.swappiness is set to 60
> 
> I think the above is more along the lines of the expected functional behaviour. 
> 

yes. but it's depend on workload (and fortune?) of A in this implementation.
Follwing is what I think now. We need some changes to vmscanc, later.

explain)
    This patch rotate memcg's page to the top of LRU. But, LRU is divided into
    INACTIVE/ACTIVE. So, sometimes, memcg's INACTIVE LRU can be empty and
    pages from other group can be reclaimed.
    In my test, group A's RSS usage can be 1-2M sometimes.

Thanks,
-Kame
    







^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-28 18:27   ` Balbir Singh
@ 2009-03-30 23:55     ` KAMEZAWA Hiroyuki
  2009-03-31  5:00       ` Balbir Singh
  2009-03-31  0:06     ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-30 23:55 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Sat, 28 Mar 2009 23:57:47 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> 
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > 
> > > ==brief test result==
> > > On 2CPU/1.6GB bytes machine. create group A and B
> > >   A.  soft limit=300M
> > >   B.  no soft limit
> > > 
> > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > >   sleeps after allocating memory and no memory refernce after it.
> > >   Run make -j 6 and compile the kernel.
> > > 
> > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > 
> > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > >
> > 
> > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > 
> > Here is what I see with
> 
> I meant to say, Here is what I see with my patches (v7)
> 
Hmm, I saw 250MB of swap out ;) As I reported before.

-Kame



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-29 13:01 ` Balbir Singh
@ 2009-03-30 23:57   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-30 23:57 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Sun, 29 Mar 2009 18:31:38 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> > 
> >  - Inactive/Active rotation scheme of global LRU will be broken.
> > 
> >  - File/Anon reclaim ratio scheme of global LRU will be broken.
> >     - vm.swappiness will be ignored.
> > 
> 
> Not true, with my patches none of these are affected since the reclaim
> for soft limits is limited to mem cgroup LRU lists only. Zone reclaim
> that happens in parallel can of-course change the global LRU.
> 
> >  - If using memcg's memory reclaim routine, 
> >     - shrink_slab() will be never called.
> >     - stale SwapCache has no chance to be reclaimed (stale SwapCache means
> >       readed but not used one.)
> >     - memcg can have no memory in a zone.
> >     - memcg can have no Anon memory
> >     - lumpty_reclaim() is not called.
> > 
> > 
> > This patch tries to avoid to use existing memcg's reclaim routine and
> > just tell "Hints" to global LRU. This patch is briefly tested and shows
> > good result to me. (But may not to you. plz brame me.)
> > 
> 
> I don't like the results, they are functionaly broken (see my other
> email). Why should "B" get reclaimed from if it is not above its soft
> limit? Why is there a swapout from "B"?
> 
I explained in other mail.




> 
> > Major characteristic is.
> >  - memcg will be inserted to softlimit-queue at charge() if usage excess
> >    soft limit.
> >  - softlimit-queue is a queue with priority. priority is detemined by size
> >    of excessing usage.
> >  - memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
> >  - Behavior is affected by vm.swappiness and LRU scan rate is determined by
> >    global LRU's status.
> > 
> > I'm sorry that I'm tend not to tell enough explanation.  plz ask me.
> > There will be much discussion points, anyway. As usual, I'm not in hurry.
> >
> 
> The code seems to add a lot of complexity and does not achieve expected
> functionality. I am going to start testing this series soon
>  



> > 
> > ==brief test result==
> > On 2CPU/1.6GB bytes machine. create group A and B
> >   A.  soft limit=300M
> >   B.  no soft limit
> > 
> >   Run a malloc() program on B and allcoate 1G of memory. The program just
> >   sleeps after allocating memory and no memory refernce after it.
> >   Run make -j 6 and compile the kernel.
> > 
> >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > 
> >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > 
> > I'll try much more complexed ones in the weekend.
> 
> Please see my response to this test result in a previous email.
> 
you too, I repoted to your thread one week ago,

Thanks,
-Kame

> -- 
> 	Balbir


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 4/8] memcg soft limit priority array queue.
  2009-03-29 16:56   ` Balbir Singh
@ 2009-03-30 23:58     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-30 23:58 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Sun, 29 Mar 2009 22:26:20 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 14:06:53]:
> 
> > I'm now search a way to reduce lock contention without complex...
> > -Kame
> > ==
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > 
> > +static void __mem_cgroup_requeue(struct mem_cgroup *mem)
> > +{
> > +	/* enqueue to softlimit queue */
> > +	int prio = mem->soft_limit_priority;
> > +
> > +	spin_lock(&softlimitq.lock);
> > +	list_del_init(&mem->soft_limit_anon);
> > +	list_add_tail(&mem->soft_limit_anon, &softlimitq.queue[prio][SL_ANON]);
> > +	list_del_init(&mem->soft_limit_file,ist[SL_FILE]);
> 
> Patch fails to build here, what is ist?
> 
Hm...my patech is broken ?
&softlimitq.queue[prio][SL_FILE]

-Kame.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 7/8] memcg soft limit LRU reorder
  2009-03-30  7:52   ` Balbir Singh
@ 2009-03-31  0:00     ` KAMEZAWA Hiroyuki
  2009-03-31  6:06       ` Balbir Singh
  0 siblings, 1 reply; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  0:00 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Mon, 30 Mar 2009 13:22:46 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 14:12:25]:
> 
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > 
> > This patch adds a function to change the LRU order of pages in global LRU
> > under control of memcg's victim of soft limit.
> > 
> > FILE and ANON victim is divided and LRU rotation will be done independently.
> > (memcg which only includes FILE cache or ANON can exists.)
> > 
> > The routine finds specfied number of pages from memcg's LRU and
> > move it to top of global LRU. They will be the first target of shrink_xxx_list.
> 
> This seems to be the core of the patch, but I don't like this very
> much. Moving LRU pages of the mem cgroup seems very subtle, why can't
> we directly use try_to_free_mem_cgroup_pages()?
> 
It ignores many things.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-28 18:27   ` Balbir Singh
  2009-03-30 23:55     ` KAMEZAWA Hiroyuki
@ 2009-03-31  0:06     ` KAMEZAWA Hiroyuki
  2009-03-31  5:01       ` Balbir Singh
  1 sibling, 1 reply; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  0:06 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Sat, 28 Mar 2009 23:57:47 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> 
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > 
> > > ==brief test result==
> > > On 2CPU/1.6GB bytes machine. create group A and B
> > >   A.  soft limit=300M
> > >   B.  no soft limit
> > > 
> > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > >   sleeps after allocating memory and no memory refernce after it.
> > >   Run make -j 6 and compile the kernel.
> > > 
> > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > 
> > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > >
> > 
> > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > 
> > Here is what I see with
> 
> I meant to say, Here is what I see with my patches (v7)
> 

your malloc program is like this ?

int main(int argc, char *argv[])
{
    c = malloc(1G);
    memset(c, 0, 1G);
    getc();
}


Thanks,
-Kame



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-30 23:55     ` KAMEZAWA Hiroyuki
@ 2009-03-31  5:00       ` Balbir Singh
  2009-03-31  5:05         ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-03-31  5:00 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 08:55:38]:

> On Sat, 28 Mar 2009 23:57:47 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
> > * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> > 
> > > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > > 
> > > > ==brief test result==
> > > > On 2CPU/1.6GB bytes machine. create group A and B
> > > >   A.  soft limit=300M
> > > >   B.  no soft limit
> > > > 
> > > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > > >   sleeps after allocating memory and no memory refernce after it.
> > > >   Run make -j 6 and compile the kernel.
> > > > 
> > > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > > 
> > > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > > >
> > > 
> > > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > > 
> > > Here is what I see with
> > 
> > I meant to say, Here is what I see with my patches (v7)
> > 
> Hmm, I saw 250MB of swap out ;) As I reported before.

Swapout for A? For A it is expected, but for B it is not. How many
nodes do you have on your machine? Any fake numa nodes?

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  0:06     ` KAMEZAWA Hiroyuki
@ 2009-03-31  5:01       ` Balbir Singh
  2009-03-31  5:11         ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-03-31  5:01 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 09:06:07]:

> On Sat, 28 Mar 2009 23:57:47 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
> > * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> > 
> > > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > > 
> > > > ==brief test result==
> > > > On 2CPU/1.6GB bytes machine. create group A and B
> > > >   A.  soft limit=300M
> > > >   B.  no soft limit
> > > > 
> > > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > > >   sleeps after allocating memory and no memory refernce after it.
> > > >   Run make -j 6 and compile the kernel.
> > > > 
> > > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > > 
> > > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > > >
> > > 
> > > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > > 
> > > Here is what I see with
> > 
> > I meant to say, Here is what I see with my patches (v7)
> > 
> 
> your malloc program is like this ?
> 
> int main(int argc, char *argv[])
> {
>     c = malloc(1G);
>     memset(c, 0, 1G);
>     getc();
> }
>

Very similar, instead of memset, we go integer by integer and set it
to 0, do two loops of touching and wait for user input before exiting.
 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  5:00       ` Balbir Singh
@ 2009-03-31  5:05         ` KAMEZAWA Hiroyuki
  2009-03-31  5:18           ` KAMEZAWA Hiroyuki
  2009-03-31  6:10           ` Balbir Singh
  0 siblings, 2 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  5:05 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Tue, 31 Mar 2009 10:30:55 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 08:55:38]:
> 
> > On Sat, 28 Mar 2009 23:57:47 +0530
> > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > 
> > > * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> > > 
> > > > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > > > 
> > > > > ==brief test result==
> > > > > On 2CPU/1.6GB bytes machine. create group A and B
> > > > >   A.  soft limit=300M
> > > > >   B.  no soft limit
> > > > > 
> > > > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > > > >   sleeps after allocating memory and no memory refernce after it.
> > > > >   Run make -j 6 and compile the kernel.
> > > > > 
> > > > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > > > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > > > 
> > > > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > > > >
> > > > 
> > > > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > > > 
> > > > Here is what I see with
> > > 
> > > I meant to say, Here is what I see with my patches (v7)
> > > 
> > Hmm, I saw 250MB of swap out ;) As I reported before.
> 
> Swapout for A? For A it is expected, but for B it is not. How many
> nodes do you have on your machine? Any fake numa nodes?
> 
Of course, from B.

Nothing special boot options. My test was on VMware 2cpus/1.6GB memory.

I wonder why swapout can be 0 on your test. Do you add some extra hooks to
kswapd ?

Thanks,
-Kame






^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  5:01       ` Balbir Singh
@ 2009-03-31  5:11         ` KAMEZAWA Hiroyuki
  2009-03-31  6:07           ` Balbir Singh
  0 siblings, 1 reply; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  5:11 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Tue, 31 Mar 2009 10:31:43 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 09:06:07]:
> 
> > On Sat, 28 Mar 2009 23:57:47 +0530
> > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > 
> > > * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> > > 
> > > > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > > > 
> > > > > ==brief test result==
> > > > > On 2CPU/1.6GB bytes machine. create group A and B
> > > > >   A.  soft limit=300M
> > > > >   B.  no soft limit
> > > > > 
> > > > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > > > >   sleeps after allocating memory and no memory refernce after it.
> > > > >   Run make -j 6 and compile the kernel.
> > > > > 
> > > > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > > > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > > > 
> > > > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > > > >
> > > > 
> > > > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > > > 
> > > > Here is what I see with
> > > 
> > > I meant to say, Here is what I see with my patches (v7)
> > > 
> > 
> > your malloc program is like this ?
> > 
> > int main(int argc, char *argv[])
> > {
> >     c = malloc(1G);
> >     memset(c, 0, 1G);
> >     getc();
> > }
> >
> 
> Very similar, instead of memset, we go integer by integer and set it
> to 0, do two loops of touching and wait for user input before exiting.
>  
Why two loops of touching ? has special meanings ?

-Kame



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  5:05         ` KAMEZAWA Hiroyuki
@ 2009-03-31  5:18           ` KAMEZAWA Hiroyuki
  2009-03-31  6:10           ` Balbir Singh
  1 sibling, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  5:18 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: balbir, linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Tue, 31 Mar 2009 14:05:02 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> On Tue, 31 Mar 2009 10:30:55 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 08:55:38]:
> > 
> > > On Sat, 28 Mar 2009 23:57:47 +0530
> > > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > > 
> > > > * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> > > > 
> > > > > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > > > > 
> > > > > > ==brief test result==
> > > > > > On 2CPU/1.6GB bytes machine. create group A and B
> > > > > >   A.  soft limit=300M
> > > > > >   B.  no soft limit
> > > > > > 
> > > > > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > > > > >   sleeps after allocating memory and no memory refernce after it.
> > > > > >   Run make -j 6 and compile the kernel.
> > > > > > 
> > > > > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > > > > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > > > > 
> > > > > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > > > > >
> > > > > 
> > > > > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > > > > 
> > > > > Here is what I see with
> > > > 
> > > > I meant to say, Here is what I see with my patches (v7)
> > > > 
> > > Hmm, I saw 250MB of swap out ;) As I reported before.
> > 
> > Swapout for A? For A it is expected, but for B it is not. How many
> > nodes do you have on your machine? Any fake numa nodes?
> > 
> Of course, from B.
> 
> Nothing special boot options. My test was on VMware 2cpus/1.6GB memory.
> 
More precise.

the host equips 1576444kB of memory not 1700MB.


Thanks,
-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 7/8] memcg soft limit LRU reorder
  2009-03-31  0:00     ` KAMEZAWA Hiroyuki
@ 2009-03-31  6:06       ` Balbir Singh
  2009-03-31  6:19         ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-03-31  6:06 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 09:00:23]:

> On Mon, 30 Mar 2009 13:22:46 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 14:12:25]:
> > 
> > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > 
> > > This patch adds a function to change the LRU order of pages in global LRU
> > > under control of memcg's victim of soft limit.
> > > 
> > > FILE and ANON victim is divided and LRU rotation will be done independently.
> > > (memcg which only includes FILE cache or ANON can exists.)
> > > 
> > > The routine finds specfied number of pages from memcg's LRU and
> > > move it to top of global LRU. They will be the first target of shrink_xxx_list.
> > 
> > This seems to be the core of the patch, but I don't like this very
> > much. Moving LRU pages of the mem cgroup seems very subtle, why can't
> > we directly use try_to_free_mem_cgroup_pages()?
> > 
> It ignores many things.

My concern is that such subtle modification to the global LRU 

1. Can break the age property of elements in the LRU (we have mixed
ages now in the LRU)
2. Can potentially impact lumpy reclaim, since we've mix LRU pages
from the memory controlelr into the global LRU?

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  5:11         ` KAMEZAWA Hiroyuki
@ 2009-03-31  6:07           ` Balbir Singh
  0 siblings, 0 replies; 41+ messages in thread
From: Balbir Singh @ 2009-03-31  6:07 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 14:11:40]:

> On Tue, 31 Mar 2009 10:31:43 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 09:06:07]:
> > 
> > > On Sat, 28 Mar 2009 23:57:47 +0530
> > > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > > 
> > > > * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> > > > 
> > > > > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > > > > 
> > > > > > ==brief test result==
> > > > > > On 2CPU/1.6GB bytes machine. create group A and B
> > > > > >   A.  soft limit=300M
> > > > > >   B.  no soft limit
> > > > > > 
> > > > > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > > > > >   sleeps after allocating memory and no memory refernce after it.
> > > > > >   Run make -j 6 and compile the kernel.
> > > > > > 
> > > > > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > > > > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > > > > 
> > > > > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > > > > >
> > > > > 
> > > > > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > > > > 
> > > > > Here is what I see with
> > > > 
> > > > I meant to say, Here is what I see with my patches (v7)
> > > > 
> > > 
> > > your malloc program is like this ?
> > > 
> > > int main(int argc, char *argv[])
> > > {
> > >     c = malloc(1G);
> > >     memset(c, 0, 1G);
> > >     getc();
> > > }
> > >
> > 
> > Very similar, instead of memset, we go integer by integer and set it
> > to 0, do two loops of touching and wait for user input before exiting.
> >  
> Why two loops of touching ? has special meanings ?

The number of loops are configurable and can be used to keep pages
active. The default loops is two. It has no special meaning in the
test scenario described.

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  5:05         ` KAMEZAWA Hiroyuki
  2009-03-31  5:18           ` KAMEZAWA Hiroyuki
@ 2009-03-31  6:10           ` Balbir Singh
  2009-03-31  6:28             ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-03-31  6:10 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 14:05:02]:

> On Tue, 31 Mar 2009 10:30:55 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 08:55:38]:
> > 
> > > On Sat, 28 Mar 2009 23:57:47 +0530
> > > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > > 
> > > > * Balbir Singh <balbir@linux.vnet.ibm.com> [2009-03-28 23:41:00]:
> > > > 
> > > > > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:
> > > > > 
> > > > > > ==brief test result==
> > > > > > On 2CPU/1.6GB bytes machine. create group A and B
> > > > > >   A.  soft limit=300M
> > > > > >   B.  no soft limit
> > > > > > 
> > > > > >   Run a malloc() program on B and allcoate 1G of memory. The program just
> > > > > >   sleeps after allocating memory and no memory refernce after it.
> > > > > >   Run make -j 6 and compile the kernel.
> > > > > > 
> > > > > >   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
> > > > > >   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> > > > > > 
> > > > > >   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
> > > > > >
> > > > > 
> > > > > I ran the same tests, booted the machine with mem=1700M and maxcpus=2
> > > > > 
> > > > > Here is what I see with
> > > > 
> > > > I meant to say, Here is what I see with my patches (v7)
> > > > 
> > > Hmm, I saw 250MB of swap out ;) As I reported before.
> > 
> > Swapout for A? For A it is expected, but for B it is not. How many
> > nodes do you have on your machine? Any fake numa nodes?
> > 
> Of course, from B.
>

I asked because I see A have a swapout of 350 MB, which is expected
since it is way over its soft limit.
 
> Nothing special boot options. My test was on VMware 2cpus/1.6GB memory.
> 
> I wonder why swapout can be 0 on your test. Do you add some extra hooks to
> kswapd ?
>

Nope.. no special hooks to kswapd. B never enters the RB-Tree and thus
never hits the memcg soft limit reclaim path. kswapd can reclaim from
it, but it grows back quickly. At some point, memcg soft limit reclaim
hits A and reclaims memory from it, allowing B to run without any
problems. I am talking about the state at the end of the experiment.

 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 7/8] memcg soft limit LRU reorder
  2009-03-31  6:06       ` Balbir Singh
@ 2009-03-31  6:19         ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  6:19 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Tue, 31 Mar 2009 11:36:07 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 09:00:23]:
> 
> > On Mon, 30 Mar 2009 13:22:46 +0530
> > Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > 
> > > * KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 14:12:25]:
> > > 
> > > > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > > > 
> > > > This patch adds a function to change the LRU order of pages in global LRU
> > > > under control of memcg's victim of soft limit.
> > > > 
> > > > FILE and ANON victim is divided and LRU rotation will be done independently.
> > > > (memcg which only includes FILE cache or ANON can exists.)
> > > > 
> > > > The routine finds specfied number of pages from memcg's LRU and
> > > > move it to top of global LRU. They will be the first target of shrink_xxx_list.
> > > 
> > > This seems to be the core of the patch, but I don't like this very
> > > much. Moving LRU pages of the mem cgroup seems very subtle, why can't
> > > we directly use try_to_free_mem_cgroup_pages()?
> > > 
> > It ignores many things.
> 
> My concern is that such subtle modification to the global LRU 
> 
okay. maybe everyone's concern.

> 1. Can break the age property of elements in the LRU (we have mixed
> ages now in the LRU)

We have to break(change) some, anyway.
I think this kind of LRU-swapping/reorder technique is a popular technique to
give LRU a hint and reordering is one of the least invasive options.
It doesn't affect global LRU other than the order of pages and all statistics
are updated in sane way.


> 2. Can potentially impact lumpy reclaim, since we've mix LRU pages
> from the memory controlelr into the global LRU?
> 

I can't catch what you ask...but I think no influence to lumpty reclaim.
It gathers vicitm pages from neiborhood of a page which should be removed.
Hmm ?


Thanks,
-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  6:10           ` Balbir Singh
@ 2009-03-31  6:28             ` KAMEZAWA Hiroyuki
  2009-03-31  6:49               ` Balbir Singh
  0 siblings, 1 reply; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  6:28 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Tue, 31 Mar 2009 11:40:10 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> > > Swapout for A? For A it is expected, but for B it is not. How many
> > > nodes do you have on your machine? Any fake numa nodes?
> > > 
> > Of course, from B.
> >
> 
> I asked because I see A have a swapout of 350 MB, which is expected
> since it is way over its soft limit.
>  
gcc doesn't use so much RSS..ld ?

> > Nothing special boot options. My test was on VMware 2cpus/1.6GB memory.
> > 
> > I wonder why swapout can be 0 on your test. Do you add some extra hooks to
> > kswapd ?
> >
> 
> Nope.. no special hooks to kswapd. B never enters the RB-Tree and thus
> never hits the memcg soft limit reclaim path. kswapd can reclaim from
> it, but it grows back quickly.
Why grows back ? tasks in B sleeps ?

>  At some point, memcg soft limit reclaim
> hits A and reclaims memory from it, allowing B to run without any
> problems. I am talking about the state at the end of the experiment.
> 
Considering LRU rotation (ACTIVE->INACTIVE), pages in group B never goes back
to ACTIVE list and can be the first candidates for swap-out via kswapd.

Hmm....kswapd doesn't work at all ?

(or 1700MB was too much.)

Thanks,
-Kame



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  6:28             ` KAMEZAWA Hiroyuki
@ 2009-03-31  6:49               ` Balbir Singh
  2009-03-31  6:56                 ` KAMEZAWA Hiroyuki
  2009-03-31  6:58                 ` KAMEZAWA Hiroyuki
  0 siblings, 2 replies; 41+ messages in thread
From: Balbir Singh @ 2009-03-31  6:49 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-31 15:28:43]:

> On Tue, 31 Mar 2009 11:40:10 +0530
> Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> 
> > > > Swapout for A? For A it is expected, but for B it is not. How many
> > > > nodes do you have on your machine? Any fake numa nodes?
> > > > 
> > > Of course, from B.
> > >
> > 
> > I asked because I see A have a swapout of 350 MB, which is expected
> > since it is way over its soft limit.
> >  
> gcc doesn't use so much RSS..ld ?

Yes, the ld step consumes a lot of memory, depending on file size and
number of parallel tasks, memory consumption does go up.

> 
> > > Nothing special boot options. My test was on VMware 2cpus/1.6GB memory.
> > > 
> > > I wonder why swapout can be 0 on your test. Do you add some extra hooks to
> > > kswapd ?
> > >
> > 
> > Nope.. no special hooks to kswapd. B never enters the RB-Tree and thus
> > never hits the memcg soft limit reclaim path. kswapd can reclaim from
> > it, but it grows back quickly.
> Why grows back ? tasks in B sleeps ?

Since B continuously consumes memory

> 
> >  At some point, memcg soft limit reclaim
> > hits A and reclaims memory from it, allowing B to run without any
> > problems. I am talking about the state at the end of the experiment.
> > 
> Considering LRU rotation (ACTIVE->INACTIVE), pages in group B never goes back
> to ACTIVE list and can be the first candidates for swap-out via kswapd.
> 
> Hmm....kswapd doesn't work at all ?
> 
> (or 1700MB was too much.)
>

No 1700MB is not too much, since we reclaim from A towards the end
when ld runs. I need to investigate more and look at the watermarks,
may be soft limit reclaim reclaims enough and/or the watermarks are
not very high. I use fake NUMA nodes as well.
 
> Thanks,
> -Kame
> 
> 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  6:49               ` Balbir Singh
@ 2009-03-31  6:56                 ` KAMEZAWA Hiroyuki
  2009-03-31  6:58                 ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  6:56 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Tue, 31 Mar 2009 12:19:02 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:

> > 
> > > > Nothing special boot options. My test was on VMware 2cpus/1.6GB memory.
> > > > 
> > > > I wonder why swapout can be 0 on your test. Do you add some extra hooks to
> > > > kswapd ?
> > > >
> > > 
> > > Nope.. no special hooks to kswapd. B never enters the RB-Tree and thus
> > > never hits the memcg soft limit reclaim path. kswapd can reclaim from
> > > it, but it grows back quickly.
> > Why grows back ? tasks in B sleeps ?
> 
> Since B continuously consumes memory
> 
Not sleep ?

In my test
 1. malloc 1GB and touch all and sleep in B. Wait until the memory usage in B
    goes up to 1024MB. This never wake up until 3. 
 2. run make in group A.
 3. kill malloc program.

Then why why continuously consumes memory ?

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-31  6:49               ` Balbir Singh
  2009-03-31  6:56                 ` KAMEZAWA Hiroyuki
@ 2009-03-31  6:58                 ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  6:58 UTC (permalink / raw)
  To: balbir; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

On Tue, 31 Mar 2009 12:19:02 +0530
Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
> > 
> > >  At some point, memcg soft limit reclaim
> > > hits A and reclaims memory from it, allowing B to run without any
> > > problems. I am talking about the state at the end of the experiment.
> > > 
> > Considering LRU rotation (ACTIVE->INACTIVE), pages in group B never goes back
> > to ACTIVE list and can be the first candidates for swap-out via kswapd.
> > 
> > Hmm....kswapd doesn't work at all ?
> > 
> > (or 1700MB was too much.)
> >
> 
> No 1700MB is not too much, since we reclaim from A towards the end
> when ld runs. I need to investigate more and look at the watermarks,
> may be soft limit reclaim reclaims enough and/or the watermarks are
> not very high. I use fake NUMA nodes as well.
>  
When talking about XXMB of swap, +100MB is much ;)

-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH 5/8] memcg soft limit (yet another new design) v1
  2009-03-27  5:09 ` [RFC][PATCH 5/8] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
  2009-03-27  5:14   ` KAMEZAWA Hiroyuki
@ 2009-03-31  8:18   ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-03-31  8:18 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: linux-kernel, linux-mm, balbir, kosaki.motohiro, nishimura

On Fri, 27 Mar 2009 14:09:23 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> memcg's reclaim routine is designed to ignore locality andplacements and
> then, inactive_anon_is_low() function doesn't take "zone" as its argument.
> 
> In later soft limit patch, we use "zone" as an arguments.
> 
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> Index: mmotm-2.6.29-Mar23/mm/memcontrol.c
> ===================================================================
> --- mmotm-2.6.29-Mar23.orig/mm/memcontrol.c
> +++ mmotm-2.6.29-Mar23/mm/memcontrol.c
> @@ -561,15 +561,28 @@ void mem_cgroup_record_reclaim_priority(
>  	spin_unlock(&mem->reclaim_param_lock);
>  }
>  
> -static int calc_inactive_ratio(struct mem_cgroup *memcg, unsigned long *present_pages)
> +static int calc_inactive_ratio(struct mem_cgroup *memcg,
> +			       unsigned long *present_pages,
> +			       struct zone *z)
>  {
>  	unsigned long active;
>  	unsigned long inactive;
>  	unsigned long gb;
>  	unsigned long inactive_ratio;
>  
> -	inactive = mem_cgroup_get_local_zonestat(memcg, LRU_INACTIVE_ANON);
> -	active = mem_cgroup_get_local_zonestat(memcg, LRU_ACTIVE_ANON);
> +	if (!z) {
> +		inactive = mem_cgroup_get_local_zonestat(memcg,
> +							 LRU_INACTIVE_ANON);
> +		active = mem_cgroup_get_local_zonestat(memcg, LRU_ACTIVE_ANON);
> +	} else {
> +		int nid = z->zone_pgdat->node_id;
> +		int zid = zone_idx(z);
> +		struct mem_cgroup_per_zone *mz;
> +
> +		mz = mem_cgroup_zoneinfo(memcg, nid, zid);
> +		inactive = MEM_CGROUP_ZSTAT(mz, LRU_INACTIVE_ANON);
> +		active = MEM_CGROUP_ZSTAT(mz, LRU_ACTIVE_ANON);
> +	}
>  
>  	gb = (inactive + active) >> (30 - PAGE_SHIFT);
>  	if (gb)
> @@ -585,14 +598,14 @@ static int calc_inactive_ratio(struct me
>  	return inactive_ratio;
>  }
>  
> -int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg)
> +int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *z)
>  {
>  	unsigned long active;
>  	unsigned long inactive;
>  	unsigned long present_pages[2];
>  	unsigned long inactive_ratio;
>  
> -	inactive_ratio = calc_inactive_ratio(memcg, present_pages);
> +	inactive_ratio = calc_inactive_ratio(memcg, present_pages, NULL);

The last arguments should be "z" not NULL...

seems posted version is a bit old...OMG, sorry.

I'm now adding bugfix etc..

-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
                   ` (10 preceding siblings ...)
  2009-03-29 13:01 ` Balbir Singh
@ 2009-04-01 14:42 ` Balbir Singh
  2009-04-01 15:11   ` KAMEZAWA Hiroyuki
  11 siblings, 1 reply; 41+ messages in thread
From: Balbir Singh @ 2009-04-01 14:42 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki; +Cc: linux-kernel, linux-mm, kosaki.motohiro, nishimura

* KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> [2009-03-27 13:59:33]:

> Hi,
> 
> Memory cgroup's soft limit feature is a feature to tell global LRU 
> "please reclaim from this memcg at memory shortage".
> 
> And Balbir's one and my one was proposed.
> This is new one. (so restart from v1), this is very new-born.
> 
> While testing soft limit, my dilemma was following.
> 
>  - needs additional cost of can if implementation is naive (unavoidable?)
> 
>  - Inactive/Active rotation scheme of global LRU will be broken.
> 
>  - File/Anon reclaim ratio scheme of global LRU will be broken.
>     - vm.swappiness will be ignored.
> 
>  - If using memcg's memory reclaim routine, 
>     - shrink_slab() will be never called.
>     - stale SwapCache has no chance to be reclaimed (stale SwapCache means
>       readed but not used one.)
>     - memcg can have no memory in a zone.
>     - memcg can have no Anon memory
>     - lumpty_reclaim() is not called.
> 
> 
> This patch tries to avoid to use existing memcg's reclaim routine and
> just tell "Hints" to global LRU. This patch is briefly tested and shows
> good result to me. (But may not to you. plz brame me.)
> 
> Major characteristic is.
>  - memcg will be inserted to softlimit-queue at charge() if usage excess
>    soft limit.
>  - softlimit-queue is a queue with priority. priority is detemined by size
>    of excessing usage.
>  - memcg's soft limit hooks is called by shrink_xxx_list() to show hints.
>  - Behavior is affected by vm.swappiness and LRU scan rate is determined by
>    global LRU's status.
> 
> I'm sorry that I'm tend not to tell enough explanation.  plz ask me.
> There will be much discussion points, anyway. As usual, I'm not in hurry.
> 
> 
> ==brief test result==
> On 2CPU/1.6GB bytes machine. create group A and B
>   A.  soft limit=300M
>   B.  no soft limit
> 
>   Run a malloc() program on B and allcoate 1G of memory. The program just
>   sleeps after allocating memory and no memory refernce after it.
>   Run make -j 6 and compile the kernel.
> 
>   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
>   When vm.swappiness = 10  => 1MB of memory are swapped out from B    
> 
>   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
>

I did some brief functionality tests and the results are far better
than the previous versions of the patch. Both my v7 (with some minor
changes) and this patchset seem to do well functionally. Time to do
some more exhaustive tests. Any results from your end? 

-- 
	Balbir

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [RFC][PATCH] memcg soft limit (yet another new design) v1
  2009-04-01 14:42 ` Balbir Singh
@ 2009-04-01 15:11   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-04-01 15:11 UTC (permalink / raw)
  To: balbir
  Cc: KAMEZAWA Hiroyuki, linux-kernel, linux-mm, kosaki.motohiro, nishimura

Balbir Singh wrote:
>> ==brief test result==
>> On 2CPU/1.6GB bytes machine. create group A and B
>>   A.  soft limit=300M
>>   B.  no soft limit
>>
>>   Run a malloc() program on B and allcoate 1G of memory. The program
>> just
>>   sleeps after allocating memory and no memory refernce after it.
>>   Run make -j 6 and compile the kernel.
>>
>>   When vm.swappiness = 60  => 60MB of memory are swapped out from B.
>>   When vm.swappiness = 10  => 1MB of memory are swapped out from B
>>
>>   If no soft limit, 350MB of swap out will happen from B.(swapiness=60)
>>
>
> I did some brief functionality tests and the results are far better
> than the previous versions of the patch. Both my v7 (with some minor
> changes) and this patchset seem to do well functionally. Time to do
> some more exhaustive tests. Any results from your end?
>
Grad to hear that.

Seems good result under several simple tests after fixing
inactive_anon_is_low(). But needed some fixes for corner cases,
add hook to uncharge, hook to cpu hotplug, etc....and making codes
slim, tuning parameters to make more sense. (or adding comments.)

I wonder whether it's convenient to post v2 before new mmotm.
(mmotm includes some fixes around memcg/vmscan.)
I'll continue test (hopefully more complicated cases on big machine.)

Anyway, I often update patch to v5 or more before posting final ones ;)

Thanks,
-Kame




^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2009-04-01 15:11 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-27  4:59 [RFC][PATCH] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
2009-03-27  5:01 ` [RFC][PATCH 1/8] soft limit support in res_counter KAMEZAWA Hiroyuki
2009-03-27  5:03 ` [RFC][PATCH 2/8] soft limit framework in memcg KAMEZAWA Hiroyuki
2009-03-27  8:01   ` KAMEZAWA Hiroyuki
2009-03-29 17:22   ` Balbir Singh
2009-03-27  5:05 ` [RFC][PATCH 3/8] trigger for updating soft limit information KAMEZAWA Hiroyuki
2009-03-27  5:06 ` [RFC][PATCH 4/8] memcg soft limit priority array queue KAMEZAWA Hiroyuki
2009-03-29 16:56   ` Balbir Singh
2009-03-30 23:58     ` KAMEZAWA Hiroyuki
2009-03-27  5:09 ` [RFC][PATCH 5/8] memcg soft limit (yet another new design) v1 KAMEZAWA Hiroyuki
2009-03-27  5:14   ` KAMEZAWA Hiroyuki
2009-03-31  8:18   ` KAMEZAWA Hiroyuki
2009-03-27  5:11 ` [RFC][PATCH 6/8] soft limit victim select KAMEZAWA Hiroyuki
2009-03-27  5:12 ` [RFC][PATCH 7/8] memcg soft limit LRU reorder KAMEZAWA Hiroyuki
2009-03-30  7:52   ` Balbir Singh
2009-03-31  0:00     ` KAMEZAWA Hiroyuki
2009-03-31  6:06       ` Balbir Singh
2009-03-31  6:19         ` KAMEZAWA Hiroyuki
2009-03-27  5:13 ` [RFC][PATCH 8/8] extends soft limit event filter KAMEZAWA Hiroyuki
2009-03-28  8:23 ` [RFC][PATCH] memcg soft limit (yet another new design) v1 Balbir Singh
2009-03-28 16:10   ` KAMEZAWA Hiroyuki
2009-03-28 18:11 ` Balbir Singh
2009-03-28 18:27   ` Balbir Singh
2009-03-30 23:55     ` KAMEZAWA Hiroyuki
2009-03-31  5:00       ` Balbir Singh
2009-03-31  5:05         ` KAMEZAWA Hiroyuki
2009-03-31  5:18           ` KAMEZAWA Hiroyuki
2009-03-31  6:10           ` Balbir Singh
2009-03-31  6:28             ` KAMEZAWA Hiroyuki
2009-03-31  6:49               ` Balbir Singh
2009-03-31  6:56                 ` KAMEZAWA Hiroyuki
2009-03-31  6:58                 ` KAMEZAWA Hiroyuki
2009-03-31  0:06     ` KAMEZAWA Hiroyuki
2009-03-31  5:01       ` Balbir Singh
2009-03-31  5:11         ` KAMEZAWA Hiroyuki
2009-03-31  6:07           ` Balbir Singh
2009-03-30 23:54   ` KAMEZAWA Hiroyuki
2009-03-29 13:01 ` Balbir Singh
2009-03-30 23:57   ` KAMEZAWA Hiroyuki
2009-04-01 14:42 ` Balbir Singh
2009-04-01 15:11   ` KAMEZAWA Hiroyuki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).