All of lore.kernel.org
 help / color / mirror / Atom feed
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Johannes Weiner <jweiner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <bsingharora@gmail.com>,
	Andrew Brestic <abrestic@google.com>,
	Ying Han <yinghan@google.com>, Michal Hocko <mhocko@suse.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch] Revert "memcg: add memory.vmscan_stat"
Date: Tue, 30 Aug 2011 10:12:33 +0900	[thread overview]
Message-ID: <20110830101233.ae416284.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20110829155113.GA21661@redhat.com>

On Mon, 29 Aug 2011 17:51:13 +0200
Johannes Weiner <jweiner@redhat.com> wrote:

> On Tue, Aug 09, 2011 at 08:33:45AM +0900, KAMEZAWA Hiroyuki wrote:
> > On Mon, 8 Aug 2011 14:43:33 +0200
> > Johannes Weiner <jweiner@redhat.com> wrote:
> > 
> > > On Fri, Jul 22, 2011 at 05:15:40PM +0900, KAMEZAWA Hiroyuki wrote:
> > > > +When under_hierarchy is added in the tail, the number indicates the
> > > > +total memcg scan of its children and itself.
> > > 
> > > In your implementation, statistics are only accounted to the memcg
> > > triggering the limit and the respectively scanned memcgs.
> > > 
> > > Consider the following setup:
> > > 
> > >         A
> > >        / \
> > >       B   C
> > >      /
> > >     D
> > > 
> > > If D tries to charge but hits the limit of A, then B's hierarchy
> > > counters do not reflect the reclaim activity resulting in D.
> > > 
> > yes, as I expected.
> 
> Andrew,
> 
> with a flawed design, the author unwilling to fix it, and two NAKs,
> can we please revert this before the release?
> 

How about this ?
==
Now, vmscan_stat's hierarchy counter just counts scan data which
is caused by the owner of limits. Then, it's not 'hierarchical'
as other parts of memcg does.

For example, Assuming following hierarchy

	A
       /
      B
     /
    C

When B,C, is scanned because of A's limit, vmscan_stat's
hierarchy accounting does
   A's hierarchy scan = A'scan + B'scan + C'scan
   B's hierarchy scan = 0
   C's hierarchy scan = 0
This first design was because the author considered C's
scan is caused by A. But considering interface compatibility,
following is natural.

  A's hierarchy scan = A'scan + B'scan + C'scan
  B's hierarchy scan = B'scan + C'scan
  C's hierarchy scan = C'scan

This patch changes counting implementation.

Suggested-by: Johannes Weiner <jweiner@redhat.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

Index: mmotm-Aug29/mm/memcontrol.c
===================================================================
--- mmotm-Aug29.orig/mm/memcontrol.c
+++ mmotm-Aug29/mm/memcontrol.c
@@ -229,7 +229,7 @@ enum {
 struct scanstat {
 	spinlock_t	lock;
 	unsigned long	stats[NR_SCAN_CONTEXT][NR_SCANSTATS];
-	unsigned long	rootstats[NR_SCAN_CONTEXT][NR_SCANSTATS];
+	unsigned long	hierarchy_stats[NR_SCAN_CONTEXT][NR_SCANSTATS];
 };
 
 const char *scanstat_string[NR_SCANSTATS] = {
@@ -1701,6 +1701,7 @@ static void __mem_cgroup_record_scanstat
 static void mem_cgroup_record_scanstat(struct memcg_scanrecord *rec)
 {
 	struct mem_cgroup *memcg;
+	struct cgroup *cgroup;
 	int context = rec->context;
 
 	if (context >= NR_SCAN_CONTEXT)
@@ -1710,11 +1711,18 @@ static void mem_cgroup_record_scanstat(s
 	spin_lock(&memcg->scanstat.lock);
 	__mem_cgroup_record_scanstat(memcg->scanstat.stats[context], rec);
 	spin_unlock(&memcg->scanstat.lock);
-
-	memcg = rec->root;
-	spin_lock(&memcg->scanstat.lock);
-	__mem_cgroup_record_scanstat(memcg->scanstat.rootstats[context], rec);
-	spin_unlock(&memcg->scanstat.lock);
+	cgroup = memcg->css.cgroup;
+	do {
+		spin_lock(&memcg->scanstat.lock);
+		__mem_cgroup_record_scanstat(
+			memcg->scanstat.hierarchy_stats[context], rec);
+		spin_unlock(&memcg->scanstat.lock);
+		if (!cgroup->parent)
+			break;
+		cgroup = cgroup->parent;
+		memcg = mem_cgroup_from_cont(cgroup);
+	} while (memcg->use_hierarchy && memcg != rec->root);
+	return;
 }
 
 /*
@@ -4733,14 +4741,14 @@ static int mem_cgroup_vmscan_stat_read(s
 		strcat(string, SCANSTAT_WORD_LIMIT);
 		strcat(string, SCANSTAT_WORD_HIERARCHY);
 		cb->fill(cb,
-			string, memcg->scanstat.rootstats[SCAN_BY_LIMIT][i]);
+		    string, memcg->scanstat.hierarchy_stats[SCAN_BY_LIMIT][i]);
 	}
 	for (i = 0; i < NR_SCANSTATS; i++) {
 		strcpy(string, scanstat_string[i]);
 		strcat(string, SCANSTAT_WORD_SYSTEM);
 		strcat(string, SCANSTAT_WORD_HIERARCHY);
 		cb->fill(cb,
-			string, memcg->scanstat.rootstats[SCAN_BY_SYSTEM][i]);
+		    string, memcg->scanstat.hierarchy_stats[SCAN_BY_SYSTEM][i]);
 	}
 	return 0;
 }
@@ -4752,8 +4760,8 @@ static int mem_cgroup_reset_vmscan_stat(
 
 	spin_lock(&memcg->scanstat.lock);
 	memset(&memcg->scanstat.stats, 0, sizeof(memcg->scanstat.stats));
-	memset(&memcg->scanstat.rootstats,
-		0, sizeof(memcg->scanstat.rootstats));
+	memset(&memcg->scanstat.hierarchy_stats,
+		0, sizeof(memcg->scanstat.hierarchy_stats));
 	spin_unlock(&memcg->scanstat.lock);
 	return 0;
 }







WARNING: multiple messages have this Message-ID (diff)
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
To: Johannes Weiner <jweiner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>,
	Balbir Singh <bsingharora@gmail.com>,
	Andrew Brestic <abrestic@google.com>,
	Ying Han <yinghan@google.com>, Michal Hocko <mhocko@suse.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [patch] Revert "memcg: add memory.vmscan_stat"
Date: Tue, 30 Aug 2011 10:12:33 +0900	[thread overview]
Message-ID: <20110830101233.ae416284.kamezawa.hiroyu@jp.fujitsu.com> (raw)
In-Reply-To: <20110829155113.GA21661@redhat.com>

On Mon, 29 Aug 2011 17:51:13 +0200
Johannes Weiner <jweiner@redhat.com> wrote:

> On Tue, Aug 09, 2011 at 08:33:45AM +0900, KAMEZAWA Hiroyuki wrote:
> > On Mon, 8 Aug 2011 14:43:33 +0200
> > Johannes Weiner <jweiner@redhat.com> wrote:
> > 
> > > On Fri, Jul 22, 2011 at 05:15:40PM +0900, KAMEZAWA Hiroyuki wrote:
> > > > +When under_hierarchy is added in the tail, the number indicates the
> > > > +total memcg scan of its children and itself.
> > > 
> > > In your implementation, statistics are only accounted to the memcg
> > > triggering the limit and the respectively scanned memcgs.
> > > 
> > > Consider the following setup:
> > > 
> > >         A
> > >        / \
> > >       B   C
> > >      /
> > >     D
> > > 
> > > If D tries to charge but hits the limit of A, then B's hierarchy
> > > counters do not reflect the reclaim activity resulting in D.
> > > 
> > yes, as I expected.
> 
> Andrew,
> 
> with a flawed design, the author unwilling to fix it, and two NAKs,
> can we please revert this before the release?
> 

How about this ?
==
Now, vmscan_stat's hierarchy counter just counts scan data which
is caused by the owner of limits. Then, it's not 'hierarchical'
as other parts of memcg does.

For example, Assuming following hierarchy

	A
       /
      B
     /
    C

When B,C, is scanned because of A's limit, vmscan_stat's
hierarchy accounting does
   A's hierarchy scan = A'scan + B'scan + C'scan
   B's hierarchy scan = 0
   C's hierarchy scan = 0
This first design was because the author considered C's
scan is caused by A. But considering interface compatibility,
following is natural.

  A's hierarchy scan = A'scan + B'scan + C'scan
  B's hierarchy scan = B'scan + C'scan
  C's hierarchy scan = C'scan

This patch changes counting implementation.

Suggested-by: Johannes Weiner <jweiner@redhat.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 mm/memcontrol.c |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

Index: mmotm-Aug29/mm/memcontrol.c
===================================================================
--- mmotm-Aug29.orig/mm/memcontrol.c
+++ mmotm-Aug29/mm/memcontrol.c
@@ -229,7 +229,7 @@ enum {
 struct scanstat {
 	spinlock_t	lock;
 	unsigned long	stats[NR_SCAN_CONTEXT][NR_SCANSTATS];
-	unsigned long	rootstats[NR_SCAN_CONTEXT][NR_SCANSTATS];
+	unsigned long	hierarchy_stats[NR_SCAN_CONTEXT][NR_SCANSTATS];
 };
 
 const char *scanstat_string[NR_SCANSTATS] = {
@@ -1701,6 +1701,7 @@ static void __mem_cgroup_record_scanstat
 static void mem_cgroup_record_scanstat(struct memcg_scanrecord *rec)
 {
 	struct mem_cgroup *memcg;
+	struct cgroup *cgroup;
 	int context = rec->context;
 
 	if (context >= NR_SCAN_CONTEXT)
@@ -1710,11 +1711,18 @@ static void mem_cgroup_record_scanstat(s
 	spin_lock(&memcg->scanstat.lock);
 	__mem_cgroup_record_scanstat(memcg->scanstat.stats[context], rec);
 	spin_unlock(&memcg->scanstat.lock);
-
-	memcg = rec->root;
-	spin_lock(&memcg->scanstat.lock);
-	__mem_cgroup_record_scanstat(memcg->scanstat.rootstats[context], rec);
-	spin_unlock(&memcg->scanstat.lock);
+	cgroup = memcg->css.cgroup;
+	do {
+		spin_lock(&memcg->scanstat.lock);
+		__mem_cgroup_record_scanstat(
+			memcg->scanstat.hierarchy_stats[context], rec);
+		spin_unlock(&memcg->scanstat.lock);
+		if (!cgroup->parent)
+			break;
+		cgroup = cgroup->parent;
+		memcg = mem_cgroup_from_cont(cgroup);
+	} while (memcg->use_hierarchy && memcg != rec->root);
+	return;
 }
 
 /*
@@ -4733,14 +4741,14 @@ static int mem_cgroup_vmscan_stat_read(s
 		strcat(string, SCANSTAT_WORD_LIMIT);
 		strcat(string, SCANSTAT_WORD_HIERARCHY);
 		cb->fill(cb,
-			string, memcg->scanstat.rootstats[SCAN_BY_LIMIT][i]);
+		    string, memcg->scanstat.hierarchy_stats[SCAN_BY_LIMIT][i]);
 	}
 	for (i = 0; i < NR_SCANSTATS; i++) {
 		strcpy(string, scanstat_string[i]);
 		strcat(string, SCANSTAT_WORD_SYSTEM);
 		strcat(string, SCANSTAT_WORD_HIERARCHY);
 		cb->fill(cb,
-			string, memcg->scanstat.rootstats[SCAN_BY_SYSTEM][i]);
+		    string, memcg->scanstat.hierarchy_stats[SCAN_BY_SYSTEM][i]);
 	}
 	return 0;
 }
@@ -4752,8 +4760,8 @@ static int mem_cgroup_reset_vmscan_stat(
 
 	spin_lock(&memcg->scanstat.lock);
 	memset(&memcg->scanstat.stats, 0, sizeof(memcg->scanstat.stats));
-	memset(&memcg->scanstat.rootstats,
-		0, sizeof(memcg->scanstat.rootstats));
+	memset(&memcg->scanstat.hierarchy_stats,
+		0, sizeof(memcg->scanstat.hierarchy_stats));
 	spin_unlock(&memcg->scanstat.lock);
 	return 0;
 }






--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-08-30  1:20 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-22  8:15 [PATCH v3] memcg: add memory.vmscan_stat KAMEZAWA Hiroyuki
2011-07-22  8:15 ` KAMEZAWA Hiroyuki
2011-08-08 12:43 ` Johannes Weiner
2011-08-08 12:43   ` Johannes Weiner
2011-08-08 23:33   ` KAMEZAWA Hiroyuki
2011-08-08 23:33     ` KAMEZAWA Hiroyuki
2011-08-09  8:01     ` Johannes Weiner
2011-08-09  8:01       ` Johannes Weiner
2011-08-09  8:01       ` KAMEZAWA Hiroyuki
2011-08-09  8:01         ` KAMEZAWA Hiroyuki
2011-08-13  1:04         ` Ying Han
2011-08-13  1:04           ` Ying Han
2011-08-29 15:51     ` [patch] Revert "memcg: add memory.vmscan_stat" Johannes Weiner
2011-08-29 15:51       ` Johannes Weiner
2011-08-30  1:12       ` KAMEZAWA Hiroyuki [this message]
2011-08-30  1:12         ` KAMEZAWA Hiroyuki
2011-08-30  7:04         ` Johannes Weiner
2011-08-30  7:04           ` Johannes Weiner
2011-08-30  7:20           ` KAMEZAWA Hiroyuki
2011-08-30  7:20             ` KAMEZAWA Hiroyuki
2011-08-30  7:35             ` KAMEZAWA Hiroyuki
2011-08-30  7:35               ` KAMEZAWA Hiroyuki
2011-08-30  8:42             ` Johannes Weiner
2011-08-30  8:42               ` Johannes Weiner
2011-08-30  8:56               ` KAMEZAWA Hiroyuki
2011-08-30  8:56                 ` KAMEZAWA Hiroyuki
2011-08-30 10:17                 ` Johannes Weiner
2011-08-30 10:17                   ` Johannes Weiner
2011-08-30 10:34                   ` KAMEZAWA Hiroyuki
2011-08-30 10:34                     ` KAMEZAWA Hiroyuki
2011-08-30 11:03                     ` Johannes Weiner
2011-08-30 11:03                       ` Johannes Weiner
2011-08-30 23:38                       ` KAMEZAWA Hiroyuki
2011-08-30 23:38                         ` KAMEZAWA Hiroyuki
2011-08-30 10:38                   ` KAMEZAWA Hiroyuki
2011-08-30 10:38                     ` KAMEZAWA Hiroyuki
2011-08-30 11:32                     ` Johannes Weiner
2011-08-30 11:32                       ` Johannes Weiner
2011-08-30 23:29                       ` KAMEZAWA Hiroyuki
2011-08-30 23:29                         ` KAMEZAWA Hiroyuki
2011-08-31  6:23                         ` Johannes Weiner
2011-08-31  6:23                           ` Johannes Weiner
2011-08-31  6:30                           ` KAMEZAWA Hiroyuki
2011-08-31  6:30                             ` KAMEZAWA Hiroyuki
2011-08-31  8:33                             ` Johannes Weiner
2011-08-31  8:33                               ` Johannes Weiner
2011-09-01  6:05               ` Ying Han
2011-09-01  6:05                 ` Ying Han
2011-09-01  6:40                 ` Johannes Weiner
2011-09-01  6:40                   ` Johannes Weiner
2011-09-01  7:04                   ` Ying Han
2011-09-01  7:04                     ` Ying Han
2011-09-01  8:27                     ` Johannes Weiner
2011-09-01  8:27                       ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110830101233.ae416284.kamezawa.hiroyu@jp.fujitsu.com \
    --to=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=abrestic@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=bsingharora@gmail.com \
    --cc=jweiner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=nishimura@mxp.nes.nec.co.jp \
    --cc=yinghan@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.