All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 00/10] mm: memcg: charge/uncharge improvements v2
@ 2012-07-11 17:02 ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Hi,

second version of tiny charge/uncharge improvements, with incorporated
feedback and acks added (thanks guys).

changes:
o fixed the 03/10 PageSwapCache check in end_migration(), spotted by Kame
o dropped the v1 03/11 shmem patch in favor of Hugh's cleanup
o included default group charging comment fix in 07/10, spotted by Wanpeng

 include/linux/memcontrol.h |   11 +--
 mm/memcontrol.c            |  207 +++++++++++++++++++++++---------------------
 mm/migrate.c               |   27 ++-----
 mm/swapfile.c              |    3 +-
 4 files changed, 119 insertions(+), 129 deletions(-)


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch 00/10] mm: memcg: charge/uncharge improvements v2
@ 2012-07-11 17:02 ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Hi,

second version of tiny charge/uncharge improvements, with incorporated
feedback and acks added (thanks guys).

changes:
o fixed the 03/10 PageSwapCache check in end_migration(), spotted by Kame
o dropped the v1 03/11 shmem patch in favor of Hugh's cleanup
o included default group charging comment fix in 07/10, spotted by Wanpeng

 include/linux/memcontrol.h |   11 +--
 mm/memcontrol.c            |  207 +++++++++++++++++++++++---------------------
 mm/migrate.c               |   27 ++-----
 mm/swapfile.c              |    3 +-
 4 files changed, 119 insertions(+), 129 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch 00/10] mm: memcg: charge/uncharge improvements v2
@ 2012-07-11 17:02 ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi,

second version of tiny charge/uncharge improvements, with incorporated
feedback and acks added (thanks guys).

changes:
o fixed the 03/10 PageSwapCache check in end_migration(), spotted by Kame
o dropped the v1 03/11 shmem patch in favor of Hugh's cleanup
o included default group charging comment fix in 07/10, spotted by Wanpeng

 include/linux/memcontrol.h |   11 +--
 mm/memcontrol.c            |  207 +++++++++++++++++++++++---------------------
 mm/migrate.c               |   27 ++-----
 mm/swapfile.c              |    3 +-
 4 files changed, 119 insertions(+), 129 deletions(-)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Compaction (and page migration in general) can currently be hindered
through pages being owned by memory cgroups that are at their limits
and unreclaimable.

The reason is that the replacement page is being charged against the
limit while the page being replaced is also still charged.  But this
seems unnecessary, given that only one of the two pages will still be
in use after migration finishes.

This patch changes the memcg migration sequence so that the
replacement page is not charged.  Whatever page is still in use after
successful or failed migration gets to keep the charge of the page
that was going to be replaced.

The replacement page will still show up temporarily in the rss/cache
statistics, this can be fixed in a later patch as it's less urgent.

Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/memcontrol.h |   11 +++----
 mm/memcontrol.c            |   67 +++++++++++++++++++++++--------------------
 mm/migrate.c               |   27 ++++--------------
 3 files changed, 47 insertions(+), 58 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 5a3ee64..8d9489f 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -98,9 +98,9 @@ int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
 
 extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
 
-extern int
-mem_cgroup_prepare_migration(struct page *page,
-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask);
+extern void
+mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
+			     struct mem_cgroup **memcgp);
 extern void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 	struct page *oldpage, struct page *newpage, bool migration_ok);
 
@@ -276,11 +276,10 @@ static inline struct cgroup_subsys_state
 	return NULL;
 }
 
-static inline int
+static inline void
 mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
-	struct mem_cgroup **memcgp, gfp_t gfp_mask)
+			     struct mem_cgroup **memcgp)
 {
-	return 0;
 }
 
 static inline void mem_cgroup_end_migration(struct mem_cgroup *memcg,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e8ddc00..12ee2de 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2977,7 +2977,8 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
  * uncharge if !page_mapped(page)
  */
 static struct mem_cgroup *
-__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
+__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
+			     bool end_migration)
 {
 	struct mem_cgroup *memcg = NULL;
 	unsigned int nr_pages = 1;
@@ -3021,7 +3022,16 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 		/* fallthrough */
 	case MEM_CGROUP_CHARGE_TYPE_DROP:
 		/* See mem_cgroup_prepare_migration() */
-		if (page_mapped(page) || PageCgroupMigration(pc))
+		if (page_mapped(page))
+			goto unlock_out;
+		/*
+		 * Pages under migration may not be uncharged.  But
+		 * end_migration() /must/ be the one uncharging the
+		 * unused post-migration page and so it has to call
+		 * here with the migration bit still set.  See the
+		 * res_counter handling below.
+		 */
+		if (!end_migration && PageCgroupMigration(pc))
 			goto unlock_out;
 		break;
 	case MEM_CGROUP_CHARGE_TYPE_SWAPOUT:
@@ -3055,7 +3065,12 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 		mem_cgroup_swap_statistics(memcg, true);
 		mem_cgroup_get(memcg);
 	}
-	if (!mem_cgroup_is_root(memcg))
+	/*
+	 * Migration does not charge the res_counter for the
+	 * replacement page, so leave it alone when phasing out the
+	 * page that is unused after the migration.
+	 */
+	if (!end_migration && !mem_cgroup_is_root(memcg))
 		mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
 
 	return memcg;
@@ -3071,14 +3086,14 @@ void mem_cgroup_uncharge_page(struct page *page)
 	if (page_mapped(page))
 		return;
 	VM_BUG_ON(page->mapping && !PageAnon(page));
-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON);
+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
 }
 
 void mem_cgroup_uncharge_cache_page(struct page *page)
 {
 	VM_BUG_ON(page_mapped(page));
 	VM_BUG_ON(page->mapping);
-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
 }
 
 /*
@@ -3142,7 +3157,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
 	if (!swapout) /* this was a swap cache but the swap is unused ! */
 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
 
-	memcg = __mem_cgroup_uncharge_common(page, ctype);
+	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
 
 	/*
 	 * record memcg information,  if swapout && memcg != NULL,
@@ -3232,19 +3247,18 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
  * Before starting migration, account PAGE_SIZE to mem_cgroup that the old
  * page belongs to.
  */
-int mem_cgroup_prepare_migration(struct page *page,
-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask)
+void mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
+				  struct mem_cgroup **memcgp)
 {
 	struct mem_cgroup *memcg = NULL;
 	struct page_cgroup *pc;
 	enum charge_type ctype;
-	int ret = 0;
 
 	*memcgp = NULL;
 
 	VM_BUG_ON(PageTransHuge(page));
 	if (mem_cgroup_disabled())
-		return 0;
+		return;
 
 	pc = lookup_page_cgroup(page);
 	lock_page_cgroup(pc);
@@ -3289,24 +3303,9 @@ int mem_cgroup_prepare_migration(struct page *page,
 	 * we return here.
 	 */
 	if (!memcg)
-		return 0;
+		return;
 
 	*memcgp = memcg;
-	ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, memcgp, false);
-	css_put(&memcg->css);/* drop extra refcnt */
-	if (ret) {
-		if (PageAnon(page)) {
-			lock_page_cgroup(pc);
-			ClearPageCgroupMigration(pc);
-			unlock_page_cgroup(pc);
-			/*
-			 * The old page may be fully unmapped while we kept it.
-			 */
-			mem_cgroup_uncharge_page(page);
-		}
-		/* we'll need to revisit this error code (we have -EINTR) */
-		return -ENOMEM;
-	}
 	/*
 	 * We charge new page before it's used/mapped. So, even if unlock_page()
 	 * is called before end_migration, we can catch all events on this new
@@ -3319,8 +3318,12 @@ int mem_cgroup_prepare_migration(struct page *page,
 		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
 	else
 		ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
+	/*
+	 * The page is committed to the memcg, but it's not actually
+	 * charged to the res_counter since we plan on replacing the
+	 * old one and only one page is going to be left afterwards.
+	 */
 	__mem_cgroup_commit_charge(memcg, newpage, 1, ctype, false);
-	return ret;
 }
 
 /* remove redundant charge if migration failed*/
@@ -3342,6 +3345,12 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 		used = newpage;
 		unused = oldpage;
 	}
+	anon = PageAnon(used);
+	__mem_cgroup_uncharge_common(unused,
+		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
+		     : MEM_CGROUP_CHARGE_TYPE_CACHE,
+		true);
+	css_put(&memcg->css);
 	/*
 	 * We disallowed uncharge of pages under migration because mapcount
 	 * of the page goes down to zero, temporarly.
@@ -3351,10 +3360,6 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 	lock_page_cgroup(pc);
 	ClearPageCgroupMigration(pc);
 	unlock_page_cgroup(pc);
-	anon = PageAnon(used);
-	__mem_cgroup_uncharge_common(unused,
-		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
-		     : MEM_CGROUP_CHARGE_TYPE_CACHE);
 
 	/*
 	 * If a page is a file cache, radix-tree replacement is very atomic
diff --git a/mm/migrate.c b/mm/migrate.c
index 8137aea..aa06bf4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -687,7 +687,6 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 {
 	int rc = -EAGAIN;
 	int remap_swapcache = 1;
-	int charge = 0;
 	struct mem_cgroup *mem;
 	struct anon_vma *anon_vma = NULL;
 
@@ -729,12 +728,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	}
 
 	/* charge against new page */
-	charge = mem_cgroup_prepare_migration(page, newpage, &mem, GFP_KERNEL);
-	if (charge == -ENOMEM) {
-		rc = -ENOMEM;
-		goto unlock;
-	}
-	BUG_ON(charge);
+	mem_cgroup_prepare_migration(page, newpage, &mem);
 
 	if (PageWriteback(page)) {
 		/*
@@ -824,8 +818,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		put_anon_vma(anon_vma);
 
 uncharge:
-	if (!charge)
-		mem_cgroup_end_migration(mem, page, newpage, rc == 0);
+	mem_cgroup_end_migration(mem, page, newpage, rc == 0);
 unlock:
 	unlock_page(page);
 out:
@@ -1519,10 +1512,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
 {
 	struct page *oldpage = page, *newpage;
 	struct address_space *mapping = page_mapping(page);
-	struct mem_cgroup *mcg;
+	struct mem_cgroup *memcg;
 	unsigned int gfp;
 	int rc = 0;
-	int charge = -ENOMEM;
 
 	VM_BUG_ON(!PageLocked(page));
 	VM_BUG_ON(page_mapcount(page));
@@ -1556,12 +1548,7 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
 	if (!trylock_page(newpage))
 		BUG();		/* new page should be unlocked!!! */
 
-	// XXX hnaz, is this right?
-	charge = mem_cgroup_prepare_migration(page, newpage, &mcg, gfp);
-	if (charge == -ENOMEM) {
-		rc = charge;
-		goto out;
-	}
+	mem_cgroup_prepare_migration(page, newpage, &memcg);
 
 	newpage->index = page->index;
 	newpage->mapping = page->mapping;
@@ -1581,11 +1568,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
 		page = newpage;
 	}
 
+	mem_cgroup_end_migration(memcg, oldpage, newpage, !rc);
 out:
-	if (!charge)
-		mem_cgroup_end_migration(mcg, oldpage, newpage, !rc);
-
-       if (oldpage != page)
+	if (oldpage != page)
                put_page(oldpage);
 
 	if (rc) {
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Compaction (and page migration in general) can currently be hindered
through pages being owned by memory cgroups that are at their limits
and unreclaimable.

The reason is that the replacement page is being charged against the
limit while the page being replaced is also still charged.  But this
seems unnecessary, given that only one of the two pages will still be
in use after migration finishes.

This patch changes the memcg migration sequence so that the
replacement page is not charged.  Whatever page is still in use after
successful or failed migration gets to keep the charge of the page
that was going to be replaced.

The replacement page will still show up temporarily in the rss/cache
statistics, this can be fixed in a later patch as it's less urgent.

Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 include/linux/memcontrol.h |   11 +++----
 mm/memcontrol.c            |   67 +++++++++++++++++++++++--------------------
 mm/migrate.c               |   27 ++++--------------
 3 files changed, 47 insertions(+), 58 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 5a3ee64..8d9489f 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -98,9 +98,9 @@ int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
 
 extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
 
-extern int
-mem_cgroup_prepare_migration(struct page *page,
-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask);
+extern void
+mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
+			     struct mem_cgroup **memcgp);
 extern void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 	struct page *oldpage, struct page *newpage, bool migration_ok);
 
@@ -276,11 +276,10 @@ static inline struct cgroup_subsys_state
 	return NULL;
 }
 
-static inline int
+static inline void
 mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
-	struct mem_cgroup **memcgp, gfp_t gfp_mask)
+			     struct mem_cgroup **memcgp)
 {
-	return 0;
 }
 
 static inline void mem_cgroup_end_migration(struct mem_cgroup *memcg,
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e8ddc00..12ee2de 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2977,7 +2977,8 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
  * uncharge if !page_mapped(page)
  */
 static struct mem_cgroup *
-__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
+__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
+			     bool end_migration)
 {
 	struct mem_cgroup *memcg = NULL;
 	unsigned int nr_pages = 1;
@@ -3021,7 +3022,16 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 		/* fallthrough */
 	case MEM_CGROUP_CHARGE_TYPE_DROP:
 		/* See mem_cgroup_prepare_migration() */
-		if (page_mapped(page) || PageCgroupMigration(pc))
+		if (page_mapped(page))
+			goto unlock_out;
+		/*
+		 * Pages under migration may not be uncharged.  But
+		 * end_migration() /must/ be the one uncharging the
+		 * unused post-migration page and so it has to call
+		 * here with the migration bit still set.  See the
+		 * res_counter handling below.
+		 */
+		if (!end_migration && PageCgroupMigration(pc))
 			goto unlock_out;
 		break;
 	case MEM_CGROUP_CHARGE_TYPE_SWAPOUT:
@@ -3055,7 +3065,12 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
 		mem_cgroup_swap_statistics(memcg, true);
 		mem_cgroup_get(memcg);
 	}
-	if (!mem_cgroup_is_root(memcg))
+	/*
+	 * Migration does not charge the res_counter for the
+	 * replacement page, so leave it alone when phasing out the
+	 * page that is unused after the migration.
+	 */
+	if (!end_migration && !mem_cgroup_is_root(memcg))
 		mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
 
 	return memcg;
@@ -3071,14 +3086,14 @@ void mem_cgroup_uncharge_page(struct page *page)
 	if (page_mapped(page))
 		return;
 	VM_BUG_ON(page->mapping && !PageAnon(page));
-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON);
+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
 }
 
 void mem_cgroup_uncharge_cache_page(struct page *page)
 {
 	VM_BUG_ON(page_mapped(page));
 	VM_BUG_ON(page->mapping);
-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
 }
 
 /*
@@ -3142,7 +3157,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
 	if (!swapout) /* this was a swap cache but the swap is unused ! */
 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
 
-	memcg = __mem_cgroup_uncharge_common(page, ctype);
+	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
 
 	/*
 	 * record memcg information,  if swapout && memcg != NULL,
@@ -3232,19 +3247,18 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
  * Before starting migration, account PAGE_SIZE to mem_cgroup that the old
  * page belongs to.
  */
-int mem_cgroup_prepare_migration(struct page *page,
-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask)
+void mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
+				  struct mem_cgroup **memcgp)
 {
 	struct mem_cgroup *memcg = NULL;
 	struct page_cgroup *pc;
 	enum charge_type ctype;
-	int ret = 0;
 
 	*memcgp = NULL;
 
 	VM_BUG_ON(PageTransHuge(page));
 	if (mem_cgroup_disabled())
-		return 0;
+		return;
 
 	pc = lookup_page_cgroup(page);
 	lock_page_cgroup(pc);
@@ -3289,24 +3303,9 @@ int mem_cgroup_prepare_migration(struct page *page,
 	 * we return here.
 	 */
 	if (!memcg)
-		return 0;
+		return;
 
 	*memcgp = memcg;
-	ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, memcgp, false);
-	css_put(&memcg->css);/* drop extra refcnt */
-	if (ret) {
-		if (PageAnon(page)) {
-			lock_page_cgroup(pc);
-			ClearPageCgroupMigration(pc);
-			unlock_page_cgroup(pc);
-			/*
-			 * The old page may be fully unmapped while we kept it.
-			 */
-			mem_cgroup_uncharge_page(page);
-		}
-		/* we'll need to revisit this error code (we have -EINTR) */
-		return -ENOMEM;
-	}
 	/*
 	 * We charge new page before it's used/mapped. So, even if unlock_page()
 	 * is called before end_migration, we can catch all events on this new
@@ -3319,8 +3318,12 @@ int mem_cgroup_prepare_migration(struct page *page,
 		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
 	else
 		ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
+	/*
+	 * The page is committed to the memcg, but it's not actually
+	 * charged to the res_counter since we plan on replacing the
+	 * old one and only one page is going to be left afterwards.
+	 */
 	__mem_cgroup_commit_charge(memcg, newpage, 1, ctype, false);
-	return ret;
 }
 
 /* remove redundant charge if migration failed*/
@@ -3342,6 +3345,12 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 		used = newpage;
 		unused = oldpage;
 	}
+	anon = PageAnon(used);
+	__mem_cgroup_uncharge_common(unused,
+		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
+		     : MEM_CGROUP_CHARGE_TYPE_CACHE,
+		true);
+	css_put(&memcg->css);
 	/*
 	 * We disallowed uncharge of pages under migration because mapcount
 	 * of the page goes down to zero, temporarly.
@@ -3351,10 +3360,6 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 	lock_page_cgroup(pc);
 	ClearPageCgroupMigration(pc);
 	unlock_page_cgroup(pc);
-	anon = PageAnon(used);
-	__mem_cgroup_uncharge_common(unused,
-		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
-		     : MEM_CGROUP_CHARGE_TYPE_CACHE);
 
 	/*
 	 * If a page is a file cache, radix-tree replacement is very atomic
diff --git a/mm/migrate.c b/mm/migrate.c
index 8137aea..aa06bf4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -687,7 +687,6 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 {
 	int rc = -EAGAIN;
 	int remap_swapcache = 1;
-	int charge = 0;
 	struct mem_cgroup *mem;
 	struct anon_vma *anon_vma = NULL;
 
@@ -729,12 +728,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	}
 
 	/* charge against new page */
-	charge = mem_cgroup_prepare_migration(page, newpage, &mem, GFP_KERNEL);
-	if (charge == -ENOMEM) {
-		rc = -ENOMEM;
-		goto unlock;
-	}
-	BUG_ON(charge);
+	mem_cgroup_prepare_migration(page, newpage, &mem);
 
 	if (PageWriteback(page)) {
 		/*
@@ -824,8 +818,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 		put_anon_vma(anon_vma);
 
 uncharge:
-	if (!charge)
-		mem_cgroup_end_migration(mem, page, newpage, rc == 0);
+	mem_cgroup_end_migration(mem, page, newpage, rc == 0);
 unlock:
 	unlock_page(page);
 out:
@@ -1519,10 +1512,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
 {
 	struct page *oldpage = page, *newpage;
 	struct address_space *mapping = page_mapping(page);
-	struct mem_cgroup *mcg;
+	struct mem_cgroup *memcg;
 	unsigned int gfp;
 	int rc = 0;
-	int charge = -ENOMEM;
 
 	VM_BUG_ON(!PageLocked(page));
 	VM_BUG_ON(page_mapcount(page));
@@ -1556,12 +1548,7 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
 	if (!trylock_page(newpage))
 		BUG();		/* new page should be unlocked!!! */
 
-	// XXX hnaz, is this right?
-	charge = mem_cgroup_prepare_migration(page, newpage, &mcg, gfp);
-	if (charge == -ENOMEM) {
-		rc = charge;
-		goto out;
-	}
+	mem_cgroup_prepare_migration(page, newpage, &memcg);
 
 	newpage->index = page->index;
 	newpage->mapping = page->mapping;
@@ -1581,11 +1568,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
 		page = newpage;
 	}
 
+	mem_cgroup_end_migration(memcg, oldpage, newpage, !rc);
 out:
-	if (!charge)
-		mem_cgroup_end_migration(mcg, oldpage, newpage, !rc);
-
-       if (oldpage != page)
+	if (oldpage != page)
                put_page(oldpage);
 
 	if (rc) {
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 02/10] mm: swapfile: clean up unuse_pte race handling
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

The conditional mem_cgroup_cancel_charge_swapin() is a leftover from
when the function would continue to reestablish the page even after
mem_cgroup_try_charge_swapin() failed.  After 85d9fc8 "memcg: fix
refcnt handling at swapoff", the condition is always true when this
code is reached.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/swapfile.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 64408be..75881ca 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -845,8 +845,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
 
 	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
 	if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
-		if (ret > 0)
-			mem_cgroup_cancel_charge_swapin(memcg);
+		mem_cgroup_cancel_charge_swapin(memcg);
 		ret = 0;
 		goto out;
 	}
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 02/10] mm: swapfile: clean up unuse_pte race handling
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

The conditional mem_cgroup_cancel_charge_swapin() is a leftover from
when the function would continue to reestablish the page even after
mem_cgroup_try_charge_swapin() failed.  After 85d9fc8 "memcg: fix
refcnt handling at swapoff", the condition is always true when this
code is reached.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/swapfile.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 64408be..75881ca 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -845,8 +845,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
 
 	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
 	if (unlikely(!pte_same(*pte, swp_entry_to_pte(entry)))) {
-		if (ret > 0)
-			mem_cgroup_cancel_charge_swapin(memcg);
+		mem_cgroup_cancel_charge_swapin(memcg);
 		ret = 0;
 		goto out;
 	}
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 03/10] mm: memcg: push down PageSwapCache check into uncharge entry functions
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Not all uncharge paths need to check if the page is swapcache, some of
them can know for sure.

Push down the check into all callsites of uncharge_common() so that
the patch that removes some of them is more obvious.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 12ee2de..fb8d525 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2988,8 +2988,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
 	if (mem_cgroup_disabled())
 		return NULL;
 
-	if (PageSwapCache(page))
-		return NULL;
+	VM_BUG_ON(PageSwapCache(page));
 
 	if (PageTransHuge(page)) {
 		nr_pages <<= compound_order(page);
@@ -3086,6 +3085,8 @@ void mem_cgroup_uncharge_page(struct page *page)
 	if (page_mapped(page))
 		return;
 	VM_BUG_ON(page->mapping && !PageAnon(page));
+	if (PageSwapCache(page))
+		return;
 	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
 }
 
@@ -3093,6 +3094,8 @@ void mem_cgroup_uncharge_cache_page(struct page *page)
 {
 	VM_BUG_ON(page_mapped(page));
 	VM_BUG_ON(page->mapping);
+	if (PageSwapCache(page))
+		return;
 	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
 }
 
@@ -3157,6 +3160,8 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
 	if (!swapout) /* this was a swap cache but the swap is unused ! */
 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
 
+	if (PageSwapCache(page))
+		return;
 	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
 
 	/*
@@ -3346,10 +3351,11 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 		unused = oldpage;
 	}
 	anon = PageAnon(used);
-	__mem_cgroup_uncharge_common(unused,
-		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
-		     : MEM_CGROUP_CHARGE_TYPE_CACHE,
-		true);
+	if (!PageSwapCache(unused))
+		__mem_cgroup_uncharge_common(unused,
+					     anon ? MEM_CGROUP_CHARGE_TYPE_ANON
+					     : MEM_CGROUP_CHARGE_TYPE_CACHE,
+					     true);
 	css_put(&memcg->css);
 	/*
 	 * We disallowed uncharge of pages under migration because mapcount
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 03/10] mm: memcg: push down PageSwapCache check into uncharge entry functions
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Not all uncharge paths need to check if the page is swapcache, some of
them can know for sure.

Push down the check into all callsites of uncharge_common() so that
the patch that removes some of them is more obvious.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   18 ++++++++++++------
 1 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 12ee2de..fb8d525 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2988,8 +2988,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
 	if (mem_cgroup_disabled())
 		return NULL;
 
-	if (PageSwapCache(page))
-		return NULL;
+	VM_BUG_ON(PageSwapCache(page));
 
 	if (PageTransHuge(page)) {
 		nr_pages <<= compound_order(page);
@@ -3086,6 +3085,8 @@ void mem_cgroup_uncharge_page(struct page *page)
 	if (page_mapped(page))
 		return;
 	VM_BUG_ON(page->mapping && !PageAnon(page));
+	if (PageSwapCache(page))
+		return;
 	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
 }
 
@@ -3093,6 +3094,8 @@ void mem_cgroup_uncharge_cache_page(struct page *page)
 {
 	VM_BUG_ON(page_mapped(page));
 	VM_BUG_ON(page->mapping);
+	if (PageSwapCache(page))
+		return;
 	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
 }
 
@@ -3157,6 +3160,8 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
 	if (!swapout) /* this was a swap cache but the swap is unused ! */
 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
 
+	if (PageSwapCache(page))
+		return;
 	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
 
 	/*
@@ -3346,10 +3351,11 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 		unused = oldpage;
 	}
 	anon = PageAnon(used);
-	__mem_cgroup_uncharge_common(unused,
-		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
-		     : MEM_CGROUP_CHARGE_TYPE_CACHE,
-		true);
+	if (!PageSwapCache(unused))
+		__mem_cgroup_uncharge_common(unused,
+					     anon ? MEM_CGROUP_CHARGE_TYPE_ANON
+					     : MEM_CGROUP_CHARGE_TYPE_CACHE,
+					     true);
 	css_put(&memcg->css);
 	/*
 	 * We disallowed uncharge of pages under migration because mapcount
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 04/10] mm: memcg: only check for PageSwapCache when uncharging anon
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Only anon pages that are uncharged at the time of the last page table
mapping vanishing may be in swapcache.

When shmem pages, file pages, swap-freed anon pages, or just migrated
pages are uncharged, they are known for sure to be not in swapcache.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   13 ++++---------
 1 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fb8d525..a5c0693 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3094,8 +3094,6 @@ void mem_cgroup_uncharge_cache_page(struct page *page)
 {
 	VM_BUG_ON(page_mapped(page));
 	VM_BUG_ON(page->mapping);
-	if (PageSwapCache(page))
-		return;
 	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
 }
 
@@ -3160,8 +3158,6 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
 	if (!swapout) /* this was a swap cache but the swap is unused ! */
 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
 
-	if (PageSwapCache(page))
-		return;
 	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
 
 	/*
@@ -3351,11 +3347,10 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 		unused = oldpage;
 	}
 	anon = PageAnon(used);
-	if (!PageSwapCache(unused))
-		__mem_cgroup_uncharge_common(unused,
-					     anon ? MEM_CGROUP_CHARGE_TYPE_ANON
-					     : MEM_CGROUP_CHARGE_TYPE_CACHE,
-					     true);
+	__mem_cgroup_uncharge_common(unused,
+				     anon ? MEM_CGROUP_CHARGE_TYPE_ANON
+				     : MEM_CGROUP_CHARGE_TYPE_CACHE,
+				     true);
 	css_put(&memcg->css);
 	/*
 	 * We disallowed uncharge of pages under migration because mapcount
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 04/10] mm: memcg: only check for PageSwapCache when uncharging anon
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Only anon pages that are uncharged at the time of the last page table
mapping vanishing may be in swapcache.

When shmem pages, file pages, swap-freed anon pages, or just migrated
pages are uncharged, they are known for sure to be not in swapcache.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   13 ++++---------
 1 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fb8d525..a5c0693 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3094,8 +3094,6 @@ void mem_cgroup_uncharge_cache_page(struct page *page)
 {
 	VM_BUG_ON(page_mapped(page));
 	VM_BUG_ON(page->mapping);
-	if (PageSwapCache(page))
-		return;
 	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
 }
 
@@ -3160,8 +3158,6 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
 	if (!swapout) /* this was a swap cache but the swap is unused ! */
 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
 
-	if (PageSwapCache(page))
-		return;
 	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
 
 	/*
@@ -3351,11 +3347,10 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
 		unused = oldpage;
 	}
 	anon = PageAnon(used);
-	if (!PageSwapCache(unused))
-		__mem_cgroup_uncharge_common(unused,
-					     anon ? MEM_CGROUP_CHARGE_TYPE_ANON
-					     : MEM_CGROUP_CHARGE_TYPE_CACHE,
-					     true);
+	__mem_cgroup_uncharge_common(unused,
+				     anon ? MEM_CGROUP_CHARGE_TYPE_ANON
+				     : MEM_CGROUP_CHARGE_TYPE_CACHE,
+				     true);
 	css_put(&memcg->css);
 	/*
 	 * We disallowed uncharge of pages under migration because mapcount
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 05/10] mm: memcg: move swapin charge functions above callsites
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Charging cache pages may require swapin in the shmem case.  Save the
forward declaration and just move the swapin functions above the cache
charging functions.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   68 +++++++++++++++++++++++++-----------------------------
 1 files changed, 32 insertions(+), 36 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a5c0693..081780b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2797,37 +2797,6 @@ int mem_cgroup_newpage_charge(struct page *page,
 					MEM_CGROUP_CHARGE_TYPE_ANON);
 }
 
-static void
-__mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr,
-					enum charge_type ctype);
-
-int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
-				gfp_t gfp_mask)
-{
-	struct mem_cgroup *memcg = NULL;
-	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
-	int ret;
-
-	if (mem_cgroup_disabled())
-		return 0;
-	if (PageCompound(page))
-		return 0;
-
-	if (unlikely(!mm))
-		mm = &init_mm;
-	if (!page_is_file_cache(page))
-		type = MEM_CGROUP_CHARGE_TYPE_SHMEM;
-
-	if (!PageSwapCache(page))
-		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
-	else { /* page is swapcache/shmem */
-		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
-		if (!ret)
-			__mem_cgroup_commit_charge_swapin(page, memcg, type);
-	}
-	return ret;
-}
-
 /*
  * While swap-in, try_charge -> commit or cancel, the page is locked.
  * And when try_charge() successfully returns, one refcnt to memcg without
@@ -2874,6 +2843,15 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 	return ret;
 }
 
+void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg)
+{
+	if (mem_cgroup_disabled())
+		return;
+	if (!memcg)
+		return;
+	__mem_cgroup_cancel_charge(memcg, 1);
+}
+
 static void
 __mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *memcg,
 					enum charge_type ctype)
@@ -2911,13 +2889,31 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
 					  MEM_CGROUP_CHARGE_TYPE_ANON);
 }
 
-void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg)
+int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
+				gfp_t gfp_mask)
 {
+	struct mem_cgroup *memcg = NULL;
+	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
+	int ret;
+
 	if (mem_cgroup_disabled())
-		return;
-	if (!memcg)
-		return;
-	__mem_cgroup_cancel_charge(memcg, 1);
+		return 0;
+	if (PageCompound(page))
+		return 0;
+
+	if (unlikely(!mm))
+		mm = &init_mm;
+	if (!page_is_file_cache(page))
+		type = MEM_CGROUP_CHARGE_TYPE_SHMEM;
+
+	if (!PageSwapCache(page))
+		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
+	else { /* page is swapcache/shmem */
+		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
+		if (!ret)
+			__mem_cgroup_commit_charge_swapin(page, memcg, type);
+	}
+	return ret;
 }
 
 static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 05/10] mm: memcg: move swapin charge functions above callsites
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Charging cache pages may require swapin in the shmem case.  Save the
forward declaration and just move the swapin functions above the cache
charging functions.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   68 +++++++++++++++++++++++++-----------------------------
 1 files changed, 32 insertions(+), 36 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a5c0693..081780b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2797,37 +2797,6 @@ int mem_cgroup_newpage_charge(struct page *page,
 					MEM_CGROUP_CHARGE_TYPE_ANON);
 }
 
-static void
-__mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *ptr,
-					enum charge_type ctype);
-
-int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
-				gfp_t gfp_mask)
-{
-	struct mem_cgroup *memcg = NULL;
-	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
-	int ret;
-
-	if (mem_cgroup_disabled())
-		return 0;
-	if (PageCompound(page))
-		return 0;
-
-	if (unlikely(!mm))
-		mm = &init_mm;
-	if (!page_is_file_cache(page))
-		type = MEM_CGROUP_CHARGE_TYPE_SHMEM;
-
-	if (!PageSwapCache(page))
-		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
-	else { /* page is swapcache/shmem */
-		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
-		if (!ret)
-			__mem_cgroup_commit_charge_swapin(page, memcg, type);
-	}
-	return ret;
-}
-
 /*
  * While swap-in, try_charge -> commit or cancel, the page is locked.
  * And when try_charge() successfully returns, one refcnt to memcg without
@@ -2874,6 +2843,15 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 	return ret;
 }
 
+void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg)
+{
+	if (mem_cgroup_disabled())
+		return;
+	if (!memcg)
+		return;
+	__mem_cgroup_cancel_charge(memcg, 1);
+}
+
 static void
 __mem_cgroup_commit_charge_swapin(struct page *page, struct mem_cgroup *memcg,
 					enum charge_type ctype)
@@ -2911,13 +2889,31 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
 					  MEM_CGROUP_CHARGE_TYPE_ANON);
 }
 
-void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg)
+int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
+				gfp_t gfp_mask)
 {
+	struct mem_cgroup *memcg = NULL;
+	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
+	int ret;
+
 	if (mem_cgroup_disabled())
-		return;
-	if (!memcg)
-		return;
-	__mem_cgroup_cancel_charge(memcg, 1);
+		return 0;
+	if (PageCompound(page))
+		return 0;
+
+	if (unlikely(!mm))
+		mm = &init_mm;
+	if (!page_is_file_cache(page))
+		type = MEM_CGROUP_CHARGE_TYPE_SHMEM;
+
+	if (!PageSwapCache(page))
+		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
+	else { /* page is swapcache/shmem */
+		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
+		if (!ret)
+			__mem_cgroup_commit_charge_swapin(page, memcg, type);
+	}
+	return ret;
 }
 
 static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 06/10] mm: memcg: remove unneeded shmem charge type
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

shmem page charges have not needed a separate charge type to tell them
from regular file pages since 08e552c 'memcg: synchronized LRU'.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   11 +----------
 1 files changed, 1 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 081780b..2c7d164c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -379,7 +379,6 @@ static bool move_file(void)
 enum charge_type {
 	MEM_CGROUP_CHARGE_TYPE_CACHE = 0,
 	MEM_CGROUP_CHARGE_TYPE_ANON,
-	MEM_CGROUP_CHARGE_TYPE_SHMEM,	/* used by page migration of shmem */
 	MEM_CGROUP_CHARGE_TYPE_SWAPOUT,	/* for accounting swapcache */
 	MEM_CGROUP_CHARGE_TYPE_DROP,	/* a page was unused swap cache */
 	NR_CHARGE_TYPE,
@@ -2903,8 +2902,6 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 
 	if (unlikely(!mm))
 		mm = &init_mm;
-	if (!page_is_file_cache(page))
-		type = MEM_CGROUP_CHARGE_TYPE_SHMEM;
 
 	if (!PageSwapCache(page))
 		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
@@ -3311,10 +3308,8 @@ void mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
 	 */
 	if (PageAnon(page))
 		ctype = MEM_CGROUP_CHARGE_TYPE_ANON;
-	else if (page_is_file_cache(page))
-		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
 	else
-		ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
+		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
 	/*
 	 * The page is committed to the memcg, but it's not actually
 	 * charged to the res_counter since we plan on replacing the
@@ -3408,10 +3403,6 @@ void mem_cgroup_replace_page_cache(struct page *oldpage,
 	 */
 	if (!memcg)
 		return;
-
-	if (PageSwapBacked(oldpage))
-		type = MEM_CGROUP_CHARGE_TYPE_SHMEM;
-
 	/*
 	 * Even if newpage->mapping was NULL before starting replacement,
 	 * the newpage may be on LRU(or pagevec for LRU) already. We lock
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 06/10] mm: memcg: remove unneeded shmem charge type
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

shmem page charges have not needed a separate charge type to tell them
from regular file pages since 08e552c 'memcg: synchronized LRU'.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   11 +----------
 1 files changed, 1 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 081780b..2c7d164c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -379,7 +379,6 @@ static bool move_file(void)
 enum charge_type {
 	MEM_CGROUP_CHARGE_TYPE_CACHE = 0,
 	MEM_CGROUP_CHARGE_TYPE_ANON,
-	MEM_CGROUP_CHARGE_TYPE_SHMEM,	/* used by page migration of shmem */
 	MEM_CGROUP_CHARGE_TYPE_SWAPOUT,	/* for accounting swapcache */
 	MEM_CGROUP_CHARGE_TYPE_DROP,	/* a page was unused swap cache */
 	NR_CHARGE_TYPE,
@@ -2903,8 +2902,6 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 
 	if (unlikely(!mm))
 		mm = &init_mm;
-	if (!page_is_file_cache(page))
-		type = MEM_CGROUP_CHARGE_TYPE_SHMEM;
 
 	if (!PageSwapCache(page))
 		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
@@ -3311,10 +3308,8 @@ void mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
 	 */
 	if (PageAnon(page))
 		ctype = MEM_CGROUP_CHARGE_TYPE_ANON;
-	else if (page_is_file_cache(page))
-		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
 	else
-		ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
+		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
 	/*
 	 * The page is committed to the memcg, but it's not actually
 	 * charged to the res_counter since we plan on replacing the
@@ -3408,10 +3403,6 @@ void mem_cgroup_replace_page_cache(struct page *oldpage,
 	 */
 	if (!memcg)
 		return;
-
-	if (PageSwapBacked(oldpage))
-		type = MEM_CGROUP_CHARGE_TYPE_SHMEM;
-
 	/*
 	 * Even if newpage->mapping was NULL before starting replacement,
 	 * the newpage may be on LRU(or pagevec for LRU) already. We lock
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 07/10] mm: memcg: remove needless !mm fixup to init_mm when charging
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

It does not matter to __mem_cgroup_try_charge() if the passed mm is
NULL or init_mm, it will charge the root memcg in either case.

Also fix up the comment in __mem_cgroup_try_charge() that claimed the
init_mm would be charged when no mm was passed.  It's not really
incorrect, but confusing.  Clarify that the root memcg is charged in
this case.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |    7 +------
 1 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2c7d164c..c6bcaaa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2334,7 +2334,7 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 	 * We always charge the cgroup the mm_struct belongs to.
 	 * The mm_struct's mem_cgroup changes on task migration if the
 	 * thread group leader migrates. It's possible that mm is not
-	 * set, if so charge the init_mm (happens for pagecache usage).
+	 * set, if so charge the root memcg (happens for pagecache usage).
 	 */
 	if (!*ptr && !mm)
 		*ptr = root_mem_cgroup;
@@ -2834,8 +2834,6 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 		ret = 0;
 	return ret;
 charge_cur_mm:
-	if (unlikely(!mm))
-		mm = &init_mm;
 	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
 	if (ret == -EINTR)
 		ret = 0;
@@ -2900,9 +2898,6 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (PageCompound(page))
 		return 0;
 
-	if (unlikely(!mm))
-		mm = &init_mm;
-
 	if (!PageSwapCache(page))
 		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
 	else { /* page is swapcache/shmem */
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 07/10] mm: memcg: remove needless !mm fixup to init_mm when charging
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

It does not matter to __mem_cgroup_try_charge() if the passed mm is
NULL or init_mm, it will charge the root memcg in either case.

Also fix up the comment in __mem_cgroup_try_charge() that claimed the
init_mm would be charged when no mm was passed.  It's not really
incorrect, but confusing.  Clarify that the root memcg is charged in
this case.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |    7 +------
 1 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2c7d164c..c6bcaaa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2334,7 +2334,7 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 	 * We always charge the cgroup the mm_struct belongs to.
 	 * The mm_struct's mem_cgroup changes on task migration if the
 	 * thread group leader migrates. It's possible that mm is not
-	 * set, if so charge the init_mm (happens for pagecache usage).
+	 * set, if so charge the root memcg (happens for pagecache usage).
 	 */
 	if (!*ptr && !mm)
 		*ptr = root_mem_cgroup;
@@ -2834,8 +2834,6 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 		ret = 0;
 	return ret;
 charge_cur_mm:
-	if (unlikely(!mm))
-		mm = &init_mm;
 	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
 	if (ret == -EINTR)
 		ret = 0;
@@ -2900,9 +2898,6 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (PageCompound(page))
 		return 0;
 
-	if (unlikely(!mm))
-		mm = &init_mm;
-
 	if (!PageSwapCache(page))
 		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
 	else { /* page is swapcache/shmem */
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 08/10] mm: memcg: split swapin charge function into private and public part
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

When shmem is charged upon swapin, it does not need to check twice
whether the memory controller is enabled.

Also, shmem pages do not have to be checked for everything that
regular anon pages have to be checked for, so let shmem use the
internal version directly and allow future patches to move around
checks that are only required when swapping in anon pages.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   24 +++++++++++++++---------
 1 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c6bcaaa..36e6d73 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2802,18 +2802,14 @@ int mem_cgroup_newpage_charge(struct page *page,
  * struct page_cgroup is acquired. This refcnt will be consumed by
  * "commit()" or removed by "cancel()"
  */
-int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
-				 struct page *page,
-				 gfp_t mask, struct mem_cgroup **memcgp)
+static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
+					  struct page *page,
+					  gfp_t mask,
+					  struct mem_cgroup **memcgp)
 {
 	struct mem_cgroup *memcg;
 	int ret;
 
-	*memcgp = NULL;
-
-	if (mem_cgroup_disabled())
-		return 0;
-
 	if (!do_swap_account)
 		goto charge_cur_mm;
 	/*
@@ -2840,6 +2836,15 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 	return ret;
 }
 
+int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
+				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
+{
+	*memcgp = NULL;
+	if (mem_cgroup_disabled())
+		return 0;
+	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
+}
+
 void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg)
 {
 	if (mem_cgroup_disabled())
@@ -2901,7 +2906,8 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (!PageSwapCache(page))
 		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
 	else { /* page is swapcache/shmem */
-		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
+		ret = __mem_cgroup_try_charge_swapin(mm, page,
+						     gfp_mask, &memcg);
 		if (!ret)
 			__mem_cgroup_commit_charge_swapin(page, memcg, type);
 	}
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 08/10] mm: memcg: split swapin charge function into private and public part
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

When shmem is charged upon swapin, it does not need to check twice
whether the memory controller is enabled.

Also, shmem pages do not have to be checked for everything that
regular anon pages have to be checked for, so let shmem use the
internal version directly and allow future patches to move around
checks that are only required when swapping in anon pages.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   24 +++++++++++++++---------
 1 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c6bcaaa..36e6d73 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2802,18 +2802,14 @@ int mem_cgroup_newpage_charge(struct page *page,
  * struct page_cgroup is acquired. This refcnt will be consumed by
  * "commit()" or removed by "cancel()"
  */
-int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
-				 struct page *page,
-				 gfp_t mask, struct mem_cgroup **memcgp)
+static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
+					  struct page *page,
+					  gfp_t mask,
+					  struct mem_cgroup **memcgp)
 {
 	struct mem_cgroup *memcg;
 	int ret;
 
-	*memcgp = NULL;
-
-	if (mem_cgroup_disabled())
-		return 0;
-
 	if (!do_swap_account)
 		goto charge_cur_mm;
 	/*
@@ -2840,6 +2836,15 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 	return ret;
 }
 
+int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
+				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
+{
+	*memcgp = NULL;
+	if (mem_cgroup_disabled())
+		return 0;
+	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
+}
+
 void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg)
 {
 	if (mem_cgroup_disabled())
@@ -2901,7 +2906,8 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (!PageSwapCache(page))
 		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
 	else { /* page is swapcache/shmem */
-		ret = mem_cgroup_try_charge_swapin(mm, page, gfp_mask, &memcg);
+		ret = __mem_cgroup_try_charge_swapin(mm, page,
+						     gfp_mask, &memcg);
 		if (!ret)
 			__mem_cgroup_commit_charge_swapin(page, memcg, type);
 	}
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 09/10] mm: memcg: only check swap cache pages for repeated charging
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Only anon and shmem pages in the swap cache are attempted to be
charged multiple times, from every swap pte fault or from
shmem_unuse().  No other pages require checking PageCgroupUsed().

Charging pages in the swap cache is also serialized by the page lock,
and since both the try_charge and commit_charge are called under the
same page lock section, the PageCgroupUsed() check might as well
happen before the counter charging, let alone reclaim.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   17 ++++++++++++-----
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 36e6d73..9433bff 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2539,11 +2539,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
 	bool anon;
 
 	lock_page_cgroup(pc);
-	if (unlikely(PageCgroupUsed(pc))) {
-		unlock_page_cgroup(pc);
-		__mem_cgroup_cancel_charge(memcg, nr_pages);
-		return;
-	}
+	VM_BUG_ON(PageCgroupUsed(pc));
 	/*
 	 * we don't need page_cgroup_lock about tail pages, becase they are not
 	 * accessed by any other context at this point.
@@ -2808,8 +2804,19 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 					  struct mem_cgroup **memcgp)
 {
 	struct mem_cgroup *memcg;
+	struct page_cgroup *pc;
 	int ret;
 
+	pc = lookup_page_cgroup(page);
+	/*
+	 * Every swap fault against a single page tries to charge the
+	 * page, bail as early as possible.  shmem_unuse() encounters
+	 * already charged pages, too.  The USED bit is protected by
+	 * the page lock, which serializes swap cache removal, which
+	 * in turn serializes uncharging.
+	 */
+	if (PageCgroupUsed(pc))
+		return 0;
 	if (!do_swap_account)
 		goto charge_cur_mm;
 	/*
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 09/10] mm: memcg: only check swap cache pages for repeated charging
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

Only anon and shmem pages in the swap cache are attempted to be
charged multiple times, from every swap pte fault or from
shmem_unuse().  No other pages require checking PageCgroupUsed().

Charging pages in the swap cache is also serialized by the page lock,
and since both the try_charge and commit_charge are called under the
same page lock section, the PageCgroupUsed() check might as well
happen before the counter charging, let alone reclaim.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   17 ++++++++++++-----
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 36e6d73..9433bff 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2539,11 +2539,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
 	bool anon;
 
 	lock_page_cgroup(pc);
-	if (unlikely(PageCgroupUsed(pc))) {
-		unlock_page_cgroup(pc);
-		__mem_cgroup_cancel_charge(memcg, nr_pages);
-		return;
-	}
+	VM_BUG_ON(PageCgroupUsed(pc));
 	/*
 	 * we don't need page_cgroup_lock about tail pages, becase they are not
 	 * accessed by any other context at this point.
@@ -2808,8 +2804,19 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 					  struct mem_cgroup **memcgp)
 {
 	struct mem_cgroup *memcg;
+	struct page_cgroup *pc;
 	int ret;
 
+	pc = lookup_page_cgroup(page);
+	/*
+	 * Every swap fault against a single page tries to charge the
+	 * page, bail as early as possible.  shmem_unuse() encounters
+	 * already charged pages, too.  The USED bit is protected by
+	 * the page lock, which serializes swap cache removal, which
+	 * in turn serializes uncharging.
+	 */
+	if (PageCgroupUsed(pc))
+		return 0;
 	if (!do_swap_account)
 		goto charge_cur_mm;
 	/*
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 10/10] mm: memcg: only check anon swapin page charges for swap cache
  2012-07-11 17:02 ` Johannes Weiner
@ 2012-07-11 17:02   ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

shmem knows for sure that the page is in swap cache when attempting to
charge a page, because the cache charge entry function has a check for
it.  Only anon pages may be removed from swap cache already when
trying to charge their swapin.

Adjust the comment, though: '4969c11 mm: fix swapin race condition'
added a stable PageSwapCache check under the page lock in the
do_swap_page() before calling the memory controller, so it's
unuse_pte()'s pte_same() that may fail.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9433bff..ffd9323 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2819,14 +2819,6 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 		return 0;
 	if (!do_swap_account)
 		goto charge_cur_mm;
-	/*
-	 * A racing thread's fault, or swapoff, may have already updated
-	 * the pte, and even removed page from swap cache: in those cases
-	 * do_swap_page()'s pte_same() test will fail; but there's also a
-	 * KSM case which does need to charge the page.
-	 */
-	if (!PageSwapCache(page))
-		goto charge_cur_mm;
 	memcg = try_get_mem_cgroup_from_page(page);
 	if (!memcg)
 		goto charge_cur_mm;
@@ -2849,6 +2841,20 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 	*memcgp = NULL;
 	if (mem_cgroup_disabled())
 		return 0;
+	/*
+	 * A racing thread's fault, or swapoff, may have already
+	 * updated the pte, and even removed page from swap cache: in
+	 * those cases unuse_pte()'s pte_same() test will fail; but
+	 * there's also a KSM case which does need to charge the page.
+	 */
+	if (!PageSwapCache(page)) {
+		int ret;
+
+		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
+		if (ret == -EINTR)
+			ret = 0;
+		return ret;
+	}
 	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
 }
 
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [patch 10/10] mm: memcg: only check anon swapin page charges for swap cache
@ 2012-07-11 17:02   ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-11 17:02 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

shmem knows for sure that the page is in swap cache when attempting to
charge a page, because the cache charge entry function has a check for
it.  Only anon pages may be removed from swap cache already when
trying to charge their swapin.

Adjust the comment, though: '4969c11 mm: fix swapin race condition'
added a stable PageSwapCache check under the page lock in the
do_swap_page() before calling the memory controller, so it's
unuse_pte()'s pte_same() that may fail.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9433bff..ffd9323 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2819,14 +2819,6 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 		return 0;
 	if (!do_swap_account)
 		goto charge_cur_mm;
-	/*
-	 * A racing thread's fault, or swapoff, may have already updated
-	 * the pte, and even removed page from swap cache: in those cases
-	 * do_swap_page()'s pte_same() test will fail; but there's also a
-	 * KSM case which does need to charge the page.
-	 */
-	if (!PageSwapCache(page))
-		goto charge_cur_mm;
 	memcg = try_get_mem_cgroup_from_page(page);
 	if (!memcg)
 		goto charge_cur_mm;
@@ -2849,6 +2841,20 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 	*memcgp = NULL;
 	if (mem_cgroup_disabled())
 		return 0;
+	/*
+	 * A racing thread's fault, or swapoff, may have already
+	 * updated the pte, and even removed page from swap cache: in
+	 * those cases unuse_pte()'s pte_same() test will fail; but
+	 * there's also a KSM case which does need to charge the page.
+	 */
+	if (!PageSwapCache(page)) {
+		int ret;
+
+		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
+		if (ret == -EINTR)
+			ret = 0;
+		return ret;
+	}
 	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
 }
 
-- 
1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
  2012-07-11 17:02   ` Johannes Weiner
@ 2012-07-12  8:54     ` Wanpeng Li
  -1 siblings, 0 replies; 34+ messages in thread
From: Wanpeng Li @ 2012-07-12  8:54 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins,
	David Rientjes, Wanpeng Li, linux-mm, cgroups, linux-kernel

On Wed, Jul 11, 2012 at 07:02:13PM +0200, Johannes Weiner wrote:
>Compaction (and page migration in general) can currently be hindered
>through pages being owned by memory cgroups that are at their limits
>and unreclaimable.
>
>The reason is that the replacement page is being charged against the
>limit while the page being replaced is also still charged.  But this
>seems unnecessary, given that only one of the two pages will still be
>in use after migration finishes.
>
>This patch changes the memcg migration sequence so that the
>replacement page is not charged.  Whatever page is still in use after
>successful or failed migration gets to keep the charge of the page
>that was going to be replaced.
>
>The replacement page will still show up temporarily in the rss/cache
>statistics, this can be fixed in a later patch as it's less urgent.
>

So I want to know after this patch be merged if mem_cgroup_wait_acct_move
still make sense, if the answer is no, I will send a patch to remove it.

>Reported-by: David Rientjes <rientjes@google.com>
>Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>Acked-by: Michal Hocko <mhocko@suse.cz>
>---
> include/linux/memcontrol.h |   11 +++----
> mm/memcontrol.c            |   67 +++++++++++++++++++++++--------------------
> mm/migrate.c               |   27 ++++--------------
> 3 files changed, 47 insertions(+), 58 deletions(-)
>
>diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>index 5a3ee64..8d9489f 100644
>--- a/include/linux/memcontrol.h
>+++ b/include/linux/memcontrol.h
>@@ -98,9 +98,9 @@ int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
> 
> extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
> 
>-extern int
>-mem_cgroup_prepare_migration(struct page *page,
>-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask);
>+extern void
>+mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>+			     struct mem_cgroup **memcgp);
> extern void mem_cgroup_end_migration(struct mem_cgroup *memcg,
> 	struct page *oldpage, struct page *newpage, bool migration_ok);
> 
>@@ -276,11 +276,10 @@ static inline struct cgroup_subsys_state
> 	return NULL;
> }
> 
>-static inline int
>+static inline void
> mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>-	struct mem_cgroup **memcgp, gfp_t gfp_mask)
>+			     struct mem_cgroup **memcgp)
> {
>-	return 0;
> }
> 
> static inline void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>index e8ddc00..12ee2de 100644
>--- a/mm/memcontrol.c
>+++ b/mm/memcontrol.c
>@@ -2977,7 +2977,8 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
>  * uncharge if !page_mapped(page)
>  */
> static struct mem_cgroup *
>-__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
>+__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
>+			     bool end_migration)
> {
> 	struct mem_cgroup *memcg = NULL;
> 	unsigned int nr_pages = 1;
>@@ -3021,7 +3022,16 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
> 		/* fallthrough */
> 	case MEM_CGROUP_CHARGE_TYPE_DROP:
> 		/* See mem_cgroup_prepare_migration() */
>-		if (page_mapped(page) || PageCgroupMigration(pc))
>+		if (page_mapped(page))
>+			goto unlock_out;
>+		/*
>+		 * Pages under migration may not be uncharged.  But
>+		 * end_migration() /must/ be the one uncharging the
>+		 * unused post-migration page and so it has to call
>+		 * here with the migration bit still set.  See the
>+		 * res_counter handling below.
>+		 */
>+		if (!end_migration && PageCgroupMigration(pc))
> 			goto unlock_out;
> 		break;
> 	case MEM_CGROUP_CHARGE_TYPE_SWAPOUT:
>@@ -3055,7 +3065,12 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
> 		mem_cgroup_swap_statistics(memcg, true);
> 		mem_cgroup_get(memcg);
> 	}
>-	if (!mem_cgroup_is_root(memcg))
>+	/*
>+	 * Migration does not charge the res_counter for the
>+	 * replacement page, so leave it alone when phasing out the
>+	 * page that is unused after the migration.
>+	 */
>+	if (!end_migration && !mem_cgroup_is_root(memcg))
> 		mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
> 
> 	return memcg;
>@@ -3071,14 +3086,14 @@ void mem_cgroup_uncharge_page(struct page *page)
> 	if (page_mapped(page))
> 		return;
> 	VM_BUG_ON(page->mapping && !PageAnon(page));
>-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON);
>+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
> }
> 
> void mem_cgroup_uncharge_cache_page(struct page *page)
> {
> 	VM_BUG_ON(page_mapped(page));
> 	VM_BUG_ON(page->mapping);
>-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
>+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
> }
> 
> /*
>@@ -3142,7 +3157,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
> 	if (!swapout) /* this was a swap cache but the swap is unused ! */
> 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
> 
>-	memcg = __mem_cgroup_uncharge_common(page, ctype);
>+	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
> 
> 	/*
> 	 * record memcg information,  if swapout && memcg != NULL,
>@@ -3232,19 +3247,18 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
>  * Before starting migration, account PAGE_SIZE to mem_cgroup that the old
>  * page belongs to.
>  */
>-int mem_cgroup_prepare_migration(struct page *page,
>-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask)
>+void mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>+				  struct mem_cgroup **memcgp)
> {
> 	struct mem_cgroup *memcg = NULL;
> 	struct page_cgroup *pc;
> 	enum charge_type ctype;
>-	int ret = 0;
> 
> 	*memcgp = NULL;
> 
> 	VM_BUG_ON(PageTransHuge(page));
> 	if (mem_cgroup_disabled())
>-		return 0;
>+		return;
> 
> 	pc = lookup_page_cgroup(page);
> 	lock_page_cgroup(pc);
>@@ -3289,24 +3303,9 @@ int mem_cgroup_prepare_migration(struct page *page,
> 	 * we return here.
> 	 */
> 	if (!memcg)
>-		return 0;
>+		return;
> 
> 	*memcgp = memcg;
>-	ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, memcgp, false);
>-	css_put(&memcg->css);/* drop extra refcnt */
>-	if (ret) {
>-		if (PageAnon(page)) {
>-			lock_page_cgroup(pc);
>-			ClearPageCgroupMigration(pc);
>-			unlock_page_cgroup(pc);
>-			/*
>-			 * The old page may be fully unmapped while we kept it.
>-			 */
>-			mem_cgroup_uncharge_page(page);
>-		}
>-		/* we'll need to revisit this error code (we have -EINTR) */
>-		return -ENOMEM;
>-	}
> 	/*
> 	 * We charge new page before it's used/mapped. So, even if unlock_page()
> 	 * is called before end_migration, we can catch all events on this new
>@@ -3319,8 +3318,12 @@ int mem_cgroup_prepare_migration(struct page *page,
> 		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
> 	else
> 		ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
>+	/*
>+	 * The page is committed to the memcg, but it's not actually
>+	 * charged to the res_counter since we plan on replacing the
>+	 * old one and only one page is going to be left afterwards.
>+	 */
> 	__mem_cgroup_commit_charge(memcg, newpage, 1, ctype, false);
>-	return ret;
> }
> 
> /* remove redundant charge if migration failed*/
>@@ -3342,6 +3345,12 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
> 		used = newpage;
> 		unused = oldpage;
> 	}
>+	anon = PageAnon(used);
>+	__mem_cgroup_uncharge_common(unused,
>+		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
>+		     : MEM_CGROUP_CHARGE_TYPE_CACHE,
>+		true);
>+	css_put(&memcg->css);
> 	/*
> 	 * We disallowed uncharge of pages under migration because mapcount
> 	 * of the page goes down to zero, temporarly.
>@@ -3351,10 +3360,6 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
> 	lock_page_cgroup(pc);
> 	ClearPageCgroupMigration(pc);
> 	unlock_page_cgroup(pc);
>-	anon = PageAnon(used);
>-	__mem_cgroup_uncharge_common(unused,
>-		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
>-		     : MEM_CGROUP_CHARGE_TYPE_CACHE);
> 
> 	/*
> 	 * If a page is a file cache, radix-tree replacement is very atomic
>diff --git a/mm/migrate.c b/mm/migrate.c
>index 8137aea..aa06bf4 100644
>--- a/mm/migrate.c
>+++ b/mm/migrate.c
>@@ -687,7 +687,6 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> {
> 	int rc = -EAGAIN;
> 	int remap_swapcache = 1;
>-	int charge = 0;
> 	struct mem_cgroup *mem;
> 	struct anon_vma *anon_vma = NULL;
> 
>@@ -729,12 +728,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> 	}
> 
> 	/* charge against new page */
>-	charge = mem_cgroup_prepare_migration(page, newpage, &mem, GFP_KERNEL);
>-	if (charge == -ENOMEM) {
>-		rc = -ENOMEM;
>-		goto unlock;
>-	}
>-	BUG_ON(charge);
>+	mem_cgroup_prepare_migration(page, newpage, &mem);
> 
> 	if (PageWriteback(page)) {
> 		/*
>@@ -824,8 +818,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> 		put_anon_vma(anon_vma);
> 
> uncharge:
>-	if (!charge)
>-		mem_cgroup_end_migration(mem, page, newpage, rc == 0);
>+	mem_cgroup_end_migration(mem, page, newpage, rc == 0);
> unlock:
> 	unlock_page(page);
> out:
>@@ -1519,10 +1512,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
> {
> 	struct page *oldpage = page, *newpage;
> 	struct address_space *mapping = page_mapping(page);
>-	struct mem_cgroup *mcg;
>+	struct mem_cgroup *memcg;
> 	unsigned int gfp;
> 	int rc = 0;
>-	int charge = -ENOMEM;
> 
> 	VM_BUG_ON(!PageLocked(page));
> 	VM_BUG_ON(page_mapcount(page));
>@@ -1556,12 +1548,7 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
> 	if (!trylock_page(newpage))
> 		BUG();		/* new page should be unlocked!!! */
> 
>-	// XXX hnaz, is this right?
>-	charge = mem_cgroup_prepare_migration(page, newpage, &mcg, gfp);
>-	if (charge == -ENOMEM) {
>-		rc = charge;
>-		goto out;
>-	}
>+	mem_cgroup_prepare_migration(page, newpage, &memcg);
> 
> 	newpage->index = page->index;
> 	newpage->mapping = page->mapping;
>@@ -1581,11 +1568,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
> 		page = newpage;
> 	}
> 
>+	mem_cgroup_end_migration(memcg, oldpage, newpage, !rc);
> out:
>-	if (!charge)
>-		mem_cgroup_end_migration(mcg, oldpage, newpage, !rc);
>-
>-       if (oldpage != page)
>+	if (oldpage != page)
>                put_page(oldpage);
> 
> 	if (rc) {
>-- 
>1.7.7.6

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
@ 2012-07-12  8:54     ` Wanpeng Li
  0 siblings, 0 replies; 34+ messages in thread
From: Wanpeng Li @ 2012-07-12  8:54 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins,
	David Rientjes, Wanpeng Li, linux-mm, cgroups, linux-kernel

On Wed, Jul 11, 2012 at 07:02:13PM +0200, Johannes Weiner wrote:
>Compaction (and page migration in general) can currently be hindered
>through pages being owned by memory cgroups that are at their limits
>and unreclaimable.
>
>The reason is that the replacement page is being charged against the
>limit while the page being replaced is also still charged.  But this
>seems unnecessary, given that only one of the two pages will still be
>in use after migration finishes.
>
>This patch changes the memcg migration sequence so that the
>replacement page is not charged.  Whatever page is still in use after
>successful or failed migration gets to keep the charge of the page
>that was going to be replaced.
>
>The replacement page will still show up temporarily in the rss/cache
>statistics, this can be fixed in a later patch as it's less urgent.
>

So I want to know after this patch be merged if mem_cgroup_wait_acct_move
still make sense, if the answer is no, I will send a patch to remove it.

>Reported-by: David Rientjes <rientjes@google.com>
>Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>Acked-by: Michal Hocko <mhocko@suse.cz>
>---
> include/linux/memcontrol.h |   11 +++----
> mm/memcontrol.c            |   67 +++++++++++++++++++++++--------------------
> mm/migrate.c               |   27 ++++--------------
> 3 files changed, 47 insertions(+), 58 deletions(-)
>
>diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>index 5a3ee64..8d9489f 100644
>--- a/include/linux/memcontrol.h
>+++ b/include/linux/memcontrol.h
>@@ -98,9 +98,9 @@ int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
> 
> extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
> 
>-extern int
>-mem_cgroup_prepare_migration(struct page *page,
>-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask);
>+extern void
>+mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>+			     struct mem_cgroup **memcgp);
> extern void mem_cgroup_end_migration(struct mem_cgroup *memcg,
> 	struct page *oldpage, struct page *newpage, bool migration_ok);
> 
>@@ -276,11 +276,10 @@ static inline struct cgroup_subsys_state
> 	return NULL;
> }
> 
>-static inline int
>+static inline void
> mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>-	struct mem_cgroup **memcgp, gfp_t gfp_mask)
>+			     struct mem_cgroup **memcgp)
> {
>-	return 0;
> }
> 
> static inline void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>index e8ddc00..12ee2de 100644
>--- a/mm/memcontrol.c
>+++ b/mm/memcontrol.c
>@@ -2977,7 +2977,8 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
>  * uncharge if !page_mapped(page)
>  */
> static struct mem_cgroup *
>-__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
>+__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
>+			     bool end_migration)
> {
> 	struct mem_cgroup *memcg = NULL;
> 	unsigned int nr_pages = 1;
>@@ -3021,7 +3022,16 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
> 		/* fallthrough */
> 	case MEM_CGROUP_CHARGE_TYPE_DROP:
> 		/* See mem_cgroup_prepare_migration() */
>-		if (page_mapped(page) || PageCgroupMigration(pc))
>+		if (page_mapped(page))
>+			goto unlock_out;
>+		/*
>+		 * Pages under migration may not be uncharged.  But
>+		 * end_migration() /must/ be the one uncharging the
>+		 * unused post-migration page and so it has to call
>+		 * here with the migration bit still set.  See the
>+		 * res_counter handling below.
>+		 */
>+		if (!end_migration && PageCgroupMigration(pc))
> 			goto unlock_out;
> 		break;
> 	case MEM_CGROUP_CHARGE_TYPE_SWAPOUT:
>@@ -3055,7 +3065,12 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
> 		mem_cgroup_swap_statistics(memcg, true);
> 		mem_cgroup_get(memcg);
> 	}
>-	if (!mem_cgroup_is_root(memcg))
>+	/*
>+	 * Migration does not charge the res_counter for the
>+	 * replacement page, so leave it alone when phasing out the
>+	 * page that is unused after the migration.
>+	 */
>+	if (!end_migration && !mem_cgroup_is_root(memcg))
> 		mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
> 
> 	return memcg;
>@@ -3071,14 +3086,14 @@ void mem_cgroup_uncharge_page(struct page *page)
> 	if (page_mapped(page))
> 		return;
> 	VM_BUG_ON(page->mapping && !PageAnon(page));
>-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON);
>+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
> }
> 
> void mem_cgroup_uncharge_cache_page(struct page *page)
> {
> 	VM_BUG_ON(page_mapped(page));
> 	VM_BUG_ON(page->mapping);
>-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
>+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
> }
> 
> /*
>@@ -3142,7 +3157,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
> 	if (!swapout) /* this was a swap cache but the swap is unused ! */
> 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
> 
>-	memcg = __mem_cgroup_uncharge_common(page, ctype);
>+	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
> 
> 	/*
> 	 * record memcg information,  if swapout && memcg != NULL,
>@@ -3232,19 +3247,18 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
>  * Before starting migration, account PAGE_SIZE to mem_cgroup that the old
>  * page belongs to.
>  */
>-int mem_cgroup_prepare_migration(struct page *page,
>-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask)
>+void mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>+				  struct mem_cgroup **memcgp)
> {
> 	struct mem_cgroup *memcg = NULL;
> 	struct page_cgroup *pc;
> 	enum charge_type ctype;
>-	int ret = 0;
> 
> 	*memcgp = NULL;
> 
> 	VM_BUG_ON(PageTransHuge(page));
> 	if (mem_cgroup_disabled())
>-		return 0;
>+		return;
> 
> 	pc = lookup_page_cgroup(page);
> 	lock_page_cgroup(pc);
>@@ -3289,24 +3303,9 @@ int mem_cgroup_prepare_migration(struct page *page,
> 	 * we return here.
> 	 */
> 	if (!memcg)
>-		return 0;
>+		return;
> 
> 	*memcgp = memcg;
>-	ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, memcgp, false);
>-	css_put(&memcg->css);/* drop extra refcnt */
>-	if (ret) {
>-		if (PageAnon(page)) {
>-			lock_page_cgroup(pc);
>-			ClearPageCgroupMigration(pc);
>-			unlock_page_cgroup(pc);
>-			/*
>-			 * The old page may be fully unmapped while we kept it.
>-			 */
>-			mem_cgroup_uncharge_page(page);
>-		}
>-		/* we'll need to revisit this error code (we have -EINTR) */
>-		return -ENOMEM;
>-	}
> 	/*
> 	 * We charge new page before it's used/mapped. So, even if unlock_page()
> 	 * is called before end_migration, we can catch all events on this new
>@@ -3319,8 +3318,12 @@ int mem_cgroup_prepare_migration(struct page *page,
> 		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
> 	else
> 		ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
>+	/*
>+	 * The page is committed to the memcg, but it's not actually
>+	 * charged to the res_counter since we plan on replacing the
>+	 * old one and only one page is going to be left afterwards.
>+	 */
> 	__mem_cgroup_commit_charge(memcg, newpage, 1, ctype, false);
>-	return ret;
> }
> 
> /* remove redundant charge if migration failed*/
>@@ -3342,6 +3345,12 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
> 		used = newpage;
> 		unused = oldpage;
> 	}
>+	anon = PageAnon(used);
>+	__mem_cgroup_uncharge_common(unused,
>+		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
>+		     : MEM_CGROUP_CHARGE_TYPE_CACHE,
>+		true);
>+	css_put(&memcg->css);
> 	/*
> 	 * We disallowed uncharge of pages under migration because mapcount
> 	 * of the page goes down to zero, temporarly.
>@@ -3351,10 +3360,6 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
> 	lock_page_cgroup(pc);
> 	ClearPageCgroupMigration(pc);
> 	unlock_page_cgroup(pc);
>-	anon = PageAnon(used);
>-	__mem_cgroup_uncharge_common(unused,
>-		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
>-		     : MEM_CGROUP_CHARGE_TYPE_CACHE);
> 
> 	/*
> 	 * If a page is a file cache, radix-tree replacement is very atomic
>diff --git a/mm/migrate.c b/mm/migrate.c
>index 8137aea..aa06bf4 100644
>--- a/mm/migrate.c
>+++ b/mm/migrate.c
>@@ -687,7 +687,6 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> {
> 	int rc = -EAGAIN;
> 	int remap_swapcache = 1;
>-	int charge = 0;
> 	struct mem_cgroup *mem;
> 	struct anon_vma *anon_vma = NULL;
> 
>@@ -729,12 +728,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> 	}
> 
> 	/* charge against new page */
>-	charge = mem_cgroup_prepare_migration(page, newpage, &mem, GFP_KERNEL);
>-	if (charge == -ENOMEM) {
>-		rc = -ENOMEM;
>-		goto unlock;
>-	}
>-	BUG_ON(charge);
>+	mem_cgroup_prepare_migration(page, newpage, &mem);
> 
> 	if (PageWriteback(page)) {
> 		/*
>@@ -824,8 +818,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> 		put_anon_vma(anon_vma);
> 
> uncharge:
>-	if (!charge)
>-		mem_cgroup_end_migration(mem, page, newpage, rc == 0);
>+	mem_cgroup_end_migration(mem, page, newpage, rc == 0);
> unlock:
> 	unlock_page(page);
> out:
>@@ -1519,10 +1512,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
> {
> 	struct page *oldpage = page, *newpage;
> 	struct address_space *mapping = page_mapping(page);
>-	struct mem_cgroup *mcg;
>+	struct mem_cgroup *memcg;
> 	unsigned int gfp;
> 	int rc = 0;
>-	int charge = -ENOMEM;
> 
> 	VM_BUG_ON(!PageLocked(page));
> 	VM_BUG_ON(page_mapcount(page));
>@@ -1556,12 +1548,7 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
> 	if (!trylock_page(newpage))
> 		BUG();		/* new page should be unlocked!!! */
> 
>-	// XXX hnaz, is this right?
>-	charge = mem_cgroup_prepare_migration(page, newpage, &mcg, gfp);
>-	if (charge == -ENOMEM) {
>-		rc = charge;
>-		goto out;
>-	}
>+	mem_cgroup_prepare_migration(page, newpage, &memcg);
> 
> 	newpage->index = page->index;
> 	newpage->mapping = page->mapping;
>@@ -1581,11 +1568,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
> 		page = newpage;
> 	}
> 
>+	mem_cgroup_end_migration(memcg, oldpage, newpage, !rc);
> out:
>-	if (!charge)
>-		mem_cgroup_end_migration(mcg, oldpage, newpage, !rc);
>-
>-       if (oldpage != page)
>+	if (oldpage != page)
>                put_page(oldpage);
> 
> 	if (rc) {
>-- 
>1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
  2012-07-12  8:54     ` Wanpeng Li
@ 2012-07-12  9:10       ` Wanpeng Li
  -1 siblings, 0 replies; 34+ messages in thread
From: Wanpeng Li @ 2012-07-12  9:10 UTC (permalink / raw)
  To: Johannes Weiner, KAMEZAWA Hiroyuki, Michal Hocko
  Cc: Andrew Morton, Hugh Dickins, David Rientjes, Wanpeng Li,
	linux-mm, cgroups, linux-kernel

On Thu, Jul 12, 2012 at 04:54:07PM +0800, Wanpeng Li wrote:
>On Wed, Jul 11, 2012 at 07:02:13PM +0200, Johannes Weiner wrote:
>>Compaction (and page migration in general) can currently be hindered
>>through pages being owned by memory cgroups that are at their limits
>>and unreclaimable.
>>
>>The reason is that the replacement page is being charged against the
>>limit while the page being replaced is also still charged.  But this
>>seems unnecessary, given that only one of the two pages will still be
>>in use after migration finishes.
>>
>>This patch changes the memcg migration sequence so that the
>>replacement page is not charged.  Whatever page is still in use after
>>successful or failed migration gets to keep the charge of the page
>>that was going to be replaced.
>>
>>The replacement page will still show up temporarily in the rss/cache
>>statistics, this can be fixed in a later patch as it's less urgent.
>>
>
>So I want to know after this patch be merged if mem_cgroup_wait_acct_move
>still make sense, if the answer is no, I will send a patch to remove it.

And if this still make sense, I want to change check in
mem_cgroup_do_charge:

if (mem_cgroup_wait_acct_move(mem_over_limit))
	return CHARGE_RETRY;

=>

if (mem_cgroup_wait_acct_move(mem_over_limit) && 
                       mem_cgroup_margin(mem_over_limit) >= nr_pages)
	return CHARGE_RETRY;

Since mem_cgroup_relcaim can reclaim some pages, but in
mem_cgroup_reclaim function there are some exit condition:

total += try_to_free_mem_cgroup_pages(memcg, gfp_mask, noswap);
if(total && (flag & MEM_CGROUP_RECLAIM_SHRINK))
	break;

and 

if (mem_cgroup_margin(memcg))
	break;

So maybe mem_cgroup_reclaim not reclaim enough pages >= nr_pages, this
time we should go to mem_cgroup_handle_oom instead of return
CHARGE_RETRY.

Hopefully, you can verify if my idea make sense.


>>Reported-by: David Rientjes <rientjes@google.com>
>>Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>>Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>>Acked-by: Michal Hocko <mhocko@suse.cz>
>>---
>> include/linux/memcontrol.h |   11 +++----
>> mm/memcontrol.c            |   67 +++++++++++++++++++++++--------------------
>> mm/migrate.c               |   27 ++++--------------
>> 3 files changed, 47 insertions(+), 58 deletions(-)
>>
>>diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>>index 5a3ee64..8d9489f 100644
>>--- a/include/linux/memcontrol.h
>>+++ b/include/linux/memcontrol.h
>>@@ -98,9 +98,9 @@ int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
>> 
>> extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
>> 
>>-extern int
>>-mem_cgroup_prepare_migration(struct page *page,
>>-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask);
>>+extern void
>>+mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>>+			     struct mem_cgroup **memcgp);
>> extern void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>> 	struct page *oldpage, struct page *newpage, bool migration_ok);
>> 
>>@@ -276,11 +276,10 @@ static inline struct cgroup_subsys_state
>> 	return NULL;
>> }
>> 
>>-static inline int
>>+static inline void
>> mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>>-	struct mem_cgroup **memcgp, gfp_t gfp_mask)
>>+			     struct mem_cgroup **memcgp)
>> {
>>-	return 0;
>> }
>> 
>> static inline void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>>diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>index e8ddc00..12ee2de 100644
>>--- a/mm/memcontrol.c
>>+++ b/mm/memcontrol.c
>>@@ -2977,7 +2977,8 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
>>  * uncharge if !page_mapped(page)
>>  */
>> static struct mem_cgroup *
>>-__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
>>+__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
>>+			     bool end_migration)
>> {
>> 	struct mem_cgroup *memcg = NULL;
>> 	unsigned int nr_pages = 1;
>>@@ -3021,7 +3022,16 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
>> 		/* fallthrough */
>> 	case MEM_CGROUP_CHARGE_TYPE_DROP:
>> 		/* See mem_cgroup_prepare_migration() */
>>-		if (page_mapped(page) || PageCgroupMigration(pc))
>>+		if (page_mapped(page))
>>+			goto unlock_out;
>>+		/*
>>+		 * Pages under migration may not be uncharged.  But
>>+		 * end_migration() /must/ be the one uncharging the
>>+		 * unused post-migration page and so it has to call
>>+		 * here with the migration bit still set.  See the
>>+		 * res_counter handling below.
>>+		 */
>>+		if (!end_migration && PageCgroupMigration(pc))
>> 			goto unlock_out;
>> 		break;
>> 	case MEM_CGROUP_CHARGE_TYPE_SWAPOUT:
>>@@ -3055,7 +3065,12 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
>> 		mem_cgroup_swap_statistics(memcg, true);
>> 		mem_cgroup_get(memcg);
>> 	}
>>-	if (!mem_cgroup_is_root(memcg))
>>+	/*
>>+	 * Migration does not charge the res_counter for the
>>+	 * replacement page, so leave it alone when phasing out the
>>+	 * page that is unused after the migration.
>>+	 */
>>+	if (!end_migration && !mem_cgroup_is_root(memcg))
>> 		mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
>> 
>> 	return memcg;
>>@@ -3071,14 +3086,14 @@ void mem_cgroup_uncharge_page(struct page *page)
>> 	if (page_mapped(page))
>> 		return;
>> 	VM_BUG_ON(page->mapping && !PageAnon(page));
>>-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON);
>>+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
>> }
>> 
>> void mem_cgroup_uncharge_cache_page(struct page *page)
>> {
>> 	VM_BUG_ON(page_mapped(page));
>> 	VM_BUG_ON(page->mapping);
>>-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
>>+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
>> }
>> 
>> /*
>>@@ -3142,7 +3157,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
>> 	if (!swapout) /* this was a swap cache but the swap is unused ! */
>> 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
>> 
>>-	memcg = __mem_cgroup_uncharge_common(page, ctype);
>>+	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
>> 
>> 	/*
>> 	 * record memcg information,  if swapout && memcg != NULL,
>>@@ -3232,19 +3247,18 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
>>  * Before starting migration, account PAGE_SIZE to mem_cgroup that the old
>>  * page belongs to.
>>  */
>>-int mem_cgroup_prepare_migration(struct page *page,
>>-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask)
>>+void mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>>+				  struct mem_cgroup **memcgp)
>> {
>> 	struct mem_cgroup *memcg = NULL;
>> 	struct page_cgroup *pc;
>> 	enum charge_type ctype;
>>-	int ret = 0;
>> 
>> 	*memcgp = NULL;
>> 
>> 	VM_BUG_ON(PageTransHuge(page));
>> 	if (mem_cgroup_disabled())
>>-		return 0;
>>+		return;
>> 
>> 	pc = lookup_page_cgroup(page);
>> 	lock_page_cgroup(pc);
>>@@ -3289,24 +3303,9 @@ int mem_cgroup_prepare_migration(struct page *page,
>> 	 * we return here.
>> 	 */
>> 	if (!memcg)
>>-		return 0;
>>+		return;
>> 
>> 	*memcgp = memcg;
>>-	ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, memcgp, false);
>>-	css_put(&memcg->css);/* drop extra refcnt */
>>-	if (ret) {
>>-		if (PageAnon(page)) {
>>-			lock_page_cgroup(pc);
>>-			ClearPageCgroupMigration(pc);
>>-			unlock_page_cgroup(pc);
>>-			/*
>>-			 * The old page may be fully unmapped while we kept it.
>>-			 */
>>-			mem_cgroup_uncharge_page(page);
>>-		}
>>-		/* we'll need to revisit this error code (we have -EINTR) */
>>-		return -ENOMEM;
>>-	}
>> 	/*
>> 	 * We charge new page before it's used/mapped. So, even if unlock_page()
>> 	 * is called before end_migration, we can catch all events on this new
>>@@ -3319,8 +3318,12 @@ int mem_cgroup_prepare_migration(struct page *page,
>> 		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
>> 	else
>> 		ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
>>+	/*
>>+	 * The page is committed to the memcg, but it's not actually
>>+	 * charged to the res_counter since we plan on replacing the
>>+	 * old one and only one page is going to be left afterwards.
>>+	 */
>> 	__mem_cgroup_commit_charge(memcg, newpage, 1, ctype, false);
>>-	return ret;
>> }
>> 
>> /* remove redundant charge if migration failed*/
>>@@ -3342,6 +3345,12 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>> 		used = newpage;
>> 		unused = oldpage;
>> 	}
>>+	anon = PageAnon(used);
>>+	__mem_cgroup_uncharge_common(unused,
>>+		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
>>+		     : MEM_CGROUP_CHARGE_TYPE_CACHE,
>>+		true);
>>+	css_put(&memcg->css);
>> 	/*
>> 	 * We disallowed uncharge of pages under migration because mapcount
>> 	 * of the page goes down to zero, temporarly.
>>@@ -3351,10 +3360,6 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>> 	lock_page_cgroup(pc);
>> 	ClearPageCgroupMigration(pc);
>> 	unlock_page_cgroup(pc);
>>-	anon = PageAnon(used);
>>-	__mem_cgroup_uncharge_common(unused,
>>-		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
>>-		     : MEM_CGROUP_CHARGE_TYPE_CACHE);
>> 
>> 	/*
>> 	 * If a page is a file cache, radix-tree replacement is very atomic
>>diff --git a/mm/migrate.c b/mm/migrate.c
>>index 8137aea..aa06bf4 100644
>>--- a/mm/migrate.c
>>+++ b/mm/migrate.c
>>@@ -687,7 +687,6 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>> {
>> 	int rc = -EAGAIN;
>> 	int remap_swapcache = 1;
>>-	int charge = 0;
>> 	struct mem_cgroup *mem;
>> 	struct anon_vma *anon_vma = NULL;
>> 
>>@@ -729,12 +728,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>> 	}
>> 
>> 	/* charge against new page */
>>-	charge = mem_cgroup_prepare_migration(page, newpage, &mem, GFP_KERNEL);
>>-	if (charge == -ENOMEM) {
>>-		rc = -ENOMEM;
>>-		goto unlock;
>>-	}
>>-	BUG_ON(charge);
>>+	mem_cgroup_prepare_migration(page, newpage, &mem);
>> 
>> 	if (PageWriteback(page)) {
>> 		/*
>>@@ -824,8 +818,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>> 		put_anon_vma(anon_vma);
>> 
>> uncharge:
>>-	if (!charge)
>>-		mem_cgroup_end_migration(mem, page, newpage, rc == 0);
>>+	mem_cgroup_end_migration(mem, page, newpage, rc == 0);
>> unlock:
>> 	unlock_page(page);
>> out:
>>@@ -1519,10 +1512,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
>> {
>> 	struct page *oldpage = page, *newpage;
>> 	struct address_space *mapping = page_mapping(page);
>>-	struct mem_cgroup *mcg;
>>+	struct mem_cgroup *memcg;
>> 	unsigned int gfp;
>> 	int rc = 0;
>>-	int charge = -ENOMEM;
>> 
>> 	VM_BUG_ON(!PageLocked(page));
>> 	VM_BUG_ON(page_mapcount(page));
>>@@ -1556,12 +1548,7 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
>> 	if (!trylock_page(newpage))
>> 		BUG();		/* new page should be unlocked!!! */
>> 
>>-	// XXX hnaz, is this right?
>>-	charge = mem_cgroup_prepare_migration(page, newpage, &mcg, gfp);
>>-	if (charge == -ENOMEM) {
>>-		rc = charge;
>>-		goto out;
>>-	}
>>+	mem_cgroup_prepare_migration(page, newpage, &memcg);
>> 
>> 	newpage->index = page->index;
>> 	newpage->mapping = page->mapping;
>>@@ -1581,11 +1568,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
>> 		page = newpage;
>> 	}
>> 
>>+	mem_cgroup_end_migration(memcg, oldpage, newpage, !rc);
>> out:
>>-	if (!charge)
>>-		mem_cgroup_end_migration(mcg, oldpage, newpage, !rc);
>>-
>>-       if (oldpage != page)
>>+	if (oldpage != page)
>>                put_page(oldpage);
>> 
>> 	if (rc) {
>>-- 
>>1.7.7.6

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
@ 2012-07-12  9:10       ` Wanpeng Li
  0 siblings, 0 replies; 34+ messages in thread
From: Wanpeng Li @ 2012-07-12  9:10 UTC (permalink / raw)
  To: Johannes Weiner, KAMEZAWA Hiroyuki, Michal Hocko
  Cc: Andrew Morton, Hugh Dickins, David Rientjes, Wanpeng Li,
	linux-mm, cgroups, linux-kernel

On Thu, Jul 12, 2012 at 04:54:07PM +0800, Wanpeng Li wrote:
>On Wed, Jul 11, 2012 at 07:02:13PM +0200, Johannes Weiner wrote:
>>Compaction (and page migration in general) can currently be hindered
>>through pages being owned by memory cgroups that are at their limits
>>and unreclaimable.
>>
>>The reason is that the replacement page is being charged against the
>>limit while the page being replaced is also still charged.  But this
>>seems unnecessary, given that only one of the two pages will still be
>>in use after migration finishes.
>>
>>This patch changes the memcg migration sequence so that the
>>replacement page is not charged.  Whatever page is still in use after
>>successful or failed migration gets to keep the charge of the page
>>that was going to be replaced.
>>
>>The replacement page will still show up temporarily in the rss/cache
>>statistics, this can be fixed in a later patch as it's less urgent.
>>
>
>So I want to know after this patch be merged if mem_cgroup_wait_acct_move
>still make sense, if the answer is no, I will send a patch to remove it.

And if this still make sense, I want to change check in
mem_cgroup_do_charge:

if (mem_cgroup_wait_acct_move(mem_over_limit))
	return CHARGE_RETRY;

=>

if (mem_cgroup_wait_acct_move(mem_over_limit) && 
                       mem_cgroup_margin(mem_over_limit) >= nr_pages)
	return CHARGE_RETRY;

Since mem_cgroup_relcaim can reclaim some pages, but in
mem_cgroup_reclaim function there are some exit condition:

total += try_to_free_mem_cgroup_pages(memcg, gfp_mask, noswap);
if(total && (flag & MEM_CGROUP_RECLAIM_SHRINK))
	break;

and 

if (mem_cgroup_margin(memcg))
	break;

So maybe mem_cgroup_reclaim not reclaim enough pages >= nr_pages, this
time we should go to mem_cgroup_handle_oom instead of return
CHARGE_RETRY.

Hopefully, you can verify if my idea make sense.


>>Reported-by: David Rientjes <rientjes@google.com>
>>Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>>Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>>Acked-by: Michal Hocko <mhocko@suse.cz>
>>---
>> include/linux/memcontrol.h |   11 +++----
>> mm/memcontrol.c            |   67 +++++++++++++++++++++++--------------------
>> mm/migrate.c               |   27 ++++--------------
>> 3 files changed, 47 insertions(+), 58 deletions(-)
>>
>>diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>>index 5a3ee64..8d9489f 100644
>>--- a/include/linux/memcontrol.h
>>+++ b/include/linux/memcontrol.h
>>@@ -98,9 +98,9 @@ int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
>> 
>> extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *memcg);
>> 
>>-extern int
>>-mem_cgroup_prepare_migration(struct page *page,
>>-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask);
>>+extern void
>>+mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>>+			     struct mem_cgroup **memcgp);
>> extern void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>> 	struct page *oldpage, struct page *newpage, bool migration_ok);
>> 
>>@@ -276,11 +276,10 @@ static inline struct cgroup_subsys_state
>> 	return NULL;
>> }
>> 
>>-static inline int
>>+static inline void
>> mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>>-	struct mem_cgroup **memcgp, gfp_t gfp_mask)
>>+			     struct mem_cgroup **memcgp)
>> {
>>-	return 0;
>> }
>> 
>> static inline void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>>diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>index e8ddc00..12ee2de 100644
>>--- a/mm/memcontrol.c
>>+++ b/mm/memcontrol.c
>>@@ -2977,7 +2977,8 @@ static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
>>  * uncharge if !page_mapped(page)
>>  */
>> static struct mem_cgroup *
>>-__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
>>+__mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype,
>>+			     bool end_migration)
>> {
>> 	struct mem_cgroup *memcg = NULL;
>> 	unsigned int nr_pages = 1;
>>@@ -3021,7 +3022,16 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
>> 		/* fallthrough */
>> 	case MEM_CGROUP_CHARGE_TYPE_DROP:
>> 		/* See mem_cgroup_prepare_migration() */
>>-		if (page_mapped(page) || PageCgroupMigration(pc))
>>+		if (page_mapped(page))
>>+			goto unlock_out;
>>+		/*
>>+		 * Pages under migration may not be uncharged.  But
>>+		 * end_migration() /must/ be the one uncharging the
>>+		 * unused post-migration page and so it has to call
>>+		 * here with the migration bit still set.  See the
>>+		 * res_counter handling below.
>>+		 */
>>+		if (!end_migration && PageCgroupMigration(pc))
>> 			goto unlock_out;
>> 		break;
>> 	case MEM_CGROUP_CHARGE_TYPE_SWAPOUT:
>>@@ -3055,7 +3065,12 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
>> 		mem_cgroup_swap_statistics(memcg, true);
>> 		mem_cgroup_get(memcg);
>> 	}
>>-	if (!mem_cgroup_is_root(memcg))
>>+	/*
>>+	 * Migration does not charge the res_counter for the
>>+	 * replacement page, so leave it alone when phasing out the
>>+	 * page that is unused after the migration.
>>+	 */
>>+	if (!end_migration && !mem_cgroup_is_root(memcg))
>> 		mem_cgroup_do_uncharge(memcg, nr_pages, ctype);
>> 
>> 	return memcg;
>>@@ -3071,14 +3086,14 @@ void mem_cgroup_uncharge_page(struct page *page)
>> 	if (page_mapped(page))
>> 		return;
>> 	VM_BUG_ON(page->mapping && !PageAnon(page));
>>-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON);
>>+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_ANON, false);
>> }
>> 
>> void mem_cgroup_uncharge_cache_page(struct page *page)
>> {
>> 	VM_BUG_ON(page_mapped(page));
>> 	VM_BUG_ON(page->mapping);
>>-	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE);
>>+	__mem_cgroup_uncharge_common(page, MEM_CGROUP_CHARGE_TYPE_CACHE, false);
>> }
>> 
>> /*
>>@@ -3142,7 +3157,7 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
>> 	if (!swapout) /* this was a swap cache but the swap is unused ! */
>> 		ctype = MEM_CGROUP_CHARGE_TYPE_DROP;
>> 
>>-	memcg = __mem_cgroup_uncharge_common(page, ctype);
>>+	memcg = __mem_cgroup_uncharge_common(page, ctype, false);
>> 
>> 	/*
>> 	 * record memcg information,  if swapout && memcg != NULL,
>>@@ -3232,19 +3247,18 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
>>  * Before starting migration, account PAGE_SIZE to mem_cgroup that the old
>>  * page belongs to.
>>  */
>>-int mem_cgroup_prepare_migration(struct page *page,
>>-	struct page *newpage, struct mem_cgroup **memcgp, gfp_t gfp_mask)
>>+void mem_cgroup_prepare_migration(struct page *page, struct page *newpage,
>>+				  struct mem_cgroup **memcgp)
>> {
>> 	struct mem_cgroup *memcg = NULL;
>> 	struct page_cgroup *pc;
>> 	enum charge_type ctype;
>>-	int ret = 0;
>> 
>> 	*memcgp = NULL;
>> 
>> 	VM_BUG_ON(PageTransHuge(page));
>> 	if (mem_cgroup_disabled())
>>-		return 0;
>>+		return;
>> 
>> 	pc = lookup_page_cgroup(page);
>> 	lock_page_cgroup(pc);
>>@@ -3289,24 +3303,9 @@ int mem_cgroup_prepare_migration(struct page *page,
>> 	 * we return here.
>> 	 */
>> 	if (!memcg)
>>-		return 0;
>>+		return;
>> 
>> 	*memcgp = memcg;
>>-	ret = __mem_cgroup_try_charge(NULL, gfp_mask, 1, memcgp, false);
>>-	css_put(&memcg->css);/* drop extra refcnt */
>>-	if (ret) {
>>-		if (PageAnon(page)) {
>>-			lock_page_cgroup(pc);
>>-			ClearPageCgroupMigration(pc);
>>-			unlock_page_cgroup(pc);
>>-			/*
>>-			 * The old page may be fully unmapped while we kept it.
>>-			 */
>>-			mem_cgroup_uncharge_page(page);
>>-		}
>>-		/* we'll need to revisit this error code (we have -EINTR) */
>>-		return -ENOMEM;
>>-	}
>> 	/*
>> 	 * We charge new page before it's used/mapped. So, even if unlock_page()
>> 	 * is called before end_migration, we can catch all events on this new
>>@@ -3319,8 +3318,12 @@ int mem_cgroup_prepare_migration(struct page *page,
>> 		ctype = MEM_CGROUP_CHARGE_TYPE_CACHE;
>> 	else
>> 		ctype = MEM_CGROUP_CHARGE_TYPE_SHMEM;
>>+	/*
>>+	 * The page is committed to the memcg, but it's not actually
>>+	 * charged to the res_counter since we plan on replacing the
>>+	 * old one and only one page is going to be left afterwards.
>>+	 */
>> 	__mem_cgroup_commit_charge(memcg, newpage, 1, ctype, false);
>>-	return ret;
>> }
>> 
>> /* remove redundant charge if migration failed*/
>>@@ -3342,6 +3345,12 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>> 		used = newpage;
>> 		unused = oldpage;
>> 	}
>>+	anon = PageAnon(used);
>>+	__mem_cgroup_uncharge_common(unused,
>>+		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
>>+		     : MEM_CGROUP_CHARGE_TYPE_CACHE,
>>+		true);
>>+	css_put(&memcg->css);
>> 	/*
>> 	 * We disallowed uncharge of pages under migration because mapcount
>> 	 * of the page goes down to zero, temporarly.
>>@@ -3351,10 +3360,6 @@ void mem_cgroup_end_migration(struct mem_cgroup *memcg,
>> 	lock_page_cgroup(pc);
>> 	ClearPageCgroupMigration(pc);
>> 	unlock_page_cgroup(pc);
>>-	anon = PageAnon(used);
>>-	__mem_cgroup_uncharge_common(unused,
>>-		anon ? MEM_CGROUP_CHARGE_TYPE_ANON
>>-		     : MEM_CGROUP_CHARGE_TYPE_CACHE);
>> 
>> 	/*
>> 	 * If a page is a file cache, radix-tree replacement is very atomic
>>diff --git a/mm/migrate.c b/mm/migrate.c
>>index 8137aea..aa06bf4 100644
>>--- a/mm/migrate.c
>>+++ b/mm/migrate.c
>>@@ -687,7 +687,6 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>> {
>> 	int rc = -EAGAIN;
>> 	int remap_swapcache = 1;
>>-	int charge = 0;
>> 	struct mem_cgroup *mem;
>> 	struct anon_vma *anon_vma = NULL;
>> 
>>@@ -729,12 +728,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>> 	}
>> 
>> 	/* charge against new page */
>>-	charge = mem_cgroup_prepare_migration(page, newpage, &mem, GFP_KERNEL);
>>-	if (charge == -ENOMEM) {
>>-		rc = -ENOMEM;
>>-		goto unlock;
>>-	}
>>-	BUG_ON(charge);
>>+	mem_cgroup_prepare_migration(page, newpage, &mem);
>> 
>> 	if (PageWriteback(page)) {
>> 		/*
>>@@ -824,8 +818,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>> 		put_anon_vma(anon_vma);
>> 
>> uncharge:
>>-	if (!charge)
>>-		mem_cgroup_end_migration(mem, page, newpage, rc == 0);
>>+	mem_cgroup_end_migration(mem, page, newpage, rc == 0);
>> unlock:
>> 	unlock_page(page);
>> out:
>>@@ -1519,10 +1512,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
>> {
>> 	struct page *oldpage = page, *newpage;
>> 	struct address_space *mapping = page_mapping(page);
>>-	struct mem_cgroup *mcg;
>>+	struct mem_cgroup *memcg;
>> 	unsigned int gfp;
>> 	int rc = 0;
>>-	int charge = -ENOMEM;
>> 
>> 	VM_BUG_ON(!PageLocked(page));
>> 	VM_BUG_ON(page_mapcount(page));
>>@@ -1556,12 +1548,7 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
>> 	if (!trylock_page(newpage))
>> 		BUG();		/* new page should be unlocked!!! */
>> 
>>-	// XXX hnaz, is this right?
>>-	charge = mem_cgroup_prepare_migration(page, newpage, &mcg, gfp);
>>-	if (charge == -ENOMEM) {
>>-		rc = charge;
>>-		goto out;
>>-	}
>>+	mem_cgroup_prepare_migration(page, newpage, &memcg);
>> 
>> 	newpage->index = page->index;
>> 	newpage->mapping = page->mapping;
>>@@ -1581,11 +1568,9 @@ migrate_misplaced_page(struct page *page, struct mm_struct *mm, int node)
>> 		page = newpage;
>> 	}
>> 
>>+	mem_cgroup_end_migration(memcg, oldpage, newpage, !rc);
>> out:
>>-	if (!charge)
>>-		mem_cgroup_end_migration(mcg, oldpage, newpage, !rc);
>>-
>>-       if (oldpage != page)
>>+	if (oldpage != page)
>>                put_page(oldpage);
>> 
>> 	if (rc) {
>>-- 
>>1.7.7.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
  2012-07-12  8:54     ` Wanpeng Li
@ 2012-07-12  9:12       ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-12  9:12 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins,
	David Rientjes, linux-mm, cgroups, linux-kernel

On Thu, Jul 12, 2012 at 04:54:07PM +0800, Wanpeng Li wrote:
> On Wed, Jul 11, 2012 at 07:02:13PM +0200, Johannes Weiner wrote:
> >Compaction (and page migration in general) can currently be hindered
> >through pages being owned by memory cgroups that are at their limits
> >and unreclaimable.
> >
> >The reason is that the replacement page is being charged against the
> >limit while the page being replaced is also still charged.  But this
> >seems unnecessary, given that only one of the two pages will still be
> >in use after migration finishes.
> >
> >This patch changes the memcg migration sequence so that the
> >replacement page is not charged.  Whatever page is still in use after
> >successful or failed migration gets to keep the charge of the page
> >that was going to be replaced.
> >
> >The replacement page will still show up temporarily in the rss/cache
> >statistics, this can be fixed in a later patch as it's less urgent.
> 
> So I want to know after this patch be merged if mem_cgroup_wait_acct_move
> still make sense, if the answer is no, I will send a patch to remove it.

This change is about migrating a charge from one physical page to
another, account moving is about migrating charges between groups.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
@ 2012-07-12  9:12       ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-12  9:12 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Michal Hocko, Hugh Dickins,
	David Rientjes, linux-mm, cgroups, linux-kernel

On Thu, Jul 12, 2012 at 04:54:07PM +0800, Wanpeng Li wrote:
> On Wed, Jul 11, 2012 at 07:02:13PM +0200, Johannes Weiner wrote:
> >Compaction (and page migration in general) can currently be hindered
> >through pages being owned by memory cgroups that are at their limits
> >and unreclaimable.
> >
> >The reason is that the replacement page is being charged against the
> >limit while the page being replaced is also still charged.  But this
> >seems unnecessary, given that only one of the two pages will still be
> >in use after migration finishes.
> >
> >This patch changes the memcg migration sequence so that the
> >replacement page is not charged.  Whatever page is still in use after
> >successful or failed migration gets to keep the charge of the page
> >that was going to be replaced.
> >
> >The replacement page will still show up temporarily in the rss/cache
> >statistics, this can be fixed in a later patch as it's less urgent.
> 
> So I want to know after this patch be merged if mem_cgroup_wait_acct_move
> still make sense, if the answer is no, I will send a patch to remove it.

This change is about migrating a charge from one physical page to
another, account moving is about migrating charges between groups.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
  2012-07-12  9:10       ` Wanpeng Li
@ 2012-07-12  9:42         ` Johannes Weiner
  -1 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-12  9:42 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Andrew Morton, Hugh Dickins,
	David Rientjes, linux-mm, cgroups, linux-kernel

On Thu, Jul 12, 2012 at 05:10:43PM +0800, Wanpeng Li wrote:
> On Thu, Jul 12, 2012 at 04:54:07PM +0800, Wanpeng Li wrote:
> >On Wed, Jul 11, 2012 at 07:02:13PM +0200, Johannes Weiner wrote:
> >>Compaction (and page migration in general) can currently be hindered
> >>through pages being owned by memory cgroups that are at their limits
> >>and unreclaimable.
> >>
> >>The reason is that the replacement page is being charged against the
> >>limit while the page being replaced is also still charged.  But this
> >>seems unnecessary, given that only one of the two pages will still be
> >>in use after migration finishes.
> >>
> >>This patch changes the memcg migration sequence so that the
> >>replacement page is not charged.  Whatever page is still in use after
> >>successful or failed migration gets to keep the charge of the page
> >>that was going to be replaced.
> >>
> >>The replacement page will still show up temporarily in the rss/cache
> >>statistics, this can be fixed in a later patch as it's less urgent.
> >>
> >
> >So I want to know after this patch be merged if mem_cgroup_wait_acct_move
> >still make sense, if the answer is no, I will send a patch to remove it.
> 
> And if this still make sense, I want to change check in
> mem_cgroup_do_charge:
> 
> if (mem_cgroup_wait_acct_move(mem_over_limit))
> 	return CHARGE_RETRY;
> 
> =>
> 
> if (mem_cgroup_wait_acct_move(mem_over_limit) && 
>                        mem_cgroup_margin(mem_over_limit) >= nr_pages)
> 	return CHARGE_RETRY;
> 
> Since mem_cgroup_relcaim can reclaim some pages, but in
> mem_cgroup_reclaim function there are some exit condition:
> 
> total += try_to_free_mem_cgroup_pages(memcg, gfp_mask, noswap);
> if(total && (flag & MEM_CGROUP_RECLAIM_SHRINK))
> 	break;
> 
> and 
> 
> if (mem_cgroup_margin(memcg))
> 	break;
> 
> So maybe mem_cgroup_reclaim not reclaim enough pages >= nr_pages, this
> time we should go to mem_cgroup_handle_oom instead of return
> CHARGE_RETRY.
> 
> Hopefully, you can verify if my idea make sense.

Sorry, but this is a waste of your time, my time, and that of
everybody else in this thread.

I will ignore any subsequent proposals from you unless they start with
a coherent description of an actual problem.  Something that has
impact on userspace, or significant impact on kernel development.

If there is a bug I don't see in your description above, than please
explain how it affects userspace.  If the code or comments are cryptic
and can be simplified or clarified, please explain.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits
@ 2012-07-12  9:42         ` Johannes Weiner
  0 siblings, 0 replies; 34+ messages in thread
From: Johannes Weiner @ 2012-07-12  9:42 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: KAMEZAWA Hiroyuki, Michal Hocko, Andrew Morton, Hugh Dickins,
	David Rientjes, linux-mm, cgroups, linux-kernel

On Thu, Jul 12, 2012 at 05:10:43PM +0800, Wanpeng Li wrote:
> On Thu, Jul 12, 2012 at 04:54:07PM +0800, Wanpeng Li wrote:
> >On Wed, Jul 11, 2012 at 07:02:13PM +0200, Johannes Weiner wrote:
> >>Compaction (and page migration in general) can currently be hindered
> >>through pages being owned by memory cgroups that are at their limits
> >>and unreclaimable.
> >>
> >>The reason is that the replacement page is being charged against the
> >>limit while the page being replaced is also still charged.  But this
> >>seems unnecessary, given that only one of the two pages will still be
> >>in use after migration finishes.
> >>
> >>This patch changes the memcg migration sequence so that the
> >>replacement page is not charged.  Whatever page is still in use after
> >>successful or failed migration gets to keep the charge of the page
> >>that was going to be replaced.
> >>
> >>The replacement page will still show up temporarily in the rss/cache
> >>statistics, this can be fixed in a later patch as it's less urgent.
> >>
> >
> >So I want to know after this patch be merged if mem_cgroup_wait_acct_move
> >still make sense, if the answer is no, I will send a patch to remove it.
> 
> And if this still make sense, I want to change check in
> mem_cgroup_do_charge:
> 
> if (mem_cgroup_wait_acct_move(mem_over_limit))
> 	return CHARGE_RETRY;
> 
> =>
> 
> if (mem_cgroup_wait_acct_move(mem_over_limit) && 
>                        mem_cgroup_margin(mem_over_limit) >= nr_pages)
> 	return CHARGE_RETRY;
> 
> Since mem_cgroup_relcaim can reclaim some pages, but in
> mem_cgroup_reclaim function there are some exit condition:
> 
> total += try_to_free_mem_cgroup_pages(memcg, gfp_mask, noswap);
> if(total && (flag & MEM_CGROUP_RECLAIM_SHRINK))
> 	break;
> 
> and 
> 
> if (mem_cgroup_margin(memcg))
> 	break;
> 
> So maybe mem_cgroup_reclaim not reclaim enough pages >= nr_pages, this
> time we should go to mem_cgroup_handle_oom instead of return
> CHARGE_RETRY.
> 
> Hopefully, you can verify if my idea make sense.

Sorry, but this is a waste of your time, my time, and that of
everybody else in this thread.

I will ignore any subsequent proposals from you unless they start with
a coherent description of an actual problem.  Something that has
impact on userspace, or significant impact on kernel development.

If there is a bug I don't see in your description above, than please
explain how it affects userspace.  If the code or comments are cryptic
and can be simplified or clarified, please explain.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 03/10] mm: memcg: push down PageSwapCache check into uncharge entry functions
  2012-07-11 17:02   ` Johannes Weiner
  (?)
@ 2012-07-19  9:57     ` Kamezawa Hiroyuki
  -1 siblings, 0 replies; 34+ messages in thread
From: Kamezawa Hiroyuki @ 2012-07-19  9:57 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

(2012/07/12 2:02), Johannes Weiner wrote:
> Not all uncharge paths need to check if the page is swapcache, some of
> them can know for sure.
> 
> Push down the check into all callsites of uncharge_common() so that
> the patch that removes some of them is more obvious.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 03/10] mm: memcg: push down PageSwapCache check into uncharge entry functions
@ 2012-07-19  9:57     ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 34+ messages in thread
From: Kamezawa Hiroyuki @ 2012-07-19  9:57 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm, cgroups, linux-kernel

(2012/07/12 2:02), Johannes Weiner wrote:
> Not all uncharge paths need to check if the page is swapcache, some of
> them can know for sure.
> 
> Push down the check into all callsites of uncharge_common() so that
> the patch that removes some of them is more obvious.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [patch 03/10] mm: memcg: push down PageSwapCache check into uncharge entry functions
@ 2012-07-19  9:57     ` Kamezawa Hiroyuki
  0 siblings, 0 replies; 34+ messages in thread
From: Kamezawa Hiroyuki @ 2012-07-19  9:57 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Michal Hocko, Hugh Dickins, David Rientjes,
	Wanpeng Li, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

(2012/07/12 2:02), Johannes Weiner wrote:
> Not all uncharge paths need to check if the page is swapcache, some of
> them can know for sure.
> 
> Push down the check into all callsites of uncharge_common() so that
> the patch that removes some of them is more obvious.
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2012-07-19 10:01 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-11 17:02 [patch 00/10] mm: memcg: charge/uncharge improvements v2 Johannes Weiner
2012-07-11 17:02 ` Johannes Weiner
2012-07-11 17:02 ` Johannes Weiner
2012-07-11 17:02 ` [patch 01/10] mm: memcg: fix compaction/migration failing due to memcg limits Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-12  8:54   ` Wanpeng Li
2012-07-12  8:54     ` Wanpeng Li
2012-07-12  9:10     ` Wanpeng Li
2012-07-12  9:10       ` Wanpeng Li
2012-07-12  9:42       ` Johannes Weiner
2012-07-12  9:42         ` Johannes Weiner
2012-07-12  9:12     ` Johannes Weiner
2012-07-12  9:12       ` Johannes Weiner
2012-07-11 17:02 ` [patch 02/10] mm: swapfile: clean up unuse_pte race handling Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-11 17:02 ` [patch 03/10] mm: memcg: push down PageSwapCache check into uncharge entry functions Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-19  9:57   ` Kamezawa Hiroyuki
2012-07-19  9:57     ` Kamezawa Hiroyuki
2012-07-19  9:57     ` Kamezawa Hiroyuki
2012-07-11 17:02 ` [patch 04/10] mm: memcg: only check for PageSwapCache when uncharging anon Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-11 17:02 ` [patch 05/10] mm: memcg: move swapin charge functions above callsites Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-11 17:02 ` [patch 06/10] mm: memcg: remove unneeded shmem charge type Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-11 17:02 ` [patch 07/10] mm: memcg: remove needless !mm fixup to init_mm when charging Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-11 17:02 ` [patch 08/10] mm: memcg: split swapin charge function into private and public part Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-11 17:02 ` [patch 09/10] mm: memcg: only check swap cache pages for repeated charging Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner
2012-07-11 17:02 ` [patch 10/10] mm: memcg: only check anon swapin page charges for swap cache Johannes Weiner
2012-07-11 17:02   ` Johannes Weiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.