All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/8] memcg: charge path cleanups
@ 2014-03-12  1:28 ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Hi Andrew,

here are some cleanups and refactoring efforts of the memcg charge
path for 3.15 from Michal and me.

 mm/memcontrol.c | 319 +++++++++++++++++++-----------------------------------
 1 file changed, 112 insertions(+), 207 deletions(-)


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch 0/8] memcg: charge path cleanups
@ 2014-03-12  1:28 ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Hi Andrew,

here are some cleanups and refactoring efforts of the memcg charge
path for 3.15 from Michal and me.

 mm/memcontrol.c | 319 +++++++++++++++++++-----------------------------------
 1 file changed, 112 insertions(+), 207 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch 0/8] memcg: charge path cleanups
@ 2014-03-12  1:28 ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hi Andrew,

here are some cleanups and refactoring efforts of the memcg charge
path for 3.15 from Michal and me.

 mm/memcontrol.c | 319 +++++++++++++++++++-----------------------------------
 1 file changed, 112 insertions(+), 207 deletions(-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [patch 1/8] mm: memcg: remove unnecessary preemption disabling
  2014-03-12  1:28 ` Johannes Weiner
@ 2014-03-12  1:28   ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

lock_page_cgroup() disables preemption, remove explicit preemption
disabling for code paths holding this lock.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5b6b0039f725..393864c162ac 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -921,8 +921,6 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
 					 struct page *page,
 					 bool anon, int nr_pages)
 {
-	preempt_disable();
-
 	/*
 	 * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
 	 * counted as CACHE even if it's on ANON LRU.
@@ -947,8 +945,6 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
 	}
 
 	__this_cpu_add(memcg->stat->nr_page_events, nr_pages);
-
-	preempt_enable();
 }
 
 unsigned long
@@ -3780,17 +3776,14 @@ void mem_cgroup_split_huge_fixup(struct page *head)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-static inline
-void mem_cgroup_move_account_page_stat(struct mem_cgroup *from,
-					struct mem_cgroup *to,
-					unsigned int nr_pages,
-					enum mem_cgroup_stat_index idx)
+static void mem_cgroup_move_account_page_stat(struct mem_cgroup *from,
+					      struct mem_cgroup *to,
+					      unsigned int nr_pages,
+					      enum mem_cgroup_stat_index idx)
 {
 	/* Update stat data for mem_cgroup */
-	preempt_disable();
 	__this_cpu_sub(from->stat->count[idx], nr_pages);
 	__this_cpu_add(to->stat->count[idx], nr_pages);
-	preempt_enable();
 }
 
 /**
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 1/8] mm: memcg: remove unnecessary preemption disabling
@ 2014-03-12  1:28   ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

lock_page_cgroup() disables preemption, remove explicit preemption
disabling for code paths holding this lock.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5b6b0039f725..393864c162ac 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -921,8 +921,6 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
 					 struct page *page,
 					 bool anon, int nr_pages)
 {
-	preempt_disable();
-
 	/*
 	 * Here, RSS means 'mapped anon' and anon's SwapCache. Shmem/tmpfs is
 	 * counted as CACHE even if it's on ANON LRU.
@@ -947,8 +945,6 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
 	}
 
 	__this_cpu_add(memcg->stat->nr_page_events, nr_pages);
-
-	preempt_enable();
 }
 
 unsigned long
@@ -3780,17 +3776,14 @@ void mem_cgroup_split_huge_fixup(struct page *head)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-static inline
-void mem_cgroup_move_account_page_stat(struct mem_cgroup *from,
-					struct mem_cgroup *to,
-					unsigned int nr_pages,
-					enum mem_cgroup_stat_index idx)
+static void mem_cgroup_move_account_page_stat(struct mem_cgroup *from,
+					      struct mem_cgroup *to,
+					      unsigned int nr_pages,
+					      enum mem_cgroup_stat_index idx)
 {
 	/* Update stat data for mem_cgroup */
-	preempt_disable();
 	__this_cpu_sub(from->stat->count[idx], nr_pages);
 	__this_cpu_add(to->stat->count[idx], nr_pages);
-	preempt_enable();
 }
 
 /**
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 2/8] mm: memcg: remove mem_cgroup_move_account_page_stat()
  2014-03-12  1:28 ` Johannes Weiner
@ 2014-03-12  1:28   ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

It used to disable preemption and run sanity checks but now it's only
taking a number out of one percpu counter and putting it into another.
Do this directly in the callsite and save the indirection.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 28 ++++++++++++----------------
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 393864c162ac..5abdfab957ad 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3776,16 +3776,6 @@ void mem_cgroup_split_huge_fixup(struct page *head)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-static void mem_cgroup_move_account_page_stat(struct mem_cgroup *from,
-					      struct mem_cgroup *to,
-					      unsigned int nr_pages,
-					      enum mem_cgroup_stat_index idx)
-{
-	/* Update stat data for mem_cgroup */
-	__this_cpu_sub(from->stat->count[idx], nr_pages);
-	__this_cpu_add(to->stat->count[idx], nr_pages);
-}
-
 /**
  * mem_cgroup_move_account - move account of the page
  * @page: the page
@@ -3831,13 +3821,19 @@ static int mem_cgroup_move_account(struct page *page,
 
 	move_lock_mem_cgroup(from, &flags);
 
-	if (!anon && page_mapped(page))
-		mem_cgroup_move_account_page_stat(from, to, nr_pages,
-			MEM_CGROUP_STAT_FILE_MAPPED);
+	if (!anon && page_mapped(page)) {
+		__this_cpu_sub(from->stat->count[MEM_CGROUP_STAT_FILE_MAPPED],
+			       nr_pages);
+		__this_cpu_add(to->stat->count[MEM_CGROUP_STAT_FILE_MAPPED],
+			       nr_pages);
+	}
 
-	if (PageWriteback(page))
-		mem_cgroup_move_account_page_stat(from, to, nr_pages,
-			MEM_CGROUP_STAT_WRITEBACK);
+	if (PageWriteback(page)) {
+		__this_cpu_sub(from->stat->count[MEM_CGROUP_STAT_WRITEBACK],
+			       nr_pages);
+		__this_cpu_add(to->stat->count[MEM_CGROUP_STAT_WRITEBACK],
+			       nr_pages);
+	}
 
 	mem_cgroup_charge_statistics(from, page, anon, -nr_pages);
 
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 2/8] mm: memcg: remove mem_cgroup_move_account_page_stat()
@ 2014-03-12  1:28   ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

It used to disable preemption and run sanity checks but now it's only
taking a number out of one percpu counter and putting it into another.
Do this directly in the callsite and save the indirection.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 28 ++++++++++++----------------
 1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 393864c162ac..5abdfab957ad 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3776,16 +3776,6 @@ void mem_cgroup_split_huge_fixup(struct page *head)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-static void mem_cgroup_move_account_page_stat(struct mem_cgroup *from,
-					      struct mem_cgroup *to,
-					      unsigned int nr_pages,
-					      enum mem_cgroup_stat_index idx)
-{
-	/* Update stat data for mem_cgroup */
-	__this_cpu_sub(from->stat->count[idx], nr_pages);
-	__this_cpu_add(to->stat->count[idx], nr_pages);
-}
-
 /**
  * mem_cgroup_move_account - move account of the page
  * @page: the page
@@ -3831,13 +3821,19 @@ static int mem_cgroup_move_account(struct page *page,
 
 	move_lock_mem_cgroup(from, &flags);
 
-	if (!anon && page_mapped(page))
-		mem_cgroup_move_account_page_stat(from, to, nr_pages,
-			MEM_CGROUP_STAT_FILE_MAPPED);
+	if (!anon && page_mapped(page)) {
+		__this_cpu_sub(from->stat->count[MEM_CGROUP_STAT_FILE_MAPPED],
+			       nr_pages);
+		__this_cpu_add(to->stat->count[MEM_CGROUP_STAT_FILE_MAPPED],
+			       nr_pages);
+	}
 
-	if (PageWriteback(page))
-		mem_cgroup_move_account_page_stat(from, to, nr_pages,
-			MEM_CGROUP_STAT_WRITEBACK);
+	if (PageWriteback(page)) {
+		__this_cpu_sub(from->stat->count[MEM_CGROUP_STAT_WRITEBACK],
+			       nr_pages);
+		__this_cpu_add(to->stat->count[MEM_CGROUP_STAT_WRITEBACK],
+			       nr_pages);
+	}
 
 	mem_cgroup_charge_statistics(from, page, anon, -nr_pages);
 
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
  2014-03-12  1:28 ` Johannes Weiner
@ 2014-03-12  1:28   ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

mem_cgroup_charge_common() is used by both cache and anon pages, but
most of its body only applies to anon pages and the remainder is not
worth having in a separate function.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 40 ++++++++++++++++------------------------
 1 file changed, 16 insertions(+), 24 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5abdfab957ad..cfdb9c385d8d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3919,20 +3919,21 @@ out:
 	return ret;
 }
 
-/*
- * Charge the memory controller for page usage.
- * Return
- * 0 if the charge was successful
- * < 0 if the cgroup is over its limit
- */
-static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
-				gfp_t gfp_mask, enum charge_type ctype)
+int mem_cgroup_newpage_charge(struct page *page,
+			      struct mm_struct *mm, gfp_t gfp_mask)
 {
 	struct mem_cgroup *memcg = NULL;
 	unsigned int nr_pages = 1;
 	bool oom = true;
 	int ret;
 
+	if (mem_cgroup_disabled())
+		return 0;
+
+	VM_BUG_ON_PAGE(page_mapped(page), page);
+	VM_BUG_ON_PAGE(page->mapping && !PageAnon(page), page);
+	VM_BUG_ON(!mm);
+
 	if (PageTransHuge(page)) {
 		nr_pages <<= compound_order(page);
 		VM_BUG_ON_PAGE(!PageTransHuge(page), page);
@@ -3946,22 +3947,11 @@ static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
 	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
 	if (ret == -ENOMEM)
 		return ret;
-	__mem_cgroup_commit_charge(memcg, page, nr_pages, ctype, false);
+	__mem_cgroup_commit_charge(memcg, page, nr_pages,
+				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
 	return 0;
 }
 
-int mem_cgroup_newpage_charge(struct page *page,
-			      struct mm_struct *mm, gfp_t gfp_mask)
-{
-	if (mem_cgroup_disabled())
-		return 0;
-	VM_BUG_ON_PAGE(page_mapped(page), page);
-	VM_BUG_ON_PAGE(page->mapping && !PageAnon(page), page);
-	VM_BUG_ON(!mm);
-	return mem_cgroup_charge_common(page, mm, gfp_mask,
-					MEM_CGROUP_CHARGE_TYPE_ANON);
-}
-
 /*
  * While swap-in, try_charge -> commit or cancel, the page is locked.
  * And when try_charge() successfully returns, one refcnt to memcg without
@@ -4079,9 +4069,11 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (PageCompound(page))
 		return 0;
 
-	if (!PageSwapCache(page))
-		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
-	else { /* page is swapcache/shmem */
+	if (!PageSwapCache(page)) {
+		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
+		if (ret != -ENOMEM)
+			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
+	} else { /* page is swapcache/shmem */
 		ret = __mem_cgroup_try_charge_swapin(mm, page,
 						     gfp_mask, &memcg);
 		if (!ret)
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
@ 2014-03-12  1:28   ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

mem_cgroup_charge_common() is used by both cache and anon pages, but
most of its body only applies to anon pages and the remainder is not
worth having in a separate function.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 40 ++++++++++++++++------------------------
 1 file changed, 16 insertions(+), 24 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5abdfab957ad..cfdb9c385d8d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3919,20 +3919,21 @@ out:
 	return ret;
 }
 
-/*
- * Charge the memory controller for page usage.
- * Return
- * 0 if the charge was successful
- * < 0 if the cgroup is over its limit
- */
-static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
-				gfp_t gfp_mask, enum charge_type ctype)
+int mem_cgroup_newpage_charge(struct page *page,
+			      struct mm_struct *mm, gfp_t gfp_mask)
 {
 	struct mem_cgroup *memcg = NULL;
 	unsigned int nr_pages = 1;
 	bool oom = true;
 	int ret;
 
+	if (mem_cgroup_disabled())
+		return 0;
+
+	VM_BUG_ON_PAGE(page_mapped(page), page);
+	VM_BUG_ON_PAGE(page->mapping && !PageAnon(page), page);
+	VM_BUG_ON(!mm);
+
 	if (PageTransHuge(page)) {
 		nr_pages <<= compound_order(page);
 		VM_BUG_ON_PAGE(!PageTransHuge(page), page);
@@ -3946,22 +3947,11 @@ static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
 	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
 	if (ret == -ENOMEM)
 		return ret;
-	__mem_cgroup_commit_charge(memcg, page, nr_pages, ctype, false);
+	__mem_cgroup_commit_charge(memcg, page, nr_pages,
+				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
 	return 0;
 }
 
-int mem_cgroup_newpage_charge(struct page *page,
-			      struct mm_struct *mm, gfp_t gfp_mask)
-{
-	if (mem_cgroup_disabled())
-		return 0;
-	VM_BUG_ON_PAGE(page_mapped(page), page);
-	VM_BUG_ON_PAGE(page->mapping && !PageAnon(page), page);
-	VM_BUG_ON(!mm);
-	return mem_cgroup_charge_common(page, mm, gfp_mask,
-					MEM_CGROUP_CHARGE_TYPE_ANON);
-}
-
 /*
  * While swap-in, try_charge -> commit or cancel, the page is locked.
  * And when try_charge() successfully returns, one refcnt to memcg without
@@ -4079,9 +4069,11 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (PageCompound(page))
 		return 0;
 
-	if (!PageSwapCache(page))
-		ret = mem_cgroup_charge_common(page, mm, gfp_mask, type);
-	else { /* page is swapcache/shmem */
+	if (!PageSwapCache(page)) {
+		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
+		if (ret != -ENOMEM)
+			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
+	} else { /* page is swapcache/shmem */
 		ret = __mem_cgroup_try_charge_swapin(mm, page,
 						     gfp_mask, &memcg);
 		if (!ret)
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
  2014-03-12  1:28 ` Johannes Weiner
@ 2014-03-12  1:28   ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Only page cache charges can happen without an mm context, so push this
special case out of the inner core and into the cache charge function.

An ancient comment explains that the mm can also be NULL in case the
task is currently being migrated, but that is not actually true with
the current case, so just remove it.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cfdb9c385d8d..c40186cf22ad 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2737,15 +2737,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 
 	if (gfp_mask & __GFP_NOFAIL)
 		oom = false;
-
-	/*
-	 * We always charge the cgroup the mm_struct belongs to.
-	 * The mm_struct's mem_cgroup changes on task migration if the
-	 * thread group leader migrates. It's possible that mm is not
-	 * set, if so charge the root memcg (happens for pagecache usage).
-	 */
-	if (!*ptr && !mm)
-		*ptr = root_mem_cgroup;
 again:
 	if (*ptr) { /* css should be a valid one */
 		memcg = *ptr;
@@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 		return 0;
 
 	if (!PageSwapCache(page)) {
+		/*
+		 * Page cache insertions can happen without an actual
+		 * task context, e.g. during disk probing on boot.
+		 */
+		if (!mm)
+			memcg = root_mem_cgroup;
 		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
 		if (ret != -ENOMEM)
 			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
@ 2014-03-12  1:28   ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Only page cache charges can happen without an mm context, so push this
special case out of the inner core and into the cache charge function.

An ancient comment explains that the mm can also be NULL in case the
task is currently being migrated, but that is not actually true with
the current case, so just remove it.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cfdb9c385d8d..c40186cf22ad 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2737,15 +2737,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 
 	if (gfp_mask & __GFP_NOFAIL)
 		oom = false;
-
-	/*
-	 * We always charge the cgroup the mm_struct belongs to.
-	 * The mm_struct's mem_cgroup changes on task migration if the
-	 * thread group leader migrates. It's possible that mm is not
-	 * set, if so charge the root memcg (happens for pagecache usage).
-	 */
-	if (!*ptr && !mm)
-		*ptr = root_mem_cgroup;
 again:
 	if (*ptr) { /* css should be a valid one */
 		memcg = *ptr;
@@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 		return 0;
 
 	if (!PageSwapCache(page)) {
+		/*
+		 * Page cache insertions can happen without an actual
+		 * task context, e.g. during disk probing on boot.
+		 */
+		if (!mm)
+			memcg = root_mem_cgroup;
 		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
 		if (ret != -ENOMEM)
 			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 5/8] memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()
  2014-03-12  1:28 ` Johannes Weiner
@ 2014-03-12  1:28   ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Users pass either a mm that has been established under task lock, or
use a verified current->mm, which means the task can't be exiting.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c40186cf22ad..1780e66ec61e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1075,13 +1075,6 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
 {
 	struct mem_cgroup *memcg = NULL;
 
-	if (!mm)
-		return NULL;
-	/*
-	 * Because we have no locks, mm->owner's may be being moved to other
-	 * cgroup. We use css_tryget() here even if this looks
-	 * pessimistic (rather than adding locks here).
-	 */
 	rcu_read_lock();
 	do {
 		memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 5/8] memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()
@ 2014-03-12  1:28   ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Users pass either a mm that has been established under task lock, or
use a verified current->mm, which means the task can't be exiting.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c40186cf22ad..1780e66ec61e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1075,13 +1075,6 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
 {
 	struct mem_cgroup *memcg = NULL;
 
-	if (!mm)
-		return NULL;
-	/*
-	 * Because we have no locks, mm->owner's may be being moved to other
-	 * cgroup. We use css_tryget() here even if this looks
-	 * pessimistic (rather than adding locks here).
-	 */
 	rcu_read_lock();
 	do {
 		memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 6/8] memcg: get_mem_cgroup_from_mm()
  2014-03-12  1:28 ` Johannes Weiner
@ 2014-03-12  1:28   ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Instead of returning NULL from try_get_mem_cgroup_from_mm() when the
mm owner is exiting, just return root_mem_cgroup.  This makes sense
for all callsites and gets rid of some of them having to fallback
manually.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1780e66ec61e..cc7f3ca3ef34 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1071,7 +1071,7 @@ struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
 	return mem_cgroup_from_css(task_css(p, mem_cgroup_subsys_id));
 }
 
-struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
+struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm)
 {
 	struct mem_cgroup *memcg = NULL;
 
@@ -1079,7 +1079,7 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
 	do {
 		memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
 		if (unlikely(!memcg))
-			break;
+			memcg = root_mem_cgroup;
 	} while (!css_tryget(&memcg->css));
 	rcu_read_unlock();
 	return memcg;
@@ -1475,7 +1475,7 @@ bool task_in_mem_cgroup(struct task_struct *task,
 
 	p = find_lock_task_mm(task);
 	if (p) {
-		curr = try_get_mem_cgroup_from_mm(p->mm);
+		curr = get_mem_cgroup_from_mm(p->mm);
 		task_unlock(p);
 	} else {
 		/*
@@ -1489,8 +1489,6 @@ bool task_in_mem_cgroup(struct task_struct *task,
 			css_get(&curr->css);
 		rcu_read_unlock();
 	}
-	if (!curr)
-		return false;
 	/*
 	 * We should check use_hierarchy of "memcg" not "curr". Because checking
 	 * use_hierarchy of "curr" here make this function true if hierarchy is
@@ -3649,15 +3647,7 @@ __memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup **_memcg, int order)
 	if (!current->mm || current->memcg_kmem_skip_account)
 		return true;
 
-	memcg = try_get_mem_cgroup_from_mm(current->mm);
-
-	/*
-	 * very rare case described in mem_cgroup_from_task. Unfortunately there
-	 * isn't much we can do without complicating this too much, and it would
-	 * be gfp-dependent anyway. Just let it go
-	 */
-	if (unlikely(!memcg))
-		return true;
+	memcg = get_mem_cgroup_from_mm(current->mm);
 
 	if (!memcg_can_account_kmem(memcg)) {
 		css_put(&memcg->css);
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 6/8] memcg: get_mem_cgroup_from_mm()
@ 2014-03-12  1:28   ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Instead of returning NULL from try_get_mem_cgroup_from_mm() when the
mm owner is exiting, just return root_mem_cgroup.  This makes sense
for all callsites and gets rid of some of them having to fallback
manually.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.cz>
---
 mm/memcontrol.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 1780e66ec61e..cc7f3ca3ef34 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1071,7 +1071,7 @@ struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
 	return mem_cgroup_from_css(task_css(p, mem_cgroup_subsys_id));
 }
 
-struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
+struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm)
 {
 	struct mem_cgroup *memcg = NULL;
 
@@ -1079,7 +1079,7 @@ struct mem_cgroup *try_get_mem_cgroup_from_mm(struct mm_struct *mm)
 	do {
 		memcg = mem_cgroup_from_task(rcu_dereference(mm->owner));
 		if (unlikely(!memcg))
-			break;
+			memcg = root_mem_cgroup;
 	} while (!css_tryget(&memcg->css));
 	rcu_read_unlock();
 	return memcg;
@@ -1475,7 +1475,7 @@ bool task_in_mem_cgroup(struct task_struct *task,
 
 	p = find_lock_task_mm(task);
 	if (p) {
-		curr = try_get_mem_cgroup_from_mm(p->mm);
+		curr = get_mem_cgroup_from_mm(p->mm);
 		task_unlock(p);
 	} else {
 		/*
@@ -1489,8 +1489,6 @@ bool task_in_mem_cgroup(struct task_struct *task,
 			css_get(&curr->css);
 		rcu_read_unlock();
 	}
-	if (!curr)
-		return false;
 	/*
 	 * We should check use_hierarchy of "memcg" not "curr". Because checking
 	 * use_hierarchy of "curr" here make this function true if hierarchy is
@@ -3649,15 +3647,7 @@ __memcg_kmem_newpage_charge(gfp_t gfp, struct mem_cgroup **_memcg, int order)
 	if (!current->mm || current->memcg_kmem_skip_account)
 		return true;
 
-	memcg = try_get_mem_cgroup_from_mm(current->mm);
-
-	/*
-	 * very rare case described in mem_cgroup_from_task. Unfortunately there
-	 * isn't much we can do without complicating this too much, and it would
-	 * be gfp-dependent anyway. Just let it go
-	 */
-	if (unlikely(!memcg))
-		return true;
+	memcg = get_mem_cgroup_from_mm(current->mm);
 
 	if (!memcg_can_account_kmem(memcg)) {
 		css_put(&memcg->css);
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 7/8] memcg: do not replicate get_mem_cgroup_from_mm in __mem_cgroup_try_charge
  2014-03-12  1:28 ` Johannes Weiner
@ 2014-03-12  1:28   ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

From: Michal Hocko <mhocko@suse.cz>

__mem_cgroup_try_charge duplicates get_mem_cgroup_from_mm for charges
which came without a memcg. The only reason seems to be a tiny
optimization when css_tryget is not called if the charge can be
consumed from the stock. Nevertheless css_tryget is very cheap since
it has been reworked to use per-cpu counting so this optimization
doesn't give us anything these days.

So let's drop the code duplication so that the code is more readable.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 50 ++++++--------------------------------------------
 1 file changed, 6 insertions(+), 44 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cc7f3ca3ef34..4f7192bfa5fa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2731,52 +2731,14 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 again:
 	if (*ptr) { /* css should be a valid one */
 		memcg = *ptr;
-		if (mem_cgroup_is_root(memcg))
-			goto done;
-		if (consume_stock(memcg, nr_pages))
-			goto done;
 		css_get(&memcg->css);
 	} else {
-		struct task_struct *p;
-
-		rcu_read_lock();
-		p = rcu_dereference(mm->owner);
-		/*
-		 * Because we don't have task_lock(), "p" can exit.
-		 * In that case, "memcg" can point to root or p can be NULL with
-		 * race with swapoff. Then, we have small risk of mis-accouning.
-		 * But such kind of mis-account by race always happens because
-		 * we don't have cgroup_mutex(). It's overkill and we allo that
-		 * small race, here.
-		 * (*) swapoff at el will charge against mm-struct not against
-		 * task-struct. So, mm->owner can be NULL.
-		 */
-		memcg = mem_cgroup_from_task(p);
-		if (!memcg)
-			memcg = root_mem_cgroup;
-		if (mem_cgroup_is_root(memcg)) {
-			rcu_read_unlock();
-			goto done;
-		}
-		if (consume_stock(memcg, nr_pages)) {
-			/*
-			 * It seems dagerous to access memcg without css_get().
-			 * But considering how consume_stok works, it's not
-			 * necessary. If consume_stock success, some charges
-			 * from this memcg are cached on this cpu. So, we
-			 * don't need to call css_get()/css_tryget() before
-			 * calling consume_stock().
-			 */
-			rcu_read_unlock();
-			goto done;
-		}
-		/* after here, we may be blocked. we need to get refcnt */
-		if (!css_tryget(&memcg->css)) {
-			rcu_read_unlock();
-			goto again;
-		}
-		rcu_read_unlock();
+		memcg = get_mem_cgroup_from_mm(mm);
 	}
+	if (mem_cgroup_is_root(memcg))
+		goto done;
+	if (consume_stock(memcg, nr_pages))
+		goto done;
 
 	do {
 		bool invoke_oom = oom && !nr_oom_retries;
@@ -2812,8 +2774,8 @@ again:
 
 	if (batch > nr_pages)
 		refill_stock(memcg, batch - nr_pages);
-	css_put(&memcg->css);
 done:
+	css_put(&memcg->css);
 	*ptr = memcg;
 	return 0;
 nomem:
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 7/8] memcg: do not replicate get_mem_cgroup_from_mm in __mem_cgroup_try_charge
@ 2014-03-12  1:28   ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

From: Michal Hocko <mhocko@suse.cz>

__mem_cgroup_try_charge duplicates get_mem_cgroup_from_mm for charges
which came without a memcg. The only reason seems to be a tiny
optimization when css_tryget is not called if the charge can be
consumed from the stock. Nevertheless css_tryget is very cheap since
it has been reworked to use per-cpu counting so this optimization
doesn't give us anything these days.

So let's drop the code duplication so that the code is more readable.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 50 ++++++--------------------------------------------
 1 file changed, 6 insertions(+), 44 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cc7f3ca3ef34..4f7192bfa5fa 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2731,52 +2731,14 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 again:
 	if (*ptr) { /* css should be a valid one */
 		memcg = *ptr;
-		if (mem_cgroup_is_root(memcg))
-			goto done;
-		if (consume_stock(memcg, nr_pages))
-			goto done;
 		css_get(&memcg->css);
 	} else {
-		struct task_struct *p;
-
-		rcu_read_lock();
-		p = rcu_dereference(mm->owner);
-		/*
-		 * Because we don't have task_lock(), "p" can exit.
-		 * In that case, "memcg" can point to root or p can be NULL with
-		 * race with swapoff. Then, we have small risk of mis-accouning.
-		 * But such kind of mis-account by race always happens because
-		 * we don't have cgroup_mutex(). It's overkill and we allo that
-		 * small race, here.
-		 * (*) swapoff at el will charge against mm-struct not against
-		 * task-struct. So, mm->owner can be NULL.
-		 */
-		memcg = mem_cgroup_from_task(p);
-		if (!memcg)
-			memcg = root_mem_cgroup;
-		if (mem_cgroup_is_root(memcg)) {
-			rcu_read_unlock();
-			goto done;
-		}
-		if (consume_stock(memcg, nr_pages)) {
-			/*
-			 * It seems dagerous to access memcg without css_get().
-			 * But considering how consume_stok works, it's not
-			 * necessary. If consume_stock success, some charges
-			 * from this memcg are cached on this cpu. So, we
-			 * don't need to call css_get()/css_tryget() before
-			 * calling consume_stock().
-			 */
-			rcu_read_unlock();
-			goto done;
-		}
-		/* after here, we may be blocked. we need to get refcnt */
-		if (!css_tryget(&memcg->css)) {
-			rcu_read_unlock();
-			goto again;
-		}
-		rcu_read_unlock();
+		memcg = get_mem_cgroup_from_mm(mm);
 	}
+	if (mem_cgroup_is_root(memcg))
+		goto done;
+	if (consume_stock(memcg, nr_pages))
+		goto done;
 
 	do {
 		bool invoke_oom = oom && !nr_oom_retries;
@@ -2812,8 +2774,8 @@ again:
 
 	if (batch > nr_pages)
 		refill_stock(memcg, batch - nr_pages);
-	css_put(&memcg->css);
 done:
+	css_put(&memcg->css);
 	*ptr = memcg;
 	return 0;
 nomem:
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol
  2014-03-12  1:28 ` Johannes Weiner
@ 2014-03-12  1:28   ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Some callsites pass a memcg directly, some callsites pass a mm that
first has to be translated to an mm.  This makes for a terrible
function interface.

Just push the mm-to-memcg translation into the respective callsites
and always pass a memcg to mem_cgroup_try_charge().

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 184 +++++++++++++++++++++++++-------------------------------
 1 file changed, 83 insertions(+), 101 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 4f7192bfa5fa..876598b4505b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2609,7 +2609,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
 }
 
 
-/* See __mem_cgroup_try_charge() for details */
+/* See mem_cgroup_try_charge() for details */
 enum {
 	CHARGE_OK,		/* success */
 	CHARGE_RETRY,		/* need to retry but retry is not bad */
@@ -2682,45 +2682,34 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	return CHARGE_NOMEM;
 }
 
-/*
- * __mem_cgroup_try_charge() does
- * 1. detect memcg to be charged against from passed *mm and *ptr,
- * 2. update res_counter
- * 3. call memory reclaim if necessary.
- *
- * In some special case, if the task is fatal, fatal_signal_pending() or
- * has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
- * to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
- * as possible without any hazards. 2: all pages should have a valid
- * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
- * pointer, that is treated as a charge to root_mem_cgroup.
- *
- * So __mem_cgroup_try_charge() will return
- *  0       ...  on success, filling *ptr with a valid memcg pointer.
- *  -ENOMEM ...  charge failure because of resource limits.
- *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
+/**
+ * mem_cgroup_try_charge - try charging a memcg
+ * @memcg: memcg to charge
+ * @nr_pages: number of pages to charge
+ * @oom: trigger OOM if reclaim fails
  *
- * Unlike the exported interface, an "oom" parameter is added. if oom==true,
- * the oom-killer can be invoked.
+ * Returns 0 if @memcg was charged successfully, -EINTR if the charge
+ * was bypassed to root_mem_cgroup, and -ENOMEM if the charge failed.
  */
-static int __mem_cgroup_try_charge(struct mm_struct *mm,
-				   gfp_t gfp_mask,
-				   unsigned int nr_pages,
-				   struct mem_cgroup **ptr,
-				   bool oom)
+static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
+				 gfp_t gfp_mask,
+				 unsigned int nr_pages,
+				 bool oom)
 {
 	unsigned int batch = max(CHARGE_BATCH, nr_pages);
 	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
-	struct mem_cgroup *memcg = NULL;
 	int ret;
 
+	if (mem_cgroup_is_root(memcg))
+		goto done;
 	/*
-	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
-	 * in system level. So, allow to go ahead dying process in addition to
-	 * MEMDIE process.
+	 * Unlike in global OOM situations, memcg is not in a physical
+	 * memory shortage.  Allow dying and OOM-killed tasks to
+	 * bypass the last charges so that they can exit quickly and
+	 * free their memory.
 	 */
-	if (unlikely(test_thread_flag(TIF_MEMDIE)
-		     || fatal_signal_pending(current)))
+	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
+		     fatal_signal_pending(current)))
 		goto bypass;
 
 	if (unlikely(task_in_memcg_oom(current)))
@@ -2729,14 +2718,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 	if (gfp_mask & __GFP_NOFAIL)
 		oom = false;
 again:
-	if (*ptr) { /* css should be a valid one */
-		memcg = *ptr;
-		css_get(&memcg->css);
-	} else {
-		memcg = get_mem_cgroup_from_mm(mm);
-	}
-	if (mem_cgroup_is_root(memcg))
-		goto done;
 	if (consume_stock(memcg, nr_pages))
 		goto done;
 
@@ -2744,10 +2725,8 @@ again:
 		bool invoke_oom = oom && !nr_oom_retries;
 
 		/* If killed, bypass charge */
-		if (fatal_signal_pending(current)) {
-			css_put(&memcg->css);
+		if (fatal_signal_pending(current))
 			goto bypass;
-		}
 
 		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
 					   nr_pages, invoke_oom);
@@ -2756,17 +2735,12 @@ again:
 			break;
 		case CHARGE_RETRY: /* not in OOM situation but retry */
 			batch = nr_pages;
-			css_put(&memcg->css);
-			memcg = NULL;
 			goto again;
 		case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */
-			css_put(&memcg->css);
 			goto nomem;
 		case CHARGE_NOMEM: /* OOM routine works */
-			if (!oom || invoke_oom) {
-				css_put(&memcg->css);
+			if (!oom || invoke_oom)
 				goto nomem;
-			}
 			nr_oom_retries--;
 			break;
 		}
@@ -2775,16 +2749,11 @@ again:
 	if (batch > nr_pages)
 		refill_stock(memcg, batch - nr_pages);
 done:
-	css_put(&memcg->css);
-	*ptr = memcg;
 	return 0;
 nomem:
-	if (!(gfp_mask & __GFP_NOFAIL)) {
-		*ptr = NULL;
+	if (!(gfp_mask & __GFP_NOFAIL))
 		return -ENOMEM;
-	}
 bypass:
-	*ptr = root_mem_cgroup;
 	return -EINTR;
 }
 
@@ -2983,20 +2952,17 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
 static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
 {
 	struct res_counter *fail_res;
-	struct mem_cgroup *_memcg;
 	int ret = 0;
 
 	ret = res_counter_charge(&memcg->kmem, size, &fail_res);
 	if (ret)
 		return ret;
 
-	_memcg = memcg;
-	ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT,
-				      &_memcg, oom_gfp_allowed(gfp));
-
+	ret = mem_cgroup_try_charge(memcg, gfp, size >> PAGE_SHIFT,
+				    oom_gfp_allowed(gfp));
 	if (ret == -EINTR)  {
 		/*
-		 * __mem_cgroup_try_charge() chosed to bypass to root due to
+		 * mem_cgroup_try_charge() chosed to bypass to root due to
 		 * OOM kill or fatal signal.  Since our only options are to
 		 * either fail the allocation or charge it to this cgroup, do
 		 * it as a temporary condition. But we can't fail. From a
@@ -3006,7 +2972,7 @@ static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
 		 *
 		 * This condition will only trigger if the task entered
 		 * memcg_charge_kmem in a sane state, but was OOM-killed during
-		 * __mem_cgroup_try_charge() above. Tasks that were already
+		 * mem_cgroup_try_charge() above. Tasks that were already
 		 * dying when the allocation triggers should have been already
 		 * directed to the root cgroup in memcontrol.h
 		 */
@@ -3858,8 +3824,8 @@ out:
 int mem_cgroup_newpage_charge(struct page *page,
 			      struct mm_struct *mm, gfp_t gfp_mask)
 {
-	struct mem_cgroup *memcg = NULL;
 	unsigned int nr_pages = 1;
+	struct mem_cgroup *memcg;
 	bool oom = true;
 	int ret;
 
@@ -3880,8 +3846,12 @@ int mem_cgroup_newpage_charge(struct page *page,
 		oom = false;
 	}
 
-	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
-	if (ret == -ENOMEM)
+	memcg = get_mem_cgroup_from_mm(mm);
+	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
+	css_put(&memcg->css);
+	if (ret == -EINTR)
+		memcg = root_mem_cgroup;
+	else if (ret)
 		return ret;
 	__mem_cgroup_commit_charge(memcg, page, nr_pages,
 				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
@@ -3899,7 +3869,7 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 					  gfp_t mask,
 					  struct mem_cgroup **memcgp)
 {
-	struct mem_cgroup *memcg;
+	struct mem_cgroup *memcg = NULL;
 	struct page_cgroup *pc;
 	int ret;
 
@@ -3912,31 +3882,29 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 	 * in turn serializes uncharging.
 	 */
 	if (PageCgroupUsed(pc))
-		return 0;
-	if (!do_swap_account)
-		goto charge_cur_mm;
-	memcg = try_get_mem_cgroup_from_page(page);
+		goto out;
+	if (do_swap_account)
+		memcg = try_get_mem_cgroup_from_page(page);
 	if (!memcg)
-		goto charge_cur_mm;
-	*memcgp = memcg;
-	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
+		memcg = get_mem_cgroup_from_mm(mm);
+	ret = mem_cgroup_try_charge(memcg, mask, 1, true);
 	css_put(&memcg->css);
 	if (ret == -EINTR)
-		ret = 0;
-	return ret;
-charge_cur_mm:
-	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
-	if (ret == -EINTR)
-		ret = 0;
-	return ret;
+		memcg = root_mem_cgroup;
+	else if (ret)
+		return ret;
+out:
+	*memcgp = memcg;
+	return 0;
 }
 
 int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
 {
-	*memcgp = NULL;
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled()) {
+		*memcgp = NULL;
 		return 0;
+	}
 	/*
 	 * A racing thread's fault, or swapoff, may have already
 	 * updated the pte, and even removed page from swap cache: in
@@ -3944,12 +3912,18 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 	 * there's also a KSM case which does need to charge the page.
 	 */
 	if (!PageSwapCache(page)) {
+		struct mem_cgroup *memcg;
 		int ret;
 
-		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
+		memcg = get_mem_cgroup_from_mm(mm);
+		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
+		css_put(&memcg->css);
 		if (ret == -EINTR)
-			ret = 0;
-		return ret;
+			memcg = root_mem_cgroup;
+		else if (ret)
+			return ret;
+		*memcgp = memcg;
+		return 0;
 	}
 	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
 }
@@ -3996,8 +3970,8 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
 int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask)
 {
-	struct mem_cgroup *memcg = NULL;
 	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
+	struct mem_cgroup *memcg;
 	int ret;
 
 	if (mem_cgroup_disabled())
@@ -4005,23 +3979,32 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (PageCompound(page))
 		return 0;
 
-	if (!PageSwapCache(page)) {
-		/*
-		 * Page cache insertions can happen without an actual
-		 * task context, e.g. during disk probing on boot.
-		 */
-		if (!mm)
-			memcg = root_mem_cgroup;
-		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
-		if (ret != -ENOMEM)
-			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
-	} else { /* page is swapcache/shmem */
+	if (PageSwapCache(page)) { /* shmem */
 		ret = __mem_cgroup_try_charge_swapin(mm, page,
 						     gfp_mask, &memcg);
-		if (!ret)
-			__mem_cgroup_commit_charge_swapin(page, memcg, type);
+		if (ret)
+			return ret;
+		__mem_cgroup_commit_charge_swapin(page, memcg, type);
+		return 0;
 	}
-	return ret;
+
+	/*
+	 * Page cache insertions can happen without an actual mm
+	 * context, e.g. during disk probing on boot.
+	 */
+	if (unlikely(!mm))
+		memcg = root_mem_cgroup;
+	else {
+		memcg = get_mem_cgroup_from_mm(mm);
+		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
+		css_put(&memcg->css);
+		if (ret == -EINTR)
+			memcg = root_mem_cgroup;
+		else if (ret)
+			return ret;
+	}
+	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
+	return 0;
 }
 
 static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
@@ -6635,8 +6618,7 @@ one_by_one:
 			batch_count = PRECHARGE_COUNT_AT_ONCE;
 			cond_resched();
 		}
-		ret = __mem_cgroup_try_charge(NULL,
-					GFP_KERNEL, 1, &memcg, false);
+		ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
 		if (ret)
 			/* mem_cgroup_clear_mc() will do uncharge later */
 			return ret;
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol
@ 2014-03-12  1:28   ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12  1:28 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Michal Hocko, linux-mm, cgroups, linux-kernel

Some callsites pass a memcg directly, some callsites pass a mm that
first has to be translated to an mm.  This makes for a terrible
function interface.

Just push the mm-to-memcg translation into the respective callsites
and always pass a memcg to mem_cgroup_try_charge().

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 184 +++++++++++++++++++++++++-------------------------------
 1 file changed, 83 insertions(+), 101 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 4f7192bfa5fa..876598b4505b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2609,7 +2609,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
 }
 
 
-/* See __mem_cgroup_try_charge() for details */
+/* See mem_cgroup_try_charge() for details */
 enum {
 	CHARGE_OK,		/* success */
 	CHARGE_RETRY,		/* need to retry but retry is not bad */
@@ -2682,45 +2682,34 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	return CHARGE_NOMEM;
 }
 
-/*
- * __mem_cgroup_try_charge() does
- * 1. detect memcg to be charged against from passed *mm and *ptr,
- * 2. update res_counter
- * 3. call memory reclaim if necessary.
- *
- * In some special case, if the task is fatal, fatal_signal_pending() or
- * has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
- * to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
- * as possible without any hazards. 2: all pages should have a valid
- * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
- * pointer, that is treated as a charge to root_mem_cgroup.
- *
- * So __mem_cgroup_try_charge() will return
- *  0       ...  on success, filling *ptr with a valid memcg pointer.
- *  -ENOMEM ...  charge failure because of resource limits.
- *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
+/**
+ * mem_cgroup_try_charge - try charging a memcg
+ * @memcg: memcg to charge
+ * @nr_pages: number of pages to charge
+ * @oom: trigger OOM if reclaim fails
  *
- * Unlike the exported interface, an "oom" parameter is added. if oom==true,
- * the oom-killer can be invoked.
+ * Returns 0 if @memcg was charged successfully, -EINTR if the charge
+ * was bypassed to root_mem_cgroup, and -ENOMEM if the charge failed.
  */
-static int __mem_cgroup_try_charge(struct mm_struct *mm,
-				   gfp_t gfp_mask,
-				   unsigned int nr_pages,
-				   struct mem_cgroup **ptr,
-				   bool oom)
+static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
+				 gfp_t gfp_mask,
+				 unsigned int nr_pages,
+				 bool oom)
 {
 	unsigned int batch = max(CHARGE_BATCH, nr_pages);
 	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
-	struct mem_cgroup *memcg = NULL;
 	int ret;
 
+	if (mem_cgroup_is_root(memcg))
+		goto done;
 	/*
-	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
-	 * in system level. So, allow to go ahead dying process in addition to
-	 * MEMDIE process.
+	 * Unlike in global OOM situations, memcg is not in a physical
+	 * memory shortage.  Allow dying and OOM-killed tasks to
+	 * bypass the last charges so that they can exit quickly and
+	 * free their memory.
 	 */
-	if (unlikely(test_thread_flag(TIF_MEMDIE)
-		     || fatal_signal_pending(current)))
+	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
+		     fatal_signal_pending(current)))
 		goto bypass;
 
 	if (unlikely(task_in_memcg_oom(current)))
@@ -2729,14 +2718,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
 	if (gfp_mask & __GFP_NOFAIL)
 		oom = false;
 again:
-	if (*ptr) { /* css should be a valid one */
-		memcg = *ptr;
-		css_get(&memcg->css);
-	} else {
-		memcg = get_mem_cgroup_from_mm(mm);
-	}
-	if (mem_cgroup_is_root(memcg))
-		goto done;
 	if (consume_stock(memcg, nr_pages))
 		goto done;
 
@@ -2744,10 +2725,8 @@ again:
 		bool invoke_oom = oom && !nr_oom_retries;
 
 		/* If killed, bypass charge */
-		if (fatal_signal_pending(current)) {
-			css_put(&memcg->css);
+		if (fatal_signal_pending(current))
 			goto bypass;
-		}
 
 		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
 					   nr_pages, invoke_oom);
@@ -2756,17 +2735,12 @@ again:
 			break;
 		case CHARGE_RETRY: /* not in OOM situation but retry */
 			batch = nr_pages;
-			css_put(&memcg->css);
-			memcg = NULL;
 			goto again;
 		case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */
-			css_put(&memcg->css);
 			goto nomem;
 		case CHARGE_NOMEM: /* OOM routine works */
-			if (!oom || invoke_oom) {
-				css_put(&memcg->css);
+			if (!oom || invoke_oom)
 				goto nomem;
-			}
 			nr_oom_retries--;
 			break;
 		}
@@ -2775,16 +2749,11 @@ again:
 	if (batch > nr_pages)
 		refill_stock(memcg, batch - nr_pages);
 done:
-	css_put(&memcg->css);
-	*ptr = memcg;
 	return 0;
 nomem:
-	if (!(gfp_mask & __GFP_NOFAIL)) {
-		*ptr = NULL;
+	if (!(gfp_mask & __GFP_NOFAIL))
 		return -ENOMEM;
-	}
 bypass:
-	*ptr = root_mem_cgroup;
 	return -EINTR;
 }
 
@@ -2983,20 +2952,17 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
 static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
 {
 	struct res_counter *fail_res;
-	struct mem_cgroup *_memcg;
 	int ret = 0;
 
 	ret = res_counter_charge(&memcg->kmem, size, &fail_res);
 	if (ret)
 		return ret;
 
-	_memcg = memcg;
-	ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT,
-				      &_memcg, oom_gfp_allowed(gfp));
-
+	ret = mem_cgroup_try_charge(memcg, gfp, size >> PAGE_SHIFT,
+				    oom_gfp_allowed(gfp));
 	if (ret == -EINTR)  {
 		/*
-		 * __mem_cgroup_try_charge() chosed to bypass to root due to
+		 * mem_cgroup_try_charge() chosed to bypass to root due to
 		 * OOM kill or fatal signal.  Since our only options are to
 		 * either fail the allocation or charge it to this cgroup, do
 		 * it as a temporary condition. But we can't fail. From a
@@ -3006,7 +2972,7 @@ static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
 		 *
 		 * This condition will only trigger if the task entered
 		 * memcg_charge_kmem in a sane state, but was OOM-killed during
-		 * __mem_cgroup_try_charge() above. Tasks that were already
+		 * mem_cgroup_try_charge() above. Tasks that were already
 		 * dying when the allocation triggers should have been already
 		 * directed to the root cgroup in memcontrol.h
 		 */
@@ -3858,8 +3824,8 @@ out:
 int mem_cgroup_newpage_charge(struct page *page,
 			      struct mm_struct *mm, gfp_t gfp_mask)
 {
-	struct mem_cgroup *memcg = NULL;
 	unsigned int nr_pages = 1;
+	struct mem_cgroup *memcg;
 	bool oom = true;
 	int ret;
 
@@ -3880,8 +3846,12 @@ int mem_cgroup_newpage_charge(struct page *page,
 		oom = false;
 	}
 
-	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
-	if (ret == -ENOMEM)
+	memcg = get_mem_cgroup_from_mm(mm);
+	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
+	css_put(&memcg->css);
+	if (ret == -EINTR)
+		memcg = root_mem_cgroup;
+	else if (ret)
 		return ret;
 	__mem_cgroup_commit_charge(memcg, page, nr_pages,
 				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
@@ -3899,7 +3869,7 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 					  gfp_t mask,
 					  struct mem_cgroup **memcgp)
 {
-	struct mem_cgroup *memcg;
+	struct mem_cgroup *memcg = NULL;
 	struct page_cgroup *pc;
 	int ret;
 
@@ -3912,31 +3882,29 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
 	 * in turn serializes uncharging.
 	 */
 	if (PageCgroupUsed(pc))
-		return 0;
-	if (!do_swap_account)
-		goto charge_cur_mm;
-	memcg = try_get_mem_cgroup_from_page(page);
+		goto out;
+	if (do_swap_account)
+		memcg = try_get_mem_cgroup_from_page(page);
 	if (!memcg)
-		goto charge_cur_mm;
-	*memcgp = memcg;
-	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
+		memcg = get_mem_cgroup_from_mm(mm);
+	ret = mem_cgroup_try_charge(memcg, mask, 1, true);
 	css_put(&memcg->css);
 	if (ret == -EINTR)
-		ret = 0;
-	return ret;
-charge_cur_mm:
-	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
-	if (ret == -EINTR)
-		ret = 0;
-	return ret;
+		memcg = root_mem_cgroup;
+	else if (ret)
+		return ret;
+out:
+	*memcgp = memcg;
+	return 0;
 }
 
 int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
 {
-	*memcgp = NULL;
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled()) {
+		*memcgp = NULL;
 		return 0;
+	}
 	/*
 	 * A racing thread's fault, or swapoff, may have already
 	 * updated the pte, and even removed page from swap cache: in
@@ -3944,12 +3912,18 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 	 * there's also a KSM case which does need to charge the page.
 	 */
 	if (!PageSwapCache(page)) {
+		struct mem_cgroup *memcg;
 		int ret;
 
-		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
+		memcg = get_mem_cgroup_from_mm(mm);
+		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
+		css_put(&memcg->css);
 		if (ret == -EINTR)
-			ret = 0;
-		return ret;
+			memcg = root_mem_cgroup;
+		else if (ret)
+			return ret;
+		*memcgp = memcg;
+		return 0;
 	}
 	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
 }
@@ -3996,8 +3970,8 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
 int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask)
 {
-	struct mem_cgroup *memcg = NULL;
 	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
+	struct mem_cgroup *memcg;
 	int ret;
 
 	if (mem_cgroup_disabled())
@@ -4005,23 +3979,32 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (PageCompound(page))
 		return 0;
 
-	if (!PageSwapCache(page)) {
-		/*
-		 * Page cache insertions can happen without an actual
-		 * task context, e.g. during disk probing on boot.
-		 */
-		if (!mm)
-			memcg = root_mem_cgroup;
-		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
-		if (ret != -ENOMEM)
-			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
-	} else { /* page is swapcache/shmem */
+	if (PageSwapCache(page)) { /* shmem */
 		ret = __mem_cgroup_try_charge_swapin(mm, page,
 						     gfp_mask, &memcg);
-		if (!ret)
-			__mem_cgroup_commit_charge_swapin(page, memcg, type);
+		if (ret)
+			return ret;
+		__mem_cgroup_commit_charge_swapin(page, memcg, type);
+		return 0;
 	}
-	return ret;
+
+	/*
+	 * Page cache insertions can happen without an actual mm
+	 * context, e.g. during disk probing on boot.
+	 */
+	if (unlikely(!mm))
+		memcg = root_mem_cgroup;
+	else {
+		memcg = get_mem_cgroup_from_mm(mm);
+		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
+		css_put(&memcg->css);
+		if (ret == -EINTR)
+			memcg = root_mem_cgroup;
+		else if (ret)
+			return ret;
+	}
+	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
+	return 0;
 }
 
 static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
@@ -6635,8 +6618,7 @@ one_by_one:
 			batch_count = PRECHARGE_COUNT_AT_ONCE;
 			cond_resched();
 		}
-		ret = __mem_cgroup_try_charge(NULL,
-					GFP_KERNEL, 1, &memcg, false);
+		ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
 		if (ret)
 			/* mem_cgroup_clear_mc() will do uncharge later */
 			return ret;
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
  2014-03-12  1:28   ` Johannes Weiner
@ 2014-03-12 12:52     ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 12:52 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Tue 11-03-14 21:28:29, Johannes Weiner wrote:
[...]
> @@ -3919,20 +3919,21 @@ out:
>  	return ret;
>  }
>  
> -/*
> - * Charge the memory controller for page usage.
> - * Return
> - * 0 if the charge was successful
> - * < 0 if the cgroup is over its limit
> - */
> -static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
> -				gfp_t gfp_mask, enum charge_type ctype)
> +int mem_cgroup_newpage_charge(struct page *page,
> +			      struct mm_struct *mm, gfp_t gfp_mask)

s/mem_cgroup_newpage_charge/mem_cgroup_anon_charge/ ?

Would be a better name? The patch would be bigger but the name more
apparent...

Other than that I am good with this. Without (preferably) or without
rename:
Acked-by: Michal Hocko <mhocko@suse.cz>

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
@ 2014-03-12 12:52     ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 12:52 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Tue 11-03-14 21:28:29, Johannes Weiner wrote:
[...]
> @@ -3919,20 +3919,21 @@ out:
>  	return ret;
>  }
>  
> -/*
> - * Charge the memory controller for page usage.
> - * Return
> - * 0 if the charge was successful
> - * < 0 if the cgroup is over its limit
> - */
> -static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
> -				gfp_t gfp_mask, enum charge_type ctype)
> +int mem_cgroup_newpage_charge(struct page *page,
> +			      struct mm_struct *mm, gfp_t gfp_mask)

s/mem_cgroup_newpage_charge/mem_cgroup_anon_charge/ ?

Would be a better name? The patch would be bigger but the name more
apparent...

Other than that I am good with this. Without (preferably) or without
rename:
Acked-by: Michal Hocko <mhocko@suse.cz>

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
  2014-03-12  1:28   ` Johannes Weiner
  (?)
@ 2014-03-12 13:11     ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 13:11 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
[...]
> @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  		return 0;
>  
>  	if (!PageSwapCache(page)) {
> +		/*
> +		 * Page cache insertions can happen without an actual
> +		 * task context, e.g. during disk probing on boot.

We read a page cache during disk probing? I have tried to find such a
code path but failed. Could you point me to such a path, please?
I thought that such probing is done from udev context but I am not
familiar with this area TBH.

Thanks!

> +		 */
> +		if (!mm)
> +			memcg = root_mem_cgroup;
>  		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
>  		if (ret != -ENOMEM)
>  			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> -- 
> 1.9.0
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
@ 2014-03-12 13:11     ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 13:11 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
[...]
> @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  		return 0;
>  
>  	if (!PageSwapCache(page)) {
> +		/*
> +		 * Page cache insertions can happen without an actual
> +		 * task context, e.g. during disk probing on boot.

We read a page cache during disk probing? I have tried to find such a
code path but failed. Could you point me to such a path, please?
I thought that such probing is done from udev context but I am not
familiar with this area TBH.

Thanks!

> +		 */
> +		if (!mm)
> +			memcg = root_mem_cgroup;
>  		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
>  		if (ret != -ENOMEM)
>  			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> -- 
> 1.9.0
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
@ 2014-03-12 13:11     ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 13:11 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
[...]
> @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  		return 0;
>  
>  	if (!PageSwapCache(page)) {
> +		/*
> +		 * Page cache insertions can happen without an actual
> +		 * task context, e.g. during disk probing on boot.

We read a page cache during disk probing? I have tried to find such a
code path but failed. Could you point me to such a path, please?
I thought that such probing is done from udev context but I am not
familiar with this area TBH.

Thanks!

> +		 */
> +		if (!mm)
> +			memcg = root_mem_cgroup;
>  		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
>  		if (ret != -ENOMEM)
>  			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> -- 
> 1.9.0
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol
  2014-03-12  1:28   ` Johannes Weiner
  (?)
@ 2014-03-12 14:01     ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 14:01 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Tue 11-03-14 21:28:34, Johannes Weiner wrote:
> Some callsites pass a memcg directly, some callsites pass a mm that
> first has to be translated to an mm.  This makes for a terrible
> function interface.
> 
> Just push the mm-to-memcg translation into the respective callsites
> and always pass a memcg to mem_cgroup_try_charge().
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

   text    data     bss     dec     hex filename
  39435    5916    4192   49543    c187 mm/memcontrol.o.after
  40466    5916    4192   50574    c58e mm/memcontrol.o.before

1K down very nice. But we can shave off additional ~300B if the the
common mm charging helper as I suggested before:

   text    data     bss     dec     hex filename
  39100    5916    4192   49208    c038 mm/memcontrol.o.mm

commit 7aa420bc051849d85dcf5a091f3619c6b8e33cfb
Author: Michal Hocko <mhocko@suse.cz>
Date:   Wed Mar 12 14:59:06 2014 +0100

    add charge mm helper

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2d7aa3e784d9..67e01b27a021 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2757,6 +2757,35 @@ bypass:
 	return -EINTR;
 }
 
+/**
+ * mem_cgroup_try_charge_mm - try charging a mm
+ * @mm: mm_struct to charge
+ * @nr_pages: number of pages to charge
+ * @oom: trigger OOM if reclaim fails
+ *
+ * Returns the charged mem_cgroup associated with the given mm_struct or
+ * NULL the charge failed.
+ */
+static struct mem_cgroup *mem_cgroup_try_charge_mm(struct mm_struct *mm,
+				 gfp_t gfp_mask,
+				 unsigned int nr_pages,
+				 bool oom)
+
+{
+	struct mem_cgroup *memcg;
+	int ret;
+
+	memcg = get_mem_cgroup_from_mm(mm);
+	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
+	css_put(&memcg->css);
+	if (ret == -EINTR)
+		memcg = root_mem_cgroup;
+	else if (ret)
+		memcg = NULL;
+
+	return memcg;
+}
+
 /*
  * Somemtimes we have to undo a charge we got by try_charge().
  * This function is for that and do uncharge, put css's refcnt.
@@ -3828,7 +3857,6 @@ int mem_cgroup_newpage_charge(struct page *page,
 	unsigned int nr_pages = 1;
 	struct mem_cgroup *memcg;
 	bool oom = true;
-	int ret;
 
 	if (mem_cgroup_disabled())
 		return 0;
@@ -3847,13 +3875,9 @@ int mem_cgroup_newpage_charge(struct page *page,
 		oom = false;
 	}
 
-	memcg = get_mem_cgroup_from_mm(mm);
-	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
-	css_put(&memcg->css);
-	if (ret == -EINTR)
-		memcg = root_mem_cgroup;
-	else if (ret)
-		return ret;
+	memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, nr_pages, oom);
+	if (!memcg)
+		return -ENOMEM;
 	__mem_cgroup_commit_charge(memcg, page, nr_pages,
 				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
 	return 0;
@@ -3914,15 +3938,10 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 	 */
 	if (!PageSwapCache(page)) {
 		struct mem_cgroup *memcg;
-		int ret;
 
-		memcg = get_mem_cgroup_from_mm(mm);
-		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
-		css_put(&memcg->css);
-		if (ret == -EINTR)
-			memcg = root_mem_cgroup;
-		else if (ret)
-			return ret;
+		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
+		if (!memcg)
+			return -ENOMEM;
 		*memcgp = memcg;
 		return 0;
 	}
@@ -3996,13 +4015,9 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (unlikely(!mm))
 		memcg = root_mem_cgroup;
 	else {
-		memcg = get_mem_cgroup_from_mm(mm);
-		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
-		css_put(&memcg->css);
-		if (ret == -EINTR)
-			memcg = root_mem_cgroup;
-		else if (ret)
-			return ret;
+		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
+		if (!memcg)
+			return -ENOMEM;
 	}
 	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
 	return 0;

Anyway to your patch as is. The above can be posted as a separate patch
or folded in as you prefer.

Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/memcontrol.c | 184 +++++++++++++++++++++++++-------------------------------
>  1 file changed, 83 insertions(+), 101 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 4f7192bfa5fa..876598b4505b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2609,7 +2609,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
>  }
>  
>  
> -/* See __mem_cgroup_try_charge() for details */
> +/* See mem_cgroup_try_charge() for details */
>  enum {
>  	CHARGE_OK,		/* success */
>  	CHARGE_RETRY,		/* need to retry but retry is not bad */
> @@ -2682,45 +2682,34 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	return CHARGE_NOMEM;
>  }
>  
> -/*
> - * __mem_cgroup_try_charge() does
> - * 1. detect memcg to be charged against from passed *mm and *ptr,
> - * 2. update res_counter
> - * 3. call memory reclaim if necessary.
> - *
> - * In some special case, if the task is fatal, fatal_signal_pending() or
> - * has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
> - * to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
> - * as possible without any hazards. 2: all pages should have a valid
> - * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
> - * pointer, that is treated as a charge to root_mem_cgroup.
> - *
> - * So __mem_cgroup_try_charge() will return
> - *  0       ...  on success, filling *ptr with a valid memcg pointer.
> - *  -ENOMEM ...  charge failure because of resource limits.
> - *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
> +/**
> + * mem_cgroup_try_charge - try charging a memcg
> + * @memcg: memcg to charge
> + * @nr_pages: number of pages to charge
> + * @oom: trigger OOM if reclaim fails
>   *
> - * Unlike the exported interface, an "oom" parameter is added. if oom==true,
> - * the oom-killer can be invoked.
> + * Returns 0 if @memcg was charged successfully, -EINTR if the charge
> + * was bypassed to root_mem_cgroup, and -ENOMEM if the charge failed.
>   */
> -static int __mem_cgroup_try_charge(struct mm_struct *mm,
> -				   gfp_t gfp_mask,
> -				   unsigned int nr_pages,
> -				   struct mem_cgroup **ptr,
> -				   bool oom)
> +static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
> +				 gfp_t gfp_mask,
> +				 unsigned int nr_pages,
> +				 bool oom)
>  {
>  	unsigned int batch = max(CHARGE_BATCH, nr_pages);
>  	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> -	struct mem_cgroup *memcg = NULL;
>  	int ret;
>  
> +	if (mem_cgroup_is_root(memcg))
> +		goto done;
>  	/*
> -	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
> -	 * in system level. So, allow to go ahead dying process in addition to
> -	 * MEMDIE process.
> +	 * Unlike in global OOM situations, memcg is not in a physical
> +	 * memory shortage.  Allow dying and OOM-killed tasks to
> +	 * bypass the last charges so that they can exit quickly and
> +	 * free their memory.
>  	 */
> -	if (unlikely(test_thread_flag(TIF_MEMDIE)
> -		     || fatal_signal_pending(current)))
> +	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
> +		     fatal_signal_pending(current)))
>  		goto bypass;
>  
>  	if (unlikely(task_in_memcg_oom(current)))
> @@ -2729,14 +2718,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
>  	if (gfp_mask & __GFP_NOFAIL)
>  		oom = false;
>  again:
> -	if (*ptr) { /* css should be a valid one */
> -		memcg = *ptr;
> -		css_get(&memcg->css);
> -	} else {
> -		memcg = get_mem_cgroup_from_mm(mm);
> -	}
> -	if (mem_cgroup_is_root(memcg))
> -		goto done;
>  	if (consume_stock(memcg, nr_pages))
>  		goto done;
>  
> @@ -2744,10 +2725,8 @@ again:
>  		bool invoke_oom = oom && !nr_oom_retries;
>  
>  		/* If killed, bypass charge */
> -		if (fatal_signal_pending(current)) {
> -			css_put(&memcg->css);
> +		if (fatal_signal_pending(current))
>  			goto bypass;
> -		}
>  
>  		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
>  					   nr_pages, invoke_oom);
> @@ -2756,17 +2735,12 @@ again:
>  			break;
>  		case CHARGE_RETRY: /* not in OOM situation but retry */
>  			batch = nr_pages;
> -			css_put(&memcg->css);
> -			memcg = NULL;
>  			goto again;
>  		case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */
> -			css_put(&memcg->css);
>  			goto nomem;
>  		case CHARGE_NOMEM: /* OOM routine works */
> -			if (!oom || invoke_oom) {
> -				css_put(&memcg->css);
> +			if (!oom || invoke_oom)
>  				goto nomem;
> -			}
>  			nr_oom_retries--;
>  			break;
>  		}
> @@ -2775,16 +2749,11 @@ again:
>  	if (batch > nr_pages)
>  		refill_stock(memcg, batch - nr_pages);
>  done:
> -	css_put(&memcg->css);
> -	*ptr = memcg;
>  	return 0;
>  nomem:
> -	if (!(gfp_mask & __GFP_NOFAIL)) {
> -		*ptr = NULL;
> +	if (!(gfp_mask & __GFP_NOFAIL))
>  		return -ENOMEM;
> -	}
>  bypass:
> -	*ptr = root_mem_cgroup;
>  	return -EINTR;
>  }
>  
> @@ -2983,20 +2952,17 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
>  static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
>  {
>  	struct res_counter *fail_res;
> -	struct mem_cgroup *_memcg;
>  	int ret = 0;
>  
>  	ret = res_counter_charge(&memcg->kmem, size, &fail_res);
>  	if (ret)
>  		return ret;
>  
> -	_memcg = memcg;
> -	ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT,
> -				      &_memcg, oom_gfp_allowed(gfp));
> -
> +	ret = mem_cgroup_try_charge(memcg, gfp, size >> PAGE_SHIFT,
> +				    oom_gfp_allowed(gfp));
>  	if (ret == -EINTR)  {
>  		/*
> -		 * __mem_cgroup_try_charge() chosed to bypass to root due to
> +		 * mem_cgroup_try_charge() chosed to bypass to root due to
>  		 * OOM kill or fatal signal.  Since our only options are to
>  		 * either fail the allocation or charge it to this cgroup, do
>  		 * it as a temporary condition. But we can't fail. From a
> @@ -3006,7 +2972,7 @@ static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
>  		 *
>  		 * This condition will only trigger if the task entered
>  		 * memcg_charge_kmem in a sane state, but was OOM-killed during
> -		 * __mem_cgroup_try_charge() above. Tasks that were already
> +		 * mem_cgroup_try_charge() above. Tasks that were already
>  		 * dying when the allocation triggers should have been already
>  		 * directed to the root cgroup in memcontrol.h
>  		 */
> @@ -3858,8 +3824,8 @@ out:
>  int mem_cgroup_newpage_charge(struct page *page,
>  			      struct mm_struct *mm, gfp_t gfp_mask)
>  {
> -	struct mem_cgroup *memcg = NULL;
>  	unsigned int nr_pages = 1;
> +	struct mem_cgroup *memcg;
>  	bool oom = true;
>  	int ret;
>  
> @@ -3880,8 +3846,12 @@ int mem_cgroup_newpage_charge(struct page *page,
>  		oom = false;
>  	}
>  
> -	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
> -	if (ret == -ENOMEM)
> +	memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> +	css_put(&memcg->css);
> +	if (ret == -EINTR)
> +		memcg = root_mem_cgroup;
> +	else if (ret)
>  		return ret;
>  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
>  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
> @@ -3899,7 +3869,7 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
>  					  gfp_t mask,
>  					  struct mem_cgroup **memcgp)
>  {
> -	struct mem_cgroup *memcg;
> +	struct mem_cgroup *memcg = NULL;
>  	struct page_cgroup *pc;
>  	int ret;
>  
> @@ -3912,31 +3882,29 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
>  	 * in turn serializes uncharging.
>  	 */
>  	if (PageCgroupUsed(pc))
> -		return 0;
> -	if (!do_swap_account)
> -		goto charge_cur_mm;
> -	memcg = try_get_mem_cgroup_from_page(page);
> +		goto out;
> +	if (do_swap_account)
> +		memcg = try_get_mem_cgroup_from_page(page);
>  	if (!memcg)
> -		goto charge_cur_mm;
> -	*memcgp = memcg;
> -	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
> +		memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, mask, 1, true);
>  	css_put(&memcg->css);
>  	if (ret == -EINTR)
> -		ret = 0;
> -	return ret;
> -charge_cur_mm:
> -	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> -	if (ret == -EINTR)
> -		ret = 0;
> -	return ret;
> +		memcg = root_mem_cgroup;
> +	else if (ret)
> +		return ret;
> +out:
> +	*memcgp = memcg;
> +	return 0;
>  }
>  
>  int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
>  {
> -	*memcgp = NULL;
> -	if (mem_cgroup_disabled())
> +	if (mem_cgroup_disabled()) {
> +		*memcgp = NULL;
>  		return 0;
> +	}
>  	/*
>  	 * A racing thread's fault, or swapoff, may have already
>  	 * updated the pte, and even removed page from swap cache: in
> @@ -3944,12 +3912,18 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  	 * there's also a KSM case which does need to charge the page.
>  	 */
>  	if (!PageSwapCache(page)) {
> +		struct mem_cgroup *memcg;
>  		int ret;
>  
> -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
> +		memcg = get_mem_cgroup_from_mm(mm);
> +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> +		css_put(&memcg->css);
>  		if (ret == -EINTR)
> -			ret = 0;
> -		return ret;
> +			memcg = root_mem_cgroup;
> +		else if (ret)
> +			return ret;
> +		*memcgp = memcg;
> +		return 0;
>  	}
>  	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
>  }
> @@ -3996,8 +3970,8 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
>  int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  				gfp_t gfp_mask)
>  {
> -	struct mem_cgroup *memcg = NULL;
>  	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
> +	struct mem_cgroup *memcg;
>  	int ret;
>  
>  	if (mem_cgroup_disabled())
> @@ -4005,23 +3979,32 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  	if (PageCompound(page))
>  		return 0;
>  
> -	if (!PageSwapCache(page)) {
> -		/*
> -		 * Page cache insertions can happen without an actual
> -		 * task context, e.g. during disk probing on boot.
> -		 */
> -		if (!mm)
> -			memcg = root_mem_cgroup;
> -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
> -		if (ret != -ENOMEM)
> -			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> -	} else { /* page is swapcache/shmem */
> +	if (PageSwapCache(page)) { /* shmem */
>  		ret = __mem_cgroup_try_charge_swapin(mm, page,
>  						     gfp_mask, &memcg);
> -		if (!ret)
> -			__mem_cgroup_commit_charge_swapin(page, memcg, type);
> +		if (ret)
> +			return ret;
> +		__mem_cgroup_commit_charge_swapin(page, memcg, type);
> +		return 0;
>  	}
> -	return ret;
> +
> +	/*
> +	 * Page cache insertions can happen without an actual mm
> +	 * context, e.g. during disk probing on boot.
> +	 */
> +	if (unlikely(!mm))
> +		memcg = root_mem_cgroup;
> +	else {
> +		memcg = get_mem_cgroup_from_mm(mm);
> +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> +		css_put(&memcg->css);
> +		if (ret == -EINTR)
> +			memcg = root_mem_cgroup;
> +		else if (ret)
> +			return ret;
> +	}
> +	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> +	return 0;
>  }
>  
>  static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
> @@ -6635,8 +6618,7 @@ one_by_one:
>  			batch_count = PRECHARGE_COUNT_AT_ONCE;
>  			cond_resched();
>  		}
> -		ret = __mem_cgroup_try_charge(NULL,
> -					GFP_KERNEL, 1, &memcg, false);
> +		ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
>  		if (ret)
>  			/* mem_cgroup_clear_mc() will do uncharge later */
>  			return ret;
> -- 
> 1.9.0
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol
@ 2014-03-12 14:01     ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 14:01 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Tue 11-03-14 21:28:34, Johannes Weiner wrote:
> Some callsites pass a memcg directly, some callsites pass a mm that
> first has to be translated to an mm.  This makes for a terrible
> function interface.
> 
> Just push the mm-to-memcg translation into the respective callsites
> and always pass a memcg to mem_cgroup_try_charge().
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

   text    data     bss     dec     hex filename
  39435    5916    4192   49543    c187 mm/memcontrol.o.after
  40466    5916    4192   50574    c58e mm/memcontrol.o.before

1K down very nice. But we can shave off additional ~300B if the the
common mm charging helper as I suggested before:

   text    data     bss     dec     hex filename
  39100    5916    4192   49208    c038 mm/memcontrol.o.mm

commit 7aa420bc051849d85dcf5a091f3619c6b8e33cfb
Author: Michal Hocko <mhocko@suse.cz>
Date:   Wed Mar 12 14:59:06 2014 +0100

    add charge mm helper

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2d7aa3e784d9..67e01b27a021 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2757,6 +2757,35 @@ bypass:
 	return -EINTR;
 }
 
+/**
+ * mem_cgroup_try_charge_mm - try charging a mm
+ * @mm: mm_struct to charge
+ * @nr_pages: number of pages to charge
+ * @oom: trigger OOM if reclaim fails
+ *
+ * Returns the charged mem_cgroup associated with the given mm_struct or
+ * NULL the charge failed.
+ */
+static struct mem_cgroup *mem_cgroup_try_charge_mm(struct mm_struct *mm,
+				 gfp_t gfp_mask,
+				 unsigned int nr_pages,
+				 bool oom)
+
+{
+	struct mem_cgroup *memcg;
+	int ret;
+
+	memcg = get_mem_cgroup_from_mm(mm);
+	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
+	css_put(&memcg->css);
+	if (ret == -EINTR)
+		memcg = root_mem_cgroup;
+	else if (ret)
+		memcg = NULL;
+
+	return memcg;
+}
+
 /*
  * Somemtimes we have to undo a charge we got by try_charge().
  * This function is for that and do uncharge, put css's refcnt.
@@ -3828,7 +3857,6 @@ int mem_cgroup_newpage_charge(struct page *page,
 	unsigned int nr_pages = 1;
 	struct mem_cgroup *memcg;
 	bool oom = true;
-	int ret;
 
 	if (mem_cgroup_disabled())
 		return 0;
@@ -3847,13 +3875,9 @@ int mem_cgroup_newpage_charge(struct page *page,
 		oom = false;
 	}
 
-	memcg = get_mem_cgroup_from_mm(mm);
-	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
-	css_put(&memcg->css);
-	if (ret == -EINTR)
-		memcg = root_mem_cgroup;
-	else if (ret)
-		return ret;
+	memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, nr_pages, oom);
+	if (!memcg)
+		return -ENOMEM;
 	__mem_cgroup_commit_charge(memcg, page, nr_pages,
 				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
 	return 0;
@@ -3914,15 +3938,10 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 	 */
 	if (!PageSwapCache(page)) {
 		struct mem_cgroup *memcg;
-		int ret;
 
-		memcg = get_mem_cgroup_from_mm(mm);
-		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
-		css_put(&memcg->css);
-		if (ret == -EINTR)
-			memcg = root_mem_cgroup;
-		else if (ret)
-			return ret;
+		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
+		if (!memcg)
+			return -ENOMEM;
 		*memcgp = memcg;
 		return 0;
 	}
@@ -3996,13 +4015,9 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (unlikely(!mm))
 		memcg = root_mem_cgroup;
 	else {
-		memcg = get_mem_cgroup_from_mm(mm);
-		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
-		css_put(&memcg->css);
-		if (ret == -EINTR)
-			memcg = root_mem_cgroup;
-		else if (ret)
-			return ret;
+		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
+		if (!memcg)
+			return -ENOMEM;
 	}
 	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
 	return 0;

Anyway to your patch as is. The above can be posted as a separate patch
or folded in as you prefer.

Acked-by: Michal Hocko <mhocko@suse.cz>

> ---
>  mm/memcontrol.c | 184 +++++++++++++++++++++++++-------------------------------
>  1 file changed, 83 insertions(+), 101 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 4f7192bfa5fa..876598b4505b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2609,7 +2609,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
>  }
>  
>  
> -/* See __mem_cgroup_try_charge() for details */
> +/* See mem_cgroup_try_charge() for details */
>  enum {
>  	CHARGE_OK,		/* success */
>  	CHARGE_RETRY,		/* need to retry but retry is not bad */
> @@ -2682,45 +2682,34 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	return CHARGE_NOMEM;
>  }
>  
> -/*
> - * __mem_cgroup_try_charge() does
> - * 1. detect memcg to be charged against from passed *mm and *ptr,
> - * 2. update res_counter
> - * 3. call memory reclaim if necessary.
> - *
> - * In some special case, if the task is fatal, fatal_signal_pending() or
> - * has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
> - * to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
> - * as possible without any hazards. 2: all pages should have a valid
> - * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
> - * pointer, that is treated as a charge to root_mem_cgroup.
> - *
> - * So __mem_cgroup_try_charge() will return
> - *  0       ...  on success, filling *ptr with a valid memcg pointer.
> - *  -ENOMEM ...  charge failure because of resource limits.
> - *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
> +/**
> + * mem_cgroup_try_charge - try charging a memcg
> + * @memcg: memcg to charge
> + * @nr_pages: number of pages to charge
> + * @oom: trigger OOM if reclaim fails
>   *
> - * Unlike the exported interface, an "oom" parameter is added. if oom==true,
> - * the oom-killer can be invoked.
> + * Returns 0 if @memcg was charged successfully, -EINTR if the charge
> + * was bypassed to root_mem_cgroup, and -ENOMEM if the charge failed.
>   */
> -static int __mem_cgroup_try_charge(struct mm_struct *mm,
> -				   gfp_t gfp_mask,
> -				   unsigned int nr_pages,
> -				   struct mem_cgroup **ptr,
> -				   bool oom)
> +static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
> +				 gfp_t gfp_mask,
> +				 unsigned int nr_pages,
> +				 bool oom)
>  {
>  	unsigned int batch = max(CHARGE_BATCH, nr_pages);
>  	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> -	struct mem_cgroup *memcg = NULL;
>  	int ret;
>  
> +	if (mem_cgroup_is_root(memcg))
> +		goto done;
>  	/*
> -	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
> -	 * in system level. So, allow to go ahead dying process in addition to
> -	 * MEMDIE process.
> +	 * Unlike in global OOM situations, memcg is not in a physical
> +	 * memory shortage.  Allow dying and OOM-killed tasks to
> +	 * bypass the last charges so that they can exit quickly and
> +	 * free their memory.
>  	 */
> -	if (unlikely(test_thread_flag(TIF_MEMDIE)
> -		     || fatal_signal_pending(current)))
> +	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
> +		     fatal_signal_pending(current)))
>  		goto bypass;
>  
>  	if (unlikely(task_in_memcg_oom(current)))
> @@ -2729,14 +2718,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
>  	if (gfp_mask & __GFP_NOFAIL)
>  		oom = false;
>  again:
> -	if (*ptr) { /* css should be a valid one */
> -		memcg = *ptr;
> -		css_get(&memcg->css);
> -	} else {
> -		memcg = get_mem_cgroup_from_mm(mm);
> -	}
> -	if (mem_cgroup_is_root(memcg))
> -		goto done;
>  	if (consume_stock(memcg, nr_pages))
>  		goto done;
>  
> @@ -2744,10 +2725,8 @@ again:
>  		bool invoke_oom = oom && !nr_oom_retries;
>  
>  		/* If killed, bypass charge */
> -		if (fatal_signal_pending(current)) {
> -			css_put(&memcg->css);
> +		if (fatal_signal_pending(current))
>  			goto bypass;
> -		}
>  
>  		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
>  					   nr_pages, invoke_oom);
> @@ -2756,17 +2735,12 @@ again:
>  			break;
>  		case CHARGE_RETRY: /* not in OOM situation but retry */
>  			batch = nr_pages;
> -			css_put(&memcg->css);
> -			memcg = NULL;
>  			goto again;
>  		case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */
> -			css_put(&memcg->css);
>  			goto nomem;
>  		case CHARGE_NOMEM: /* OOM routine works */
> -			if (!oom || invoke_oom) {
> -				css_put(&memcg->css);
> +			if (!oom || invoke_oom)
>  				goto nomem;
> -			}
>  			nr_oom_retries--;
>  			break;
>  		}
> @@ -2775,16 +2749,11 @@ again:
>  	if (batch > nr_pages)
>  		refill_stock(memcg, batch - nr_pages);
>  done:
> -	css_put(&memcg->css);
> -	*ptr = memcg;
>  	return 0;
>  nomem:
> -	if (!(gfp_mask & __GFP_NOFAIL)) {
> -		*ptr = NULL;
> +	if (!(gfp_mask & __GFP_NOFAIL))
>  		return -ENOMEM;
> -	}
>  bypass:
> -	*ptr = root_mem_cgroup;
>  	return -EINTR;
>  }
>  
> @@ -2983,20 +2952,17 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
>  static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
>  {
>  	struct res_counter *fail_res;
> -	struct mem_cgroup *_memcg;
>  	int ret = 0;
>  
>  	ret = res_counter_charge(&memcg->kmem, size, &fail_res);
>  	if (ret)
>  		return ret;
>  
> -	_memcg = memcg;
> -	ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT,
> -				      &_memcg, oom_gfp_allowed(gfp));
> -
> +	ret = mem_cgroup_try_charge(memcg, gfp, size >> PAGE_SHIFT,
> +				    oom_gfp_allowed(gfp));
>  	if (ret == -EINTR)  {
>  		/*
> -		 * __mem_cgroup_try_charge() chosed to bypass to root due to
> +		 * mem_cgroup_try_charge() chosed to bypass to root due to
>  		 * OOM kill or fatal signal.  Since our only options are to
>  		 * either fail the allocation or charge it to this cgroup, do
>  		 * it as a temporary condition. But we can't fail. From a
> @@ -3006,7 +2972,7 @@ static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
>  		 *
>  		 * This condition will only trigger if the task entered
>  		 * memcg_charge_kmem in a sane state, but was OOM-killed during
> -		 * __mem_cgroup_try_charge() above. Tasks that were already
> +		 * mem_cgroup_try_charge() above. Tasks that were already
>  		 * dying when the allocation triggers should have been already
>  		 * directed to the root cgroup in memcontrol.h
>  		 */
> @@ -3858,8 +3824,8 @@ out:
>  int mem_cgroup_newpage_charge(struct page *page,
>  			      struct mm_struct *mm, gfp_t gfp_mask)
>  {
> -	struct mem_cgroup *memcg = NULL;
>  	unsigned int nr_pages = 1;
> +	struct mem_cgroup *memcg;
>  	bool oom = true;
>  	int ret;
>  
> @@ -3880,8 +3846,12 @@ int mem_cgroup_newpage_charge(struct page *page,
>  		oom = false;
>  	}
>  
> -	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
> -	if (ret == -ENOMEM)
> +	memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> +	css_put(&memcg->css);
> +	if (ret == -EINTR)
> +		memcg = root_mem_cgroup;
> +	else if (ret)
>  		return ret;
>  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
>  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
> @@ -3899,7 +3869,7 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
>  					  gfp_t mask,
>  					  struct mem_cgroup **memcgp)
>  {
> -	struct mem_cgroup *memcg;
> +	struct mem_cgroup *memcg = NULL;
>  	struct page_cgroup *pc;
>  	int ret;
>  
> @@ -3912,31 +3882,29 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
>  	 * in turn serializes uncharging.
>  	 */
>  	if (PageCgroupUsed(pc))
> -		return 0;
> -	if (!do_swap_account)
> -		goto charge_cur_mm;
> -	memcg = try_get_mem_cgroup_from_page(page);
> +		goto out;
> +	if (do_swap_account)
> +		memcg = try_get_mem_cgroup_from_page(page);
>  	if (!memcg)
> -		goto charge_cur_mm;
> -	*memcgp = memcg;
> -	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
> +		memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, mask, 1, true);
>  	css_put(&memcg->css);
>  	if (ret == -EINTR)
> -		ret = 0;
> -	return ret;
> -charge_cur_mm:
> -	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> -	if (ret == -EINTR)
> -		ret = 0;
> -	return ret;
> +		memcg = root_mem_cgroup;
> +	else if (ret)
> +		return ret;
> +out:
> +	*memcgp = memcg;
> +	return 0;
>  }
>  
>  int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
>  {
> -	*memcgp = NULL;
> -	if (mem_cgroup_disabled())
> +	if (mem_cgroup_disabled()) {
> +		*memcgp = NULL;
>  		return 0;
> +	}
>  	/*
>  	 * A racing thread's fault, or swapoff, may have already
>  	 * updated the pte, and even removed page from swap cache: in
> @@ -3944,12 +3912,18 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  	 * there's also a KSM case which does need to charge the page.
>  	 */
>  	if (!PageSwapCache(page)) {
> +		struct mem_cgroup *memcg;
>  		int ret;
>  
> -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
> +		memcg = get_mem_cgroup_from_mm(mm);
> +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> +		css_put(&memcg->css);
>  		if (ret == -EINTR)
> -			ret = 0;
> -		return ret;
> +			memcg = root_mem_cgroup;
> +		else if (ret)
> +			return ret;
> +		*memcgp = memcg;
> +		return 0;
>  	}
>  	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
>  }
> @@ -3996,8 +3970,8 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
>  int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  				gfp_t gfp_mask)
>  {
> -	struct mem_cgroup *memcg = NULL;
>  	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
> +	struct mem_cgroup *memcg;
>  	int ret;
>  
>  	if (mem_cgroup_disabled())
> @@ -4005,23 +3979,32 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  	if (PageCompound(page))
>  		return 0;
>  
> -	if (!PageSwapCache(page)) {
> -		/*
> -		 * Page cache insertions can happen without an actual
> -		 * task context, e.g. during disk probing on boot.
> -		 */
> -		if (!mm)
> -			memcg = root_mem_cgroup;
> -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
> -		if (ret != -ENOMEM)
> -			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> -	} else { /* page is swapcache/shmem */
> +	if (PageSwapCache(page)) { /* shmem */
>  		ret = __mem_cgroup_try_charge_swapin(mm, page,
>  						     gfp_mask, &memcg);
> -		if (!ret)
> -			__mem_cgroup_commit_charge_swapin(page, memcg, type);
> +		if (ret)
> +			return ret;
> +		__mem_cgroup_commit_charge_swapin(page, memcg, type);
> +		return 0;
>  	}
> -	return ret;
> +
> +	/*
> +	 * Page cache insertions can happen without an actual mm
> +	 * context, e.g. during disk probing on boot.
> +	 */
> +	if (unlikely(!mm))
> +		memcg = root_mem_cgroup;
> +	else {
> +		memcg = get_mem_cgroup_from_mm(mm);
> +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> +		css_put(&memcg->css);
> +		if (ret == -EINTR)
> +			memcg = root_mem_cgroup;
> +		else if (ret)
> +			return ret;
> +	}
> +	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> +	return 0;
>  }
>  
>  static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
> @@ -6635,8 +6618,7 @@ one_by_one:
>  			batch_count = PRECHARGE_COUNT_AT_ONCE;
>  			cond_resched();
>  		}
> -		ret = __mem_cgroup_try_charge(NULL,
> -					GFP_KERNEL, 1, &memcg, false);
> +		ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
>  		if (ret)
>  			/* mem_cgroup_clear_mc() will do uncharge later */
>  			return ret;
> -- 
> 1.9.0
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol
@ 2014-03-12 14:01     ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 14:01 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Tue 11-03-14 21:28:34, Johannes Weiner wrote:
> Some callsites pass a memcg directly, some callsites pass a mm that
> first has to be translated to an mm.  This makes for a terrible
> function interface.
> 
> Just push the mm-to-memcg translation into the respective callsites
> and always pass a memcg to mem_cgroup_try_charge().
> 
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

   text    data     bss     dec     hex filename
  39435    5916    4192   49543    c187 mm/memcontrol.o.after
  40466    5916    4192   50574    c58e mm/memcontrol.o.before

1K down very nice. But we can shave off additional ~300B if the the
common mm charging helper as I suggested before:

   text    data     bss     dec     hex filename
  39100    5916    4192   49208    c038 mm/memcontrol.o.mm

commit 7aa420bc051849d85dcf5a091f3619c6b8e33cfb
Author: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
Date:   Wed Mar 12 14:59:06 2014 +0100

    add charge mm helper

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2d7aa3e784d9..67e01b27a021 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2757,6 +2757,35 @@ bypass:
 	return -EINTR;
 }
 
+/**
+ * mem_cgroup_try_charge_mm - try charging a mm
+ * @mm: mm_struct to charge
+ * @nr_pages: number of pages to charge
+ * @oom: trigger OOM if reclaim fails
+ *
+ * Returns the charged mem_cgroup associated with the given mm_struct or
+ * NULL the charge failed.
+ */
+static struct mem_cgroup *mem_cgroup_try_charge_mm(struct mm_struct *mm,
+				 gfp_t gfp_mask,
+				 unsigned int nr_pages,
+				 bool oom)
+
+{
+	struct mem_cgroup *memcg;
+	int ret;
+
+	memcg = get_mem_cgroup_from_mm(mm);
+	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
+	css_put(&memcg->css);
+	if (ret == -EINTR)
+		memcg = root_mem_cgroup;
+	else if (ret)
+		memcg = NULL;
+
+	return memcg;
+}
+
 /*
  * Somemtimes we have to undo a charge we got by try_charge().
  * This function is for that and do uncharge, put css's refcnt.
@@ -3828,7 +3857,6 @@ int mem_cgroup_newpage_charge(struct page *page,
 	unsigned int nr_pages = 1;
 	struct mem_cgroup *memcg;
 	bool oom = true;
-	int ret;
 
 	if (mem_cgroup_disabled())
 		return 0;
@@ -3847,13 +3875,9 @@ int mem_cgroup_newpage_charge(struct page *page,
 		oom = false;
 	}
 
-	memcg = get_mem_cgroup_from_mm(mm);
-	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
-	css_put(&memcg->css);
-	if (ret == -EINTR)
-		memcg = root_mem_cgroup;
-	else if (ret)
-		return ret;
+	memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, nr_pages, oom);
+	if (!memcg)
+		return -ENOMEM;
 	__mem_cgroup_commit_charge(memcg, page, nr_pages,
 				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
 	return 0;
@@ -3914,15 +3938,10 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
 	 */
 	if (!PageSwapCache(page)) {
 		struct mem_cgroup *memcg;
-		int ret;
 
-		memcg = get_mem_cgroup_from_mm(mm);
-		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
-		css_put(&memcg->css);
-		if (ret == -EINTR)
-			memcg = root_mem_cgroup;
-		else if (ret)
-			return ret;
+		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
+		if (!memcg)
+			return -ENOMEM;
 		*memcgp = memcg;
 		return 0;
 	}
@@ -3996,13 +4015,9 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
 	if (unlikely(!mm))
 		memcg = root_mem_cgroup;
 	else {
-		memcg = get_mem_cgroup_from_mm(mm);
-		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
-		css_put(&memcg->css);
-		if (ret == -EINTR)
-			memcg = root_mem_cgroup;
-		else if (ret)
-			return ret;
+		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
+		if (!memcg)
+			return -ENOMEM;
 	}
 	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
 	return 0;

Anyway to your patch as is. The above can be posted as a separate patch
or folded in as you prefer.

Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>

> ---
>  mm/memcontrol.c | 184 +++++++++++++++++++++++++-------------------------------
>  1 file changed, 83 insertions(+), 101 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 4f7192bfa5fa..876598b4505b 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2609,7 +2609,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
>  }
>  
>  
> -/* See __mem_cgroup_try_charge() for details */
> +/* See mem_cgroup_try_charge() for details */
>  enum {
>  	CHARGE_OK,		/* success */
>  	CHARGE_RETRY,		/* need to retry but retry is not bad */
> @@ -2682,45 +2682,34 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	return CHARGE_NOMEM;
>  }
>  
> -/*
> - * __mem_cgroup_try_charge() does
> - * 1. detect memcg to be charged against from passed *mm and *ptr,
> - * 2. update res_counter
> - * 3. call memory reclaim if necessary.
> - *
> - * In some special case, if the task is fatal, fatal_signal_pending() or
> - * has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
> - * to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
> - * as possible without any hazards. 2: all pages should have a valid
> - * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
> - * pointer, that is treated as a charge to root_mem_cgroup.
> - *
> - * So __mem_cgroup_try_charge() will return
> - *  0       ...  on success, filling *ptr with a valid memcg pointer.
> - *  -ENOMEM ...  charge failure because of resource limits.
> - *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
> +/**
> + * mem_cgroup_try_charge - try charging a memcg
> + * @memcg: memcg to charge
> + * @nr_pages: number of pages to charge
> + * @oom: trigger OOM if reclaim fails
>   *
> - * Unlike the exported interface, an "oom" parameter is added. if oom==true,
> - * the oom-killer can be invoked.
> + * Returns 0 if @memcg was charged successfully, -EINTR if the charge
> + * was bypassed to root_mem_cgroup, and -ENOMEM if the charge failed.
>   */
> -static int __mem_cgroup_try_charge(struct mm_struct *mm,
> -				   gfp_t gfp_mask,
> -				   unsigned int nr_pages,
> -				   struct mem_cgroup **ptr,
> -				   bool oom)
> +static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
> +				 gfp_t gfp_mask,
> +				 unsigned int nr_pages,
> +				 bool oom)
>  {
>  	unsigned int batch = max(CHARGE_BATCH, nr_pages);
>  	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> -	struct mem_cgroup *memcg = NULL;
>  	int ret;
>  
> +	if (mem_cgroup_is_root(memcg))
> +		goto done;
>  	/*
> -	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
> -	 * in system level. So, allow to go ahead dying process in addition to
> -	 * MEMDIE process.
> +	 * Unlike in global OOM situations, memcg is not in a physical
> +	 * memory shortage.  Allow dying and OOM-killed tasks to
> +	 * bypass the last charges so that they can exit quickly and
> +	 * free their memory.
>  	 */
> -	if (unlikely(test_thread_flag(TIF_MEMDIE)
> -		     || fatal_signal_pending(current)))
> +	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
> +		     fatal_signal_pending(current)))
>  		goto bypass;
>  
>  	if (unlikely(task_in_memcg_oom(current)))
> @@ -2729,14 +2718,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
>  	if (gfp_mask & __GFP_NOFAIL)
>  		oom = false;
>  again:
> -	if (*ptr) { /* css should be a valid one */
> -		memcg = *ptr;
> -		css_get(&memcg->css);
> -	} else {
> -		memcg = get_mem_cgroup_from_mm(mm);
> -	}
> -	if (mem_cgroup_is_root(memcg))
> -		goto done;
>  	if (consume_stock(memcg, nr_pages))
>  		goto done;
>  
> @@ -2744,10 +2725,8 @@ again:
>  		bool invoke_oom = oom && !nr_oom_retries;
>  
>  		/* If killed, bypass charge */
> -		if (fatal_signal_pending(current)) {
> -			css_put(&memcg->css);
> +		if (fatal_signal_pending(current))
>  			goto bypass;
> -		}
>  
>  		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
>  					   nr_pages, invoke_oom);
> @@ -2756,17 +2735,12 @@ again:
>  			break;
>  		case CHARGE_RETRY: /* not in OOM situation but retry */
>  			batch = nr_pages;
> -			css_put(&memcg->css);
> -			memcg = NULL;
>  			goto again;
>  		case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */
> -			css_put(&memcg->css);
>  			goto nomem;
>  		case CHARGE_NOMEM: /* OOM routine works */
> -			if (!oom || invoke_oom) {
> -				css_put(&memcg->css);
> +			if (!oom || invoke_oom)
>  				goto nomem;
> -			}
>  			nr_oom_retries--;
>  			break;
>  		}
> @@ -2775,16 +2749,11 @@ again:
>  	if (batch > nr_pages)
>  		refill_stock(memcg, batch - nr_pages);
>  done:
> -	css_put(&memcg->css);
> -	*ptr = memcg;
>  	return 0;
>  nomem:
> -	if (!(gfp_mask & __GFP_NOFAIL)) {
> -		*ptr = NULL;
> +	if (!(gfp_mask & __GFP_NOFAIL))
>  		return -ENOMEM;
> -	}
>  bypass:
> -	*ptr = root_mem_cgroup;
>  	return -EINTR;
>  }
>  
> @@ -2983,20 +2952,17 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
>  static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
>  {
>  	struct res_counter *fail_res;
> -	struct mem_cgroup *_memcg;
>  	int ret = 0;
>  
>  	ret = res_counter_charge(&memcg->kmem, size, &fail_res);
>  	if (ret)
>  		return ret;
>  
> -	_memcg = memcg;
> -	ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT,
> -				      &_memcg, oom_gfp_allowed(gfp));
> -
> +	ret = mem_cgroup_try_charge(memcg, gfp, size >> PAGE_SHIFT,
> +				    oom_gfp_allowed(gfp));
>  	if (ret == -EINTR)  {
>  		/*
> -		 * __mem_cgroup_try_charge() chosed to bypass to root due to
> +		 * mem_cgroup_try_charge() chosed to bypass to root due to
>  		 * OOM kill or fatal signal.  Since our only options are to
>  		 * either fail the allocation or charge it to this cgroup, do
>  		 * it as a temporary condition. But we can't fail. From a
> @@ -3006,7 +2972,7 @@ static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
>  		 *
>  		 * This condition will only trigger if the task entered
>  		 * memcg_charge_kmem in a sane state, but was OOM-killed during
> -		 * __mem_cgroup_try_charge() above. Tasks that were already
> +		 * mem_cgroup_try_charge() above. Tasks that were already
>  		 * dying when the allocation triggers should have been already
>  		 * directed to the root cgroup in memcontrol.h
>  		 */
> @@ -3858,8 +3824,8 @@ out:
>  int mem_cgroup_newpage_charge(struct page *page,
>  			      struct mm_struct *mm, gfp_t gfp_mask)
>  {
> -	struct mem_cgroup *memcg = NULL;
>  	unsigned int nr_pages = 1;
> +	struct mem_cgroup *memcg;
>  	bool oom = true;
>  	int ret;
>  
> @@ -3880,8 +3846,12 @@ int mem_cgroup_newpage_charge(struct page *page,
>  		oom = false;
>  	}
>  
> -	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
> -	if (ret == -ENOMEM)
> +	memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> +	css_put(&memcg->css);
> +	if (ret == -EINTR)
> +		memcg = root_mem_cgroup;
> +	else if (ret)
>  		return ret;
>  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
>  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
> @@ -3899,7 +3869,7 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
>  					  gfp_t mask,
>  					  struct mem_cgroup **memcgp)
>  {
> -	struct mem_cgroup *memcg;
> +	struct mem_cgroup *memcg = NULL;
>  	struct page_cgroup *pc;
>  	int ret;
>  
> @@ -3912,31 +3882,29 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
>  	 * in turn serializes uncharging.
>  	 */
>  	if (PageCgroupUsed(pc))
> -		return 0;
> -	if (!do_swap_account)
> -		goto charge_cur_mm;
> -	memcg = try_get_mem_cgroup_from_page(page);
> +		goto out;
> +	if (do_swap_account)
> +		memcg = try_get_mem_cgroup_from_page(page);
>  	if (!memcg)
> -		goto charge_cur_mm;
> -	*memcgp = memcg;
> -	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
> +		memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, mask, 1, true);
>  	css_put(&memcg->css);
>  	if (ret == -EINTR)
> -		ret = 0;
> -	return ret;
> -charge_cur_mm:
> -	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> -	if (ret == -EINTR)
> -		ret = 0;
> -	return ret;
> +		memcg = root_mem_cgroup;
> +	else if (ret)
> +		return ret;
> +out:
> +	*memcgp = memcg;
> +	return 0;
>  }
>  
>  int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
>  {
> -	*memcgp = NULL;
> -	if (mem_cgroup_disabled())
> +	if (mem_cgroup_disabled()) {
> +		*memcgp = NULL;
>  		return 0;
> +	}
>  	/*
>  	 * A racing thread's fault, or swapoff, may have already
>  	 * updated the pte, and even removed page from swap cache: in
> @@ -3944,12 +3912,18 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  	 * there's also a KSM case which does need to charge the page.
>  	 */
>  	if (!PageSwapCache(page)) {
> +		struct mem_cgroup *memcg;
>  		int ret;
>  
> -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
> +		memcg = get_mem_cgroup_from_mm(mm);
> +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> +		css_put(&memcg->css);
>  		if (ret == -EINTR)
> -			ret = 0;
> -		return ret;
> +			memcg = root_mem_cgroup;
> +		else if (ret)
> +			return ret;
> +		*memcgp = memcg;
> +		return 0;
>  	}
>  	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
>  }
> @@ -3996,8 +3970,8 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
>  int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  				gfp_t gfp_mask)
>  {
> -	struct mem_cgroup *memcg = NULL;
>  	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
> +	struct mem_cgroup *memcg;
>  	int ret;
>  
>  	if (mem_cgroup_disabled())
> @@ -4005,23 +3979,32 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  	if (PageCompound(page))
>  		return 0;
>  
> -	if (!PageSwapCache(page)) {
> -		/*
> -		 * Page cache insertions can happen without an actual
> -		 * task context, e.g. during disk probing on boot.
> -		 */
> -		if (!mm)
> -			memcg = root_mem_cgroup;
> -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
> -		if (ret != -ENOMEM)
> -			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> -	} else { /* page is swapcache/shmem */
> +	if (PageSwapCache(page)) { /* shmem */
>  		ret = __mem_cgroup_try_charge_swapin(mm, page,
>  						     gfp_mask, &memcg);
> -		if (!ret)
> -			__mem_cgroup_commit_charge_swapin(page, memcg, type);
> +		if (ret)
> +			return ret;
> +		__mem_cgroup_commit_charge_swapin(page, memcg, type);
> +		return 0;
>  	}
> -	return ret;
> +
> +	/*
> +	 * Page cache insertions can happen without an actual mm
> +	 * context, e.g. during disk probing on boot.
> +	 */
> +	if (unlikely(!mm))
> +		memcg = root_mem_cgroup;
> +	else {
> +		memcg = get_mem_cgroup_from_mm(mm);
> +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> +		css_put(&memcg->css);
> +		if (ret == -EINTR)
> +			memcg = root_mem_cgroup;
> +		else if (ret)
> +			return ret;
> +	}
> +	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> +	return 0;
>  }
>  
>  static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
> @@ -6635,8 +6618,7 @@ one_by_one:
>  			batch_count = PRECHARGE_COUNT_AT_ONCE;
>  			cond_resched();
>  		}
> -		ret = __mem_cgroup_try_charge(NULL,
> -					GFP_KERNEL, 1, &memcg, false);
> +		ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
>  		if (ret)
>  			/* mem_cgroup_clear_mc() will do uncharge later */
>  			return ret;
> -- 
> 1.9.0
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol
  2014-03-12 14:01     ` Michal Hocko
  (?)
@ 2014-03-12 14:05       ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 14:05 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed 12-03-14 15:01:38, Michal Hocko wrote:
> On Tue 11-03-14 21:28:34, Johannes Weiner wrote:
> > Some callsites pass a memcg directly, some callsites pass a mm that
> > first has to be translated to an mm.  This makes for a terrible
> > function interface.
> > 
> > Just push the mm-to-memcg translation into the respective callsites
> > and always pass a memcg to mem_cgroup_try_charge().
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
>    text    data     bss     dec     hex filename
>   39435    5916    4192   49543    c187 mm/memcontrol.o.after
>   40466    5916    4192   50574    c58e mm/memcontrol.o.before

Just to prevent from confusion, .before is before the series is applied
not just this patch. 

Before _this_ patch size looks like this:
   text    data     bss     dec     hex filename
  39898    5916    4192   50006    c356 mm/memcontrol.o

> 1K down very nice. But we can shave off additional ~300B if the the
> common mm charging helper as I suggested before:
> 
>    text    data     bss     dec     hex filename
>   39100    5916    4192   49208    c038 mm/memcontrol.o.mm
> 
> commit 7aa420bc051849d85dcf5a091f3619c6b8e33cfb
> Author: Michal Hocko <mhocko@suse.cz>
> Date:   Wed Mar 12 14:59:06 2014 +0100
> 
>     add charge mm helper
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2d7aa3e784d9..67e01b27a021 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2757,6 +2757,35 @@ bypass:
>  	return -EINTR;
>  }
>  
> +/**
> + * mem_cgroup_try_charge_mm - try charging a mm
> + * @mm: mm_struct to charge
> + * @nr_pages: number of pages to charge
> + * @oom: trigger OOM if reclaim fails
> + *
> + * Returns the charged mem_cgroup associated with the given mm_struct or
> + * NULL the charge failed.
> + */
> +static struct mem_cgroup *mem_cgroup_try_charge_mm(struct mm_struct *mm,
> +				 gfp_t gfp_mask,
> +				 unsigned int nr_pages,
> +				 bool oom)
> +
> +{
> +	struct mem_cgroup *memcg;
> +	int ret;
> +
> +	memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> +	css_put(&memcg->css);
> +	if (ret == -EINTR)
> +		memcg = root_mem_cgroup;
> +	else if (ret)
> +		memcg = NULL;
> +
> +	return memcg;
> +}
> +
>  /*
>   * Somemtimes we have to undo a charge we got by try_charge().
>   * This function is for that and do uncharge, put css's refcnt.
> @@ -3828,7 +3857,6 @@ int mem_cgroup_newpage_charge(struct page *page,
>  	unsigned int nr_pages = 1;
>  	struct mem_cgroup *memcg;
>  	bool oom = true;
> -	int ret;
>  
>  	if (mem_cgroup_disabled())
>  		return 0;
> @@ -3847,13 +3875,9 @@ int mem_cgroup_newpage_charge(struct page *page,
>  		oom = false;
>  	}
>  
> -	memcg = get_mem_cgroup_from_mm(mm);
> -	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> -	css_put(&memcg->css);
> -	if (ret == -EINTR)
> -		memcg = root_mem_cgroup;
> -	else if (ret)
> -		return ret;
> +	memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, nr_pages, oom);
> +	if (!memcg)
> +		return -ENOMEM;
>  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
>  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
>  	return 0;
> @@ -3914,15 +3938,10 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  	 */
>  	if (!PageSwapCache(page)) {
>  		struct mem_cgroup *memcg;
> -		int ret;
>  
> -		memcg = get_mem_cgroup_from_mm(mm);
> -		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> -		css_put(&memcg->css);
> -		if (ret == -EINTR)
> -			memcg = root_mem_cgroup;
> -		else if (ret)
> -			return ret;
> +		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
> +		if (!memcg)
> +			return -ENOMEM;
>  		*memcgp = memcg;
>  		return 0;
>  	}
> @@ -3996,13 +4015,9 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  	if (unlikely(!mm))
>  		memcg = root_mem_cgroup;
>  	else {
> -		memcg = get_mem_cgroup_from_mm(mm);
> -		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> -		css_put(&memcg->css);
> -		if (ret == -EINTR)
> -			memcg = root_mem_cgroup;
> -		else if (ret)
> -			return ret;
> +		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
> +		if (!memcg)
> +			return -ENOMEM;
>  	}
>  	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
>  	return 0;
> 
> Anyway to your patch as is. The above can be posted as a separate patch
> or folded in as you prefer.
> 
> Acked-by: Michal Hocko <mhocko@suse.cz>
> 
> > ---
> >  mm/memcontrol.c | 184 +++++++++++++++++++++++++-------------------------------
> >  1 file changed, 83 insertions(+), 101 deletions(-)
> > 
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 4f7192bfa5fa..876598b4505b 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2609,7 +2609,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
> >  }
> >  
> >  
> > -/* See __mem_cgroup_try_charge() for details */
> > +/* See mem_cgroup_try_charge() for details */
> >  enum {
> >  	CHARGE_OK,		/* success */
> >  	CHARGE_RETRY,		/* need to retry but retry is not bad */
> > @@ -2682,45 +2682,34 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> >  	return CHARGE_NOMEM;
> >  }
> >  
> > -/*
> > - * __mem_cgroup_try_charge() does
> > - * 1. detect memcg to be charged against from passed *mm and *ptr,
> > - * 2. update res_counter
> > - * 3. call memory reclaim if necessary.
> > - *
> > - * In some special case, if the task is fatal, fatal_signal_pending() or
> > - * has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
> > - * to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
> > - * as possible without any hazards. 2: all pages should have a valid
> > - * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
> > - * pointer, that is treated as a charge to root_mem_cgroup.
> > - *
> > - * So __mem_cgroup_try_charge() will return
> > - *  0       ...  on success, filling *ptr with a valid memcg pointer.
> > - *  -ENOMEM ...  charge failure because of resource limits.
> > - *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
> > +/**
> > + * mem_cgroup_try_charge - try charging a memcg
> > + * @memcg: memcg to charge
> > + * @nr_pages: number of pages to charge
> > + * @oom: trigger OOM if reclaim fails
> >   *
> > - * Unlike the exported interface, an "oom" parameter is added. if oom==true,
> > - * the oom-killer can be invoked.
> > + * Returns 0 if @memcg was charged successfully, -EINTR if the charge
> > + * was bypassed to root_mem_cgroup, and -ENOMEM if the charge failed.
> >   */
> > -static int __mem_cgroup_try_charge(struct mm_struct *mm,
> > -				   gfp_t gfp_mask,
> > -				   unsigned int nr_pages,
> > -				   struct mem_cgroup **ptr,
> > -				   bool oom)
> > +static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
> > +				 gfp_t gfp_mask,
> > +				 unsigned int nr_pages,
> > +				 bool oom)
> >  {
> >  	unsigned int batch = max(CHARGE_BATCH, nr_pages);
> >  	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> > -	struct mem_cgroup *memcg = NULL;
> >  	int ret;
> >  
> > +	if (mem_cgroup_is_root(memcg))
> > +		goto done;
> >  	/*
> > -	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
> > -	 * in system level. So, allow to go ahead dying process in addition to
> > -	 * MEMDIE process.
> > +	 * Unlike in global OOM situations, memcg is not in a physical
> > +	 * memory shortage.  Allow dying and OOM-killed tasks to
> > +	 * bypass the last charges so that they can exit quickly and
> > +	 * free their memory.
> >  	 */
> > -	if (unlikely(test_thread_flag(TIF_MEMDIE)
> > -		     || fatal_signal_pending(current)))
> > +	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
> > +		     fatal_signal_pending(current)))
> >  		goto bypass;
> >  
> >  	if (unlikely(task_in_memcg_oom(current)))
> > @@ -2729,14 +2718,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> >  	if (gfp_mask & __GFP_NOFAIL)
> >  		oom = false;
> >  again:
> > -	if (*ptr) { /* css should be a valid one */
> > -		memcg = *ptr;
> > -		css_get(&memcg->css);
> > -	} else {
> > -		memcg = get_mem_cgroup_from_mm(mm);
> > -	}
> > -	if (mem_cgroup_is_root(memcg))
> > -		goto done;
> >  	if (consume_stock(memcg, nr_pages))
> >  		goto done;
> >  
> > @@ -2744,10 +2725,8 @@ again:
> >  		bool invoke_oom = oom && !nr_oom_retries;
> >  
> >  		/* If killed, bypass charge */
> > -		if (fatal_signal_pending(current)) {
> > -			css_put(&memcg->css);
> > +		if (fatal_signal_pending(current))
> >  			goto bypass;
> > -		}
> >  
> >  		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
> >  					   nr_pages, invoke_oom);
> > @@ -2756,17 +2735,12 @@ again:
> >  			break;
> >  		case CHARGE_RETRY: /* not in OOM situation but retry */
> >  			batch = nr_pages;
> > -			css_put(&memcg->css);
> > -			memcg = NULL;
> >  			goto again;
> >  		case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */
> > -			css_put(&memcg->css);
> >  			goto nomem;
> >  		case CHARGE_NOMEM: /* OOM routine works */
> > -			if (!oom || invoke_oom) {
> > -				css_put(&memcg->css);
> > +			if (!oom || invoke_oom)
> >  				goto nomem;
> > -			}
> >  			nr_oom_retries--;
> >  			break;
> >  		}
> > @@ -2775,16 +2749,11 @@ again:
> >  	if (batch > nr_pages)
> >  		refill_stock(memcg, batch - nr_pages);
> >  done:
> > -	css_put(&memcg->css);
> > -	*ptr = memcg;
> >  	return 0;
> >  nomem:
> > -	if (!(gfp_mask & __GFP_NOFAIL)) {
> > -		*ptr = NULL;
> > +	if (!(gfp_mask & __GFP_NOFAIL))
> >  		return -ENOMEM;
> > -	}
> >  bypass:
> > -	*ptr = root_mem_cgroup;
> >  	return -EINTR;
> >  }
> >  
> > @@ -2983,20 +2952,17 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
> >  static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
> >  {
> >  	struct res_counter *fail_res;
> > -	struct mem_cgroup *_memcg;
> >  	int ret = 0;
> >  
> >  	ret = res_counter_charge(&memcg->kmem, size, &fail_res);
> >  	if (ret)
> >  		return ret;
> >  
> > -	_memcg = memcg;
> > -	ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT,
> > -				      &_memcg, oom_gfp_allowed(gfp));
> > -
> > +	ret = mem_cgroup_try_charge(memcg, gfp, size >> PAGE_SHIFT,
> > +				    oom_gfp_allowed(gfp));
> >  	if (ret == -EINTR)  {
> >  		/*
> > -		 * __mem_cgroup_try_charge() chosed to bypass to root due to
> > +		 * mem_cgroup_try_charge() chosed to bypass to root due to
> >  		 * OOM kill or fatal signal.  Since our only options are to
> >  		 * either fail the allocation or charge it to this cgroup, do
> >  		 * it as a temporary condition. But we can't fail. From a
> > @@ -3006,7 +2972,7 @@ static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
> >  		 *
> >  		 * This condition will only trigger if the task entered
> >  		 * memcg_charge_kmem in a sane state, but was OOM-killed during
> > -		 * __mem_cgroup_try_charge() above. Tasks that were already
> > +		 * mem_cgroup_try_charge() above. Tasks that were already
> >  		 * dying when the allocation triggers should have been already
> >  		 * directed to the root cgroup in memcontrol.h
> >  		 */
> > @@ -3858,8 +3824,8 @@ out:
> >  int mem_cgroup_newpage_charge(struct page *page,
> >  			      struct mm_struct *mm, gfp_t gfp_mask)
> >  {
> > -	struct mem_cgroup *memcg = NULL;
> >  	unsigned int nr_pages = 1;
> > +	struct mem_cgroup *memcg;
> >  	bool oom = true;
> >  	int ret;
> >  
> > @@ -3880,8 +3846,12 @@ int mem_cgroup_newpage_charge(struct page *page,
> >  		oom = false;
> >  	}
> >  
> > -	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
> > -	if (ret == -ENOMEM)
> > +	memcg = get_mem_cgroup_from_mm(mm);
> > +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> > +	css_put(&memcg->css);
> > +	if (ret == -EINTR)
> > +		memcg = root_mem_cgroup;
> > +	else if (ret)
> >  		return ret;
> >  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
> >  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
> > @@ -3899,7 +3869,7 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
> >  					  gfp_t mask,
> >  					  struct mem_cgroup **memcgp)
> >  {
> > -	struct mem_cgroup *memcg;
> > +	struct mem_cgroup *memcg = NULL;
> >  	struct page_cgroup *pc;
> >  	int ret;
> >  
> > @@ -3912,31 +3882,29 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
> >  	 * in turn serializes uncharging.
> >  	 */
> >  	if (PageCgroupUsed(pc))
> > -		return 0;
> > -	if (!do_swap_account)
> > -		goto charge_cur_mm;
> > -	memcg = try_get_mem_cgroup_from_page(page);
> > +		goto out;
> > +	if (do_swap_account)
> > +		memcg = try_get_mem_cgroup_from_page(page);
> >  	if (!memcg)
> > -		goto charge_cur_mm;
> > -	*memcgp = memcg;
> > -	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +	ret = mem_cgroup_try_charge(memcg, mask, 1, true);
> >  	css_put(&memcg->css);
> >  	if (ret == -EINTR)
> > -		ret = 0;
> > -	return ret;
> > -charge_cur_mm:
> > -	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> > -	if (ret == -EINTR)
> > -		ret = 0;
> > -	return ret;
> > +		memcg = root_mem_cgroup;
> > +	else if (ret)
> > +		return ret;
> > +out:
> > +	*memcgp = memcg;
> > +	return 0;
> >  }
> >  
> >  int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
> >  				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
> >  {
> > -	*memcgp = NULL;
> > -	if (mem_cgroup_disabled())
> > +	if (mem_cgroup_disabled()) {
> > +		*memcgp = NULL;
> >  		return 0;
> > +	}
> >  	/*
> >  	 * A racing thread's fault, or swapoff, may have already
> >  	 * updated the pte, and even removed page from swap cache: in
> > @@ -3944,12 +3912,18 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
> >  	 * there's also a KSM case which does need to charge the page.
> >  	 */
> >  	if (!PageSwapCache(page)) {
> > +		struct mem_cgroup *memcg;
> >  		int ret;
> >  
> > -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> > +		css_put(&memcg->css);
> >  		if (ret == -EINTR)
> > -			ret = 0;
> > -		return ret;
> > +			memcg = root_mem_cgroup;
> > +		else if (ret)
> > +			return ret;
> > +		*memcgp = memcg;
> > +		return 0;
> >  	}
> >  	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
> >  }
> > @@ -3996,8 +3970,8 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
> >  int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  				gfp_t gfp_mask)
> >  {
> > -	struct mem_cgroup *memcg = NULL;
> >  	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
> > +	struct mem_cgroup *memcg;
> >  	int ret;
> >  
> >  	if (mem_cgroup_disabled())
> > @@ -4005,23 +3979,32 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  	if (PageCompound(page))
> >  		return 0;
> >  
> > -	if (!PageSwapCache(page)) {
> > -		/*
> > -		 * Page cache insertions can happen without an actual
> > -		 * task context, e.g. during disk probing on boot.
> > -		 */
> > -		if (!mm)
> > -			memcg = root_mem_cgroup;
> > -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
> > -		if (ret != -ENOMEM)
> > -			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> > -	} else { /* page is swapcache/shmem */
> > +	if (PageSwapCache(page)) { /* shmem */
> >  		ret = __mem_cgroup_try_charge_swapin(mm, page,
> >  						     gfp_mask, &memcg);
> > -		if (!ret)
> > -			__mem_cgroup_commit_charge_swapin(page, memcg, type);
> > +		if (ret)
> > +			return ret;
> > +		__mem_cgroup_commit_charge_swapin(page, memcg, type);
> > +		return 0;
> >  	}
> > -	return ret;
> > +
> > +	/*
> > +	 * Page cache insertions can happen without an actual mm
> > +	 * context, e.g. during disk probing on boot.
> > +	 */
> > +	if (unlikely(!mm))
> > +		memcg = root_mem_cgroup;
> > +	else {
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> > +		css_put(&memcg->css);
> > +		if (ret == -EINTR)
> > +			memcg = root_mem_cgroup;
> > +		else if (ret)
> > +			return ret;
> > +	}
> > +	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> > +	return 0;
> >  }
> >  
> >  static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
> > @@ -6635,8 +6618,7 @@ one_by_one:
> >  			batch_count = PRECHARGE_COUNT_AT_ONCE;
> >  			cond_resched();
> >  		}
> > -		ret = __mem_cgroup_try_charge(NULL,
> > -					GFP_KERNEL, 1, &memcg, false);
> > +		ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
> >  		if (ret)
> >  			/* mem_cgroup_clear_mc() will do uncharge later */
> >  			return ret;
> > -- 
> > 1.9.0
> > 
> 
> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol
@ 2014-03-12 14:05       ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 14:05 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed 12-03-14 15:01:38, Michal Hocko wrote:
> On Tue 11-03-14 21:28:34, Johannes Weiner wrote:
> > Some callsites pass a memcg directly, some callsites pass a mm that
> > first has to be translated to an mm.  This makes for a terrible
> > function interface.
> > 
> > Just push the mm-to-memcg translation into the respective callsites
> > and always pass a memcg to mem_cgroup_try_charge().
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
>    text    data     bss     dec     hex filename
>   39435    5916    4192   49543    c187 mm/memcontrol.o.after
>   40466    5916    4192   50574    c58e mm/memcontrol.o.before

Just to prevent from confusion, .before is before the series is applied
not just this patch. 

Before _this_ patch size looks like this:
   text    data     bss     dec     hex filename
  39898    5916    4192   50006    c356 mm/memcontrol.o

> 1K down very nice. But we can shave off additional ~300B if the the
> common mm charging helper as I suggested before:
> 
>    text    data     bss     dec     hex filename
>   39100    5916    4192   49208    c038 mm/memcontrol.o.mm
> 
> commit 7aa420bc051849d85dcf5a091f3619c6b8e33cfb
> Author: Michal Hocko <mhocko@suse.cz>
> Date:   Wed Mar 12 14:59:06 2014 +0100
> 
>     add charge mm helper
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2d7aa3e784d9..67e01b27a021 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2757,6 +2757,35 @@ bypass:
>  	return -EINTR;
>  }
>  
> +/**
> + * mem_cgroup_try_charge_mm - try charging a mm
> + * @mm: mm_struct to charge
> + * @nr_pages: number of pages to charge
> + * @oom: trigger OOM if reclaim fails
> + *
> + * Returns the charged mem_cgroup associated with the given mm_struct or
> + * NULL the charge failed.
> + */
> +static struct mem_cgroup *mem_cgroup_try_charge_mm(struct mm_struct *mm,
> +				 gfp_t gfp_mask,
> +				 unsigned int nr_pages,
> +				 bool oom)
> +
> +{
> +	struct mem_cgroup *memcg;
> +	int ret;
> +
> +	memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> +	css_put(&memcg->css);
> +	if (ret == -EINTR)
> +		memcg = root_mem_cgroup;
> +	else if (ret)
> +		memcg = NULL;
> +
> +	return memcg;
> +}
> +
>  /*
>   * Somemtimes we have to undo a charge we got by try_charge().
>   * This function is for that and do uncharge, put css's refcnt.
> @@ -3828,7 +3857,6 @@ int mem_cgroup_newpage_charge(struct page *page,
>  	unsigned int nr_pages = 1;
>  	struct mem_cgroup *memcg;
>  	bool oom = true;
> -	int ret;
>  
>  	if (mem_cgroup_disabled())
>  		return 0;
> @@ -3847,13 +3875,9 @@ int mem_cgroup_newpage_charge(struct page *page,
>  		oom = false;
>  	}
>  
> -	memcg = get_mem_cgroup_from_mm(mm);
> -	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> -	css_put(&memcg->css);
> -	if (ret == -EINTR)
> -		memcg = root_mem_cgroup;
> -	else if (ret)
> -		return ret;
> +	memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, nr_pages, oom);
> +	if (!memcg)
> +		return -ENOMEM;
>  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
>  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
>  	return 0;
> @@ -3914,15 +3938,10 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  	 */
>  	if (!PageSwapCache(page)) {
>  		struct mem_cgroup *memcg;
> -		int ret;
>  
> -		memcg = get_mem_cgroup_from_mm(mm);
> -		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> -		css_put(&memcg->css);
> -		if (ret == -EINTR)
> -			memcg = root_mem_cgroup;
> -		else if (ret)
> -			return ret;
> +		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
> +		if (!memcg)
> +			return -ENOMEM;
>  		*memcgp = memcg;
>  		return 0;
>  	}
> @@ -3996,13 +4015,9 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  	if (unlikely(!mm))
>  		memcg = root_mem_cgroup;
>  	else {
> -		memcg = get_mem_cgroup_from_mm(mm);
> -		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> -		css_put(&memcg->css);
> -		if (ret == -EINTR)
> -			memcg = root_mem_cgroup;
> -		else if (ret)
> -			return ret;
> +		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
> +		if (!memcg)
> +			return -ENOMEM;
>  	}
>  	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
>  	return 0;
> 
> Anyway to your patch as is. The above can be posted as a separate patch
> or folded in as you prefer.
> 
> Acked-by: Michal Hocko <mhocko@suse.cz>
> 
> > ---
> >  mm/memcontrol.c | 184 +++++++++++++++++++++++++-------------------------------
> >  1 file changed, 83 insertions(+), 101 deletions(-)
> > 
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 4f7192bfa5fa..876598b4505b 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2609,7 +2609,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
> >  }
> >  
> >  
> > -/* See __mem_cgroup_try_charge() for details */
> > +/* See mem_cgroup_try_charge() for details */
> >  enum {
> >  	CHARGE_OK,		/* success */
> >  	CHARGE_RETRY,		/* need to retry but retry is not bad */
> > @@ -2682,45 +2682,34 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> >  	return CHARGE_NOMEM;
> >  }
> >  
> > -/*
> > - * __mem_cgroup_try_charge() does
> > - * 1. detect memcg to be charged against from passed *mm and *ptr,
> > - * 2. update res_counter
> > - * 3. call memory reclaim if necessary.
> > - *
> > - * In some special case, if the task is fatal, fatal_signal_pending() or
> > - * has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
> > - * to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
> > - * as possible without any hazards. 2: all pages should have a valid
> > - * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
> > - * pointer, that is treated as a charge to root_mem_cgroup.
> > - *
> > - * So __mem_cgroup_try_charge() will return
> > - *  0       ...  on success, filling *ptr with a valid memcg pointer.
> > - *  -ENOMEM ...  charge failure because of resource limits.
> > - *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
> > +/**
> > + * mem_cgroup_try_charge - try charging a memcg
> > + * @memcg: memcg to charge
> > + * @nr_pages: number of pages to charge
> > + * @oom: trigger OOM if reclaim fails
> >   *
> > - * Unlike the exported interface, an "oom" parameter is added. if oom==true,
> > - * the oom-killer can be invoked.
> > + * Returns 0 if @memcg was charged successfully, -EINTR if the charge
> > + * was bypassed to root_mem_cgroup, and -ENOMEM if the charge failed.
> >   */
> > -static int __mem_cgroup_try_charge(struct mm_struct *mm,
> > -				   gfp_t gfp_mask,
> > -				   unsigned int nr_pages,
> > -				   struct mem_cgroup **ptr,
> > -				   bool oom)
> > +static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
> > +				 gfp_t gfp_mask,
> > +				 unsigned int nr_pages,
> > +				 bool oom)
> >  {
> >  	unsigned int batch = max(CHARGE_BATCH, nr_pages);
> >  	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> > -	struct mem_cgroup *memcg = NULL;
> >  	int ret;
> >  
> > +	if (mem_cgroup_is_root(memcg))
> > +		goto done;
> >  	/*
> > -	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
> > -	 * in system level. So, allow to go ahead dying process in addition to
> > -	 * MEMDIE process.
> > +	 * Unlike in global OOM situations, memcg is not in a physical
> > +	 * memory shortage.  Allow dying and OOM-killed tasks to
> > +	 * bypass the last charges so that they can exit quickly and
> > +	 * free their memory.
> >  	 */
> > -	if (unlikely(test_thread_flag(TIF_MEMDIE)
> > -		     || fatal_signal_pending(current)))
> > +	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
> > +		     fatal_signal_pending(current)))
> >  		goto bypass;
> >  
> >  	if (unlikely(task_in_memcg_oom(current)))
> > @@ -2729,14 +2718,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> >  	if (gfp_mask & __GFP_NOFAIL)
> >  		oom = false;
> >  again:
> > -	if (*ptr) { /* css should be a valid one */
> > -		memcg = *ptr;
> > -		css_get(&memcg->css);
> > -	} else {
> > -		memcg = get_mem_cgroup_from_mm(mm);
> > -	}
> > -	if (mem_cgroup_is_root(memcg))
> > -		goto done;
> >  	if (consume_stock(memcg, nr_pages))
> >  		goto done;
> >  
> > @@ -2744,10 +2725,8 @@ again:
> >  		bool invoke_oom = oom && !nr_oom_retries;
> >  
> >  		/* If killed, bypass charge */
> > -		if (fatal_signal_pending(current)) {
> > -			css_put(&memcg->css);
> > +		if (fatal_signal_pending(current))
> >  			goto bypass;
> > -		}
> >  
> >  		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
> >  					   nr_pages, invoke_oom);
> > @@ -2756,17 +2735,12 @@ again:
> >  			break;
> >  		case CHARGE_RETRY: /* not in OOM situation but retry */
> >  			batch = nr_pages;
> > -			css_put(&memcg->css);
> > -			memcg = NULL;
> >  			goto again;
> >  		case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */
> > -			css_put(&memcg->css);
> >  			goto nomem;
> >  		case CHARGE_NOMEM: /* OOM routine works */
> > -			if (!oom || invoke_oom) {
> > -				css_put(&memcg->css);
> > +			if (!oom || invoke_oom)
> >  				goto nomem;
> > -			}
> >  			nr_oom_retries--;
> >  			break;
> >  		}
> > @@ -2775,16 +2749,11 @@ again:
> >  	if (batch > nr_pages)
> >  		refill_stock(memcg, batch - nr_pages);
> >  done:
> > -	css_put(&memcg->css);
> > -	*ptr = memcg;
> >  	return 0;
> >  nomem:
> > -	if (!(gfp_mask & __GFP_NOFAIL)) {
> > -		*ptr = NULL;
> > +	if (!(gfp_mask & __GFP_NOFAIL))
> >  		return -ENOMEM;
> > -	}
> >  bypass:
> > -	*ptr = root_mem_cgroup;
> >  	return -EINTR;
> >  }
> >  
> > @@ -2983,20 +2952,17 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
> >  static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
> >  {
> >  	struct res_counter *fail_res;
> > -	struct mem_cgroup *_memcg;
> >  	int ret = 0;
> >  
> >  	ret = res_counter_charge(&memcg->kmem, size, &fail_res);
> >  	if (ret)
> >  		return ret;
> >  
> > -	_memcg = memcg;
> > -	ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT,
> > -				      &_memcg, oom_gfp_allowed(gfp));
> > -
> > +	ret = mem_cgroup_try_charge(memcg, gfp, size >> PAGE_SHIFT,
> > +				    oom_gfp_allowed(gfp));
> >  	if (ret == -EINTR)  {
> >  		/*
> > -		 * __mem_cgroup_try_charge() chosed to bypass to root due to
> > +		 * mem_cgroup_try_charge() chosed to bypass to root due to
> >  		 * OOM kill or fatal signal.  Since our only options are to
> >  		 * either fail the allocation or charge it to this cgroup, do
> >  		 * it as a temporary condition. But we can't fail. From a
> > @@ -3006,7 +2972,7 @@ static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
> >  		 *
> >  		 * This condition will only trigger if the task entered
> >  		 * memcg_charge_kmem in a sane state, but was OOM-killed during
> > -		 * __mem_cgroup_try_charge() above. Tasks that were already
> > +		 * mem_cgroup_try_charge() above. Tasks that were already
> >  		 * dying when the allocation triggers should have been already
> >  		 * directed to the root cgroup in memcontrol.h
> >  		 */
> > @@ -3858,8 +3824,8 @@ out:
> >  int mem_cgroup_newpage_charge(struct page *page,
> >  			      struct mm_struct *mm, gfp_t gfp_mask)
> >  {
> > -	struct mem_cgroup *memcg = NULL;
> >  	unsigned int nr_pages = 1;
> > +	struct mem_cgroup *memcg;
> >  	bool oom = true;
> >  	int ret;
> >  
> > @@ -3880,8 +3846,12 @@ int mem_cgroup_newpage_charge(struct page *page,
> >  		oom = false;
> >  	}
> >  
> > -	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
> > -	if (ret == -ENOMEM)
> > +	memcg = get_mem_cgroup_from_mm(mm);
> > +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> > +	css_put(&memcg->css);
> > +	if (ret == -EINTR)
> > +		memcg = root_mem_cgroup;
> > +	else if (ret)
> >  		return ret;
> >  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
> >  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
> > @@ -3899,7 +3869,7 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
> >  					  gfp_t mask,
> >  					  struct mem_cgroup **memcgp)
> >  {
> > -	struct mem_cgroup *memcg;
> > +	struct mem_cgroup *memcg = NULL;
> >  	struct page_cgroup *pc;
> >  	int ret;
> >  
> > @@ -3912,31 +3882,29 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
> >  	 * in turn serializes uncharging.
> >  	 */
> >  	if (PageCgroupUsed(pc))
> > -		return 0;
> > -	if (!do_swap_account)
> > -		goto charge_cur_mm;
> > -	memcg = try_get_mem_cgroup_from_page(page);
> > +		goto out;
> > +	if (do_swap_account)
> > +		memcg = try_get_mem_cgroup_from_page(page);
> >  	if (!memcg)
> > -		goto charge_cur_mm;
> > -	*memcgp = memcg;
> > -	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +	ret = mem_cgroup_try_charge(memcg, mask, 1, true);
> >  	css_put(&memcg->css);
> >  	if (ret == -EINTR)
> > -		ret = 0;
> > -	return ret;
> > -charge_cur_mm:
> > -	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> > -	if (ret == -EINTR)
> > -		ret = 0;
> > -	return ret;
> > +		memcg = root_mem_cgroup;
> > +	else if (ret)
> > +		return ret;
> > +out:
> > +	*memcgp = memcg;
> > +	return 0;
> >  }
> >  
> >  int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
> >  				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
> >  {
> > -	*memcgp = NULL;
> > -	if (mem_cgroup_disabled())
> > +	if (mem_cgroup_disabled()) {
> > +		*memcgp = NULL;
> >  		return 0;
> > +	}
> >  	/*
> >  	 * A racing thread's fault, or swapoff, may have already
> >  	 * updated the pte, and even removed page from swap cache: in
> > @@ -3944,12 +3912,18 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
> >  	 * there's also a KSM case which does need to charge the page.
> >  	 */
> >  	if (!PageSwapCache(page)) {
> > +		struct mem_cgroup *memcg;
> >  		int ret;
> >  
> > -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> > +		css_put(&memcg->css);
> >  		if (ret == -EINTR)
> > -			ret = 0;
> > -		return ret;
> > +			memcg = root_mem_cgroup;
> > +		else if (ret)
> > +			return ret;
> > +		*memcgp = memcg;
> > +		return 0;
> >  	}
> >  	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
> >  }
> > @@ -3996,8 +3970,8 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
> >  int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  				gfp_t gfp_mask)
> >  {
> > -	struct mem_cgroup *memcg = NULL;
> >  	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
> > +	struct mem_cgroup *memcg;
> >  	int ret;
> >  
> >  	if (mem_cgroup_disabled())
> > @@ -4005,23 +3979,32 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  	if (PageCompound(page))
> >  		return 0;
> >  
> > -	if (!PageSwapCache(page)) {
> > -		/*
> > -		 * Page cache insertions can happen without an actual
> > -		 * task context, e.g. during disk probing on boot.
> > -		 */
> > -		if (!mm)
> > -			memcg = root_mem_cgroup;
> > -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
> > -		if (ret != -ENOMEM)
> > -			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> > -	} else { /* page is swapcache/shmem */
> > +	if (PageSwapCache(page)) { /* shmem */
> >  		ret = __mem_cgroup_try_charge_swapin(mm, page,
> >  						     gfp_mask, &memcg);
> > -		if (!ret)
> > -			__mem_cgroup_commit_charge_swapin(page, memcg, type);
> > +		if (ret)
> > +			return ret;
> > +		__mem_cgroup_commit_charge_swapin(page, memcg, type);
> > +		return 0;
> >  	}
> > -	return ret;
> > +
> > +	/*
> > +	 * Page cache insertions can happen without an actual mm
> > +	 * context, e.g. during disk probing on boot.
> > +	 */
> > +	if (unlikely(!mm))
> > +		memcg = root_mem_cgroup;
> > +	else {
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> > +		css_put(&memcg->css);
> > +		if (ret == -EINTR)
> > +			memcg = root_mem_cgroup;
> > +		else if (ret)
> > +			return ret;
> > +	}
> > +	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> > +	return 0;
> >  }
> >  
> >  static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
> > @@ -6635,8 +6618,7 @@ one_by_one:
> >  			batch_count = PRECHARGE_COUNT_AT_ONCE;
> >  			cond_resched();
> >  		}
> > -		ret = __mem_cgroup_try_charge(NULL,
> > -					GFP_KERNEL, 1, &memcg, false);
> > +		ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
> >  		if (ret)
> >  			/* mem_cgroup_clear_mc() will do uncharge later */
> >  			return ret;
> > -- 
> > 1.9.0
> > 
> 
> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol
@ 2014-03-12 14:05       ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 14:05 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed 12-03-14 15:01:38, Michal Hocko wrote:
> On Tue 11-03-14 21:28:34, Johannes Weiner wrote:
> > Some callsites pass a memcg directly, some callsites pass a mm that
> > first has to be translated to an mm.  This makes for a terrible
> > function interface.
> > 
> > Just push the mm-to-memcg translation into the respective callsites
> > and always pass a memcg to mem_cgroup_try_charge().
> > 
> > Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
> 
>    text    data     bss     dec     hex filename
>   39435    5916    4192   49543    c187 mm/memcontrol.o.after
>   40466    5916    4192   50574    c58e mm/memcontrol.o.before

Just to prevent from confusion, .before is before the series is applied
not just this patch. 

Before _this_ patch size looks like this:
   text    data     bss     dec     hex filename
  39898    5916    4192   50006    c356 mm/memcontrol.o

> 1K down very nice. But we can shave off additional ~300B if the the
> common mm charging helper as I suggested before:
> 
>    text    data     bss     dec     hex filename
>   39100    5916    4192   49208    c038 mm/memcontrol.o.mm
> 
> commit 7aa420bc051849d85dcf5a091f3619c6b8e33cfb
> Author: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Date:   Wed Mar 12 14:59:06 2014 +0100
> 
>     add charge mm helper
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 2d7aa3e784d9..67e01b27a021 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2757,6 +2757,35 @@ bypass:
>  	return -EINTR;
>  }
>  
> +/**
> + * mem_cgroup_try_charge_mm - try charging a mm
> + * @mm: mm_struct to charge
> + * @nr_pages: number of pages to charge
> + * @oom: trigger OOM if reclaim fails
> + *
> + * Returns the charged mem_cgroup associated with the given mm_struct or
> + * NULL the charge failed.
> + */
> +static struct mem_cgroup *mem_cgroup_try_charge_mm(struct mm_struct *mm,
> +				 gfp_t gfp_mask,
> +				 unsigned int nr_pages,
> +				 bool oom)
> +
> +{
> +	struct mem_cgroup *memcg;
> +	int ret;
> +
> +	memcg = get_mem_cgroup_from_mm(mm);
> +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> +	css_put(&memcg->css);
> +	if (ret == -EINTR)
> +		memcg = root_mem_cgroup;
> +	else if (ret)
> +		memcg = NULL;
> +
> +	return memcg;
> +}
> +
>  /*
>   * Somemtimes we have to undo a charge we got by try_charge().
>   * This function is for that and do uncharge, put css's refcnt.
> @@ -3828,7 +3857,6 @@ int mem_cgroup_newpage_charge(struct page *page,
>  	unsigned int nr_pages = 1;
>  	struct mem_cgroup *memcg;
>  	bool oom = true;
> -	int ret;
>  
>  	if (mem_cgroup_disabled())
>  		return 0;
> @@ -3847,13 +3875,9 @@ int mem_cgroup_newpage_charge(struct page *page,
>  		oom = false;
>  	}
>  
> -	memcg = get_mem_cgroup_from_mm(mm);
> -	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> -	css_put(&memcg->css);
> -	if (ret == -EINTR)
> -		memcg = root_mem_cgroup;
> -	else if (ret)
> -		return ret;
> +	memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, nr_pages, oom);
> +	if (!memcg)
> +		return -ENOMEM;
>  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
>  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
>  	return 0;
> @@ -3914,15 +3938,10 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
>  	 */
>  	if (!PageSwapCache(page)) {
>  		struct mem_cgroup *memcg;
> -		int ret;
>  
> -		memcg = get_mem_cgroup_from_mm(mm);
> -		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> -		css_put(&memcg->css);
> -		if (ret == -EINTR)
> -			memcg = root_mem_cgroup;
> -		else if (ret)
> -			return ret;
> +		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
> +		if (!memcg)
> +			return -ENOMEM;
>  		*memcgp = memcg;
>  		return 0;
>  	}
> @@ -3996,13 +4015,9 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
>  	if (unlikely(!mm))
>  		memcg = root_mem_cgroup;
>  	else {
> -		memcg = get_mem_cgroup_from_mm(mm);
> -		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> -		css_put(&memcg->css);
> -		if (ret == -EINTR)
> -			memcg = root_mem_cgroup;
> -		else if (ret)
> -			return ret;
> +		memcg = mem_cgroup_try_charge_mm(mm, gfp_mask, 1, true);
> +		if (!memcg)
> +			return -ENOMEM;
>  	}
>  	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
>  	return 0;
> 
> Anyway to your patch as is. The above can be posted as a separate patch
> or folded in as you prefer.
> 
> Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> 
> > ---
> >  mm/memcontrol.c | 184 +++++++++++++++++++++++++-------------------------------
> >  1 file changed, 83 insertions(+), 101 deletions(-)
> > 
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index 4f7192bfa5fa..876598b4505b 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -2609,7 +2609,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
> >  }
> >  
> >  
> > -/* See __mem_cgroup_try_charge() for details */
> > +/* See mem_cgroup_try_charge() for details */
> >  enum {
> >  	CHARGE_OK,		/* success */
> >  	CHARGE_RETRY,		/* need to retry but retry is not bad */
> > @@ -2682,45 +2682,34 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
> >  	return CHARGE_NOMEM;
> >  }
> >  
> > -/*
> > - * __mem_cgroup_try_charge() does
> > - * 1. detect memcg to be charged against from passed *mm and *ptr,
> > - * 2. update res_counter
> > - * 3. call memory reclaim if necessary.
> > - *
> > - * In some special case, if the task is fatal, fatal_signal_pending() or
> > - * has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
> > - * to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
> > - * as possible without any hazards. 2: all pages should have a valid
> > - * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
> > - * pointer, that is treated as a charge to root_mem_cgroup.
> > - *
> > - * So __mem_cgroup_try_charge() will return
> > - *  0       ...  on success, filling *ptr with a valid memcg pointer.
> > - *  -ENOMEM ...  charge failure because of resource limits.
> > - *  -EINTR  ...  if thread is fatal. *ptr is filled with root_mem_cgroup.
> > +/**
> > + * mem_cgroup_try_charge - try charging a memcg
> > + * @memcg: memcg to charge
> > + * @nr_pages: number of pages to charge
> > + * @oom: trigger OOM if reclaim fails
> >   *
> > - * Unlike the exported interface, an "oom" parameter is added. if oom==true,
> > - * the oom-killer can be invoked.
> > + * Returns 0 if @memcg was charged successfully, -EINTR if the charge
> > + * was bypassed to root_mem_cgroup, and -ENOMEM if the charge failed.
> >   */
> > -static int __mem_cgroup_try_charge(struct mm_struct *mm,
> > -				   gfp_t gfp_mask,
> > -				   unsigned int nr_pages,
> > -				   struct mem_cgroup **ptr,
> > -				   bool oom)
> > +static int mem_cgroup_try_charge(struct mem_cgroup *memcg,
> > +				 gfp_t gfp_mask,
> > +				 unsigned int nr_pages,
> > +				 bool oom)
> >  {
> >  	unsigned int batch = max(CHARGE_BATCH, nr_pages);
> >  	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> > -	struct mem_cgroup *memcg = NULL;
> >  	int ret;
> >  
> > +	if (mem_cgroup_is_root(memcg))
> > +		goto done;
> >  	/*
> > -	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
> > -	 * in system level. So, allow to go ahead dying process in addition to
> > -	 * MEMDIE process.
> > +	 * Unlike in global OOM situations, memcg is not in a physical
> > +	 * memory shortage.  Allow dying and OOM-killed tasks to
> > +	 * bypass the last charges so that they can exit quickly and
> > +	 * free their memory.
> >  	 */
> > -	if (unlikely(test_thread_flag(TIF_MEMDIE)
> > -		     || fatal_signal_pending(current)))
> > +	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
> > +		     fatal_signal_pending(current)))
> >  		goto bypass;
> >  
> >  	if (unlikely(task_in_memcg_oom(current)))
> > @@ -2729,14 +2718,6 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm,
> >  	if (gfp_mask & __GFP_NOFAIL)
> >  		oom = false;
> >  again:
> > -	if (*ptr) { /* css should be a valid one */
> > -		memcg = *ptr;
> > -		css_get(&memcg->css);
> > -	} else {
> > -		memcg = get_mem_cgroup_from_mm(mm);
> > -	}
> > -	if (mem_cgroup_is_root(memcg))
> > -		goto done;
> >  	if (consume_stock(memcg, nr_pages))
> >  		goto done;
> >  
> > @@ -2744,10 +2725,8 @@ again:
> >  		bool invoke_oom = oom && !nr_oom_retries;
> >  
> >  		/* If killed, bypass charge */
> > -		if (fatal_signal_pending(current)) {
> > -			css_put(&memcg->css);
> > +		if (fatal_signal_pending(current))
> >  			goto bypass;
> > -		}
> >  
> >  		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
> >  					   nr_pages, invoke_oom);
> > @@ -2756,17 +2735,12 @@ again:
> >  			break;
> >  		case CHARGE_RETRY: /* not in OOM situation but retry */
> >  			batch = nr_pages;
> > -			css_put(&memcg->css);
> > -			memcg = NULL;
> >  			goto again;
> >  		case CHARGE_WOULDBLOCK: /* !__GFP_WAIT */
> > -			css_put(&memcg->css);
> >  			goto nomem;
> >  		case CHARGE_NOMEM: /* OOM routine works */
> > -			if (!oom || invoke_oom) {
> > -				css_put(&memcg->css);
> > +			if (!oom || invoke_oom)
> >  				goto nomem;
> > -			}
> >  			nr_oom_retries--;
> >  			break;
> >  		}
> > @@ -2775,16 +2749,11 @@ again:
> >  	if (batch > nr_pages)
> >  		refill_stock(memcg, batch - nr_pages);
> >  done:
> > -	css_put(&memcg->css);
> > -	*ptr = memcg;
> >  	return 0;
> >  nomem:
> > -	if (!(gfp_mask & __GFP_NOFAIL)) {
> > -		*ptr = NULL;
> > +	if (!(gfp_mask & __GFP_NOFAIL))
> >  		return -ENOMEM;
> > -	}
> >  bypass:
> > -	*ptr = root_mem_cgroup;
> >  	return -EINTR;
> >  }
> >  
> > @@ -2983,20 +2952,17 @@ static int mem_cgroup_slabinfo_read(struct seq_file *m, void *v)
> >  static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
> >  {
> >  	struct res_counter *fail_res;
> > -	struct mem_cgroup *_memcg;
> >  	int ret = 0;
> >  
> >  	ret = res_counter_charge(&memcg->kmem, size, &fail_res);
> >  	if (ret)
> >  		return ret;
> >  
> > -	_memcg = memcg;
> > -	ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT,
> > -				      &_memcg, oom_gfp_allowed(gfp));
> > -
> > +	ret = mem_cgroup_try_charge(memcg, gfp, size >> PAGE_SHIFT,
> > +				    oom_gfp_allowed(gfp));
> >  	if (ret == -EINTR)  {
> >  		/*
> > -		 * __mem_cgroup_try_charge() chosed to bypass to root due to
> > +		 * mem_cgroup_try_charge() chosed to bypass to root due to
> >  		 * OOM kill or fatal signal.  Since our only options are to
> >  		 * either fail the allocation or charge it to this cgroup, do
> >  		 * it as a temporary condition. But we can't fail. From a
> > @@ -3006,7 +2972,7 @@ static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
> >  		 *
> >  		 * This condition will only trigger if the task entered
> >  		 * memcg_charge_kmem in a sane state, but was OOM-killed during
> > -		 * __mem_cgroup_try_charge() above. Tasks that were already
> > +		 * mem_cgroup_try_charge() above. Tasks that were already
> >  		 * dying when the allocation triggers should have been already
> >  		 * directed to the root cgroup in memcontrol.h
> >  		 */
> > @@ -3858,8 +3824,8 @@ out:
> >  int mem_cgroup_newpage_charge(struct page *page,
> >  			      struct mm_struct *mm, gfp_t gfp_mask)
> >  {
> > -	struct mem_cgroup *memcg = NULL;
> >  	unsigned int nr_pages = 1;
> > +	struct mem_cgroup *memcg;
> >  	bool oom = true;
> >  	int ret;
> >  
> > @@ -3880,8 +3846,12 @@ int mem_cgroup_newpage_charge(struct page *page,
> >  		oom = false;
> >  	}
> >  
> > -	ret = __mem_cgroup_try_charge(mm, gfp_mask, nr_pages, &memcg, oom);
> > -	if (ret == -ENOMEM)
> > +	memcg = get_mem_cgroup_from_mm(mm);
> > +	ret = mem_cgroup_try_charge(memcg, gfp_mask, nr_pages, oom);
> > +	css_put(&memcg->css);
> > +	if (ret == -EINTR)
> > +		memcg = root_mem_cgroup;
> > +	else if (ret)
> >  		return ret;
> >  	__mem_cgroup_commit_charge(memcg, page, nr_pages,
> >  				   MEM_CGROUP_CHARGE_TYPE_ANON, false);
> > @@ -3899,7 +3869,7 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
> >  					  gfp_t mask,
> >  					  struct mem_cgroup **memcgp)
> >  {
> > -	struct mem_cgroup *memcg;
> > +	struct mem_cgroup *memcg = NULL;
> >  	struct page_cgroup *pc;
> >  	int ret;
> >  
> > @@ -3912,31 +3882,29 @@ static int __mem_cgroup_try_charge_swapin(struct mm_struct *mm,
> >  	 * in turn serializes uncharging.
> >  	 */
> >  	if (PageCgroupUsed(pc))
> > -		return 0;
> > -	if (!do_swap_account)
> > -		goto charge_cur_mm;
> > -	memcg = try_get_mem_cgroup_from_page(page);
> > +		goto out;
> > +	if (do_swap_account)
> > +		memcg = try_get_mem_cgroup_from_page(page);
> >  	if (!memcg)
> > -		goto charge_cur_mm;
> > -	*memcgp = memcg;
> > -	ret = __mem_cgroup_try_charge(NULL, mask, 1, memcgp, true);
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +	ret = mem_cgroup_try_charge(memcg, mask, 1, true);
> >  	css_put(&memcg->css);
> >  	if (ret == -EINTR)
> > -		ret = 0;
> > -	return ret;
> > -charge_cur_mm:
> > -	ret = __mem_cgroup_try_charge(mm, mask, 1, memcgp, true);
> > -	if (ret == -EINTR)
> > -		ret = 0;
> > -	return ret;
> > +		memcg = root_mem_cgroup;
> > +	else if (ret)
> > +		return ret;
> > +out:
> > +	*memcgp = memcg;
> > +	return 0;
> >  }
> >  
> >  int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
> >  				 gfp_t gfp_mask, struct mem_cgroup **memcgp)
> >  {
> > -	*memcgp = NULL;
> > -	if (mem_cgroup_disabled())
> > +	if (mem_cgroup_disabled()) {
> > +		*memcgp = NULL;
> >  		return 0;
> > +	}
> >  	/*
> >  	 * A racing thread's fault, or swapoff, may have already
> >  	 * updated the pte, and even removed page from swap cache: in
> > @@ -3944,12 +3912,18 @@ int mem_cgroup_try_charge_swapin(struct mm_struct *mm, struct page *page,
> >  	 * there's also a KSM case which does need to charge the page.
> >  	 */
> >  	if (!PageSwapCache(page)) {
> > +		struct mem_cgroup *memcg;
> >  		int ret;
> >  
> > -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, memcgp, true);
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> > +		css_put(&memcg->css);
> >  		if (ret == -EINTR)
> > -			ret = 0;
> > -		return ret;
> > +			memcg = root_mem_cgroup;
> > +		else if (ret)
> > +			return ret;
> > +		*memcgp = memcg;
> > +		return 0;
> >  	}
> >  	return __mem_cgroup_try_charge_swapin(mm, page, gfp_mask, memcgp);
> >  }
> > @@ -3996,8 +3970,8 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
> >  int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  				gfp_t gfp_mask)
> >  {
> > -	struct mem_cgroup *memcg = NULL;
> >  	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
> > +	struct mem_cgroup *memcg;
> >  	int ret;
> >  
> >  	if (mem_cgroup_disabled())
> > @@ -4005,23 +3979,32 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  	if (PageCompound(page))
> >  		return 0;
> >  
> > -	if (!PageSwapCache(page)) {
> > -		/*
> > -		 * Page cache insertions can happen without an actual
> > -		 * task context, e.g. during disk probing on boot.
> > -		 */
> > -		if (!mm)
> > -			memcg = root_mem_cgroup;
> > -		ret = __mem_cgroup_try_charge(mm, gfp_mask, 1, &memcg, true);
> > -		if (ret != -ENOMEM)
> > -			__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> > -	} else { /* page is swapcache/shmem */
> > +	if (PageSwapCache(page)) { /* shmem */
> >  		ret = __mem_cgroup_try_charge_swapin(mm, page,
> >  						     gfp_mask, &memcg);
> > -		if (!ret)
> > -			__mem_cgroup_commit_charge_swapin(page, memcg, type);
> > +		if (ret)
> > +			return ret;
> > +		__mem_cgroup_commit_charge_swapin(page, memcg, type);
> > +		return 0;
> >  	}
> > -	return ret;
> > +
> > +	/*
> > +	 * Page cache insertions can happen without an actual mm
> > +	 * context, e.g. during disk probing on boot.
> > +	 */
> > +	if (unlikely(!mm))
> > +		memcg = root_mem_cgroup;
> > +	else {
> > +		memcg = get_mem_cgroup_from_mm(mm);
> > +		ret = mem_cgroup_try_charge(memcg, gfp_mask, 1, true);
> > +		css_put(&memcg->css);
> > +		if (ret == -EINTR)
> > +			memcg = root_mem_cgroup;
> > +		else if (ret)
> > +			return ret;
> > +	}
> > +	__mem_cgroup_commit_charge(memcg, page, 1, type, false);
> > +	return 0;
> >  }
> >  
> >  static void mem_cgroup_do_uncharge(struct mem_cgroup *memcg,
> > @@ -6635,8 +6618,7 @@ one_by_one:
> >  			batch_count = PRECHARGE_COUNT_AT_ONCE;
> >  			cond_resched();
> >  		}
> > -		ret = __mem_cgroup_try_charge(NULL,
> > -					GFP_KERNEL, 1, &memcg, false);
> > +		ret = mem_cgroup_try_charge(memcg, GFP_KERNEL, 1, false);
> >  		if (ret)
> >  			/* mem_cgroup_clear_mc() will do uncharge later */
> >  			return ret;
> > -- 
> > 1.9.0
> > 
> 
> -- 
> Michal Hocko
> SUSE Labs
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
  2014-03-12 12:52     ` Michal Hocko
@ 2014-03-12 14:53       ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12 14:53 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed, Mar 12, 2014 at 01:52:13PM +0100, Michal Hocko wrote:
> On Tue 11-03-14 21:28:29, Johannes Weiner wrote:
> [...]
> > @@ -3919,20 +3919,21 @@ out:
> >  	return ret;
> >  }
> >  
> > -/*
> > - * Charge the memory controller for page usage.
> > - * Return
> > - * 0 if the charge was successful
> > - * < 0 if the cgroup is over its limit
> > - */
> > -static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
> > -				gfp_t gfp_mask, enum charge_type ctype)
> > +int mem_cgroup_newpage_charge(struct page *page,
> > +			      struct mm_struct *mm, gfp_t gfp_mask)
> 
> s/mem_cgroup_newpage_charge/mem_cgroup_anon_charge/ ?
> 
> Would be a better name? The patch would be bigger but the name more
> apparent...

I wouldn't be opposed to fixing those names at all, but I think that
is out of the scope of this patch.  Want to send one?

mem_cgroup_charge_anon() would be a good name, but then we should also
rename mem_cgroup_cache_charge() to mem_cgroup_charge_file() to match.

Or charge_private() vs. charge_shared()...

> Other than that I am good with this. Without (preferably) or without
> rename:
> Acked-by: Michal Hocko <mhocko@suse.cz>

Thanks!

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
@ 2014-03-12 14:53       ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12 14:53 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed, Mar 12, 2014 at 01:52:13PM +0100, Michal Hocko wrote:
> On Tue 11-03-14 21:28:29, Johannes Weiner wrote:
> [...]
> > @@ -3919,20 +3919,21 @@ out:
> >  	return ret;
> >  }
> >  
> > -/*
> > - * Charge the memory controller for page usage.
> > - * Return
> > - * 0 if the charge was successful
> > - * < 0 if the cgroup is over its limit
> > - */
> > -static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
> > -				gfp_t gfp_mask, enum charge_type ctype)
> > +int mem_cgroup_newpage_charge(struct page *page,
> > +			      struct mm_struct *mm, gfp_t gfp_mask)
> 
> s/mem_cgroup_newpage_charge/mem_cgroup_anon_charge/ ?
> 
> Would be a better name? The patch would be bigger but the name more
> apparent...

I wouldn't be opposed to fixing those names at all, but I think that
is out of the scope of this patch.  Want to send one?

mem_cgroup_charge_anon() would be a good name, but then we should also
rename mem_cgroup_cache_charge() to mem_cgroup_charge_file() to match.

Or charge_private() vs. charge_shared()...

> Other than that I am good with this. Without (preferably) or without
> rename:
> Acked-by: Michal Hocko <mhocko@suse.cz>

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
  2014-03-12 13:11     ` Michal Hocko
  (?)
@ 2014-03-12 14:56       ` Johannes Weiner
  -1 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12 14:56 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed, Mar 12, 2014 at 02:11:52PM +0100, Michal Hocko wrote:
> On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
> [...]
> > @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  		return 0;
> >  
> >  	if (!PageSwapCache(page)) {
> > +		/*
> > +		 * Page cache insertions can happen without an actual
> > +		 * task context, e.g. during disk probing on boot.
> 
> We read a page cache during disk probing? I have tried to find such a
> code path but failed. Could you point me to such a path, please?
> I thought that such probing is done from udev context but I am not
> familiar with this area TBH.

Yes, I tried to remove the !mm case entirely and hit the following
during boot:

[    1.869561] BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
[    1.869565] IP: [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869566] PGD 0
[    1.869567] Oops: 0000 [#1] SMP
[    1.869569] CPU: 3 PID: 65 Comm: kworker/u8:6 Not tainted 3.14.0-rc6-00007-g3856318f53a0-dirty #133
[    1.869569] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-DGS, BIOS P1.30 05/10/2012
[    1.869573] Workqueue: events_unbound async_run_entry_fn
[    1.869573] task: ffff8800ce82d3c0 ti: ffff8800ce8c6000 task.ti: ffff8800ce8c6000
[    1.869575] RIP: 0010:[<ffffffff811369a2>]  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869576] RSP: 0000:ffff8800ce8c78f8  EFLAGS: 00010246
[    1.869576] RAX: 003fffc000000001 RBX: 0000000000000000 RCX: 0000000000000001
[    1.869577] RDX: 00000000000000d0 RSI: 0000000000000000 RDI: 0000000000000000
[    1.869577] RBP: ffff8800ce8c7908 R08: ffffffff81713232 R09: ffffea00033a1680
[    1.869578] R10: 0000000000001723 R11: ffffc90004e4dfff R12: 0000000000000000
[    1.869578] R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000000d0
[    1.869579] FS:  0000000000000000(0000) GS:ffff88021f380000(0000) knlGS:0000000000000000
[    1.869579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.869580] CR2: 0000000000000320 CR3: 00000000017a5000 CR4: 00000000000407e0
[    1.869580] Stack:
[    1.869581]  0000000000000000 ffffea00033a1640 ffff8800ce8c7948 ffffffff8113a112
[    1.869582]  00000001ce8c7978 0000000000000000 ffffea00033a1640 00000000000200d0
[    1.869583]  0000000000000000 ffffffff81174520 ffff8800ce8c7970 ffffffff8113be0a
[    1.869583] Call Trace:
[    1.869586]  [<ffffffff8113a112>] mem_cgroup_charge_common+0x42/0xf0
[    1.869589]  [<ffffffff81174520>] ? blkdev_write_begin+0x30/0x30
[    1.869590]  [<ffffffff8113be0a>] mem_cgroup_cache_charge+0x7a/0xb0
[    1.869592] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    1.869594]  [<ffffffff810db06d>] add_to_page_cache_locked+0x3d/0x150
[    1.869595]  [<ffffffff810db19a>] add_to_page_cache_lru+0x1a/0x40
[    1.869597]  [<ffffffff810dbdef>] do_read_cache_page+0x6f/0x1a0
[    1.869598]  [<ffffffff810dce79>] read_cache_page+0x19/0x30
[    1.869601]  [<ffffffff8123952d>] read_dev_sector+0x2d/0x90
[    1.869603]  [<ffffffff8123a21f>] read_lba+0xef/0x1a0
[    1.869604]  [<ffffffff8123a663>] ? find_valid_gpt+0xc3/0x640
[    1.869605]  [<ffffffff8123a681>] find_valid_gpt+0xe1/0x640
[    1.869607]  [<ffffffff81249e6b>] ? string.isra.4+0x3b/0xf0
[    1.869609]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
[    1.869610]  [<ffffffff8123ac56>] efi_partition+0x76/0x3f0
[    1.869611]  [<ffffffff8124aec4>] ? vsnprintf+0x1f4/0x610
[    1.869612]  [<ffffffff8124b799>] ? snprintf+0x39/0x40
[    1.869613]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
[    1.869615]  [<ffffffff812396c8>] check_partition+0x108/0x240
[    1.869616]  [<ffffffff81239264>] rescan_partitions+0xb4/0x2c0
[    1.869617]  [<ffffffff8117584c>] __blkdev_get+0x2dc/0x400
[    1.869618]  [<ffffffff81175b1d>] blkdev_get+0x1ad/0x320
[    1.869619] sd 1:0:0:0: [sdb] Write Protect is off
[    1.869621]  [<ffffffff81157603>] ? unlock_new_inode+0x43/0x70
[    1.869622] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    1.869622]  [<ffffffff81174f66>] ? bdget+0x136/0x150
[    1.869624]  [<ffffffff81236b34>] add_disk+0x394/0x4a0
[    1.869627]  [<ffffffff8135b327>] sd_probe_async+0x127/0x1d0
[    1.869628]  [<ffffffff81065c87>] async_run_entry_fn+0x37/0x130
[    1.869629]  [<ffffffff810595fe>] process_one_work+0x16e/0x3e0
[    1.869630]  [<ffffffff81059991>] worker_thread+0x121/0x3a0
[    1.869631]  [<ffffffff81059870>] ? process_one_work+0x3e0/0x3e0
[    1.869633]  [<ffffffff810602c2>] kthread+0xd2/0xf0
[    1.869634] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.869636]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
[    1.869638]  [<ffffffff815dbaac>] ret_from_fork+0x7c/0xb0
[    1.869639]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
[    1.869648] Code: 89 e5 41 54 49 89 fc 53 eb 21 0f 1f 80 00 00 00 00 f6 43 48 01 75 52 48 8b 43 18 a8 03 75 52 65 ff 00 b8 01 00 00 00 84 c0 75 3e <49> 8b 84 24 20 03 00 00 48 85 c0 74 10 48 8b 80 98 06 00 00 48
[    1.869650] RIP  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869650]  RSP <ffff8800ce8c78f8>
[    1.869650] CR2: 0000000000000320
[    1.869653] ---[ end trace 4cda1f5484a90d6d ]---

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
@ 2014-03-12 14:56       ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12 14:56 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed, Mar 12, 2014 at 02:11:52PM +0100, Michal Hocko wrote:
> On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
> [...]
> > @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  		return 0;
> >  
> >  	if (!PageSwapCache(page)) {
> > +		/*
> > +		 * Page cache insertions can happen without an actual
> > +		 * task context, e.g. during disk probing on boot.
> 
> We read a page cache during disk probing? I have tried to find such a
> code path but failed. Could you point me to such a path, please?
> I thought that such probing is done from udev context but I am not
> familiar with this area TBH.

Yes, I tried to remove the !mm case entirely and hit the following
during boot:

[    1.869561] BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
[    1.869565] IP: [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869566] PGD 0
[    1.869567] Oops: 0000 [#1] SMP
[    1.869569] CPU: 3 PID: 65 Comm: kworker/u8:6 Not tainted 3.14.0-rc6-00007-g3856318f53a0-dirty #133
[    1.869569] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-DGS, BIOS P1.30 05/10/2012
[    1.869573] Workqueue: events_unbound async_run_entry_fn
[    1.869573] task: ffff8800ce82d3c0 ti: ffff8800ce8c6000 task.ti: ffff8800ce8c6000
[    1.869575] RIP: 0010:[<ffffffff811369a2>]  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869576] RSP: 0000:ffff8800ce8c78f8  EFLAGS: 00010246
[    1.869576] RAX: 003fffc000000001 RBX: 0000000000000000 RCX: 0000000000000001
[    1.869577] RDX: 00000000000000d0 RSI: 0000000000000000 RDI: 0000000000000000
[    1.869577] RBP: ffff8800ce8c7908 R08: ffffffff81713232 R09: ffffea00033a1680
[    1.869578] R10: 0000000000001723 R11: ffffc90004e4dfff R12: 0000000000000000
[    1.869578] R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000000d0
[    1.869579] FS:  0000000000000000(0000) GS:ffff88021f380000(0000) knlGS:0000000000000000
[    1.869579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.869580] CR2: 0000000000000320 CR3: 00000000017a5000 CR4: 00000000000407e0
[    1.869580] Stack:
[    1.869581]  0000000000000000 ffffea00033a1640 ffff8800ce8c7948 ffffffff8113a112
[    1.869582]  00000001ce8c7978 0000000000000000 ffffea00033a1640 00000000000200d0
[    1.869583]  0000000000000000 ffffffff81174520 ffff8800ce8c7970 ffffffff8113be0a
[    1.869583] Call Trace:
[    1.869586]  [<ffffffff8113a112>] mem_cgroup_charge_common+0x42/0xf0
[    1.869589]  [<ffffffff81174520>] ? blkdev_write_begin+0x30/0x30
[    1.869590]  [<ffffffff8113be0a>] mem_cgroup_cache_charge+0x7a/0xb0
[    1.869592] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    1.869594]  [<ffffffff810db06d>] add_to_page_cache_locked+0x3d/0x150
[    1.869595]  [<ffffffff810db19a>] add_to_page_cache_lru+0x1a/0x40
[    1.869597]  [<ffffffff810dbdef>] do_read_cache_page+0x6f/0x1a0
[    1.869598]  [<ffffffff810dce79>] read_cache_page+0x19/0x30
[    1.869601]  [<ffffffff8123952d>] read_dev_sector+0x2d/0x90
[    1.869603]  [<ffffffff8123a21f>] read_lba+0xef/0x1a0
[    1.869604]  [<ffffffff8123a663>] ? find_valid_gpt+0xc3/0x640
[    1.869605]  [<ffffffff8123a681>] find_valid_gpt+0xe1/0x640
[    1.869607]  [<ffffffff81249e6b>] ? string.isra.4+0x3b/0xf0
[    1.869609]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
[    1.869610]  [<ffffffff8123ac56>] efi_partition+0x76/0x3f0
[    1.869611]  [<ffffffff8124aec4>] ? vsnprintf+0x1f4/0x610
[    1.869612]  [<ffffffff8124b799>] ? snprintf+0x39/0x40
[    1.869613]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
[    1.869615]  [<ffffffff812396c8>] check_partition+0x108/0x240
[    1.869616]  [<ffffffff81239264>] rescan_partitions+0xb4/0x2c0
[    1.869617]  [<ffffffff8117584c>] __blkdev_get+0x2dc/0x400
[    1.869618]  [<ffffffff81175b1d>] blkdev_get+0x1ad/0x320
[    1.869619] sd 1:0:0:0: [sdb] Write Protect is off
[    1.869621]  [<ffffffff81157603>] ? unlock_new_inode+0x43/0x70
[    1.869622] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    1.869622]  [<ffffffff81174f66>] ? bdget+0x136/0x150
[    1.869624]  [<ffffffff81236b34>] add_disk+0x394/0x4a0
[    1.869627]  [<ffffffff8135b327>] sd_probe_async+0x127/0x1d0
[    1.869628]  [<ffffffff81065c87>] async_run_entry_fn+0x37/0x130
[    1.869629]  [<ffffffff810595fe>] process_one_work+0x16e/0x3e0
[    1.869630]  [<ffffffff81059991>] worker_thread+0x121/0x3a0
[    1.869631]  [<ffffffff81059870>] ? process_one_work+0x3e0/0x3e0
[    1.869633]  [<ffffffff810602c2>] kthread+0xd2/0xf0
[    1.869634] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.869636]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
[    1.869638]  [<ffffffff815dbaac>] ret_from_fork+0x7c/0xb0
[    1.869639]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
[    1.869648] Code: 89 e5 41 54 49 89 fc 53 eb 21 0f 1f 80 00 00 00 00 f6 43 48 01 75 52 48 8b 43 18 a8 03 75 52 65 ff 00 b8 01 00 00 00 84 c0 75 3e <49> 8b 84 24 20 03 00 00 48 85 c0 74 10 48 8b 80 98 06 00 00 48
[    1.869650] RIP  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869650]  RSP <ffff8800ce8c78f8>
[    1.869650] CR2: 0000000000000320
[    1.869653] ---[ end trace 4cda1f5484a90d6d ]---

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
@ 2014-03-12 14:56       ` Johannes Weiner
  0 siblings, 0 replies; 43+ messages in thread
From: Johannes Weiner @ 2014-03-12 14:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed, Mar 12, 2014 at 02:11:52PM +0100, Michal Hocko wrote:
> On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
> [...]
> > @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> >  		return 0;
> >  
> >  	if (!PageSwapCache(page)) {
> > +		/*
> > +		 * Page cache insertions can happen without an actual
> > +		 * task context, e.g. during disk probing on boot.
> 
> We read a page cache during disk probing? I have tried to find such a
> code path but failed. Could you point me to such a path, please?
> I thought that such probing is done from udev context but I am not
> familiar with this area TBH.

Yes, I tried to remove the !mm case entirely and hit the following
during boot:

[    1.869561] BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
[    1.869565] IP: [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869566] PGD 0
[    1.869567] Oops: 0000 [#1] SMP
[    1.869569] CPU: 3 PID: 65 Comm: kworker/u8:6 Not tainted 3.14.0-rc6-00007-g3856318f53a0-dirty #133
[    1.869569] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-DGS, BIOS P1.30 05/10/2012
[    1.869573] Workqueue: events_unbound async_run_entry_fn
[    1.869573] task: ffff8800ce82d3c0 ti: ffff8800ce8c6000 task.ti: ffff8800ce8c6000
[    1.869575] RIP: 0010:[<ffffffff811369a2>]  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869576] RSP: 0000:ffff8800ce8c78f8  EFLAGS: 00010246
[    1.869576] RAX: 003fffc000000001 RBX: 0000000000000000 RCX: 0000000000000001
[    1.869577] RDX: 00000000000000d0 RSI: 0000000000000000 RDI: 0000000000000000
[    1.869577] RBP: ffff8800ce8c7908 R08: ffffffff81713232 R09: ffffea00033a1680
[    1.869578] R10: 0000000000001723 R11: ffffc90004e4dfff R12: 0000000000000000
[    1.869578] R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000000d0
[    1.869579] FS:  0000000000000000(0000) GS:ffff88021f380000(0000) knlGS:0000000000000000
[    1.869579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.869580] CR2: 0000000000000320 CR3: 00000000017a5000 CR4: 00000000000407e0
[    1.869580] Stack:
[    1.869581]  0000000000000000 ffffea00033a1640 ffff8800ce8c7948 ffffffff8113a112
[    1.869582]  00000001ce8c7978 0000000000000000 ffffea00033a1640 00000000000200d0
[    1.869583]  0000000000000000 ffffffff81174520 ffff8800ce8c7970 ffffffff8113be0a
[    1.869583] Call Trace:
[    1.869586]  [<ffffffff8113a112>] mem_cgroup_charge_common+0x42/0xf0
[    1.869589]  [<ffffffff81174520>] ? blkdev_write_begin+0x30/0x30
[    1.869590]  [<ffffffff8113be0a>] mem_cgroup_cache_charge+0x7a/0xb0
[    1.869592] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    1.869594]  [<ffffffff810db06d>] add_to_page_cache_locked+0x3d/0x150
[    1.869595]  [<ffffffff810db19a>] add_to_page_cache_lru+0x1a/0x40
[    1.869597]  [<ffffffff810dbdef>] do_read_cache_page+0x6f/0x1a0
[    1.869598]  [<ffffffff810dce79>] read_cache_page+0x19/0x30
[    1.869601]  [<ffffffff8123952d>] read_dev_sector+0x2d/0x90
[    1.869603]  [<ffffffff8123a21f>] read_lba+0xef/0x1a0
[    1.869604]  [<ffffffff8123a663>] ? find_valid_gpt+0xc3/0x640
[    1.869605]  [<ffffffff8123a681>] find_valid_gpt+0xe1/0x640
[    1.869607]  [<ffffffff81249e6b>] ? string.isra.4+0x3b/0xf0
[    1.869609]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
[    1.869610]  [<ffffffff8123ac56>] efi_partition+0x76/0x3f0
[    1.869611]  [<ffffffff8124aec4>] ? vsnprintf+0x1f4/0x610
[    1.869612]  [<ffffffff8124b799>] ? snprintf+0x39/0x40
[    1.869613]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
[    1.869615]  [<ffffffff812396c8>] check_partition+0x108/0x240
[    1.869616]  [<ffffffff81239264>] rescan_partitions+0xb4/0x2c0
[    1.869617]  [<ffffffff8117584c>] __blkdev_get+0x2dc/0x400
[    1.869618]  [<ffffffff81175b1d>] blkdev_get+0x1ad/0x320
[    1.869619] sd 1:0:0:0: [sdb] Write Protect is off
[    1.869621]  [<ffffffff81157603>] ? unlock_new_inode+0x43/0x70
[    1.869622] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    1.869622]  [<ffffffff81174f66>] ? bdget+0x136/0x150
[    1.869624]  [<ffffffff81236b34>] add_disk+0x394/0x4a0
[    1.869627]  [<ffffffff8135b327>] sd_probe_async+0x127/0x1d0
[    1.869628]  [<ffffffff81065c87>] async_run_entry_fn+0x37/0x130
[    1.869629]  [<ffffffff810595fe>] process_one_work+0x16e/0x3e0
[    1.869630]  [<ffffffff81059991>] worker_thread+0x121/0x3a0
[    1.869631]  [<ffffffff81059870>] ? process_one_work+0x3e0/0x3e0
[    1.869633]  [<ffffffff810602c2>] kthread+0xd2/0xf0
[    1.869634] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.869636]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
[    1.869638]  [<ffffffff815dbaac>] ret_from_fork+0x7c/0xb0
[    1.869639]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
[    1.869648] Code: 89 e5 41 54 49 89 fc 53 eb 21 0f 1f 80 00 00 00 00 f6 43 48 01 75 52 48 8b 43 18 a8 03 75 52 65 ff 00 b8 01 00 00 00 84 c0 75 3e <49> 8b 84 24 20 03 00 00 48 85 c0 74 10 48 8b 80 98 06 00 00 48
[    1.869650] RIP  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
[    1.869650]  RSP <ffff8800ce8c78f8>
[    1.869650] CR2: 0000000000000320
[    1.869653] ---[ end trace 4cda1f5484a90d6d ]---

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
  2014-03-12 14:56       ` Johannes Weiner
  (?)
@ 2014-03-12 15:01         ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 15:01 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed 12-03-14 10:56:11, Johannes Weiner wrote:
> On Wed, Mar 12, 2014 at 02:11:52PM +0100, Michal Hocko wrote:
> > On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
> > [...]
> > > @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> > >  		return 0;
> > >  
> > >  	if (!PageSwapCache(page)) {
> > > +		/*
> > > +		 * Page cache insertions can happen without an actual
> > > +		 * task context, e.g. during disk probing on boot.
> > 
> > We read a page cache during disk probing? I have tried to find such a
> > code path but failed. Could you point me to such a path, please?
> > I thought that such probing is done from udev context but I am not
> > familiar with this area TBH.
> 
> Yes, I tried to remove the !mm case entirely and hit the following
> during boot:

OK, I wonder why I haven't triggered that. Anyway, could you mention
this path in the changelog? This is really hard to find when jumping in
the code.

Anyway thanks!

> [    1.869561] BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
> [    1.869565] IP: [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869566] PGD 0
> [    1.869567] Oops: 0000 [#1] SMP
> [    1.869569] CPU: 3 PID: 65 Comm: kworker/u8:6 Not tainted 3.14.0-rc6-00007-g3856318f53a0-dirty #133
> [    1.869569] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-DGS, BIOS P1.30 05/10/2012
> [    1.869573] Workqueue: events_unbound async_run_entry_fn
> [    1.869573] task: ffff8800ce82d3c0 ti: ffff8800ce8c6000 task.ti: ffff8800ce8c6000
> [    1.869575] RIP: 0010:[<ffffffff811369a2>]  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869576] RSP: 0000:ffff8800ce8c78f8  EFLAGS: 00010246
> [    1.869576] RAX: 003fffc000000001 RBX: 0000000000000000 RCX: 0000000000000001
> [    1.869577] RDX: 00000000000000d0 RSI: 0000000000000000 RDI: 0000000000000000
> [    1.869577] RBP: ffff8800ce8c7908 R08: ffffffff81713232 R09: ffffea00033a1680
> [    1.869578] R10: 0000000000001723 R11: ffffc90004e4dfff R12: 0000000000000000
> [    1.869578] R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000000d0
> [    1.869579] FS:  0000000000000000(0000) GS:ffff88021f380000(0000) knlGS:0000000000000000
> [    1.869579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.869580] CR2: 0000000000000320 CR3: 00000000017a5000 CR4: 00000000000407e0
> [    1.869580] Stack:
> [    1.869581]  0000000000000000 ffffea00033a1640 ffff8800ce8c7948 ffffffff8113a112
> [    1.869582]  00000001ce8c7978 0000000000000000 ffffea00033a1640 00000000000200d0
> [    1.869583]  0000000000000000 ffffffff81174520 ffff8800ce8c7970 ffffffff8113be0a
> [    1.869583] Call Trace:
> [    1.869586]  [<ffffffff8113a112>] mem_cgroup_charge_common+0x42/0xf0
> [    1.869589]  [<ffffffff81174520>] ? blkdev_write_begin+0x30/0x30
> [    1.869590]  [<ffffffff8113be0a>] mem_cgroup_cache_charge+0x7a/0xb0
> [    1.869592] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
> [    1.869594]  [<ffffffff810db06d>] add_to_page_cache_locked+0x3d/0x150
> [    1.869595]  [<ffffffff810db19a>] add_to_page_cache_lru+0x1a/0x40
> [    1.869597]  [<ffffffff810dbdef>] do_read_cache_page+0x6f/0x1a0
> [    1.869598]  [<ffffffff810dce79>] read_cache_page+0x19/0x30
> [    1.869601]  [<ffffffff8123952d>] read_dev_sector+0x2d/0x90
> [    1.869603]  [<ffffffff8123a21f>] read_lba+0xef/0x1a0
> [    1.869604]  [<ffffffff8123a663>] ? find_valid_gpt+0xc3/0x640
> [    1.869605]  [<ffffffff8123a681>] find_valid_gpt+0xe1/0x640
> [    1.869607]  [<ffffffff81249e6b>] ? string.isra.4+0x3b/0xf0
> [    1.869609]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
> [    1.869610]  [<ffffffff8123ac56>] efi_partition+0x76/0x3f0
> [    1.869611]  [<ffffffff8124aec4>] ? vsnprintf+0x1f4/0x610
> [    1.869612]  [<ffffffff8124b799>] ? snprintf+0x39/0x40
> [    1.869613]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
> [    1.869615]  [<ffffffff812396c8>] check_partition+0x108/0x240
> [    1.869616]  [<ffffffff81239264>] rescan_partitions+0xb4/0x2c0
> [    1.869617]  [<ffffffff8117584c>] __blkdev_get+0x2dc/0x400
> [    1.869618]  [<ffffffff81175b1d>] blkdev_get+0x1ad/0x320
> [    1.869619] sd 1:0:0:0: [sdb] Write Protect is off
> [    1.869621]  [<ffffffff81157603>] ? unlock_new_inode+0x43/0x70
> [    1.869622] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [    1.869622]  [<ffffffff81174f66>] ? bdget+0x136/0x150
> [    1.869624]  [<ffffffff81236b34>] add_disk+0x394/0x4a0
> [    1.869627]  [<ffffffff8135b327>] sd_probe_async+0x127/0x1d0
> [    1.869628]  [<ffffffff81065c87>] async_run_entry_fn+0x37/0x130
> [    1.869629]  [<ffffffff810595fe>] process_one_work+0x16e/0x3e0
> [    1.869630]  [<ffffffff81059991>] worker_thread+0x121/0x3a0
> [    1.869631]  [<ffffffff81059870>] ? process_one_work+0x3e0/0x3e0
> [    1.869633]  [<ffffffff810602c2>] kthread+0xd2/0xf0
> [    1.869634] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    1.869636]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
> [    1.869638]  [<ffffffff815dbaac>] ret_from_fork+0x7c/0xb0
> [    1.869639]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
> [    1.869648] Code: 89 e5 41 54 49 89 fc 53 eb 21 0f 1f 80 00 00 00 00 f6 43 48 01 75 52 48 8b 43 18 a8 03 75 52 65 ff 00 b8 01 00 00 00 84 c0 75 3e <49> 8b 84 24 20 03 00 00 48 85 c0 74 10 48 8b 80 98 06 00 00 48
> [    1.869650] RIP  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869650]  RSP <ffff8800ce8c78f8>
> [    1.869650] CR2: 0000000000000320
> [    1.869653] ---[ end trace 4cda1f5484a90d6d ]---

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
@ 2014-03-12 15:01         ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 15:01 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed 12-03-14 10:56:11, Johannes Weiner wrote:
> On Wed, Mar 12, 2014 at 02:11:52PM +0100, Michal Hocko wrote:
> > On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
> > [...]
> > > @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> > >  		return 0;
> > >  
> > >  	if (!PageSwapCache(page)) {
> > > +		/*
> > > +		 * Page cache insertions can happen without an actual
> > > +		 * task context, e.g. during disk probing on boot.
> > 
> > We read a page cache during disk probing? I have tried to find such a
> > code path but failed. Could you point me to such a path, please?
> > I thought that such probing is done from udev context but I am not
> > familiar with this area TBH.
> 
> Yes, I tried to remove the !mm case entirely and hit the following
> during boot:

OK, I wonder why I haven't triggered that. Anyway, could you mention
this path in the changelog? This is really hard to find when jumping in
the code.

Anyway thanks!

> [    1.869561] BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
> [    1.869565] IP: [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869566] PGD 0
> [    1.869567] Oops: 0000 [#1] SMP
> [    1.869569] CPU: 3 PID: 65 Comm: kworker/u8:6 Not tainted 3.14.0-rc6-00007-g3856318f53a0-dirty #133
> [    1.869569] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-DGS, BIOS P1.30 05/10/2012
> [    1.869573] Workqueue: events_unbound async_run_entry_fn
> [    1.869573] task: ffff8800ce82d3c0 ti: ffff8800ce8c6000 task.ti: ffff8800ce8c6000
> [    1.869575] RIP: 0010:[<ffffffff811369a2>]  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869576] RSP: 0000:ffff8800ce8c78f8  EFLAGS: 00010246
> [    1.869576] RAX: 003fffc000000001 RBX: 0000000000000000 RCX: 0000000000000001
> [    1.869577] RDX: 00000000000000d0 RSI: 0000000000000000 RDI: 0000000000000000
> [    1.869577] RBP: ffff8800ce8c7908 R08: ffffffff81713232 R09: ffffea00033a1680
> [    1.869578] R10: 0000000000001723 R11: ffffc90004e4dfff R12: 0000000000000000
> [    1.869578] R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000000d0
> [    1.869579] FS:  0000000000000000(0000) GS:ffff88021f380000(0000) knlGS:0000000000000000
> [    1.869579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.869580] CR2: 0000000000000320 CR3: 00000000017a5000 CR4: 00000000000407e0
> [    1.869580] Stack:
> [    1.869581]  0000000000000000 ffffea00033a1640 ffff8800ce8c7948 ffffffff8113a112
> [    1.869582]  00000001ce8c7978 0000000000000000 ffffea00033a1640 00000000000200d0
> [    1.869583]  0000000000000000 ffffffff81174520 ffff8800ce8c7970 ffffffff8113be0a
> [    1.869583] Call Trace:
> [    1.869586]  [<ffffffff8113a112>] mem_cgroup_charge_common+0x42/0xf0
> [    1.869589]  [<ffffffff81174520>] ? blkdev_write_begin+0x30/0x30
> [    1.869590]  [<ffffffff8113be0a>] mem_cgroup_cache_charge+0x7a/0xb0
> [    1.869592] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
> [    1.869594]  [<ffffffff810db06d>] add_to_page_cache_locked+0x3d/0x150
> [    1.869595]  [<ffffffff810db19a>] add_to_page_cache_lru+0x1a/0x40
> [    1.869597]  [<ffffffff810dbdef>] do_read_cache_page+0x6f/0x1a0
> [    1.869598]  [<ffffffff810dce79>] read_cache_page+0x19/0x30
> [    1.869601]  [<ffffffff8123952d>] read_dev_sector+0x2d/0x90
> [    1.869603]  [<ffffffff8123a21f>] read_lba+0xef/0x1a0
> [    1.869604]  [<ffffffff8123a663>] ? find_valid_gpt+0xc3/0x640
> [    1.869605]  [<ffffffff8123a681>] find_valid_gpt+0xe1/0x640
> [    1.869607]  [<ffffffff81249e6b>] ? string.isra.4+0x3b/0xf0
> [    1.869609]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
> [    1.869610]  [<ffffffff8123ac56>] efi_partition+0x76/0x3f0
> [    1.869611]  [<ffffffff8124aec4>] ? vsnprintf+0x1f4/0x610
> [    1.869612]  [<ffffffff8124b799>] ? snprintf+0x39/0x40
> [    1.869613]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
> [    1.869615]  [<ffffffff812396c8>] check_partition+0x108/0x240
> [    1.869616]  [<ffffffff81239264>] rescan_partitions+0xb4/0x2c0
> [    1.869617]  [<ffffffff8117584c>] __blkdev_get+0x2dc/0x400
> [    1.869618]  [<ffffffff81175b1d>] blkdev_get+0x1ad/0x320
> [    1.869619] sd 1:0:0:0: [sdb] Write Protect is off
> [    1.869621]  [<ffffffff81157603>] ? unlock_new_inode+0x43/0x70
> [    1.869622] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [    1.869622]  [<ffffffff81174f66>] ? bdget+0x136/0x150
> [    1.869624]  [<ffffffff81236b34>] add_disk+0x394/0x4a0
> [    1.869627]  [<ffffffff8135b327>] sd_probe_async+0x127/0x1d0
> [    1.869628]  [<ffffffff81065c87>] async_run_entry_fn+0x37/0x130
> [    1.869629]  [<ffffffff810595fe>] process_one_work+0x16e/0x3e0
> [    1.869630]  [<ffffffff81059991>] worker_thread+0x121/0x3a0
> [    1.869631]  [<ffffffff81059870>] ? process_one_work+0x3e0/0x3e0
> [    1.869633]  [<ffffffff810602c2>] kthread+0xd2/0xf0
> [    1.869634] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    1.869636]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
> [    1.869638]  [<ffffffff815dbaac>] ret_from_fork+0x7c/0xb0
> [    1.869639]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
> [    1.869648] Code: 89 e5 41 54 49 89 fc 53 eb 21 0f 1f 80 00 00 00 00 f6 43 48 01 75 52 48 8b 43 18 a8 03 75 52 65 ff 00 b8 01 00 00 00 84 c0 75 3e <49> 8b 84 24 20 03 00 00 48 85 c0 74 10 48 8b 80 98 06 00 00 48
> [    1.869650] RIP  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869650]  RSP <ffff8800ce8c78f8>
> [    1.869650] CR2: 0000000000000320
> [    1.869653] ---[ end trace 4cda1f5484a90d6d ]---

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 4/8] mm: memcg: push !mm handling out to page cache charge function
@ 2014-03-12 15:01         ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 15:01 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed 12-03-14 10:56:11, Johannes Weiner wrote:
> On Wed, Mar 12, 2014 at 02:11:52PM +0100, Michal Hocko wrote:
> > On Tue 11-03-14 21:28:30, Johannes Weiner wrote:
> > [...]
> > > @@ -4070,6 +4061,12 @@ int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
> > >  		return 0;
> > >  
> > >  	if (!PageSwapCache(page)) {
> > > +		/*
> > > +		 * Page cache insertions can happen without an actual
> > > +		 * task context, e.g. during disk probing on boot.
> > 
> > We read a page cache during disk probing? I have tried to find such a
> > code path but failed. Could you point me to such a path, please?
> > I thought that such probing is done from udev context but I am not
> > familiar with this area TBH.
> 
> Yes, I tried to remove the !mm case entirely and hit the following
> during boot:

OK, I wonder why I haven't triggered that. Anyway, could you mention
this path in the changelog? This is really hard to find when jumping in
the code.

Anyway thanks!

> [    1.869561] BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
> [    1.869565] IP: [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869566] PGD 0
> [    1.869567] Oops: 0000 [#1] SMP
> [    1.869569] CPU: 3 PID: 65 Comm: kworker/u8:6 Not tainted 3.14.0-rc6-00007-g3856318f53a0-dirty #133
> [    1.869569] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H61M-DGS, BIOS P1.30 05/10/2012
> [    1.869573] Workqueue: events_unbound async_run_entry_fn
> [    1.869573] task: ffff8800ce82d3c0 ti: ffff8800ce8c6000 task.ti: ffff8800ce8c6000
> [    1.869575] RIP: 0010:[<ffffffff811369a2>]  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869576] RSP: 0000:ffff8800ce8c78f8  EFLAGS: 00010246
> [    1.869576] RAX: 003fffc000000001 RBX: 0000000000000000 RCX: 0000000000000001
> [    1.869577] RDX: 00000000000000d0 RSI: 0000000000000000 RDI: 0000000000000000
> [    1.869577] RBP: ffff8800ce8c7908 R08: ffffffff81713232 R09: ffffea00033a1680
> [    1.869578] R10: 0000000000001723 R11: ffffc90004e4dfff R12: 0000000000000000
> [    1.869578] R13: 0000000000000001 R14: 0000000000000000 R15: 00000000000000d0
> [    1.869579] FS:  0000000000000000(0000) GS:ffff88021f380000(0000) knlGS:0000000000000000
> [    1.869579] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.869580] CR2: 0000000000000320 CR3: 00000000017a5000 CR4: 00000000000407e0
> [    1.869580] Stack:
> [    1.869581]  0000000000000000 ffffea00033a1640 ffff8800ce8c7948 ffffffff8113a112
> [    1.869582]  00000001ce8c7978 0000000000000000 ffffea00033a1640 00000000000200d0
> [    1.869583]  0000000000000000 ffffffff81174520 ffff8800ce8c7970 ffffffff8113be0a
> [    1.869583] Call Trace:
> [    1.869586]  [<ffffffff8113a112>] mem_cgroup_charge_common+0x42/0xf0
> [    1.869589]  [<ffffffff81174520>] ? blkdev_write_begin+0x30/0x30
> [    1.869590]  [<ffffffff8113be0a>] mem_cgroup_cache_charge+0x7a/0xb0
> [    1.869592] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
> [    1.869594]  [<ffffffff810db06d>] add_to_page_cache_locked+0x3d/0x150
> [    1.869595]  [<ffffffff810db19a>] add_to_page_cache_lru+0x1a/0x40
> [    1.869597]  [<ffffffff810dbdef>] do_read_cache_page+0x6f/0x1a0
> [    1.869598]  [<ffffffff810dce79>] read_cache_page+0x19/0x30
> [    1.869601]  [<ffffffff8123952d>] read_dev_sector+0x2d/0x90
> [    1.869603]  [<ffffffff8123a21f>] read_lba+0xef/0x1a0
> [    1.869604]  [<ffffffff8123a663>] ? find_valid_gpt+0xc3/0x640
> [    1.869605]  [<ffffffff8123a681>] find_valid_gpt+0xe1/0x640
> [    1.869607]  [<ffffffff81249e6b>] ? string.isra.4+0x3b/0xf0
> [    1.869609]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
> [    1.869610]  [<ffffffff8123ac56>] efi_partition+0x76/0x3f0
> [    1.869611]  [<ffffffff8124aec4>] ? vsnprintf+0x1f4/0x610
> [    1.869612]  [<ffffffff8124b799>] ? snprintf+0x39/0x40
> [    1.869613]  [<ffffffff8123abe0>] ? find_valid_gpt+0x640/0x640
> [    1.869615]  [<ffffffff812396c8>] check_partition+0x108/0x240
> [    1.869616]  [<ffffffff81239264>] rescan_partitions+0xb4/0x2c0
> [    1.869617]  [<ffffffff8117584c>] __blkdev_get+0x2dc/0x400
> [    1.869618]  [<ffffffff81175b1d>] blkdev_get+0x1ad/0x320
> [    1.869619] sd 1:0:0:0: [sdb] Write Protect is off
> [    1.869621]  [<ffffffff81157603>] ? unlock_new_inode+0x43/0x70
> [    1.869622] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> [    1.869622]  [<ffffffff81174f66>] ? bdget+0x136/0x150
> [    1.869624]  [<ffffffff81236b34>] add_disk+0x394/0x4a0
> [    1.869627]  [<ffffffff8135b327>] sd_probe_async+0x127/0x1d0
> [    1.869628]  [<ffffffff81065c87>] async_run_entry_fn+0x37/0x130
> [    1.869629]  [<ffffffff810595fe>] process_one_work+0x16e/0x3e0
> [    1.869630]  [<ffffffff81059991>] worker_thread+0x121/0x3a0
> [    1.869631]  [<ffffffff81059870>] ? process_one_work+0x3e0/0x3e0
> [    1.869633]  [<ffffffff810602c2>] kthread+0xd2/0xf0
> [    1.869634] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
> [    1.869636]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
> [    1.869638]  [<ffffffff815dbaac>] ret_from_fork+0x7c/0xb0
> [    1.869639]  [<ffffffff810601f0>] ? __kthread_parkme+0x70/0x70
> [    1.869648] Code: 89 e5 41 54 49 89 fc 53 eb 21 0f 1f 80 00 00 00 00 f6 43 48 01 75 52 48 8b 43 18 a8 03 75 52 65 ff 00 b8 01 00 00 00 84 c0 75 3e <49> 8b 84 24 20 03 00 00 48 85 c0 74 10 48 8b 80 98 06 00 00 48
> [    1.869650] RIP  [<ffffffff811369a2>] get_mem_cgroup_from_mm+0x32/0x80
> [    1.869650]  RSP <ffff8800ce8c78f8>
> [    1.869650] CR2: 0000000000000320
> [    1.869653] ---[ end trace 4cda1f5484a90d6d ]---

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
  2014-03-12 14:53       ` Johannes Weiner
  (?)
@ 2014-03-12 15:19         ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 15:19 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed 12-03-14 10:53:00, Johannes Weiner wrote:
> On Wed, Mar 12, 2014 at 01:52:13PM +0100, Michal Hocko wrote:
> > On Tue 11-03-14 21:28:29, Johannes Weiner wrote:
> > [...]
> > > @@ -3919,20 +3919,21 @@ out:
> > >  	return ret;
> > >  }
> > >  
> > > -/*
> > > - * Charge the memory controller for page usage.
> > > - * Return
> > > - * 0 if the charge was successful
> > > - * < 0 if the cgroup is over its limit
> > > - */
> > > -static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
> > > -				gfp_t gfp_mask, enum charge_type ctype)
> > > +int mem_cgroup_newpage_charge(struct page *page,
> > > +			      struct mm_struct *mm, gfp_t gfp_mask)
> > 
> > s/mem_cgroup_newpage_charge/mem_cgroup_anon_charge/ ?
> > 
> > Would be a better name? The patch would be bigger but the name more
> > apparent...
> 
> I wouldn't be opposed to fixing those names at all, but I think that
> is out of the scope of this patch.

OK.

> Want to send one?

will do

> mem_cgroup_charge_anon() would be a good name, but then we should also
> rename mem_cgroup_cache_charge() to mem_cgroup_charge_file() to match.

Yes that sounds good to me.

> Or charge_private() vs. charge_shared()...

anon vs. file is easier to follow but I do not have any preference here.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
@ 2014-03-12 15:19         ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 15:19 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

On Wed 12-03-14 10:53:00, Johannes Weiner wrote:
> On Wed, Mar 12, 2014 at 01:52:13PM +0100, Michal Hocko wrote:
> > On Tue 11-03-14 21:28:29, Johannes Weiner wrote:
> > [...]
> > > @@ -3919,20 +3919,21 @@ out:
> > >  	return ret;
> > >  }
> > >  
> > > -/*
> > > - * Charge the memory controller for page usage.
> > > - * Return
> > > - * 0 if the charge was successful
> > > - * < 0 if the cgroup is over its limit
> > > - */
> > > -static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
> > > -				gfp_t gfp_mask, enum charge_type ctype)
> > > +int mem_cgroup_newpage_charge(struct page *page,
> > > +			      struct mm_struct *mm, gfp_t gfp_mask)
> > 
> > s/mem_cgroup_newpage_charge/mem_cgroup_anon_charge/ ?
> > 
> > Would be a better name? The patch would be bigger but the name more
> > apparent...
> 
> I wouldn't be opposed to fixing those names at all, but I think that
> is out of the scope of this patch.

OK.

> Want to send one?

will do

> mem_cgroup_charge_anon() would be a good name, but then we should also
> rename mem_cgroup_cache_charge() to mem_cgroup_charge_file() to match.

Yes that sounds good to me.

> Or charge_private() vs. charge_shared()...

anon vs. file is easier to follow but I do not have any preference here.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [patch 3/8] mm: memcg: inline mem_cgroup_charge_common()
@ 2014-03-12 15:19         ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 15:19 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	cgroups-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

On Wed 12-03-14 10:53:00, Johannes Weiner wrote:
> On Wed, Mar 12, 2014 at 01:52:13PM +0100, Michal Hocko wrote:
> > On Tue 11-03-14 21:28:29, Johannes Weiner wrote:
> > [...]
> > > @@ -3919,20 +3919,21 @@ out:
> > >  	return ret;
> > >  }
> > >  
> > > -/*
> > > - * Charge the memory controller for page usage.
> > > - * Return
> > > - * 0 if the charge was successful
> > > - * < 0 if the cgroup is over its limit
> > > - */
> > > -static int mem_cgroup_charge_common(struct page *page, struct mm_struct *mm,
> > > -				gfp_t gfp_mask, enum charge_type ctype)
> > > +int mem_cgroup_newpage_charge(struct page *page,
> > > +			      struct mm_struct *mm, gfp_t gfp_mask)
> > 
> > s/mem_cgroup_newpage_charge/mem_cgroup_anon_charge/ ?
> > 
> > Would be a better name? The patch would be bigger but the name more
> > apparent...
> 
> I wouldn't be opposed to fixing those names at all, but I think that
> is out of the scope of this patch.

OK.

> Want to send one?

will do

> mem_cgroup_charge_anon() would be a good name, but then we should also
> rename mem_cgroup_cache_charge() to mem_cgroup_charge_file() to match.

Yes that sounds good to me.

> Or charge_private() vs. charge_shared()...

anon vs. file is easier to follow but I do not have any preference here.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH] memcg: rename high level charging functions
  2014-03-12 14:53       ` Johannes Weiner
@ 2014-03-12 15:20         ` Michal Hocko
  -1 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 15:20 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

mem_cgroup_newpage_charge is used only for charging anonymous memory
so it is better to rename it to mem_cgroup_charge_anon.

mem_cgroup_cache_charge is used for file backed memory so rename it
to mem_cgroup_charge_file.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 Documentation/cgroups/memcg_test.txt | 4 ++--
 include/linux/memcontrol.h           | 8 ++++----
 mm/filemap.c                         | 2 +-
 mm/huge_memory.c                     | 8 ++++----
 mm/memcontrol.c                      | 4 ++--
 mm/memory.c                          | 6 +++---
 mm/shmem.c                           | 6 +++---
 7 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/Documentation/cgroups/memcg_test.txt b/Documentation/cgroups/memcg_test.txt
index ce94a83a7d9a..80ac454704b8 100644
--- a/Documentation/cgroups/memcg_test.txt
+++ b/Documentation/cgroups/memcg_test.txt
@@ -24,7 +24,7 @@ Please note that implementation details can be changed.
 
    a page/swp_entry may be charged (usage += PAGE_SIZE) at
 
-	mem_cgroup_newpage_charge()
+	mem_cgroup_charge_anon()
 	  Called at new page fault and Copy-On-Write.
 
 	mem_cgroup_try_charge_swapin()
@@ -32,7 +32,7 @@ Please note that implementation details can be changed.
 	  Followed by charge-commit-cancel protocol. (With swap accounting)
 	  At commit, a charge recorded in swap_cgroup is removed.
 
-	mem_cgroup_cache_charge()
+	mem_cgroup_charge_file()
 	  Called at add_to_page_cache()
 
 	mem_cgroup_cache_charge_swapin()
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index abd0113b6620..b4e9c196949a 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -65,7 +65,7 @@ struct mem_cgroup_reclaim_cookie {
  * (Of course, if memcg does memory allocation in future, GFP_KERNEL is sane.)
  */
 
-extern int mem_cgroup_newpage_charge(struct page *page, struct mm_struct *mm,
+extern int mem_cgroup_charge_anon(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask);
 /* for swap handling */
 extern int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
@@ -74,7 +74,7 @@ extern void mem_cgroup_commit_charge_swapin(struct page *page,
 					struct mem_cgroup *memcg);
 extern void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg);
 
-extern int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
+extern int mem_cgroup_charge_file(struct page *page, struct mm_struct *mm,
 					gfp_t gfp_mask);
 
 struct lruvec *mem_cgroup_zone_lruvec(struct zone *, struct mem_cgroup *);
@@ -234,13 +234,13 @@ void mem_cgroup_print_bad_page(struct page *page);
 #else /* CONFIG_MEMCG */
 struct mem_cgroup;
 
-static inline int mem_cgroup_newpage_charge(struct page *page,
+static inline int mem_cgroup_charge_anon(struct page *page,
 					struct mm_struct *mm, gfp_t gfp_mask)
 {
 	return 0;
 }
 
-static inline int mem_cgroup_cache_charge(struct page *page,
+static inline int mem_cgroup_charge_file(struct page *page,
 					struct mm_struct *mm, gfp_t gfp_mask)
 {
 	return 0;
diff --git a/mm/filemap.c b/mm/filemap.c
index 2d8af8796fed..a2e7b8ed7b74 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -562,7 +562,7 @@ static int __add_to_page_cache_locked(struct page *page,
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 	VM_BUG_ON_PAGE(PageSwapBacked(page), page);
 
-	error = mem_cgroup_cache_charge(page, current->mm,
+	error = mem_cgroup_charge_file(page, current->mm,
 					gfp_mask & GFP_RECLAIM_MASK);
 	if (error)
 		return error;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index bbf3b3db8f27..335e2f59853b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -827,7 +827,7 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
 		count_vm_event(THP_FAULT_FALLBACK);
 		return VM_FAULT_FALLBACK;
 	}
-	if (unlikely(mem_cgroup_newpage_charge(page, mm, GFP_KERNEL))) {
+	if (unlikely(mem_cgroup_charge_anon(page, mm, GFP_KERNEL))) {
 		put_page(page);
 		count_vm_event(THP_FAULT_FALLBACK);
 		return VM_FAULT_FALLBACK;
@@ -968,7 +968,7 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
 					       __GFP_OTHER_NODE,
 					       vma, address, page_to_nid(page));
 		if (unlikely(!pages[i] ||
-			     mem_cgroup_newpage_charge(pages[i], mm,
+			     mem_cgroup_charge_anon(pages[i], mm,
 						       GFP_KERNEL))) {
 			if (pages[i])
 				put_page(pages[i]);
@@ -1101,7 +1101,7 @@ alloc:
 		goto out;
 	}
 
-	if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))) {
+	if (unlikely(mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL))) {
 		put_page(new_page);
 		if (page) {
 			split_huge_page(page);
@@ -2363,7 +2363,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	if (!new_page)
 		return;
 
-	if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL)))
+	if (unlikely(mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL)))
 		return;
 
 	/*
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 67e01b27a021..d67650a67507 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3851,7 +3851,7 @@ out:
 	return ret;
 }
 
-int mem_cgroup_newpage_charge(struct page *page,
+int mem_cgroup_charge_anon(struct page *page,
 			      struct mm_struct *mm, gfp_t gfp_mask)
 {
 	unsigned int nr_pages = 1;
@@ -3987,7 +3987,7 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
 					  MEM_CGROUP_CHARGE_TYPE_ANON);
 }
 
-int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
+int mem_cgroup_charge_file(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask)
 {
 	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
diff --git a/mm/memory.c b/mm/memory.c
index 548d97e3df91..5c57d1bbf3cf 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2803,7 +2803,7 @@ gotten:
 	}
 	__SetPageUptodate(new_page);
 
-	if (mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))
+	if (mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL))
 		goto oom_free_new;
 
 	mmun_start  = address & PAGE_MASK;
@@ -3256,7 +3256,7 @@ static int do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	 */
 	__SetPageUptodate(page);
 
-	if (mem_cgroup_newpage_charge(page, mm, GFP_KERNEL))
+	if (mem_cgroup_charge_anon(page, mm, GFP_KERNEL))
 		goto oom_free_page;
 
 	entry = mk_pte(page, vma->vm_page_prot);
@@ -3384,7 +3384,7 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	if (!new_page)
 		return VM_FAULT_OOM;
 
-	if (mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL)) {
+	if (mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL)) {
 		page_cache_release(new_page);
 		return VM_FAULT_OOM;
 	}
diff --git a/mm/shmem.c b/mm/shmem.c
index 7847ea0c0d30..0f0fca94b532 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -685,7 +685,7 @@ int shmem_unuse(swp_entry_t swap, struct page *page)
 	 * the shmem_swaplist_mutex which might hold up shmem_writepage().
 	 * Charged back to the user (not to caller) when swap account is used.
 	 */
-	error = mem_cgroup_cache_charge(page, current->mm, GFP_KERNEL);
+	error = mem_cgroup_charge_file(page, current->mm, GFP_KERNEL);
 	if (error)
 		goto out;
 	/* No radix_tree_preload: swap entry keeps a place for page in tree */
@@ -1082,7 +1082,7 @@ repeat:
 				goto failed;
 		}
 
-		error = mem_cgroup_cache_charge(page, current->mm,
+		error = mem_cgroup_charge_file(page, current->mm,
 						gfp & GFP_RECLAIM_MASK);
 		if (!error) {
 			error = shmem_add_to_page_cache(page, mapping, index,
@@ -1136,7 +1136,7 @@ repeat:
 
 		SetPageSwapBacked(page);
 		__set_page_locked(page);
-		error = mem_cgroup_cache_charge(page, current->mm,
+		error = mem_cgroup_charge_file(page, current->mm,
 						gfp & GFP_RECLAIM_MASK);
 		if (error)
 			goto decused;
-- 
1.9.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH] memcg: rename high level charging functions
@ 2014-03-12 15:20         ` Michal Hocko
  0 siblings, 0 replies; 43+ messages in thread
From: Michal Hocko @ 2014-03-12 15:20 UTC (permalink / raw)
  To: Johannes Weiner; +Cc: Andrew Morton, linux-mm, cgroups, linux-kernel

mem_cgroup_newpage_charge is used only for charging anonymous memory
so it is better to rename it to mem_cgroup_charge_anon.

mem_cgroup_cache_charge is used for file backed memory so rename it
to mem_cgroup_charge_file.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
---
 Documentation/cgroups/memcg_test.txt | 4 ++--
 include/linux/memcontrol.h           | 8 ++++----
 mm/filemap.c                         | 2 +-
 mm/huge_memory.c                     | 8 ++++----
 mm/memcontrol.c                      | 4 ++--
 mm/memory.c                          | 6 +++---
 mm/shmem.c                           | 6 +++---
 7 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/Documentation/cgroups/memcg_test.txt b/Documentation/cgroups/memcg_test.txt
index ce94a83a7d9a..80ac454704b8 100644
--- a/Documentation/cgroups/memcg_test.txt
+++ b/Documentation/cgroups/memcg_test.txt
@@ -24,7 +24,7 @@ Please note that implementation details can be changed.
 
    a page/swp_entry may be charged (usage += PAGE_SIZE) at
 
-	mem_cgroup_newpage_charge()
+	mem_cgroup_charge_anon()
 	  Called at new page fault and Copy-On-Write.
 
 	mem_cgroup_try_charge_swapin()
@@ -32,7 +32,7 @@ Please note that implementation details can be changed.
 	  Followed by charge-commit-cancel protocol. (With swap accounting)
 	  At commit, a charge recorded in swap_cgroup is removed.
 
-	mem_cgroup_cache_charge()
+	mem_cgroup_charge_file()
 	  Called at add_to_page_cache()
 
 	mem_cgroup_cache_charge_swapin()
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index abd0113b6620..b4e9c196949a 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -65,7 +65,7 @@ struct mem_cgroup_reclaim_cookie {
  * (Of course, if memcg does memory allocation in future, GFP_KERNEL is sane.)
  */
 
-extern int mem_cgroup_newpage_charge(struct page *page, struct mm_struct *mm,
+extern int mem_cgroup_charge_anon(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask);
 /* for swap handling */
 extern int mem_cgroup_try_charge_swapin(struct mm_struct *mm,
@@ -74,7 +74,7 @@ extern void mem_cgroup_commit_charge_swapin(struct page *page,
 					struct mem_cgroup *memcg);
 extern void mem_cgroup_cancel_charge_swapin(struct mem_cgroup *memcg);
 
-extern int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
+extern int mem_cgroup_charge_file(struct page *page, struct mm_struct *mm,
 					gfp_t gfp_mask);
 
 struct lruvec *mem_cgroup_zone_lruvec(struct zone *, struct mem_cgroup *);
@@ -234,13 +234,13 @@ void mem_cgroup_print_bad_page(struct page *page);
 #else /* CONFIG_MEMCG */
 struct mem_cgroup;
 
-static inline int mem_cgroup_newpage_charge(struct page *page,
+static inline int mem_cgroup_charge_anon(struct page *page,
 					struct mm_struct *mm, gfp_t gfp_mask)
 {
 	return 0;
 }
 
-static inline int mem_cgroup_cache_charge(struct page *page,
+static inline int mem_cgroup_charge_file(struct page *page,
 					struct mm_struct *mm, gfp_t gfp_mask)
 {
 	return 0;
diff --git a/mm/filemap.c b/mm/filemap.c
index 2d8af8796fed..a2e7b8ed7b74 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -562,7 +562,7 @@ static int __add_to_page_cache_locked(struct page *page,
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 	VM_BUG_ON_PAGE(PageSwapBacked(page), page);
 
-	error = mem_cgroup_cache_charge(page, current->mm,
+	error = mem_cgroup_charge_file(page, current->mm,
 					gfp_mask & GFP_RECLAIM_MASK);
 	if (error)
 		return error;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index bbf3b3db8f27..335e2f59853b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -827,7 +827,7 @@ int do_huge_pmd_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
 		count_vm_event(THP_FAULT_FALLBACK);
 		return VM_FAULT_FALLBACK;
 	}
-	if (unlikely(mem_cgroup_newpage_charge(page, mm, GFP_KERNEL))) {
+	if (unlikely(mem_cgroup_charge_anon(page, mm, GFP_KERNEL))) {
 		put_page(page);
 		count_vm_event(THP_FAULT_FALLBACK);
 		return VM_FAULT_FALLBACK;
@@ -968,7 +968,7 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
 					       __GFP_OTHER_NODE,
 					       vma, address, page_to_nid(page));
 		if (unlikely(!pages[i] ||
-			     mem_cgroup_newpage_charge(pages[i], mm,
+			     mem_cgroup_charge_anon(pages[i], mm,
 						       GFP_KERNEL))) {
 			if (pages[i])
 				put_page(pages[i]);
@@ -1101,7 +1101,7 @@ alloc:
 		goto out;
 	}
 
-	if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))) {
+	if (unlikely(mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL))) {
 		put_page(new_page);
 		if (page) {
 			split_huge_page(page);
@@ -2363,7 +2363,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	if (!new_page)
 		return;
 
-	if (unlikely(mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL)))
+	if (unlikely(mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL)))
 		return;
 
 	/*
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 67e01b27a021..d67650a67507 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3851,7 +3851,7 @@ out:
 	return ret;
 }
 
-int mem_cgroup_newpage_charge(struct page *page,
+int mem_cgroup_charge_anon(struct page *page,
 			      struct mm_struct *mm, gfp_t gfp_mask)
 {
 	unsigned int nr_pages = 1;
@@ -3987,7 +3987,7 @@ void mem_cgroup_commit_charge_swapin(struct page *page,
 					  MEM_CGROUP_CHARGE_TYPE_ANON);
 }
 
-int mem_cgroup_cache_charge(struct page *page, struct mm_struct *mm,
+int mem_cgroup_charge_file(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask)
 {
 	enum charge_type type = MEM_CGROUP_CHARGE_TYPE_CACHE;
diff --git a/mm/memory.c b/mm/memory.c
index 548d97e3df91..5c57d1bbf3cf 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2803,7 +2803,7 @@ gotten:
 	}
 	__SetPageUptodate(new_page);
 
-	if (mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL))
+	if (mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL))
 		goto oom_free_new;
 
 	mmun_start  = address & PAGE_MASK;
@@ -3256,7 +3256,7 @@ static int do_anonymous_page(struct mm_struct *mm, struct vm_area_struct *vma,
 	 */
 	__SetPageUptodate(page);
 
-	if (mem_cgroup_newpage_charge(page, mm, GFP_KERNEL))
+	if (mem_cgroup_charge_anon(page, mm, GFP_KERNEL))
 		goto oom_free_page;
 
 	entry = mk_pte(page, vma->vm_page_prot);
@@ -3384,7 +3384,7 @@ static int do_cow_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	if (!new_page)
 		return VM_FAULT_OOM;
 
-	if (mem_cgroup_newpage_charge(new_page, mm, GFP_KERNEL)) {
+	if (mem_cgroup_charge_anon(new_page, mm, GFP_KERNEL)) {
 		page_cache_release(new_page);
 		return VM_FAULT_OOM;
 	}
diff --git a/mm/shmem.c b/mm/shmem.c
index 7847ea0c0d30..0f0fca94b532 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -685,7 +685,7 @@ int shmem_unuse(swp_entry_t swap, struct page *page)
 	 * the shmem_swaplist_mutex which might hold up shmem_writepage().
 	 * Charged back to the user (not to caller) when swap account is used.
 	 */
-	error = mem_cgroup_cache_charge(page, current->mm, GFP_KERNEL);
+	error = mem_cgroup_charge_file(page, current->mm, GFP_KERNEL);
 	if (error)
 		goto out;
 	/* No radix_tree_preload: swap entry keeps a place for page in tree */
@@ -1082,7 +1082,7 @@ repeat:
 				goto failed;
 		}
 
-		error = mem_cgroup_cache_charge(page, current->mm,
+		error = mem_cgroup_charge_file(page, current->mm,
 						gfp & GFP_RECLAIM_MASK);
 		if (!error) {
 			error = shmem_add_to_page_cache(page, mapping, index,
@@ -1136,7 +1136,7 @@ repeat:
 
 		SetPageSwapBacked(page);
 		__set_page_locked(page);
-		error = mem_cgroup_cache_charge(page, current->mm,
+		error = mem_cgroup_charge_file(page, current->mm,
 						gfp & GFP_RECLAIM_MASK);
 		if (error)
 			goto decused;
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2014-03-12 15:20 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-12  1:28 [patch 0/8] memcg: charge path cleanups Johannes Weiner
2014-03-12  1:28 ` Johannes Weiner
2014-03-12  1:28 ` Johannes Weiner
2014-03-12  1:28 ` [patch 1/8] mm: memcg: remove unnecessary preemption disabling Johannes Weiner
2014-03-12  1:28   ` Johannes Weiner
2014-03-12  1:28 ` [patch 2/8] mm: memcg: remove mem_cgroup_move_account_page_stat() Johannes Weiner
2014-03-12  1:28   ` Johannes Weiner
2014-03-12  1:28 ` [patch 3/8] mm: memcg: inline mem_cgroup_charge_common() Johannes Weiner
2014-03-12  1:28   ` Johannes Weiner
2014-03-12 12:52   ` Michal Hocko
2014-03-12 12:52     ` Michal Hocko
2014-03-12 14:53     ` Johannes Weiner
2014-03-12 14:53       ` Johannes Weiner
2014-03-12 15:19       ` Michal Hocko
2014-03-12 15:19         ` Michal Hocko
2014-03-12 15:19         ` Michal Hocko
2014-03-12 15:20       ` [PATCH] memcg: rename high level charging functions Michal Hocko
2014-03-12 15:20         ` Michal Hocko
2014-03-12  1:28 ` [patch 4/8] mm: memcg: push !mm handling out to page cache charge function Johannes Weiner
2014-03-12  1:28   ` Johannes Weiner
2014-03-12 13:11   ` Michal Hocko
2014-03-12 13:11     ` Michal Hocko
2014-03-12 13:11     ` Michal Hocko
2014-03-12 14:56     ` Johannes Weiner
2014-03-12 14:56       ` Johannes Weiner
2014-03-12 14:56       ` Johannes Weiner
2014-03-12 15:01       ` Michal Hocko
2014-03-12 15:01         ` Michal Hocko
2014-03-12 15:01         ` Michal Hocko
2014-03-12  1:28 ` [patch 5/8] memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm() Johannes Weiner
2014-03-12  1:28   ` Johannes Weiner
2014-03-12  1:28 ` [patch 6/8] memcg: get_mem_cgroup_from_mm() Johannes Weiner
2014-03-12  1:28   ` Johannes Weiner
2014-03-12  1:28 ` [patch 7/8] memcg: do not replicate get_mem_cgroup_from_mm in __mem_cgroup_try_charge Johannes Weiner
2014-03-12  1:28   ` Johannes Weiner
2014-03-12  1:28 ` [patch 8/8] memcg: sanitize __mem_cgroup_try_charge() call protocol Johannes Weiner
2014-03-12  1:28   ` Johannes Weiner
2014-03-12 14:01   ` Michal Hocko
2014-03-12 14:01     ` Michal Hocko
2014-03-12 14:01     ` Michal Hocko
2014-03-12 14:05     ` Michal Hocko
2014-03-12 14:05       ` Michal Hocko
2014-03-12 14:05       ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.