linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov@parallels.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [patch 13/13] mm: memcontrol: rewrite uncharge API
Date: Wed, 16 Jul 2014 10:14:47 -0400	[thread overview]
Message-ID: <20140716141447.GY29639@cmpxchg.org> (raw)
In-Reply-To: <20140716133050.GA4644@nhori.redhat.com>

On Wed, Jul 16, 2014 at 09:30:50AM -0400, Naoya Horiguchi wrote:
> On Tue, Jul 15, 2014 at 05:48:43PM -0400, Johannes Weiner wrote:
> > On Tue, Jul 15, 2014 at 04:49:53PM -0400, Naoya Horiguchi wrote:
> > > I feel that these 2 messages have the same cause (just appear differently).
> > > __add_to_page_cache_locked() (and mem_cgroup_try_charge()) can be called
> > > for hugetlb, while we avoid calling mem_cgroup_migrate()/mem_cgroup_uncharge()
> > > for hugetlb. This seems to make page_cgroup of the hugepage inconsistent,
> > > and results in the bad page bug ("page dumped because: cgroup check failed").
> > > So maybe some more PageHuge check is necessary around the charging code.
> > 
> > This struck me as odd because I don't remember removing a PageHuge()
> > call in the charge path and wondered how it worked before my changes:
> > apparently it just checked PageCompound() in mem_cgroup_charge_file().
> > 
> > So it's not fallout of the new uncharge batching code, but was already
> > broken during the rewrite of the charge API because then hugetlb pages
> > entered the charging code.
> > 
> > Anyway, we don't have file-specific charging code anymore, and the
> > PageCompound() check would have required changing anyway for THP
> > cache.  So I guess the solution is checking PageHuge() in charge,
> > uncharge, and migrate for now.  Oh well.
> > 
> > How about this?
> 
> With tweaking a bit, this patch solved the problem, thanks!
> 
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 9c99d6868a5e..b61194273b56 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -564,9 +564,12 @@ static int __add_to_page_cache_locked(struct page *page,
> >  	VM_BUG_ON_PAGE(!PageLocked(page), page);
> >  	VM_BUG_ON_PAGE(PageSwapBacked(page), page);
> >  
> > -	error = mem_cgroup_try_charge(page, current->mm, gfp_mask, &memcg);
> > -	if (error)
> > -		return error;
> > +	if (!PageHuge(page)) {
> > +		error = mem_cgroup_try_charge(page, current->mm,
> > +					      gfp_mask, &memcg);
> > +		if (error)
> > +			return error;
> > +	}
> >  
> >  	error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
> >  	if (error) {
> 
> We have mem_cgroup_commit_charge() later in __add_to_page_cache_locked(),
> so adding "if (!PageHuge(page))" for it is necessary too.

You are right.  Annotated them all now.

> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 7f5a42403fae..dabed2f08609 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -781,7 +781,8 @@ static int move_to_new_page(struct page *newpage, struct page *page,
> >  		if (!PageAnon(newpage))
> >  			newpage->mapping = NULL;
> >  	} else {
> > -		mem_cgroup_migrate(page, newpage, false);
> > +		if (!PageHuge(page))
> > +			mem_cgroup_migrate(page, newpage, false);

I deleted this again as it was a followup fix to hugepages getting
wrongfully charged as file cache.  They shouldn't be, and
mem_cgroup_migrate() checks whether the page is charged.

> >  		if (remap_swapcache)
> >  			remove_migration_ptes(page, newpage);
> >  		if (!PageAnon(page))
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 3461f2f5be20..97b6ec132398 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -62,12 +62,12 @@ static void __page_cache_release(struct page *page)
> >  		del_page_from_lru_list(page, lruvec, page_off_lru(page));
> >  		spin_unlock_irqrestore(&zone->lru_lock, flags);
> >  	}
> > -	mem_cgroup_uncharge(page);
> >  }
> >  
> >  static void __put_single_page(struct page *page)
> >  {
> >  	__page_cache_release(page);
> > +	mem_cgroup_uncharge_page(page);
> 
> My kernel is based on mmotm-2014-07-09-17-08, where mem_cgroup_uncharge_page()
> does not exist any more. Maybe mem_cgroup_uncharge(page) seems correct.

Sorry, I should have build tested.  The name is still reflex...

> >  	free_hot_cold_page(page, false);
> >  }
> >  
> > @@ -75,7 +75,10 @@ static void __put_compound_page(struct page *page)
> >  {
> >  	compound_page_dtor *dtor;
> >  
> > -	__page_cache_release(page);
> > +	if (!PageHuge(page)) {
> > +		__page_cache_release(page);
> > +		mem_cgroup_uncharge_page(page);

I reverted all these mm/swap.c changes again as well.  Instead,
mem_cgroup_uncharge() now does a preliminary check if the page is
charged before it touches page->lru.

That should be much more robust: now the vetting whether a page is
valid for memcg happens at charge time only, all other operations
check first if a page is charged before doing anything else to it.

These two places should be the only ones that need fixing then:

diff --git a/mm/filemap.c b/mm/filemap.c
index 9c99d6868a5e..bfe0745a704d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -31,6 +31,7 @@
 #include <linux/security.h>
 #include <linux/cpuset.h>
 #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
+#include <linux/hugetlb.h>
 #include <linux/memcontrol.h>
 #include <linux/cleancache.h>
 #include <linux/rmap.h>
@@ -558,19 +559,24 @@ static int __add_to_page_cache_locked(struct page *page,
 				      pgoff_t offset, gfp_t gfp_mask,
 				      void **shadowp)
 {
+	int huge = PageHuge(page);
 	struct mem_cgroup *memcg;
 	int error;
 
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 	VM_BUG_ON_PAGE(PageSwapBacked(page), page);
 
-	error = mem_cgroup_try_charge(page, current->mm, gfp_mask, &memcg);
-	if (error)
-		return error;
+	if (!huge) {
+		error = mem_cgroup_try_charge(page, current->mm,
+					      gfp_mask, &memcg);
+		if (error)
+			return error;
+	}
 
 	error = radix_tree_maybe_preload(gfp_mask & ~__GFP_HIGHMEM);
 	if (error) {
-		mem_cgroup_cancel_charge(page, memcg);
+		if (!huge)
+			mem_cgroup_cancel_charge(page, memcg);
 		return error;
 	}
 
@@ -585,14 +591,16 @@ static int __add_to_page_cache_locked(struct page *page,
 		goto err_insert;
 	__inc_zone_page_state(page, NR_FILE_PAGES);
 	spin_unlock_irq(&mapping->tree_lock);
-	mem_cgroup_commit_charge(page, memcg, false);
+	if (!huge)
+		mem_cgroup_commit_charge(page, memcg, false);
 	trace_mm_filemap_add_to_page_cache(page);
 	return 0;
 err_insert:
 	page->mapping = NULL;
 	/* Leave page->index set: truncation relies upon it */
 	spin_unlock_irq(&mapping->tree_lock);
-	mem_cgroup_cancel_charge(page, memcg);
+	if (!huge)
+		mem_cgroup_cancel_charge(page, memcg);
 	page_cache_release(page);
 	return error;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 063080e35459..b5de5deddbfb 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -6635,9 +6635,16 @@ static void uncharge_list(struct list_head *page_list)
  */
 void mem_cgroup_uncharge(struct page *page)
 {
+	struct page_cgroup *pc;
+
 	if (mem_cgroup_disabled())
 		return;
 
+	/* Don't touch page->lru of any random page, pre-check: */
+	pc = lookup_page_cgroup(page);
+	if (!PageCgroupUsed(pc))
+		return;
+
 	INIT_LIST_HEAD(&page->lru);
 	uncharge_list(&page->lru);
 }

  reply	other threads:[~2014-07-16 14:15 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18 20:40 [patch 00/13] mm: memcontrol: naturalize charge lifetime v4 Johannes Weiner
2014-06-18 20:40 ` [patch 01/13] mm: memcontrol: fold mem_cgroup_do_charge() Johannes Weiner
2014-06-18 20:40 ` [patch 02/13] mm: memcontrol: rearrange charging fast path Johannes Weiner
2014-06-18 20:40 ` [patch 03/13] mm: memcontrol: reclaim at least once for __GFP_NORETRY Johannes Weiner
2014-06-18 20:40 ` [patch 04/13] mm: huge_memory: use GFP_TRANSHUGE when charging huge pages Johannes Weiner
2014-06-18 20:40 ` [patch 05/13] mm: memcontrol: retry reclaim for oom-disabled and __GFP_NOFAIL charges Johannes Weiner
2014-06-18 20:40 ` [patch 06/13] mm: memcontrol: remove explicit OOM parameter in charge path Johannes Weiner
2014-06-18 20:40 ` [patch 07/13] mm: memcontrol: simplify move precharge function Johannes Weiner
2014-06-18 20:40 ` [patch 08/13] mm: memcontrol: catch root bypass in move precharge Johannes Weiner
2014-06-18 20:40 ` [patch 09/13] mm: memcontrol: use root_mem_cgroup res_counter Johannes Weiner
2014-06-18 20:40 ` [patch 10/13] mm: memcontrol: remove ordering between pc->mem_cgroup and PageCgroupUsed Johannes Weiner
2014-06-18 20:40 ` [patch 11/13] mm: memcontrol: do not acquire page_cgroup lock for kmem pages Johannes Weiner
2014-06-18 20:40 ` [patch 12/13] mm: memcontrol: rewrite charge API Johannes Weiner
2014-06-23  6:15   ` Uwe Kleine-König
2014-06-23  9:30     ` Michal Hocko
2014-06-23  9:42       ` Uwe Kleine-König
2014-07-14 15:04   ` Michal Hocko
2014-07-14 17:13     ` Johannes Weiner
2014-07-14 18:43       ` Michal Hocko
2014-06-18 20:40 ` [patch 13/13] mm: memcontrol: rewrite uncharge API Johannes Weiner
2014-06-20 16:36   ` [PATCH -mm] memcg: mem_cgroup_charge_statistics needs preempt_disable Michal Hocko
2014-06-23  4:16     ` Johannes Weiner
2014-06-21  0:34   ` [patch 13/13] mm: memcontrol: rewrite uncharge API Sasha Levin
2014-06-21  0:56     ` Andrew Morton
2014-06-21  1:03       ` Sasha Levin
2014-07-15  8:25   ` Michal Hocko
2014-07-15 12:19     ` Michal Hocko
2014-07-18  7:12       ` Michal Hocko
2014-07-18 14:45         ` Johannes Weiner
2014-07-18 15:12           ` Miklos Szeredi
2014-07-19 17:39             ` Johannes Weiner
2014-07-22 15:08               ` Michal Hocko
2014-07-22 15:44                 ` Miklos Szeredi
2014-07-23 14:38                   ` Michal Hocko
2014-07-23 15:06                     ` Johannes Weiner
2014-07-23 15:19                       ` Michal Hocko
2014-07-23 15:36                         ` Johannes Weiner
2014-07-23 18:08                       ` Miklos Szeredi
2014-07-23 21:02                         ` Johannes Weiner
2014-07-24  8:46                           ` Michal Hocko
2014-07-24  9:02                             ` Michal Hocko
2014-07-25 15:26                               ` Johannes Weiner
2014-07-25 15:43                                 ` Michal Hocko
2014-07-25 17:34                                   ` Johannes Weiner
2014-07-15 14:23     ` Michal Hocko
2014-07-15 15:09       ` Johannes Weiner
2014-07-15 15:18         ` Michal Hocko
2014-07-15 15:46           ` Johannes Weiner
2014-07-15 15:56             ` Michal Hocko
2014-07-15 15:55   ` Naoya Horiguchi
2014-07-15 16:07     ` Michal Hocko
2014-07-15 17:34       ` Johannes Weiner
2014-07-15 18:21         ` Michal Hocko
2014-07-15 18:43         ` Naoya Horiguchi
2014-07-15 19:04           ` Johannes Weiner
2014-07-15 20:49             ` Naoya Horiguchi
2014-07-15 21:48               ` Johannes Weiner
2014-07-16  7:55                 ` Michal Hocko
2014-07-16 13:30                 ` Naoya Horiguchi
2014-07-16 14:14                   ` Johannes Weiner [this message]
2014-07-16 14:57                     ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140716141447.GY29639@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=tj@kernel.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).