linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>, Tejun Heo <tj@kernel.org>,
	Vladimir Davydov <vdavydov@parallels.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [patch 12/13] mm: memcontrol: rewrite charge API
Date: Mon, 14 Jul 2014 13:13:24 -0400	[thread overview]
Message-ID: <20140714171324.GQ29639@cmpxchg.org> (raw)
In-Reply-To: <20140714150446.GD30713@dhcp22.suse.cz>

On Mon, Jul 14, 2014 at 05:04:46PM +0200, Michal Hocko wrote:
> Hi,
> I've finally manage to untagle myself from internal stuff...
> 
> On Wed 18-06-14 16:40:44, Johannes Weiner wrote:
> > The memcg charge API charges pages before they are rmapped - i.e. have
> > an actual "type" - and so every callsite needs its own set of charge
> > and uncharge functions to know what type is being operated on.  Worse,
> > uncharge has to happen from a context that is still type-specific,
> > rather than at the end of the page's lifetime with exclusive access,
> > and so requires a lot of synchronization.
> > 
> > Rewrite the charge API to provide a generic set of try_charge(),
> > commit_charge() and cancel_charge() transaction operations, much like
> > what's currently done for swap-in:
> > 
> >   mem_cgroup_try_charge() attempts to reserve a charge, reclaiming
> >   pages from the memcg if necessary.
> > 
> >   mem_cgroup_commit_charge() commits the page to the charge once it
> >   has a valid page->mapping and PageAnon() reliably tells the type.
> > 
> >   mem_cgroup_cancel_charge() aborts the transaction.
> > 
> > This reduces the charge API and enables subsequent patches to
> > drastically simplify uncharging.
> > 
> > As pages need to be committed after rmap is established but before
> > they are added to the LRU, page_add_new_anon_rmap() must stop doing
> > LRU additions again.  Revive lru_cache_add_active_or_unevictable().
> 
> I think it would make more sense to do
> lru_cache_add_active_or_unevictable in a separate patch for easier
> review. Too late, though...
> 
> Few comments bellow
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> The patch looks correct but the code is quite tricky so I hope I didn't
> miss anything.
> 
> Acked-by: Michal Hocko <mhocko@suse.cz>

Thanks!

> > @@ -54,28 +54,11 @@ struct mem_cgroup_reclaim_cookie {
> >  };
> >  
> >  #ifdef CONFIG_MEMCG
> > -/*
> > - * All "charge" functions with gfp_mask should use GFP_KERNEL or
> > - * (gfp_mask & GFP_RECLAIM_MASK). In current implementatin, memcg doesn't
> > - * alloc memory but reclaims memory from all available zones. So, "where I want
> > - * memory from" bits of gfp_mask has no meaning. So any bits of that field is
> > - * available but adding a rule is better. charge functions' gfp_mask should
> > - * be set to GFP_KERNEL or gfp_mask & GFP_RECLAIM_MASK for avoiding ambiguous
> > - * codes.
> > - * (Of course, if memcg does memory allocation in future, GFP_KERNEL is sane.)
> > - */
> 
> I think we should slightly modify the comment but the primary idea
> should stay there. What about the following?
> /*
>  * Although memcg charge functions do not allocate any memory they are
>  * still getting GFP mask to control the reclaim process (therefore
>  * gfp_mask & GFP_RECLAIM_MASK is expected).
>  * GFP_KERNEL should be used for the general charge path without any
>  * constraints for the reclaim
>  * __GFP_WAIT should be cleared for atomic contexts
>  * __GFP_NORETRY should be set for charges which might fail rather than
>  * spend too much time reclaiming
>  * __GFP_NOFAIL should be set for charges which cannot fail.
>  */

What *is* the primary idea here?

Taking any kind of gfp mask and interpreting the bits that pertain to
you is done in a lot of places already, and there really is no need to
duplicate the documentation and risk it getting stale and misleading.

> > @@ -948,6 +951,7 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
> >  					struct page *page,
> >  					unsigned long haddr)
> >  {
> > +	struct mem_cgroup *memcg;
> >  	spinlock_t *ptl;
> >  	pgtable_t pgtable;
> >  	pmd_t _pmd;
> > @@ -968,20 +972,21 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
> >  					       __GFP_OTHER_NODE,
> >  					       vma, address, page_to_nid(page));
> >  		if (unlikely(!pages[i] ||
> > -			     mem_cgroup_charge_anon(pages[i], mm,
> > -						       GFP_KERNEL))) {
> > +			     mem_cgroup_try_charge(pages[i], mm, GFP_KERNEL,
> > +						   &memcg))) {
> >  			if (pages[i])
> >  				put_page(pages[i]);
> > -			mem_cgroup_uncharge_start();
> >  			while (--i >= 0) {
> > -				mem_cgroup_uncharge_page(pages[i]);
> > +				memcg = (void *)page_private(pages[i]);
> 
> Hmm, OK the memcg couldn't go away even if mm owner has left it because
> the charge is already there and the page is not on LRU so the
> mem_cgroup_css_free will wait until we uncharge it or put to LRU.

Yep, res_counter charges have always pinned the memcg.  We already
used this exact protocol and relied on the same lifetime rules for
swapin charging.

> > +/**
> > + * mem_cgroup_commit_charge - commit a page charge
> > + * @page: page to charge
> > + * @memcg: memcg to charge the page to
> > + * @lrucare: page might be on LRU already
> > + *
> > + * Finalize a charge transaction started by mem_cgroup_try_charge(),
> > + * after page->mapping has been set up.  This must happen atomically
> > + * as part of the page instantiation, i.e. under the page table lock
> > + * for anonymous pages, under the page lock for page and swap cache.
> > + *
> > + * In addition, the page must not be on the LRU during the commit, to
> > + * prevent racing with task migration.  If it might be, use @lrucare.
> > + *
> > + * Use mem_cgroup_cancel_charge() to cancel the transaction instead.
> > + */
> > +void mem_cgroup_commit_charge(struct page *page, struct mem_cgroup *memcg,
> > +			      bool lrucare)
> 
> I think we should be explicit that this is only required for LRU pages.
> kmem doesn't have to finalize the transaction.

The function itself only applies to user/LRU pages.  kmem has its own
separate API for charge/commit/cancel/uncharge.

  reply	other threads:[~2014-07-14 17:13 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-18 20:40 [patch 00/13] mm: memcontrol: naturalize charge lifetime v4 Johannes Weiner
2014-06-18 20:40 ` [patch 01/13] mm: memcontrol: fold mem_cgroup_do_charge() Johannes Weiner
2014-06-18 20:40 ` [patch 02/13] mm: memcontrol: rearrange charging fast path Johannes Weiner
2014-06-18 20:40 ` [patch 03/13] mm: memcontrol: reclaim at least once for __GFP_NORETRY Johannes Weiner
2014-06-18 20:40 ` [patch 04/13] mm: huge_memory: use GFP_TRANSHUGE when charging huge pages Johannes Weiner
2014-06-18 20:40 ` [patch 05/13] mm: memcontrol: retry reclaim for oom-disabled and __GFP_NOFAIL charges Johannes Weiner
2014-06-18 20:40 ` [patch 06/13] mm: memcontrol: remove explicit OOM parameter in charge path Johannes Weiner
2014-06-18 20:40 ` [patch 07/13] mm: memcontrol: simplify move precharge function Johannes Weiner
2014-06-18 20:40 ` [patch 08/13] mm: memcontrol: catch root bypass in move precharge Johannes Weiner
2014-06-18 20:40 ` [patch 09/13] mm: memcontrol: use root_mem_cgroup res_counter Johannes Weiner
2014-06-18 20:40 ` [patch 10/13] mm: memcontrol: remove ordering between pc->mem_cgroup and PageCgroupUsed Johannes Weiner
2014-06-18 20:40 ` [patch 11/13] mm: memcontrol: do not acquire page_cgroup lock for kmem pages Johannes Weiner
2014-06-18 20:40 ` [patch 12/13] mm: memcontrol: rewrite charge API Johannes Weiner
2014-06-23  6:15   ` Uwe Kleine-König
2014-06-23  9:30     ` Michal Hocko
2014-06-23  9:42       ` Uwe Kleine-König
2014-07-14 15:04   ` Michal Hocko
2014-07-14 17:13     ` Johannes Weiner [this message]
2014-07-14 18:43       ` Michal Hocko
2014-06-18 20:40 ` [patch 13/13] mm: memcontrol: rewrite uncharge API Johannes Weiner
2014-06-20 16:36   ` [PATCH -mm] memcg: mem_cgroup_charge_statistics needs preempt_disable Michal Hocko
2014-06-23  4:16     ` Johannes Weiner
2014-06-21  0:34   ` [patch 13/13] mm: memcontrol: rewrite uncharge API Sasha Levin
2014-06-21  0:56     ` Andrew Morton
2014-06-21  1:03       ` Sasha Levin
2014-07-15  8:25   ` Michal Hocko
2014-07-15 12:19     ` Michal Hocko
2014-07-18  7:12       ` Michal Hocko
2014-07-18 14:45         ` Johannes Weiner
2014-07-18 15:12           ` Miklos Szeredi
2014-07-19 17:39             ` Johannes Weiner
2014-07-22 15:08               ` Michal Hocko
2014-07-22 15:44                 ` Miklos Szeredi
2014-07-23 14:38                   ` Michal Hocko
2014-07-23 15:06                     ` Johannes Weiner
2014-07-23 15:19                       ` Michal Hocko
2014-07-23 15:36                         ` Johannes Weiner
2014-07-23 18:08                       ` Miklos Szeredi
2014-07-23 21:02                         ` Johannes Weiner
2014-07-24  8:46                           ` Michal Hocko
2014-07-24  9:02                             ` Michal Hocko
2014-07-25 15:26                               ` Johannes Weiner
2014-07-25 15:43                                 ` Michal Hocko
2014-07-25 17:34                                   ` Johannes Weiner
2014-07-15 14:23     ` Michal Hocko
2014-07-15 15:09       ` Johannes Weiner
2014-07-15 15:18         ` Michal Hocko
2014-07-15 15:46           ` Johannes Weiner
2014-07-15 15:56             ` Michal Hocko
2014-07-15 15:55   ` Naoya Horiguchi
2014-07-15 16:07     ` Michal Hocko
2014-07-15 17:34       ` Johannes Weiner
2014-07-15 18:21         ` Michal Hocko
2014-07-15 18:43         ` Naoya Horiguchi
2014-07-15 19:04           ` Johannes Weiner
2014-07-15 20:49             ` Naoya Horiguchi
2014-07-15 21:48               ` Johannes Weiner
2014-07-16  7:55                 ` Michal Hocko
2014-07-16 13:30                 ` Naoya Horiguchi
2014-07-16 14:14                   ` Johannes Weiner
2014-07-16 14:57                     ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140714171324.GQ29639@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=tj@kernel.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).