From: Muchun Song <songmuchun@bytedance.com>
To: Dan Schatzberg <schatzberg.dan@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>, Tejun Heo <tj@kernel.org>,
Zefan Li <lizefan.x@bytedance.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Hugh Dickins <hughd@google.com>,
Shakeel Butt <shakeelb@google.com>, Roman Gushchin <guro@fb.com>,
Yang Shi <shy828301@gmail.com>,
Alex Shi <alex.shi@linux.alibaba.com>,
Alexander Duyck <alexander.h.duyck@linux.intel.com>,
Yafang Shao <laoar.shao@gmail.com>,
Wei Yang <richard.weiyang@gmail.com>,
"open list:BLOCK LAYER" <linux-block@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>,
"open list:CONTROL GROUP (CGROUP)" <cgroups@vger.kernel.org>,
"open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
Chris Down <chris@chrisdown.name>
Subject: Re: [External] [PATCH 2/3] mm: Charge active memcg when no mm is set
Date: Tue, 30 Mar 2021 00:13:01 +0800 [thread overview]
Message-ID: <CAMZfGtVT85+1_Bu9LBaG6DUsr1kYsepQ-1-K7BDD0Wn3L+BQgg@mail.gmail.com> (raw)
In-Reply-To: <20210329144829.1834347-3-schatzberg.dan@gmail.com>
On Mon, Mar 29, 2021 at 10:49 PM Dan Schatzberg
<schatzberg.dan@gmail.com> wrote:
>
> set_active_memcg() worked for kernel allocations but was silently
> ignored for user pages.
>
> This patch establishes a precedence order for who gets charged:
>
> 1. If there is a memcg associated with the page already, that memcg is
> charged. This happens during swapin.
>
> 2. If an explicit mm is passed, mm->memcg is charged. This happens
> during page faults, which can be triggered in remote VMs (eg gup).
>
> 3. Otherwise consult the current process context. If there is an
> active_memcg, use that. Otherwise, current->mm->memcg.
>
> Previously, if a NULL mm was passed to mem_cgroup_charge (case 3) it
> would always charge the root cgroup. Now it looks up the active_memcg
> first (falling back to charging the root cgroup if not set).
>
> Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Tejun Heo <tj@kernel.org>
> Acked-by: Chris Down <chris@chrisdown.name>
> Reviewed-by: Shakeel Butt <shakeelb@google.com>
> ---
> mm/filemap.c | 2 +-
> mm/memcontrol.c | 72 ++++++++++++++++++++++++++++---------------------
> mm/shmem.c | 4 +--
> 3 files changed, 44 insertions(+), 34 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index eeeb8e2cc36a..63fd980e863a 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -872,7 +872,7 @@ noinline int __add_to_page_cache_locked(struct page *page,
> page->index = offset;
>
> if (!huge) {
> - error = mem_cgroup_charge(page, current->mm, gfp);
> + error = mem_cgroup_charge(page, NULL, gfp);
> if (error)
> goto error;
> charged = true;
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 668d1d7c2645..adc618814fd2 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -884,13 +884,38 @@ struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p)
> }
> EXPORT_SYMBOL(mem_cgroup_from_task);
>
> +static __always_inline struct mem_cgroup *active_memcg(void)
> +{
> + if (in_interrupt())
> + return this_cpu_read(int_active_memcg);
> + else
> + return current->active_memcg;
> +}
> +
> +static __always_inline struct mem_cgroup *get_active_memcg(void)
> +{
> + struct mem_cgroup *memcg;
> +
> + rcu_read_lock();
> + memcg = active_memcg();
> + /* remote memcg must hold a ref. */
> + if (memcg && WARN_ON_ONCE(!css_tryget(&memcg->css)))
> + memcg = root_mem_cgroup;
> + rcu_read_unlock();
> +
> + return memcg;
> +}
This function is already removed since the patchset below.
Use obj_cgroup APIs to charge kmem pages
https://lore.kernel.org/patchwork/cover/1399132/
I also suggest not reintroducing get_active_memcg.
There is only one user of it, just inline it into
get_mem_cgroup_from_mm(). Actually we don’t
need get_active_memcg() either.
> +
> /**
> * get_mem_cgroup_from_mm: Obtain a reference on given mm_struct's memcg.
> * @mm: mm from which memcg should be extracted. It can be NULL.
> *
> - * Obtain a reference on mm->memcg and returns it if successful. Otherwise
> - * root_mem_cgroup is returned. However if mem_cgroup is disabled, NULL is
> - * returned.
> + * Obtain a reference on mm->memcg and returns it if successful. If mm
> + * is NULL, then the memcg is chosen as follows:
> + * 1) The active memcg, if set.
> + * 2) current->mm->memcg, if available
> + * 3) root memcg
> + * If mem_cgroup is disabled, NULL is returned.
> */
> struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm)
> {
> @@ -899,13 +924,19 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm)
> if (mem_cgroup_disabled())
> return NULL;
>
> + /*
> + * Page cache insertions can happen without an
> + * actual mm context, e.g. during disk probing
> + * on boot, loopback IO, acct() writes etc.
> + */
> + if (unlikely(!mm)) {
> + if (unlikely(active_memcg()))
> + return get_active_memcg();
Since remote memcg must hold a reference, we do not
need to do something like get_active_memcg() does.
Just use css_get to obtain a ref, it is simpler. Just
Like below.
+ if (unlikely(!mm)) {
+ memcg = active_memcg();
+ if (unlikely(memcg)) {
+ /* remote memcg must hold a ref. */
+ css_get(memcg);
+ return memcg;
+ }
Thanks.
> + mm = current->mm;
> + }
> +
> rcu_read_lock();
> do {
> - /*
> - * Page cache insertions can happen withou an
> - * actual mm context, e.g. during disk probing
> - * on boot, loopback IO, acct() writes etc.
> - */
> if (unlikely(!mm))
> memcg = root_mem_cgroup;
> else {
> @@ -919,28 +950,6 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm)
> }
> EXPORT_SYMBOL(get_mem_cgroup_from_mm);
>
> -static __always_inline struct mem_cgroup *active_memcg(void)
> -{
> - if (in_interrupt())
> - return this_cpu_read(int_active_memcg);
> - else
> - return current->active_memcg;
> -}
> -
> -static __always_inline struct mem_cgroup *get_active_memcg(void)
> -{
> - struct mem_cgroup *memcg;
> -
> - rcu_read_lock();
> - memcg = active_memcg();
> - /* remote memcg must hold a ref. */
> - if (memcg && WARN_ON_ONCE(!css_tryget(&memcg->css)))
> - memcg = root_mem_cgroup;
> - rcu_read_unlock();
> -
> - return memcg;
> -}
> -
> static __always_inline bool memcg_kmem_bypass(void)
> {
> /* Allow remote memcg charging from any context. */
> @@ -6549,7 +6558,8 @@ static int __mem_cgroup_charge(struct page *page, struct mem_cgroup *memcg,
> * @gfp_mask: reclaim mode
> *
> * Try to charge @page to the memcg that @mm belongs to, reclaiming
> - * pages according to @gfp_mask if necessary.
> + * pages according to @gfp_mask if necessary. if @mm is NULL, try to
> + * charge to the active memcg.
> *
> * Do not use this for pages allocated for swapin.
> *
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 78ab81a62b29..7c09276125d5 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1694,7 +1694,7 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index,
> {
> struct address_space *mapping = inode->i_mapping;
> struct shmem_inode_info *info = SHMEM_I(inode);
> - struct mm_struct *charge_mm = vma ? vma->vm_mm : current->mm;
> + struct mm_struct *charge_mm = vma ? vma->vm_mm : NULL;
> struct page *page;
> swp_entry_t swap;
> int error;
> @@ -1815,7 +1815,7 @@ static int shmem_getpage_gfp(struct inode *inode, pgoff_t index,
> }
>
> sbinfo = SHMEM_SB(inode->i_sb);
> - charge_mm = vma ? vma->vm_mm : current->mm;
> + charge_mm = vma ? vma->vm_mm : NULL;
>
> page = pagecache_get_page(mapping, index,
> FGP_ENTRY | FGP_HEAD | FGP_LOCK, 0);
> --
> 2.30.2
>
next prev parent reply other threads:[~2021-03-29 16:14 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-29 14:48 [PATCH V11 0/3] Charge loop device i/o to issuing cgroup Dan Schatzberg
2021-03-29 14:48 ` [PATCH 1/3] loop: Use worker per cgroup instead of kworker Dan Schatzberg
2021-03-29 14:48 ` [PATCH 2/3] mm: Charge active memcg when no mm is set Dan Schatzberg
2021-03-29 16:13 ` Muchun Song [this message]
2021-03-29 20:59 ` [External] " Shakeel Butt
2021-03-29 14:48 ` [PATCH 3/3] loop: Charge i/o to mem and blk cg Dan Schatzberg
2021-04-02 19:16 [PATCH V12 0/3] Charge loop device i/o to issuing cgroup Dan Schatzberg
2021-04-02 19:16 ` [PATCH 2/3] mm: Charge active memcg when no mm is set Dan Schatzberg
2021-04-03 5:47 ` [External] " Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMZfGtVT85+1_Bu9LBaG6DUsr1kYsepQ-1-K7BDD0Wn3L+BQgg@mail.gmail.com \
--to=songmuchun@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=alex.shi@linux.alibaba.com \
--cc=alexander.h.duyck@linux.intel.com \
--cc=axboe@kernel.dk \
--cc=cgroups@vger.kernel.org \
--cc=chris@chrisdown.name \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=laoar.shao@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan.x@bytedance.com \
--cc=mhocko@kernel.org \
--cc=richard.weiyang@gmail.com \
--cc=schatzberg.dan@gmail.com \
--cc=shakeelb@google.com \
--cc=shy828301@gmail.com \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).