All of lore.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeelb@google.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Xiongchun duan <duanxiongchun@bytedance.com>,
	fam.zheng@bytedance.com
Subject: Re: [External] Re: [PATCH] mm: memcontrol: fix root_mem_cgroup charging
Date: Wed, 21 Apr 2021 21:39:03 +0800	[thread overview]
Message-ID: <CAMZfGtUOXyK3RDZ+P0GaO4p-P0XatFB8ZbmXEFvfet1HSrdFog@mail.gmail.com> (raw)
In-Reply-To: <YIAimEdqpen3/38Z@dhcp22.suse.cz>

On Wed, Apr 21, 2021 at 9:03 PM Michal Hocko <mhocko@suse.com> wrote:
>
> On Wed 21-04-21 17:50:06, Muchun Song wrote:
> > On Wed, Apr 21, 2021 at 3:34 PM Michal Hocko <mhocko@suse.com> wrote:
> > >
> > > On Wed 21-04-21 14:26:44, Muchun Song wrote:
> > > > The below scenario can cause the page counters of the root_mem_cgroup
> > > > to be out of balance.
> > > >
> > > > CPU0:                                   CPU1:
> > > >
> > > > objcg = get_obj_cgroup_from_current()
> > > > obj_cgroup_charge_pages(objcg)
> > > >                                         memcg_reparent_objcgs()
> > > >                                             // reparent to root_mem_cgroup
> > > >                                             WRITE_ONCE(iter->memcg, parent)
> > > >     // memcg == root_mem_cgroup
> > > >     memcg = get_mem_cgroup_from_objcg(objcg)
> > > >     // do not charge to the root_mem_cgroup
> > > >     try_charge(memcg)
> > > >
> > > > obj_cgroup_uncharge_pages(objcg)
> > > >     memcg = get_mem_cgroup_from_objcg(objcg)
> > > >     // uncharge from the root_mem_cgroup
> > > >     page_counter_uncharge(&memcg->memory)
> > > >
> > > > This can cause the page counter to be less than the actual value,
> > > > Although we do not display the value (mem_cgroup_usage) so there
> > > > shouldn't be any actual problem, but there is a WARN_ON_ONCE in
> > > > the page_counter_cancel(). Who knows if it will trigger? So it
> > > > is better to fix it.
> > >
> > > The changelog doesn't explain the fix and why you have chosen to charge
> > > kmem objects to root memcg and left all other try_charge users intact.
> >
> > The object cgroup is special (because the page can reparent). Only the
> > user of objcg APIs should be fixed.
> >
> > > The reason is likely that those are not reparented now but that just
> > > adds an inconsistency.
> > >
> > > Is there any reason you haven't simply matched obj_cgroup_uncharge_pages
> > > to check for the root memcg and bail out early?
> >
> > Because obj_cgroup_uncharge_pages() uncharges pages from the
> > root memcg unconditionally. Why? Because some pages can be
> > reparented to root memcg, in order to ensure the correctness of
> > page counter of root memcg. We have to uncharge pages from
> > root memcg. So we do not check whether the page belongs to
> > the root memcg when it uncharges.
>
> I am not sure I follow. Let me ask differently. Wouldn't you
> achieve the same if you simply didn't uncharge root memcg in
> obj_cgroup_charge_pages?

I'm afraid not. Some pages should uncharge root memcg, some
pages should not uncharge root memcg. But all those pages belong
to the root memcg. We cannot distinguish between the two.

I believe Roman is very familiar with this mechanism (objcg APIs).

Hi Roman,

Any thoughts on this?

>
> Btw. which tree is this patch based on? The current linux-next doesn't
> uncharge from memcg->memory inside obj_cgroup_uncharge_pages (nor does
> the Linus tree).

Sorry. I should expose more details.

obj_cgroup_uncharge_pages
  refill_stock->drain_stock
    page_counter_uncharge  // uncharging is here

Thanks.

> --
> Michal Hocko
> SUSE Labs

  reply	other threads:[~2021-04-21 13:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-21  6:26 [PATCH] mm: memcontrol: fix root_mem_cgroup charging Muchun Song
2021-04-21  7:34 ` Michal Hocko
2021-04-21  9:50   ` [External] " Muchun Song
2021-04-21  9:50     ` Muchun Song
2021-04-21 13:03     ` Michal Hocko
2021-04-21 13:39       ` Muchun Song [this message]
2021-04-21 13:39         ` Muchun Song
2021-04-22  0:57         ` Roman Gushchin
2021-04-22  3:47           ` Muchun Song
2021-04-22  3:47             ` Muchun Song
2021-04-22 18:53             ` Roman Gushchin
2021-04-23  8:20               ` Muchun Song
2021-04-23  8:20                 ` Muchun Song
2021-04-22  8:44           ` Michal Hocko
2021-04-22 18:37             ` Roman Gushchin
  -- strict thread matches above, loose matches on Subject: below --
2021-03-02  8:18 Muchun Song
2021-03-02 18:58 ` Roman Gushchin
2021-03-03  3:12   ` [External] " Muchun Song
2021-03-03  3:12     ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZfGtUOXyK3RDZ+P0GaO4p-P0XatFB8ZbmXEFvfet1HSrdFog@mail.gmail.com \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=duanxiongchun@bytedance.com \
    --cc=fam.zheng@bytedance.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=shakeelb@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.