linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@kernel.org>,
	 Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeelb@google.com>,
	 Vladimir Davydov <vdavydov.dev@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	 Linux Memory Management List <linux-mm@kvack.org>,
	Xiongchun duan <duanxiongchun@bytedance.com>
Subject: Re: [External] Re: [PATCH v3 3/4] mm: memcontrol: use obj_cgroup APIs to charge kmem pages
Date: Fri, 12 Mar 2021 17:22:55 +0800	[thread overview]
Message-ID: <CAMZfGtUqTBJ56eEj5CiFbHGMMaopP9k1Tq94R+M=W6P0HF83_A@mail.gmail.com> (raw)
In-Reply-To: <YElCxDzVgBBLAQhJ@cmpxchg.org>

On Thu, Mar 11, 2021 at 6:05 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> Hello Munchun,
>
> On Tue, Mar 09, 2021 at 06:07:16PM +0800, Muchun Song wrote:
> > @@ -6806,11 +6823,23 @@ static inline void uncharge_gather_clear(struct uncharge_gather *ug)
> >  static void uncharge_batch(const struct uncharge_gather *ug)
> >  {
> >       unsigned long flags;
> > +     unsigned long nr_pages;
> >
> > -     if (!mem_cgroup_is_root(ug->memcg)) {
> > -             page_counter_uncharge(&ug->memcg->memory, ug->nr_pages);
> > +     /*
> > +      * The kmem pages can be reparented to the root memcg, in
> > +      * order to prevent the memory counter of root memcg from
> > +      * increasing indefinitely. We should decrease the memory
> > +      * counter when unchange.
> > +      */
> > +     if (mem_cgroup_is_root(ug->memcg))
> > +             nr_pages = ug->nr_kmem;
> > +     else
> > +             nr_pages = ug->nr_pages;
>
> Correct or not, I find this unreadable. We're uncharging nr_kmem on
> the root, and nr_pages against leaf groups?
>
> It implies several things that might not be immediately obvious to the
> reader of this function. Namely, that nr_kmem is a subset of nr_pages.
> Or that we don't *want* to account LRU pages for the root cgroup.
>
> The old code followed a very simple pattern: the root memcg's page
> counters aren't touched.
>
> This is no longer true: we modify them depending on very specific
> circumstances. But that's too clever for the stupid uncharge_batch()
> which is only supposed to flush a number of accumulators into their
> corresponding page counters.
>
> This distinction really needs to be moved down to uncharge_page() now.

OK. I will rework the code here to make it readable.

>
> > @@ -6828,7 +6857,7 @@ static void uncharge_batch(const struct uncharge_gather *ug)
> >
> >  static void uncharge_page(struct page *page, struct uncharge_gather *ug)
> >  {
> > -     unsigned long nr_pages;
> > +     unsigned long nr_pages, nr_kmem;
> >       struct mem_cgroup *memcg;
> >
> >       VM_BUG_ON_PAGE(PageLRU(page), page);
> > @@ -6836,34 +6865,44 @@ static void uncharge_page(struct page *page, struct uncharge_gather *ug)
> >       if (!page_memcg_charged(page))
> >               return;
> >
> > +     nr_pages = compound_nr(page);
> >       /*
> >        * Nobody should be changing or seriously looking at
> > -      * page memcg at this point, we have fully exclusive
> > -      * access to the page.
> > +      * page memcg or objcg at this point, we have fully
> > +      * exclusive access to the page.
> >        */
> > -     memcg = page_memcg_check(page);
> > +     if (PageMemcgKmem(page)) {
> > +             struct obj_cgroup *objcg;
> > +
> > +             objcg = page_objcg(page);
> > +             memcg = obj_cgroup_memcg_get(objcg);
> > +
> > +             page->memcg_data = 0;
> > +             obj_cgroup_put(objcg);
> > +             nr_kmem = nr_pages;
> > +     } else {
> > +             memcg = page_memcg(page);
> > +             page->memcg_data = 0;
> > +             nr_kmem = 0;
> > +     }
>
> Why is all this moved above the uncharge_batch() call?

Before calling obj_cgroup_put(), we need set page->memcg_data
to zero. So I move "page->memcg_data = 0" to here.

>
> It separates the pointer manipulations from the refcounting, which
> makes the code very difficult to follow.
>
> > +
> >       if (ug->memcg != memcg) {
> >               if (ug->memcg) {
> >                       uncharge_batch(ug);
> >                       uncharge_gather_clear(ug);
> >               }
> >               ug->memcg = memcg;
> > +             ug->dummy_page = page;
>
> Why this change?

Just like ug->memcg, we do not need to set
ug->dummy_page in every loop.


>
> >               /* pairs with css_put in uncharge_batch */
> >               css_get(&ug->memcg->css);
> >       }
> >
> > -     nr_pages = compound_nr(page);
> >       ug->nr_pages += nr_pages;
> > +     ug->nr_kmem += nr_kmem;
> > +     ug->pgpgout += !nr_kmem;
>
> Oof.
>
> Yes, this pgpgout line is an equivalent transformation for counting
> LRU compound pages. But unless you already know that, it's completely
> impossible to understand what the intent here is.
>
> Please avoid clever tricks like this. If you need to check whether the
> page is kmem, test PageMemcgKmem() instead of abusing the counters as
> boolean flags. This is supposed to be read by human beings, too.

Got it.

>
> > -     if (PageMemcgKmem(page))
> > -             ug->nr_kmem += nr_pages;
> > -     else
> > -             ug->pgpgout++;
> > -
> > -     ug->dummy_page = page;
> > -     page->memcg_data = 0;
> > -     css_put(&ug->memcg->css);
> > +     css_put(&memcg->css);
>
> Sorry, these two functions are no longer readable after your changes.
>
> Please retain the following sequence as discrete steps:
>
> 1. look up memcg from the page
> 2. flush existing batch if memcg changed
> 3. add page's various counts to the current batch
> 4. clear page->memcg and decrease the referece count to whatever it was pointing to

Got it.

>
> And as per above, step 3. is where we should check whether to uncharge
> the memcg's page counter at all:
>
>         if (PageMemcgKmem(page)) {
>                 ug->nr_pages += nr_pages;
>                 ug->nr_kmem += nr_pages;
>         } else {
>                 /* LRU pages aren't accounted at the root level */
>                 if (!mem_cgroup_is_root(memcg))
>                         ug->nr_pages += nr_pages;
>                 ug->pgpgout++;
>         }
>
> In fact, it might be a good idea to rename ug->nr_pages to
> ug->nr_memory to highlight how it maps to the page_counter.

I will rework the code in the next version. Thanks for your
suggestions.


  reply	other threads:[~2021-03-12  9:23 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-09 10:07 [PATCH v3 0/4] Use obj_cgroup APIs to charge kmem pages Muchun Song
2021-03-09 10:07 ` [PATCH v3 1/4] mm: memcontrol: introduce obj_cgroup_{un}charge_pages Muchun Song
2021-03-11 12:30   ` Johannes Weiner
2021-03-11 18:56   ` Shakeel Butt
2021-03-09 10:07 ` [PATCH v3 2/4] mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem page Muchun Song
2021-03-10 19:57   ` Roman Gushchin
2021-03-11  6:45     ` [External] " Muchun Song
2021-03-11 13:12   ` Johannes Weiner
2021-03-12  7:14     ` [External] " Muchun Song
2021-03-12 19:23       ` Johannes Weiner
2021-03-12 22:42         ` Shakeel Butt
2021-03-12 23:07           ` Johannes Weiner
2021-03-12 23:18             ` Shakeel Butt
2021-03-14 13:56         ` Muchun Song
2021-03-12  3:22   ` Shakeel Butt
2021-03-12  5:02     ` [External] " Muchun Song
2021-03-09 10:07 ` [PATCH v3 3/4] mm: memcontrol: use obj_cgroup APIs to charge kmem pages Muchun Song
2021-03-10 19:53   ` Roman Gushchin
2021-03-11  6:50     ` [External] " Muchun Song
2021-03-10 22:05   ` Johannes Weiner
2021-03-12  9:22     ` Muchun Song [this message]
2021-03-12 15:59       ` [External] " Johannes Weiner
2021-03-12 16:46         ` Muchun Song
2021-03-11 17:48   ` kernel test robot
2021-03-09 10:07 ` [PATCH v3 4/4] mm: memcontrol: move PageMemcgKmem to the scope of CONFIG_MEMCG_KMEM Muchun Song
2021-03-10 19:30   ` Roman Gushchin
2021-03-12  3:26   ` Shakeel Butt
2021-03-12 19:24   ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZfGtUqTBJ56eEj5CiFbHGMMaopP9k1Tq94R+M=W6P0HF83_A@mail.gmail.com' \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=duanxiongchun@bytedance.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=shakeelb@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).