All of lore.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeelb@google.com>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Xiongchun duan <duanxiongchun@bytedance.com>
Subject: Re: [External] Re: [PATCH v3 2/4] mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem page
Date: Thu, 11 Mar 2021 14:45:54 +0800	[thread overview]
Message-ID: <CAMZfGtXbTOeRub-vQTA4CTYODEuZYoEBiQtaekeOLL=6j=eqbA@mail.gmail.com> (raw)
In-Reply-To: <YEkkvuIZ/0+LD/9s@carbon.dhcp.thefacebook.com>

On Thu, Mar 11, 2021 at 3:58 AM Roman Gushchin <guro@fb.com> wrote:
>
> On Tue, Mar 09, 2021 at 06:07:15PM +0800, Muchun Song wrote:
> > We want to reuse the obj_cgroup APIs to charge the kmem pages.
> > If we do that, we should store an object cgroup pointer to
> > page->memcg_data for the kmem pages.
> >
> > Finally, page->memcg_data can have 3 different meanings.
> >
> >   1) For the slab pages, page->memcg_data points to an object cgroups
> >      vector.
> >
> >   2) For the kmem pages (exclude the slab pages), page->memcg_data
> >      points to an object cgroup.
> >
> >   3) For the user pages (e.g. the LRU pages), page->memcg_data points
> >      to a memory cgroup.
> >
> > Currently we always get the memory cgroup associated with a page via
> > page_memcg() or page_memcg_rcu(). page_memcg_check() is special, it
> > has to be used in cases when it's not known if a page has an
> > associated memory cgroup pointer or an object cgroups vector. Because
> > the page->memcg_data of the kmem page is not pointing to a memory
> > cgroup in the later patch, the page_memcg() and page_memcg_rcu()
> > cannot be applicable for the kmem pages. In this patch, make
> > page_memcg() and page_memcg_rcu() no longer apply to the kmem pages.
> > We do not change the behavior of the page_memcg_check(), it is also
> > applicable for the kmem pages.
> >
> > In the end, there are 3 helpers to get the memcg associated with a page.
> > Usage is as follows.
> >
> >   1) Get the memory cgroup associated with a non-kmem page (e.g. the LRU
> >      pages).
> >
> >      - page_memcg()
> >      - page_memcg_rcu()
> >
> >   2) Get the memory cgroup associated with a page. It has to be used in
> >      cases when it's not known if a page has an associated memory cgroup
> >      pointer or an object cgroups vector. Returns NULL for slab pages or
> >      uncharged pages. Otherwise, returns memory cgroup for charged pages
> >      (e.g. the kmem pages, the LRU pages).
> >
> >      - page_memcg_check()
> >
> > In some place, we use page_memcg() to check whether the page is charged.
> > Now introduce page_memcg_charged() helper to do that.
> >
> > This is a preparation for reparenting the kmem pages.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > ---
> >  include/linux/memcontrol.h | 33 +++++++++++++++++++++++++++------
> >  mm/memcontrol.c            | 23 +++++++++++++----------
> >  mm/page_alloc.c            |  4 ++--
> >  3 files changed, 42 insertions(+), 18 deletions(-)
> >
> > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> > index e6dc793d587d..83cbcdcfcc92 100644
> > --- a/include/linux/memcontrol.h
> > +++ b/include/linux/memcontrol.h
> > @@ -358,14 +358,26 @@ enum page_memcg_data_flags {
> >
> >  #define MEMCG_DATA_FLAGS_MASK (__NR_MEMCG_DATA_FLAGS - 1)
> >
> > +/* Return true for charged page, otherwise false. */
> > +static inline bool page_memcg_charged(struct page *page)
> > +{
> > +     unsigned long memcg_data = page->memcg_data;
> > +
> > +     VM_BUG_ON_PAGE(PageSlab(page), page);
> > +     VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_OBJCGS, page);
> > +
> > +     return !!memcg_data;
> > +}
> > +
> >  /*
> > - * page_memcg - get the memory cgroup associated with a page
> > + * page_memcg - get the memory cgroup associated with a non-kmem page
> >   * @page: a pointer to the page struct
> >   *
> >   * Returns a pointer to the memory cgroup associated with the page,
> >   * or NULL. This function assumes that the page is known to have a
> >   * proper memory cgroup pointer. It's not safe to call this function
> > - * against some type of pages, e.g. slab pages or ex-slab pages.
> > + * against some type of pages, e.g. slab pages, kmem pages or ex-slab
> > + * pages.
> >   *
> >   * Any of the following ensures page and memcg binding stability:
> >   * - the page lock
> > @@ -378,27 +390,31 @@ static inline struct mem_cgroup *page_memcg(struct page *page)
> >       unsigned long memcg_data = page->memcg_data;
> >
> >       VM_BUG_ON_PAGE(PageSlab(page), page);
> > +     VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page);
> >       VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_OBJCGS, page);
> >
> >       return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
> >  }
> >
> >  /*
> > - * page_memcg_rcu - locklessly get the memory cgroup associated with a page
> > + * page_memcg_rcu - locklessly get the memory cgroup associated with a non-kmem page
> >   * @page: a pointer to the page struct
> >   *
> >   * Returns a pointer to the memory cgroup associated with the page,
> >   * or NULL. This function assumes that the page is known to have a
> >   * proper memory cgroup pointer. It's not safe to call this function
> > - * against some type of pages, e.g. slab pages or ex-slab pages.
> > + * against some type of pages, e.g. slab pages, kmem pages or ex-slab
> > + * pages.
> >   */
> >  static inline struct mem_cgroup *page_memcg_rcu(struct page *page)
> >  {
> > +     unsigned long memcg_data = READ_ONCE(page->memcg_data);
> > +
> >       VM_BUG_ON_PAGE(PageSlab(page), page);
> > +     VM_BUG_ON_PAGE(memcg_data & MEMCG_DATA_KMEM, page);
> >       WARN_ON_ONCE(!rcu_read_lock_held());
> >
> > -     return (struct mem_cgroup *)(READ_ONCE(page->memcg_data) &
> > -                                  ~MEMCG_DATA_FLAGS_MASK);
> > +     return (struct mem_cgroup *)(memcg_data & ~MEMCG_DATA_FLAGS_MASK);
> >  }
> >
> >  /*
> > @@ -1072,6 +1088,11 @@ void mem_cgroup_split_huge_fixup(struct page *head);
> >
> >  struct mem_cgroup;
> >
> > +static inline bool page_memcg_charged(struct page *page)
> > +{
> > +     return false;
> > +}
> > +
> >  static inline struct mem_cgroup *page_memcg(struct page *page)
> >  {
> >       return NULL;
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index fc22da9805fb..e1dc73ceb98a 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -855,10 +855,11 @@ void __mod_lruvec_page_state(struct page *page, enum node_stat_item idx,
> >                            int val)
> >  {
> >       struct page *head = compound_head(page); /* rmap on tail pages */
> > -     struct mem_cgroup *memcg = page_memcg(head);
> > +     struct mem_cgroup *memcg;
> >       pg_data_t *pgdat = page_pgdat(page);
> >       struct lruvec *lruvec;
> >
> > +     memcg = page_memcg_check(head);
>
> In general, this and the next patch look good to me (aside from some small things,
> commented separately).
>
> But I wonder if it's better to have two separate versions of __mod_lruvec_page_state()
> for kmem and non-kmem pages, rather then rely on PageMemcgKmem flag. It's a hot path,
> so if we can have fewer conditions here, that would be nice.
> I take a brief look (and could be wrong), but it seems like we know in advance
> which version should be used.

Right. The user should know whether the page is kmem.
IIUC,  __mod_lruvec_kmem_state is the version of the
kmem. It is enough to reuse it. And make __mod_lruvec_page_state()
only suitable for non-kmem page.

>
> Thanks!
>
> >       /* Untracked pages have no memcg, no lruvec. Update only the node */
> >       if (!memcg) {
> >               __mod_node_page_state(pgdat, idx, val);
> > @@ -3166,12 +3167,13 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order)
> >   */
> >  void __memcg_kmem_uncharge_page(struct page *page, int order)
> >  {
> > -     struct mem_cgroup *memcg = page_memcg(page);
> > +     struct mem_cgroup *memcg;
> >       unsigned int nr_pages = 1 << order;
> >
> > -     if (!memcg)
> > +     if (!page_memcg_charged(page))
> >               return;
> >
> > +     memcg = page_memcg_check(page);
> >       VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
> >       __memcg_kmem_uncharge(memcg, nr_pages);
> >       page->memcg_data = 0;
> > @@ -6827,24 +6829,25 @@ static void uncharge_batch(const struct uncharge_gather *ug)
> >  static void uncharge_page(struct page *page, struct uncharge_gather *ug)
> >  {
> >       unsigned long nr_pages;
> > +     struct mem_cgroup *memcg;
> >
> >       VM_BUG_ON_PAGE(PageLRU(page), page);
> >
> > -     if (!page_memcg(page))
> > +     if (!page_memcg_charged(page))
> >               return;
> >
> >       /*
> >        * Nobody should be changing or seriously looking at
> > -      * page_memcg(page) at this point, we have fully
> > -      * exclusive access to the page.
> > +      * page memcg at this point, we have fully exclusive
> > +      * access to the page.
> >        */
> > -
> > -     if (ug->memcg != page_memcg(page)) {
> > +     memcg = page_memcg_check(page);
> > +     if (ug->memcg != memcg) {
> >               if (ug->memcg) {
> >                       uncharge_batch(ug);
> >                       uncharge_gather_clear(ug);
> >               }
> > -             ug->memcg = page_memcg(page);
> > +             ug->memcg = memcg;
> >
> >               /* pairs with css_put in uncharge_batch */
> >               css_get(&ug->memcg->css);
> > @@ -6877,7 +6880,7 @@ void mem_cgroup_uncharge(struct page *page)
> >               return;
> >
> >       /* Don't touch page->lru of any random page, pre-check: */
> > -     if (!page_memcg(page))
> > +     if (!page_memcg_charged(page))
> >               return;
> >
> >       uncharge_gather_clear(&ug);
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index f10966e3b4a5..bcb58ae15e24 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -1124,7 +1124,7 @@ static inline bool page_expected_state(struct page *page,
> >       if (unlikely((unsigned long)page->mapping |
> >                       page_ref_count(page) |
> >  #ifdef CONFIG_MEMCG
> > -                     (unsigned long)page_memcg(page) |
> > +                     page_memcg_charged(page) |
> >  #endif
> >                       (page->flags & check_flags)))
> >               return false;
> > @@ -1149,7 +1149,7 @@ static const char *page_bad_reason(struct page *page, unsigned long flags)
> >                       bad_reason = "PAGE_FLAGS_CHECK_AT_FREE flag(s) set";
> >       }
> >  #ifdef CONFIG_MEMCG
> > -     if (unlikely(page_memcg(page)))
> > +     if (unlikely(page_memcg_charged(page)))
> >               bad_reason = "page still charged to cgroup";
> >  #endif
> >       return bad_reason;
> > --
> > 2.11.0
> >

  reply	other threads:[~2021-03-11  6:47 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-09 10:07 [PATCH v3 0/4] Use obj_cgroup APIs to charge kmem pages Muchun Song
2021-03-09 10:07 ` [PATCH v3 1/4] mm: memcontrol: introduce obj_cgroup_{un}charge_pages Muchun Song
2021-03-11 12:30   ` Johannes Weiner
2021-03-11 18:56   ` Shakeel Butt
2021-03-11 18:56     ` Shakeel Butt
2021-03-09 10:07 ` [PATCH v3 2/4] mm: memcontrol: make page_memcg{_rcu} only applicable for non-kmem page Muchun Song
2021-03-10 19:57   ` Roman Gushchin
2021-03-11  6:45     ` Muchun Song [this message]
2021-03-11  6:45       ` [External] " Muchun Song
2021-03-11 13:12   ` Johannes Weiner
2021-03-12  7:14     ` [External] " Muchun Song
2021-03-12  7:14       ` Muchun Song
2021-03-12 19:23       ` Johannes Weiner
2021-03-12 22:42         ` Shakeel Butt
2021-03-12 22:42           ` Shakeel Butt
2021-03-12 23:07           ` Johannes Weiner
2021-03-12 23:18             ` Shakeel Butt
2021-03-12 23:18               ` Shakeel Butt
2021-03-14 13:56         ` Muchun Song
2021-03-14 13:56           ` Muchun Song
2021-03-12  3:22   ` Shakeel Butt
2021-03-12  3:22     ` Shakeel Butt
2021-03-12  5:02     ` [External] " Muchun Song
2021-03-12  5:02       ` Muchun Song
2021-03-09 10:07 ` [PATCH v3 3/4] mm: memcontrol: use obj_cgroup APIs to charge kmem pages Muchun Song
2021-03-10 19:53   ` Roman Gushchin
2021-03-11  6:50     ` [External] " Muchun Song
2021-03-11  6:50       ` Muchun Song
2021-03-10 22:05   ` Johannes Weiner
2021-03-12  9:22     ` [External] " Muchun Song
2021-03-12  9:22       ` Muchun Song
2021-03-12 15:59       ` Johannes Weiner
2021-03-12 16:46         ` Muchun Song
2021-03-12 16:46           ` Muchun Song
2021-03-11 17:48   ` kernel test robot
2021-03-11 17:48     ` kernel test robot
2021-03-09 10:07 ` [PATCH v3 4/4] mm: memcontrol: move PageMemcgKmem to the scope of CONFIG_MEMCG_KMEM Muchun Song
2021-03-10 19:30   ` Roman Gushchin
2021-03-12  3:26   ` Shakeel Butt
2021-03-12  3:26     ` Shakeel Butt
2021-03-12 19:24   ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZfGtXbTOeRub-vQTA4CTYODEuZYoEBiQtaekeOLL=6j=eqbA@mail.gmail.com' \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=duanxiongchun@bytedance.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=shakeelb@google.com \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.