All of lore.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>, Li Zefan <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>, Cgroups <cgroups@vger.kernel.org>,
	linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, kernel test robot <lkp@intel.com>
Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2
Date: Tue, 15 Sep 2020 00:54:50 +0800	[thread overview]
Message-ID: <CAMZfGtXoBrFioh=FqRA82ZRSt=2oW=ie8BgZE0hAvtCOBRMXiw@mail.gmail.com> (raw)
In-Reply-To: <CALvZod7VH3NDwBXrY9w95pUY7DV+R-b_chBHuygmwH_bhpULkQ@mail.gmail.com>

On Tue, Sep 15, 2020 at 12:07 AM Shakeel Butt <shakeelb@google.com> wrote:
>
> On Sun, Sep 13, 2020 at 12:01 AM Muchun Song <songmuchun@bytedance.com> wrote:
> >
> > In the cgroup v1, we have a numa_stat interface. This is useful for
> > providing visibility into the numa locality information within an
> > memcg since the pages are allowed to be allocated from any physical
> > node. One of the use cases is evaluating application performance by
> > combining this information with the application's CPU allocation.
> > But the cgroup v2 does not. So this patch adds the missing information.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > Suggested-by: Shakeel Butt <shakeelb@google.com>
> > Reported-by: kernel test robot <lkp@intel.com>
> > ---
> [snip]
> > +
> > +static struct numa_stat numa_stats[] = {
> > +       { "anon", PAGE_SIZE, NR_ANON_MAPPED },
> > +       { "file", PAGE_SIZE, NR_FILE_PAGES },
> > +       { "kernel_stack", 1024, NR_KERNEL_STACK_KB },
> > +       { "shmem", PAGE_SIZE, NR_SHMEM },
> > +       { "file_mapped", PAGE_SIZE, NR_FILE_MAPPED },
> > +       { "file_dirty", PAGE_SIZE, NR_FILE_DIRTY },
> > +       { "file_writeback", PAGE_SIZE, NR_WRITEBACK },
> > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > +       /*
> > +        * The ratio will be initialized in numa_stats_init(). Because
> > +        * on some architectures, the macro of HPAGE_PMD_SIZE is not
> > +        * constant(e.g. powerpc).
> > +        */
> > +       { "anon_thp", 0, NR_ANON_THPS },
> > +#endif
> > +       { "inactive_anon", PAGE_SIZE, NR_INACTIVE_ANON },
> > +       { "active_anon", PAGE_SIZE, NR_ACTIVE_ANON },
> > +       { "inactive_file", PAGE_SIZE, NR_INACTIVE_FILE },
> > +       { "active_file", PAGE_SIZE, NR_ACTIVE_FILE },
> > +       { "unevictable", PAGE_SIZE, NR_UNEVICTABLE },
> > +       { "slab_reclaimable", 1, NR_SLAB_RECLAIMABLE_B },
> > +       { "slab_unreclaimable", 1, NR_SLAB_UNRECLAIMABLE_B },
> > +};
> > +
> > +static int __init numa_stats_init(void)
> > +{
> > +       int i;
> > +
> > +       for (i = 0; i < ARRAY_SIZE(numa_stats); i++) {
> > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > +               if (numa_stats[i].idx == NR_ANON_THPS)
> > +                       numa_stats[i].ratio = HPAGE_PMD_SIZE;
> > +#endif
> > +       }
>
> The for loop seems excessive but I don't really have a good alternative.

Yeah, I also have no good alternative. The numa_stats is only initialized
once. So there may be no problem :).

>
> > +
> > +       return 0;
> > +}
> > +pure_initcall(numa_stats_init);
> > +
> > +static unsigned long memcg_node_page_state(struct mem_cgroup *memcg,
> > +                                          unsigned int nid,
> > +                                          enum node_stat_item idx)
> > +{
> > +       VM_BUG_ON(nid >= nr_node_ids);
> > +       return lruvec_page_state(mem_cgroup_lruvec(memcg, NODE_DATA(nid)), idx);
> > +}
> > +
> > +static const char *memory_numa_stat_format(struct mem_cgroup *memcg)
> > +{
> > +       int i;
> > +       struct seq_buf s;
> > +
> > +       /* Reserve a byte for the trailing null */
> > +       seq_buf_init(&s, kmalloc(PAGE_SIZE, GFP_KERNEL), PAGE_SIZE - 1);
> > +       if (!s.buffer)
> > +               return NULL;
> > +
> > +       for (i = 0; i < ARRAY_SIZE(numa_stats); i++) {
> > +               int nid;
> > +
> > +               seq_buf_printf(&s, "%s", numa_stats[i].name);
> > +               for_each_node_state(nid, N_MEMORY) {
> > +                       u64 size;
> > +
> > +                       size = memcg_node_page_state(memcg, nid,
> > +                                                    numa_stats[i].idx);
> > +                       size *= numa_stats[i].ratio;
> > +                       seq_buf_printf(&s, " N%d=%llu", nid, size);
> > +               }
> > +               seq_buf_putc(&s, '\n');
> > +       }
> > +
> > +       /* The above should easily fit into one page */
> > +       if (WARN_ON_ONCE(seq_buf_putc(&s, '\0')))
> > +               s.buffer[PAGE_SIZE - 1] = '\0';
>
> I think you should follow Michal's recommendation at
> http://lkml.kernel.org/r/20200914115724.GO16999@dhcp22.suse.cz

Here is different, because the seq_buf_putc(&s, '\n') will not add \0 unless
we use seq_buf_puts(&s, "\n").


-- 
Yours,
Muchun

WARNING: multiple messages have this Message-ID (diff)
From: Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
To: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Li Zefan <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Jonathan Corbet <corbet-T1hC0tSOHrs@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
	Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux MM <linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	kernel test robot <lkp-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2
Date: Tue, 15 Sep 2020 00:54:50 +0800	[thread overview]
Message-ID: <CAMZfGtXoBrFioh=FqRA82ZRSt=2oW=ie8BgZE0hAvtCOBRMXiw@mail.gmail.com> (raw)
In-Reply-To: <CALvZod7VH3NDwBXrY9w95pUY7DV+R-b_chBHuygmwH_bhpULkQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tue, Sep 15, 2020 at 12:07 AM Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>
> On Sun, Sep 13, 2020 at 12:01 AM Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org> wrote:
> >
> > In the cgroup v1, we have a numa_stat interface. This is useful for
> > providing visibility into the numa locality information within an
> > memcg since the pages are allowed to be allocated from any physical
> > node. One of the use cases is evaluating application performance by
> > combining this information with the application's CPU allocation.
> > But the cgroup v2 does not. So this patch adds the missing information.
> >
> > Signed-off-by: Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
> > Suggested-by: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> > Reported-by: kernel test robot <lkp-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > ---
> [snip]
> > +
> > +static struct numa_stat numa_stats[] = {
> > +       { "anon", PAGE_SIZE, NR_ANON_MAPPED },
> > +       { "file", PAGE_SIZE, NR_FILE_PAGES },
> > +       { "kernel_stack", 1024, NR_KERNEL_STACK_KB },
> > +       { "shmem", PAGE_SIZE, NR_SHMEM },
> > +       { "file_mapped", PAGE_SIZE, NR_FILE_MAPPED },
> > +       { "file_dirty", PAGE_SIZE, NR_FILE_DIRTY },
> > +       { "file_writeback", PAGE_SIZE, NR_WRITEBACK },
> > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > +       /*
> > +        * The ratio will be initialized in numa_stats_init(). Because
> > +        * on some architectures, the macro of HPAGE_PMD_SIZE is not
> > +        * constant(e.g. powerpc).
> > +        */
> > +       { "anon_thp", 0, NR_ANON_THPS },
> > +#endif
> > +       { "inactive_anon", PAGE_SIZE, NR_INACTIVE_ANON },
> > +       { "active_anon", PAGE_SIZE, NR_ACTIVE_ANON },
> > +       { "inactive_file", PAGE_SIZE, NR_INACTIVE_FILE },
> > +       { "active_file", PAGE_SIZE, NR_ACTIVE_FILE },
> > +       { "unevictable", PAGE_SIZE, NR_UNEVICTABLE },
> > +       { "slab_reclaimable", 1, NR_SLAB_RECLAIMABLE_B },
> > +       { "slab_unreclaimable", 1, NR_SLAB_UNRECLAIMABLE_B },
> > +};
> > +
> > +static int __init numa_stats_init(void)
> > +{
> > +       int i;
> > +
> > +       for (i = 0; i < ARRAY_SIZE(numa_stats); i++) {
> > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > +               if (numa_stats[i].idx == NR_ANON_THPS)
> > +                       numa_stats[i].ratio = HPAGE_PMD_SIZE;
> > +#endif
> > +       }
>
> The for loop seems excessive but I don't really have a good alternative.

Yeah, I also have no good alternative. The numa_stats is only initialized
once. So there may be no problem :).

>
> > +
> > +       return 0;
> > +}
> > +pure_initcall(numa_stats_init);
> > +
> > +static unsigned long memcg_node_page_state(struct mem_cgroup *memcg,
> > +                                          unsigned int nid,
> > +                                          enum node_stat_item idx)
> > +{
> > +       VM_BUG_ON(nid >= nr_node_ids);
> > +       return lruvec_page_state(mem_cgroup_lruvec(memcg, NODE_DATA(nid)), idx);
> > +}
> > +
> > +static const char *memory_numa_stat_format(struct mem_cgroup *memcg)
> > +{
> > +       int i;
> > +       struct seq_buf s;
> > +
> > +       /* Reserve a byte for the trailing null */
> > +       seq_buf_init(&s, kmalloc(PAGE_SIZE, GFP_KERNEL), PAGE_SIZE - 1);
> > +       if (!s.buffer)
> > +               return NULL;
> > +
> > +       for (i = 0; i < ARRAY_SIZE(numa_stats); i++) {
> > +               int nid;
> > +
> > +               seq_buf_printf(&s, "%s", numa_stats[i].name);
> > +               for_each_node_state(nid, N_MEMORY) {
> > +                       u64 size;
> > +
> > +                       size = memcg_node_page_state(memcg, nid,
> > +                                                    numa_stats[i].idx);
> > +                       size *= numa_stats[i].ratio;
> > +                       seq_buf_printf(&s, " N%d=%llu", nid, size);
> > +               }
> > +               seq_buf_putc(&s, '\n');
> > +       }
> > +
> > +       /* The above should easily fit into one page */
> > +       if (WARN_ON_ONCE(seq_buf_putc(&s, '\0')))
> > +               s.buffer[PAGE_SIZE - 1] = '\0';
>
> I think you should follow Michal's recommendation at
> http://lkml.kernel.org/r/20200914115724.GO16999-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org

Here is different, because the seq_buf_putc(&s, '\n') will not add \0 unless
we use seq_buf_puts(&s, "\n").


-- 
Yours,
Muchun

  reply	other threads:[~2020-09-14 16:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-13  7:00 [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2 Muchun Song
2020-09-13 17:09 ` Chris Down
2020-09-14  3:10   ` [External] " Muchun Song
2020-09-14  3:10     ` Muchun Song
2020-09-14  3:18     ` Zefan Li
2020-09-14  3:18       ` Zefan Li
2020-09-14  3:28       ` Muchun Song
2020-09-14  3:28         ` Muchun Song
2020-09-14 16:07 ` Shakeel Butt
2020-09-14 16:07   ` Shakeel Butt
2020-09-14 16:54   ` Muchun Song [this message]
2020-09-14 16:54     ` [External] " Muchun Song
2020-09-14 16:54     ` Muchun Song
2020-09-14 22:57     ` Shakeel Butt
2020-09-14 22:57       ` Shakeel Butt
2020-09-14 22:57       ` Shakeel Butt
2020-09-15  2:46       ` Muchun Song
2020-09-15  2:46         ` Muchun Song
2020-09-15  2:46         ` Muchun Song
2020-09-14 19:06 ` Randy Dunlap
2020-09-15  2:44   ` [External] " Muchun Song
2020-09-15  2:44     ` Muchun Song
2020-09-15  2:44     ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZfGtXoBrFioh=FqRA82ZRSt=2oW=ie8BgZE0hAvtCOBRMXiw@mail.gmail.com' \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=lkp@intel.com \
    --cc=mhocko@kernel.org \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.