All of lore.kernel.org
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Randy Dunlap <rdunlap@infradead.org>
Cc: tj@kernel.org, Zefan Li <lizefan@huawei.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	corbet@lwn.net, Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeelb@google.com>, Roman Gushchin <guro@fb.com>,
	Cgroups <cgroups@vger.kernel.org>,
	linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	kernel test robot <lkp@intel.com>
Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2
Date: Tue, 15 Sep 2020 10:44:01 +0800	[thread overview]
Message-ID: <CAMZfGtVvi5uY7iDAfWVVzaAy8YmfM9-UJ60p=aCw59Q=KKS-Vw@mail.gmail.com> (raw)
In-Reply-To: <8387344f-0e43-9b6e-068d-b2c45bbda1de@infradead.org>

On Tue, Sep 15, 2020 at 3:07 AM Randy Dunlap <rdunlap@infradead.org> wrote:
>
> On 9/13/20 12:00 AM, Muchun Song wrote:
> > In the cgroup v1, we have a numa_stat interface. This is useful for
> > providing visibility into the numa locality information within an
> > memcg since the pages are allowed to be allocated from any physical
> > node. One of the use cases is evaluating application performance by
> > combining this information with the application's CPU allocation.
> > But the cgroup v2 does not. So this patch adds the missing information.
> >
> > Signed-off-by: Muchun Song <songmuchun@bytedance.com>
> > Suggested-by: Shakeel Butt <shakeelb@google.com>
> > Reported-by: kernel test robot <lkp@intel.com>
> > ---
> >  changelog in v3:
> >  1. Fix compiler error on powerpc architecture reported by kernel test robot.
> >  2. Fix a typo from "anno" to "anon".
> >
> >  changelog in v2:
> >  1. Add memory.numa_stat interface in cgroup v2.
> >
> >  Documentation/admin-guide/cgroup-v2.rst |  72 ++++++++++++++++
> >  mm/memcontrol.c                         | 107 ++++++++++++++++++++++++
> >  2 files changed, 179 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> > index 6be43781ec7f..92207f0012e4 100644
> > --- a/Documentation/admin-guide/cgroup-v2.rst
> > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > @@ -1368,6 +1368,78 @@ PAGE_SIZE multiple when read back.
> >               collapsing an existing range of pages. This counter is not
> >               present when CONFIG_TRANSPARENT_HUGEPAGE is not set.
> >
> > +  memory.numa_stat
> > +     A read-only flat-keyed file which exists on non-root cgroups.
> > +
> > +     This breaks down the cgroup's memory footprint into different
> > +     types of memory, type-specific details, and other information
> > +     per node on the state of the memory management system.
> > +
> > +     This is useful for providing visibility into the numa locality
>
> capitalize acronyms, please:                             NUMA

OK, I will do that. Thanks.

>
>
> > +     information within an memcg since the pages are allowed to be
> > +     allocated from any physical node. One of the use cases is evaluating
> > +     application performance by combining this information with the
> > +     application's CPU allocation.
> > +
> > +     All memory amounts are in bytes.
> > +
> > +     The output format of memory.numa_stat is::
> > +
> > +       type N0=<node 0 pages> N1=<node 1 pages> ...
>
> Now I'm confused.  5 lines above here it says "All memory amounts are in bytes"
> but these appear to be in pages. Which is it?  and what size pages if that matters?

Sorry. It's my mistake. I will fix it.

>
> Is it like this?
>           type N0=<bytes in node 0 pages> N1=<bytes in node 1 pages> ...

Thanks.

>
>
>
> > +     The entries are ordered to be human readable, and new entries
> > +     can show up in the middle. Don't rely on items remaining in a
> > +     fixed position; use the keys to look up specific values!
> > +
> > +       anon
> > +             Amount of memory per node used in anonymous mappings such
> > +             as brk(), sbrk(), and mmap(MAP_ANONYMOUS)
> > +
> > +       file
> > +             Amount of memory per node used to cache filesystem data,
> > +             including tmpfs and shared memory.
> > +
> > +       kernel_stack
> > +             Amount of memory per node allocated to kernel stacks.
> > +
> > +       shmem
> > +             Amount of cached filesystem data per node that is swap-backed,
> > +             such as tmpfs, shm segments, shared anonymous mmap()s
> > +
> > +       file_mapped
> > +             Amount of cached filesystem data per node mapped with mmap()
> > +
> > +       file_dirty
> > +             Amount of cached filesystem data per node that was modified but
> > +             not yet written back to disk
> > +
> > +       file_writeback
> > +             Amount of cached filesystem data per node that was modified and
> > +             is currently being written back to disk
> > +
> > +       anon_thp
> > +             Amount of memory per node used in anonymous mappings backed by
> > +             transparent hugepages
> > +
> > +       inactive_anon, active_anon, inactive_file, active_file, unevictable
> > +             Amount of memory, swap-backed and filesystem-backed,
> > +             per node on the internal memory management lists used
> > +             by the page reclaim algorithm.
> > +
> > +             As these represent internal list state (eg. shmem pages are on anon
>
>                                                          e.g.

Thanks.

>
> > +             memory management lists), inactive_foo + active_foo may not be equal to
> > +             the value for the foo counter, since the foo counter is type-based, not
> > +             list-based.
> > +
> > +       slab_reclaimable
> > +             Amount of memory per node used for storing in-kernel data
> > +             structures which might be reclaimed, such as dentries and
> > +             inodes.
> > +
> > +       slab_unreclaimable
> > +             Amount of memory per node used for storing in-kernel data
> > +             structures which cannot be reclaimed on memory pressure.
>
> Some of the descriptions above end with a '.' and some do not. Please be consistent.

Will do that.

>
> > +
> >    memory.swap.current
> >       A read-only single value file which exists on non-root
> >       cgroups.
>
>
> thanks.
> --
> ~Randy
>


-- 
Yours,
Muchun

WARNING: multiple messages have this Message-ID (diff)
From: Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
To: Randy Dunlap <rdunlap-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
Cc: tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	Zefan Li <lizefan-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	corbet-T1hC0tSOHrs@public.gmane.org,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>,
	Cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-doc-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Linux Memory Management List
	<linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>,
	kernel test robot <lkp-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2
Date: Tue, 15 Sep 2020 10:44:01 +0800	[thread overview]
Message-ID: <CAMZfGtVvi5uY7iDAfWVVzaAy8YmfM9-UJ60p=aCw59Q=KKS-Vw@mail.gmail.com> (raw)
In-Reply-To: <8387344f-0e43-9b6e-068d-b2c45bbda1de-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>

On Tue, Sep 15, 2020 at 3:07 AM Randy Dunlap <rdunlap-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> wrote:
>
> On 9/13/20 12:00 AM, Muchun Song wrote:
> > In the cgroup v1, we have a numa_stat interface. This is useful for
> > providing visibility into the numa locality information within an
> > memcg since the pages are allowed to be allocated from any physical
> > node. One of the use cases is evaluating application performance by
> > combining this information with the application's CPU allocation.
> > But the cgroup v2 does not. So this patch adds the missing information.
> >
> > Signed-off-by: Muchun Song <songmuchun-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>
> > Suggested-by: Shakeel Butt <shakeelb-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
> > Reported-by: kernel test robot <lkp-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > ---
> >  changelog in v3:
> >  1. Fix compiler error on powerpc architecture reported by kernel test robot.
> >  2. Fix a typo from "anno" to "anon".
> >
> >  changelog in v2:
> >  1. Add memory.numa_stat interface in cgroup v2.
> >
> >  Documentation/admin-guide/cgroup-v2.rst |  72 ++++++++++++++++
> >  mm/memcontrol.c                         | 107 ++++++++++++++++++++++++
> >  2 files changed, 179 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
> > index 6be43781ec7f..92207f0012e4 100644
> > --- a/Documentation/admin-guide/cgroup-v2.rst
> > +++ b/Documentation/admin-guide/cgroup-v2.rst
> > @@ -1368,6 +1368,78 @@ PAGE_SIZE multiple when read back.
> >               collapsing an existing range of pages. This counter is not
> >               present when CONFIG_TRANSPARENT_HUGEPAGE is not set.
> >
> > +  memory.numa_stat
> > +     A read-only flat-keyed file which exists on non-root cgroups.
> > +
> > +     This breaks down the cgroup's memory footprint into different
> > +     types of memory, type-specific details, and other information
> > +     per node on the state of the memory management system.
> > +
> > +     This is useful for providing visibility into the numa locality
>
> capitalize acronyms, please:                             NUMA

OK, I will do that. Thanks.

>
>
> > +     information within an memcg since the pages are allowed to be
> > +     allocated from any physical node. One of the use cases is evaluating
> > +     application performance by combining this information with the
> > +     application's CPU allocation.
> > +
> > +     All memory amounts are in bytes.
> > +
> > +     The output format of memory.numa_stat is::
> > +
> > +       type N0=<node 0 pages> N1=<node 1 pages> ...
>
> Now I'm confused.  5 lines above here it says "All memory amounts are in bytes"
> but these appear to be in pages. Which is it?  and what size pages if that matters?

Sorry. It's my mistake. I will fix it.

>
> Is it like this?
>           type N0=<bytes in node 0 pages> N1=<bytes in node 1 pages> ...

Thanks.

>
>
>
> > +     The entries are ordered to be human readable, and new entries
> > +     can show up in the middle. Don't rely on items remaining in a
> > +     fixed position; use the keys to look up specific values!
> > +
> > +       anon
> > +             Amount of memory per node used in anonymous mappings such
> > +             as brk(), sbrk(), and mmap(MAP_ANONYMOUS)
> > +
> > +       file
> > +             Amount of memory per node used to cache filesystem data,
> > +             including tmpfs and shared memory.
> > +
> > +       kernel_stack
> > +             Amount of memory per node allocated to kernel stacks.
> > +
> > +       shmem
> > +             Amount of cached filesystem data per node that is swap-backed,
> > +             such as tmpfs, shm segments, shared anonymous mmap()s
> > +
> > +       file_mapped
> > +             Amount of cached filesystem data per node mapped with mmap()
> > +
> > +       file_dirty
> > +             Amount of cached filesystem data per node that was modified but
> > +             not yet written back to disk
> > +
> > +       file_writeback
> > +             Amount of cached filesystem data per node that was modified and
> > +             is currently being written back to disk
> > +
> > +       anon_thp
> > +             Amount of memory per node used in anonymous mappings backed by
> > +             transparent hugepages
> > +
> > +       inactive_anon, active_anon, inactive_file, active_file, unevictable
> > +             Amount of memory, swap-backed and filesystem-backed,
> > +             per node on the internal memory management lists used
> > +             by the page reclaim algorithm.
> > +
> > +             As these represent internal list state (eg. shmem pages are on anon
>
>                                                          e.g.

Thanks.

>
> > +             memory management lists), inactive_foo + active_foo may not be equal to
> > +             the value for the foo counter, since the foo counter is type-based, not
> > +             list-based.
> > +
> > +       slab_reclaimable
> > +             Amount of memory per node used for storing in-kernel data
> > +             structures which might be reclaimed, such as dentries and
> > +             inodes.
> > +
> > +       slab_unreclaimable
> > +             Amount of memory per node used for storing in-kernel data
> > +             structures which cannot be reclaimed on memory pressure.
>
> Some of the descriptions above end with a '.' and some do not. Please be consistent.

Will do that.

>
> > +
> >    memory.swap.current
> >       A read-only single value file which exists on non-root
> >       cgroups.
>
>
> thanks.
> --
> ~Randy
>


-- 
Yours,
Muchun

  reply	other threads:[~2020-09-15  2:44 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-13  7:00 [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2 Muchun Song
2020-09-13 17:09 ` Chris Down
2020-09-14  3:10   ` [External] " Muchun Song
2020-09-14  3:10     ` Muchun Song
2020-09-14  3:18     ` Zefan Li
2020-09-14  3:18       ` Zefan Li
2020-09-14  3:28       ` Muchun Song
2020-09-14  3:28         ` Muchun Song
2020-09-14 16:07 ` Shakeel Butt
2020-09-14 16:07   ` Shakeel Butt
2020-09-14 16:54   ` [External] " Muchun Song
2020-09-14 16:54     ` Muchun Song
2020-09-14 16:54     ` Muchun Song
2020-09-14 22:57     ` Shakeel Butt
2020-09-14 22:57       ` Shakeel Butt
2020-09-14 22:57       ` Shakeel Butt
2020-09-15  2:46       ` Muchun Song
2020-09-15  2:46         ` Muchun Song
2020-09-15  2:46         ` Muchun Song
2020-09-14 19:06 ` Randy Dunlap
2020-09-15  2:44   ` Muchun Song [this message]
2020-09-15  2:44     ` [External] " Muchun Song
2020-09-15  2:44     ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMZfGtVvi5uY7iDAfWVVzaAy8YmfM9-UJ60p=aCw59Q=KKS-Vw@mail.gmail.com' \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizefan@huawei.com \
    --cc=lkp@intel.com \
    --cc=mhocko@kernel.org \
    --cc=rdunlap@infradead.org \
    --cc=shakeelb@google.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.