All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: <linux-mm@kvack.org>, Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	<linux-fsdevel@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<kernel-team@fb.com>, Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES
Date: Wed, 11 Apr 2018 14:56:31 +0100	[thread overview]
Message-ID: <20180411135624.GA24260@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <08524819-14ef-81d0-fa90-d7af13c6b9d5@suse.cz>

On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote:
> [+CC linux-api]
> 
> On 03/05/2018 02:37 PM, Roman Gushchin wrote:
> > This patch introduces a concept of indirectly reclaimable memory
> > and adds the corresponding memory counter and /proc/vmstat item.
> > 
> > Indirectly reclaimable memory is any sort of memory, used by
> > the kernel (except of reclaimable slabs), which is actually
> > reclaimable, i.e. will be released under memory pressure.
> > 
> > The counter is in bytes, as it's not always possible to
> > count such objects in pages. The name contains BYTES
> > by analogy to NR_KERNEL_STACK_KB.
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: linux-fsdevel@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: linux-mm@kvack.org
> > Cc: kernel-team@fb.com
> 
> Hmm, looks like I'm late and this user-visible API change was just
> merged. But it's for rc1, so we can still change it, hopefully?
> 
> One problem I see with the counter is that it's in bytes, but among
> counters that use pages, and the name doesn't indicate it.

Here I just followed "nr_kernel_stack" path, which is measured in kB,
but this is not mentioned in the field name.

> Then, I don't
> see why users should care about the "indirectly" part, as that's just an
> implementation detail. It is reclaimable and that's what matters, right?
> (I also wanted to complain about lack of Documentation/... update, but
> looks like there's no general file about vmstat, ugh)

I agree, that it's a bit weird, and it's probably better to not expose
it at all; but this is how all vm counters work. We do expose them all
in /proc/vmstat. A good number of them is useless until you are not a
mm developer, so it's arguable more "debug info" rather than "api".
It's definitely not a reason to make them messy.
Does "nr_indirectly_reclaimable_bytes" look better to you?

> 
> I also kind of liked the idea from v1 rfc posting that there would be a
> separate set of reclaimable kmalloc-X caches for these kind of
> allocations. Besides accounting, it should also help reduce memory
> fragmentation. The right variant of cache would be detected via
> __GFP_RECLAIMABLE.

Well, the downside is that we have to introduce X new caches
just for this particular problem. I'm not strictly against the idea,
but not convinced that it's much better.

> 
> With that in mind, can we at least for now put the (manually maintained)
> byte counter in a variable that's not directly exposed via /proc/vmstat,
> and then when printing nr_slab_reclaimable, simply add the value
> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable,
> subtract the same value. This way we would be simply making the existing
> counters more precise, in line with their semantics.

Idk, I don't like the idea of adding a counter outside of the vm counters
infrastructure, and I definitely wouldn't touch the exposed
nr_slab_reclaimable and nr_slab_unreclaimable fields.
We do have some stats in /proc/slabinfo, /proc/meminfo and /sys/kernel/slab
and I think that we should keep it consistent.

Thanks!

> 
> Thoughts?
> Vlastimil
> 
> > ---
> >  include/linux/mmzone.h | 1 +
> >  mm/vmstat.c            | 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index e09fe563d5dc..15e783f29e21 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -180,6 +180,7 @@ enum node_stat_item {
> >  	NR_VMSCAN_IMMEDIATE,	/* Prioritise for reclaim when writeback ends */
> >  	NR_DIRTIED,		/* page dirtyings since bootup */
> >  	NR_WRITTEN,		/* page writings since bootup */
> > +	NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */
> >  	NR_VM_NODE_STAT_ITEMS
> >  };
> >  
> > diff --git a/mm/vmstat.c b/mm/vmstat.c
> > index 40b2db6db6b1..b6b5684f31fe 100644
> > --- a/mm/vmstat.c
> > +++ b/mm/vmstat.c
> > @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = {
> >  	"nr_vmscan_immediate_reclaim",
> >  	"nr_dirtied",
> >  	"nr_written",
> > +	"nr_indirectly_reclaimable",
> >  
> >  	/* enum writeback_stat_item counters */
> >  	"nr_dirty_threshold",
> > 
> 

WARNING: multiple messages have this Message-ID (diff)
From: Roman Gushchin <guro@fb.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES
Date: Wed, 11 Apr 2018 14:56:31 +0100	[thread overview]
Message-ID: <20180411135624.GA24260@castle.DHCP.thefacebook.com> (raw)
In-Reply-To: <08524819-14ef-81d0-fa90-d7af13c6b9d5@suse.cz>

On Wed, Apr 11, 2018 at 03:16:08PM +0200, Vlastimil Babka wrote:
> [+CC linux-api]
> 
> On 03/05/2018 02:37 PM, Roman Gushchin wrote:
> > This patch introduces a concept of indirectly reclaimable memory
> > and adds the corresponding memory counter and /proc/vmstat item.
> > 
> > Indirectly reclaimable memory is any sort of memory, used by
> > the kernel (except of reclaimable slabs), which is actually
> > reclaimable, i.e. will be released under memory pressure.
> > 
> > The counter is in bytes, as it's not always possible to
> > count such objects in pages. The name contains BYTES
> > by analogy to NR_KERNEL_STACK_KB.
> > 
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Alexander Viro <viro@zeniv.linux.org.uk>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: linux-fsdevel@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: linux-mm@kvack.org
> > Cc: kernel-team@fb.com
> 
> Hmm, looks like I'm late and this user-visible API change was just
> merged. But it's for rc1, so we can still change it, hopefully?
> 
> One problem I see with the counter is that it's in bytes, but among
> counters that use pages, and the name doesn't indicate it.

Here I just followed "nr_kernel_stack" path, which is measured in kB,
but this is not mentioned in the field name.

> Then, I don't
> see why users should care about the "indirectly" part, as that's just an
> implementation detail. It is reclaimable and that's what matters, right?
> (I also wanted to complain about lack of Documentation/... update, but
> looks like there's no general file about vmstat, ugh)

I agree, that it's a bit weird, and it's probably better to not expose
it at all; but this is how all vm counters work. We do expose them all
in /proc/vmstat. A good number of them is useless until you are not a
mm developer, so it's arguable more "debug info" rather than "api".
It's definitely not a reason to make them messy.
Does "nr_indirectly_reclaimable_bytes" look better to you?

> 
> I also kind of liked the idea from v1 rfc posting that there would be a
> separate set of reclaimable kmalloc-X caches for these kind of
> allocations. Besides accounting, it should also help reduce memory
> fragmentation. The right variant of cache would be detected via
> __GFP_RECLAIMABLE.

Well, the downside is that we have to introduce X new caches
just for this particular problem. I'm not strictly against the idea,
but not convinced that it's much better.

> 
> With that in mind, can we at least for now put the (manually maintained)
> byte counter in a variable that's not directly exposed via /proc/vmstat,
> and then when printing nr_slab_reclaimable, simply add the value
> (divided by PAGE_SIZE), and when printing nr_slab_unreclaimable,
> subtract the same value. This way we would be simply making the existing
> counters more precise, in line with their semantics.

Idk, I don't like the idea of adding a counter outside of the vm counters
infrastructure, and I definitely wouldn't touch the exposed
nr_slab_reclaimable and nr_slab_unreclaimable fields.
We do have some stats in /proc/slabinfo, /proc/meminfo and /sys/kernel/slab
and I think that we should keep it consistent.

Thanks!

> 
> Thoughts?
> Vlastimil
> 
> > ---
> >  include/linux/mmzone.h | 1 +
> >  mm/vmstat.c            | 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> > index e09fe563d5dc..15e783f29e21 100644
> > --- a/include/linux/mmzone.h
> > +++ b/include/linux/mmzone.h
> > @@ -180,6 +180,7 @@ enum node_stat_item {
> >  	NR_VMSCAN_IMMEDIATE,	/* Prioritise for reclaim when writeback ends */
> >  	NR_DIRTIED,		/* page dirtyings since bootup */
> >  	NR_WRITTEN,		/* page writings since bootup */
> > +	NR_INDIRECTLY_RECLAIMABLE_BYTES, /* measured in bytes */
> >  	NR_VM_NODE_STAT_ITEMS
> >  };
> >  
> > diff --git a/mm/vmstat.c b/mm/vmstat.c
> > index 40b2db6db6b1..b6b5684f31fe 100644
> > --- a/mm/vmstat.c
> > +++ b/mm/vmstat.c
> > @@ -1161,6 +1161,7 @@ const char * const vmstat_text[] = {
> >  	"nr_vmscan_immediate_reclaim",
> >  	"nr_dirtied",
> >  	"nr_written",
> > +	"nr_indirectly_reclaimable",
> >  
> >  	/* enum writeback_stat_item counters */
> >  	"nr_dirty_threshold",
> > 
> 

  reply	other threads:[~2018-04-11 13:57 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-05 13:37 [PATCH 0/3] indirectly reclaimable memory Roman Gushchin
2018-03-05 13:37 ` Roman Gushchin
2018-03-05 13:37 ` Roman Gushchin
2018-03-05 13:37 ` [PATCH 1/3] mm: introduce NR_INDIRECTLY_RECLAIMABLE_BYTES Roman Gushchin
2018-03-05 13:37   ` Roman Gushchin
2018-03-05 13:37   ` Roman Gushchin
2018-04-11 13:16   ` Vlastimil Babka
2018-04-11 13:56     ` Roman Gushchin [this message]
2018-04-11 13:56       ` Roman Gushchin
2018-04-12  6:52       ` Vlastimil Babka
2018-04-12 11:52         ` Michal Hocko
2018-04-12 14:38           ` Roman Gushchin
2018-04-12 14:38             ` Roman Gushchin
2018-04-12 14:46             ` Michal Hocko
2018-04-12 14:57         ` Roman Gushchin
2018-04-12 14:57           ` Roman Gushchin
2018-04-13  6:59           ` Michal Hocko
2018-04-13 12:13           ` vinayak menon
2018-04-25  3:49             ` Vijayanand Jitta
2018-04-25 12:52               ` Roman Gushchin
2018-04-25 12:52                 ` Roman Gushchin
2018-04-25 15:47                 ` Vlastimil Babka
2018-04-25 16:48                   ` Roman Gushchin
2018-04-25 16:48                     ` Roman Gushchin
2018-04-25 17:02                     ` Vlastimil Babka
2018-04-25 17:23                       ` Roman Gushchin
2018-04-25 17:23                         ` Roman Gushchin
2018-04-25 15:55             ` Matthew Wilcox
2018-04-25 16:59               ` Vlastimil Babka
2018-03-05 13:37 ` [PATCH 2/3] mm: add indirectly reclaimable memory to MemAvailable Roman Gushchin
2018-03-05 13:37   ` Roman Gushchin
2018-03-05 13:37   ` Roman Gushchin
2018-03-05 13:47   ` Roman Gushchin
2018-03-05 13:47     ` Roman Gushchin
2018-03-05 13:47     ` Roman Gushchin
2018-03-05 13:37 ` [PATCH 2/3] mm: treat indirectly reclaimable memory as available in MemAvailable Roman Gushchin
2018-03-05 13:37   ` Roman Gushchin
2018-03-05 13:37   ` Roman Gushchin
2018-03-05 13:37 ` [PATCH 3/3] dcache: account external names as indirectly reclaimable memory Roman Gushchin
2018-03-05 13:37   ` Roman Gushchin
2018-03-05 13:37   ` Roman Gushchin
2018-03-12 21:17   ` Al Viro
2018-03-12 22:36     ` Roman Gushchin
2018-03-12 22:36       ` Roman Gushchin
2018-03-13  0:45       ` Al Viro
2018-04-05 22:11         ` Andrew Morton
2018-04-06 10:32           ` Roman Gushchin
2018-04-06 10:32             ` Roman Gushchin
2018-04-13 13:35   ` Minchan Kim
2018-04-13 13:59     ` Michal Hocko
2018-04-13 14:20       ` Vlastimil Babka
2018-04-13 14:28         ` Michal Hocko
2018-04-13 14:37           ` Johannes Weiner
2018-04-16 11:41             ` Michal Hocko
2018-04-16 12:06               ` Vlastimil Babka
2018-04-16 12:27                 ` Michal Hocko
2018-04-16 19:57                   ` Vlastimil Babka
2018-04-17  6:44                     ` Michal Hocko
2018-04-16 13:09                 ` Matthew Wilcox
2018-04-17 11:24               ` Roman Gushchin
2018-04-17 11:24                 ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180411135624.GA24260@castle.DHCP.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.