All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
To: David Rientjes <rientjes@google.com>
Cc: "corbet@lwn.net" <corbet@lwn.net>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mchehab+samsung@kernel.org" <mchehab+samsung@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"jpoimboe@redhat.com" <jpoimboe@redhat.com>,
	"pawan.kumar.gupta@linux.intel.com" 
	<pawan.kumar.gupta@linux.intel.com>,
	"jgross@suse.com" <jgross@suse.com>,
	"oneukum@suse.com" <oneukum@suse.com>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: RE: [PATCH] mm/vmscan: add vm_swappiness configuration knobs
Date: Thu, 12 Mar 2020 12:48:22 +0000	[thread overview]
Message-ID: <BL0PR02MB5601808F36BE202813E9D562E9FD0@BL0PR02MB5601.namprd02.prod.outlook.com> (raw)
In-Reply-To: <alpine.DEB.2.21.2003111227230.171292@chino.kir.corp.google.com>

On Wed, 11 Mar 2020, David Rientjes wrote:

> On Wed, 11 Mar 2020, Ivan Teterevkov wrote:
> 
> > This patch adds a couple of knobs:
> >
> > - The configuration option (CONFIG_VM_SWAPPINESS).
> > - The command line parameter (vm_swappiness).
> >
> > The default value is preserved, but now defined by CONFIG_VM_SWAPPINESS.
> >
> > Historically, the default swappiness is set to the well-known value 60,
> > and this works well for the majority of cases. The vm_swappiness is also
> > exposed as the kernel parameter that can be changed at runtime too, e.g.
> > with sysctl.
> >
> > This approach might not suit well some configurations, e.g. systemd-based
> > distros, where systemd is put in charge of the cgroup controllers,
> > including the memory one. In such cases, the default swappiness 60
> > is copied across the cgroup subtrees early at startup, when systemd
> > is arranging the slices for its services, before the sysctl.conf
> > or tmpfiles.d/*.conf changes are applied.
> >
> 
> Seems like something that can be fully handled by an initscript that would
> set the sysctl and then iterate the memcg hierarchy propagating the
> non-default value.  I don't think that's too much of an ask if userspace
> wants to manipulate the swappiness value.
> 

This is exactly what I'm trying to avoid: in some distros there is no way
to tackle the configuration early enough, e.g. in systemd-based systems
the systemd is the process that starts first and arranges memcg in a way
it's configured, but unfortunately, it doesn't offer the swappiness knob.

There could be a script to iterate the memcg later, but there would be a
race condition with the system entity that's put in charge of the memcg
because the configuration can't be changed atomically, e.g. a possible
script could iterate the memcg tree and update each memory.swappiness
while systemd is creating another slice or scope subtree.

> Or maybe we can be more clever: have memcg->swappiness store -1 by default
> unless it is changed by the user explicitly and then have
> mem_cgroup_swappiness() return vm_swappiness for this value.  If the user
> overwrites it, it's intended.
> 

Does it mean that -1 would become a reference to the vm_swappiness
or the parent's memory.swappiness? It sounds interesting and if so then
it would address my issues with the swappiness but would also change
the existing memcg behaviour: if the referred-to value changed, would
the memory.swappiness backed by -1 also change?

> So there are a couple options here but I don't think one of them is to add
> a new config option or kernel command line option.
> 

The vm_swappiness starts its lifespan in the kernel and thus
why not to facilitate it with a simple "constructor" there?

> > One could run a script to traverse the cgroup trees later and set the
> > desired memory.swappiness individually in each occurrence when the runtime
> > is set up, but this would require some amount of work to implement
> > properly. Instead, why not set the default swappiness as early as possible?
> >
> > Signed-off-by: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
> > ---
> >  .../admin-guide/kernel-parameters.txt         |  4 ++++
> >  mm/Kconfig                                    | 10 ++++++++
> >  mm/vmscan.c                                   | 24 ++++++++++++++++++-
> >  3 files changed, 37 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> > index c07815d230bc..5d54a4303522 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -5317,6 +5317,10 @@
> >  			  P	Enable page structure init time poisoning
> >  			  -	Disable all of the above options
> >
> > +	vm_swappiness=	[KNL]
> > +			Sets the default vm_swappiness.
> > +			Ranges from 0 to 100, the default value is 60.
> > +
> >  	vmalloc=nn[KMG]	[KNL,BOOT] Forces the vmalloc area to have an
> exact
> >  			size of <nn>. This can be used to increase the
> >  			minimum size (128MB on x86). It can also be used to
> diff --git a/mm/Kconfig b/mm/Kconfig index ab80933be65f..ec59c19e578e
> 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -739,4 +739,14 @@ config ARCH_HAS_HUGEPD  config
> MAPPING_DIRTY_HELPERS
> >          bool
> >
> > +config VM_SWAPPINESS
> > +	int "Default memory swappiness"
> > +	default 60
> > +	range 0 100
> > +	help
> > +	  Sets the default vm_swappiness, that could be changed later
> > +	  in the runtime, e.g. kernel command line, sysctl, etc.
> > +
> > +	  Higher value means more swappy. Historically, defaults to 60.
> > +
> >  endmenu
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 876370565455..7d2d3550f698 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -163,7 +163,29 @@ struct scan_control {
> >  /*
> >   * From 0 .. 100.  Higher means more swappy.
> >   */
> > -int vm_swappiness = 60;
> > +int vm_swappiness = CONFIG_VM_SWAPPINESS;
> > +
> > +static int __init swappiness_cmdline(char *str) {
> > +	int val, err;
> > +
> > +	if (!str)
> > +		return -EINVAL;
> > +
> > +	err = kstrtoint(str, 10, &val);
> > +	if (err)
> > +		return -EINVAL;
> > +
> > +	if (val < 0 || val > 100)
> > +		return -EINVAL;
> > +
> > +	vm_swappiness = val;
> > +
> > +	return 0;
> > +}
> > +
> > +early_param("vm_swappiness", swappiness_cmdline);
> > +
> >  /*
> >   * The total number of pages which are beyond the high watermark within all
> >   * zones.

  reply	other threads:[~2020-03-12 13:23 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-11 17:45 [PATCH] mm/vmscan: add vm_swappiness configuration knobs Ivan Teterevkov
2020-03-11 17:45 ` Ivan Teterevkov
2020-03-11 19:31 ` David Rientjes
2020-03-11 19:31   ` David Rientjes
2020-03-12 12:48   ` Ivan Teterevkov [this message]
2020-03-12 12:48     ` Ivan Teterevkov
2020-03-12 13:36     ` Matthew Wilcox
2020-03-12 13:36       ` Matthew Wilcox
2020-03-12 14:03       ` Chris Down
2020-03-12 14:03         ` Chris Down
2020-03-13 10:49         ` Ivan Teterevkov
2020-03-13 10:49           ` Ivan Teterevkov
2020-03-13 21:50           ` David Rientjes
2020-03-13 21:50             ` David Rientjes
2020-03-16 16:03             ` Ivan Teterevkov
2020-03-16 16:03               ` Ivan Teterevkov
2020-03-12  9:25 ` Michal Hocko
2020-03-12  9:25   ` Michal Hocko
2020-03-12 12:54   ` Ivan Teterevkov
2020-03-12 12:54     ` Ivan Teterevkov
2020-03-12 13:26     ` Michal Hocko
2020-03-12 13:26       ` Michal Hocko
2020-03-16 14:53       ` Vlastimil Babka
2020-03-16 14:53         ` Vlastimil Babka
2020-03-16 16:14         ` Ivan Teterevkov
2020-03-16 16:14           ` Ivan Teterevkov
2020-03-17  8:29         ` Michal Hocko
2020-03-17  8:29           ` Michal Hocko
2020-03-17 14:51           ` Vlastimil Babka
2020-03-17 14:51             ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BL0PR02MB5601808F36BE202813E9D562E9FD0@BL0PR02MB5601.namprd02.prod.outlook.com \
    --to=ivan.teterevkov@nutanix.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=jgross@suse.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=oneukum@suse.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=rientjes@google.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.