All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	ben.widawsky@intel.com, rientjes@google.com, cl@linux.com,
	alex.shi@linux.alibaba.com, dwagner@suse.de, tobin@kernel.org,
	akpm@linux-foundation.org, ying.huang@intel.com,
	dan.j.williams@intel.com, cai@lca.pw, stable@vger.kernel.org
Subject: Re: [RFC][PATCH 01/13] mm/vmscan: restore zone_reclaim_mode ABI
Date: Wed, 10 Feb 2021 10:42:28 +0100	[thread overview]
Message-ID: <20210210094222.GA27173@linux> (raw)
In-Reply-To: <20210126003412.59594AA9@viggo.jf.intel.com>

On Mon, Jan 25, 2021 at 04:34:13PM -0800, Dave Hansen wrote:
> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> I went to go add a new RECLAIM_* mode for the zone_reclaim_mode
> sysctl.  Like a good kernel developer, I also went to go update the
> documentation.  I noticed that the bits in the documentation didn't
> match the bits in the #defines.
> 
> The VM never explicitly checks the RECLAIM_ZONE bit.  The bit is,
> however implicitly checked when checking 'node_reclaim_mode==0'.
> The RECLAIM_ZONE #define was removed in a cleanup.  That, by itself
> is fine.
> 
> But, when the bit was removed (bit 0) the _other_ bit locations also
> got changed.  That's not OK because the bit values are documented to
> mean one specific thing and users surely rely on them meaning that one
> thing and not changing from kernel to kernel.  The end result is that
> if someone had a script that did:
> 
> 	sysctl vm.zone_reclaim_mode=1
> 
> This script would have gone from enalbing node reclaim for clean
> unmapped pages to writing out pages during node reclaim after the
> commit in question.  That's not great.
> 
> Put the bits back the way they were and add a comment so something
> like this is a bit harder to do again.  Update the documentation to
> make it clear that the first bit is ignored.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Fixes: 648b5cf368e0 ("mm/vmscan: remove unused RECLAIM_OFF/RECLAIM_ZONE")
> Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
> Acked-by: David Rientjes <rientjes@google.com>
> Acked-by: Christoph Lameter <cl@linux.com>
> Cc: Alex Shi <alex.shi@linux.alibaba.com>
> Cc: Daniel Wagner <dwagner@suse.de>
> Cc: "Tobin C. Harding" <tobin@kernel.org>
> Cc: Christoph Lameter <cl@linux.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Qian Cai <cai@lca.pw>
> Cc: Daniel Wagner <dwagner@suse.de>
> Cc: osalvador <osalvador@suse.de>
> Cc: stable@vger.kernel.org

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> 
> --
> 
> Changes from v2:
>  * Update description to indicate that bit0 was used for clean
>    unmapped page node reclaim.
> ---
> 
>  b/Documentation/admin-guide/sysctl/vm.rst |   10 +++++-----
>  b/mm/vmscan.c                             |    9 +++++++--
>  2 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff -puN Documentation/admin-guide/sysctl/vm.rst~mm-vmscan-restore-old-zone_reclaim_mode-abi Documentation/admin-guide/sysctl/vm.rst
> --- a/Documentation/admin-guide/sysctl/vm.rst~mm-vmscan-restore-old-zone_reclaim_mode-abi	2021-01-25 16:23:06.048866718 -0800
> +++ b/Documentation/admin-guide/sysctl/vm.rst	2021-01-25 16:23:06.056866718 -0800
> @@ -978,11 +978,11 @@ that benefit from having their data cach
>  left disabled as the caching effect is likely to be more important than
>  data locality.
>  
> -zone_reclaim may be enabled if it's known that the workload is partitioned
> -such that each partition fits within a NUMA node and that accessing remote
> -memory would cause a measurable performance reduction.  The page allocator
> -will then reclaim easily reusable pages (those page cache pages that are
> -currently not used) before allocating off node pages.
> +Consider enabling one or more zone_reclaim mode bits if it's known that the
> +workload is partitioned such that each partition fits within a NUMA node
> +and that accessing remote memory would cause a measurable performance
> +reduction.  The page allocator will take additional actions before
> +allocating off node pages.
>  
>  Allowing zone reclaim to write out pages stops processes that are
>  writing large amounts of data from dirtying pages on other nodes. Zone
> diff -puN mm/vmscan.c~mm-vmscan-restore-old-zone_reclaim_mode-abi mm/vmscan.c
> --- a/mm/vmscan.c~mm-vmscan-restore-old-zone_reclaim_mode-abi	2021-01-25 16:23:06.052866718 -0800
> +++ b/mm/vmscan.c	2021-01-25 16:23:06.057866718 -0800
> @@ -4086,8 +4086,13 @@ module_init(kswapd_init)
>   */
>  int node_reclaim_mode __read_mostly;
>  
> -#define RECLAIM_WRITE (1<<0)	/* Writeout pages during reclaim */
> -#define RECLAIM_UNMAP (1<<1)	/* Unmap pages during reclaim */
> +/*
> + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> + * ABI.  New bits are OK, but existing bits can never change.
> + */
> +#define RECLAIM_ZONE  (1<<0)   /* Run shrink_inactive_list on the zone */
> +#define RECLAIM_WRITE (1<<1)   /* Writeout pages during reclaim */
> +#define RECLAIM_UNMAP (1<<2)   /* Unmap pages during reclaim */
>  
>  /*
>   * Priority for NODE_RECLAIM. This determines the fraction of pages
> _
> 

-- 
Oscar Salvador
SUSE L3

  reply	other threads:[~2021-02-10  9:45 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-26  0:34 [RFC][PATCH 00/13] [v5] Migrate Pages in lieu of discard Dave Hansen
2021-01-26  0:34 ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 01/13] mm/vmscan: restore zone_reclaim_mode ABI Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-02-10  9:42   ` Oscar Salvador [this message]
2021-01-26  0:34 ` [RFC][PATCH 02/13] mm/vmscan: move RECLAIM* bits to uapi header Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-02-10  9:44   ` Oscar Salvador
2021-01-26  0:34 ` [RFC][PATCH 03/13] mm/vmscan: replace implicit RECLAIM_ZONE checks with explicit checks Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-31  1:10   ` David Rientjes
2021-01-31  1:10     ` David Rientjes
2021-02-10  9:54   ` Oscar Salvador
2021-01-26  0:34 ` [RFC][PATCH 04/13] mm/numa: node demotion data structure and lookup Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-31  1:19   ` David Rientjes
2021-01-31  1:19     ` David Rientjes
2021-02-01 17:49     ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 05/13] mm/numa: automatically generate node migration order Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-29 20:46   ` Yang Shi
2021-01-29 20:46     ` Yang Shi
2021-02-01 19:13     ` Dave Hansen
2021-02-02 11:43       ` Oscar Salvador
2021-02-02 17:46       ` Yang Shi
2021-02-02 17:46         ` Yang Shi
2021-02-03  0:43         ` Dave Hansen
2021-02-04  0:26           ` Yang Shi
2021-02-04  0:26             ` Yang Shi
2021-01-26  0:34 ` [RFC][PATCH 06/13] mm/migrate: update migration order during on hotplug events Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-29 20:59   ` Yang Shi
2021-01-29 20:59     ` Yang Shi
2021-02-02 11:42   ` Oscar Salvador
2021-02-09 23:45     ` Dave Hansen
2021-02-10  8:55       ` Oscar Salvador
2021-01-26  0:34 ` [RFC][PATCH 07/13] mm/migrate: make migrate_pages() return nr_succeeded Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-29 21:04   ` Yang Shi
2021-01-29 21:04     ` Yang Shi
2021-02-09 23:41     ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 08/13] mm/migrate: demote pages during reclaim Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-02-02 11:55   ` Oscar Salvador
2021-02-02 22:45     ` Yang Shi
2021-02-02 22:45       ` Yang Shi
2021-02-02 22:56       ` Dave Hansen
2021-02-02 18:22   ` Yang Shi
2021-02-02 18:22     ` Yang Shi
2021-02-02 18:34     ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 09/13] mm/vmscan: add page demotion counter Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 10/13] mm/vmscan: add helper for querying ability to age anonymous pages Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 11/13] mm/vmscan: Consider anonymous pages without swap Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-02-02 18:56   ` Yang Shi
2021-02-02 18:56     ` Yang Shi
2021-02-02 21:35     ` Dave Hansen
2021-02-02 22:35       ` Yang Shi
2021-02-02 22:35         ` Yang Shi
2021-01-26  0:34 ` [RFC][PATCH 12/13] mm/vmscan: never demote for memcg reclaim Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-26  0:34 ` [RFC][PATCH 13/13] mm/migrate: new zone_reclaim_mode to enable reclaim migration Dave Hansen
2021-01-26  0:34   ` Dave Hansen
2021-01-31  1:13 ` [RFC][PATCH 00/13] [v5] Migrate Pages in lieu of discard David Rientjes
2021-01-31  1:13   ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210210094222.GA27173@linux \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@linux.alibaba.com \
    --cc=ben.widawsky@intel.com \
    --cc=cai@lca.pw \
    --cc=cl@linux.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dwagner@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=stable@vger.kernel.org \
    --cc=tobin@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.