All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hasan Al Maruf <hasan3050@gmail.com>
To: ying.huang@intel.com
Cc: akpm@linux-foundation.org, dave.hansen@linux.intel.com,
	feng.tang@intel.com, hasanalmaruf@fb.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	mgorman@suse.de, mgorman@techsingularity.net, mhocko@suse.com,
	osalvador@suse.de, peterz@infradead.org, riel@surriel.com,
	shakeelb@google.com, shy828301@gmail.com, weixugc@google.com,
	ziy@nvidia.com
Subject: Re: [PATCH -V10 RESEND 2/6] NUMA balancing: optimize page placement for memory tiering system
Date: Tue,  7 Dec 2021 01:36:39 -0500	[thread overview]
Message-ID: <20211207063639.83762-1-hasanalmaruf@fb.com> (raw)
In-Reply-To: <20211207022757.2523359-3-ying.huang@intel.com>

Hi Huang,

>+void set_numabalancing_state(bool enabled)
>+{
>+	if (enabled)
>+		sysctl_numa_balancing_mode = NUMA_BALANCING_NORMAL;
>+	else
>+		sysctl_numa_balancing_mode = NUMA_BALANCING_DISABLED;
>+	__set_numabalancing_state(enabled);
>+}
>+

One of the properties of optimized NUMA Balancing for tiered memory is we
are not going to scan top-tier nodes as promotion doesn't make sense there
(implemented in the next patch [3/6]). However, if a system has only
single memory node with CPU, does it make sense to run
`NUMA_BALANCING_NORMAL` mode there? What do you think about downgrading to
`NUMA_BALANCING_MEMORY_TIERING` mode if a user setup NUMA Balancing on
the default mode of `NUMA_BALANCING_NORMAL` on a single toptier memory
node?

>diff --git a/mm/vmscan.c b/mm/vmscan.c
>index c266e64d2f7e..5edb5dfa8900 100644
>--- a/mm/vmscan.c
>+++ b/mm/vmscan.c
>@@ -56,6 +56,7 @@
>
> #include <linux/swapops.h>
> #include <linux/balloon_compaction.h>
>+#include <linux/sched/sysctl.h>
>
> #include "internal.h"
>
>@@ -3919,6 +3920,12 @@ static bool pgdat_watermark_boosted(pg_data_t *pgdat, int highest_zoneidx)
> 	return false;
> }
>
>+/*
>+ * Keep the free pages on fast memory node a little more than the high
>+ * watermark to accommodate the promoted pages.
>+ */
>+#define NUMA_BALANCING_PROMOTE_WATERMARK	(10UL * 1024 * 1024 >> PAGE_SHIFT)
>+
> /*
>  * Returns true if there is an eligible zone balanced for the request order
>  * and highest_zoneidx
>@@ -3940,6 +3947,15 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx)
> 			continue;
>
> 		mark = high_wmark_pages(zone);
>+		if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING &&
>+		    numa_demotion_enabled &&
>+		    next_demotion_node(pgdat->node_id) != NUMA_NO_NODE) {
>+			unsigned long promote_mark;
>+
>+			promote_mark = min(NUMA_BALANCING_PROMOTE_WATERMARK,
>+					   pgdat->node_present_pages >> 6);
>+			mark += promote_mark;
>+		}
> 		if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx))
> 			return true;
> 	}

This can be moved to a different patch. I think, this patch [2/6] can be
splitted into two basic patches -- 1. NUMA Balancing interface for tiered
memory and 2. maintaining a headroom for promotion.

Instead of having a static value for `NUMA_BALANCING_PROMOTE_WATERMARK`
what about decoupling the allocation and reclamation and add a user-space
interface for controling them?

Do you think patch [2/5] and [3/5] of this series can be merged to your
current patchset?

https://lore.kernel.org/all/cover.1637778851.git.hasanalmaruf@fb.com/

Best,
Hasan

  reply	other threads:[~2021-12-07  6:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07  2:27 [PATCH -V10 RESEND 0/6] NUMA balancing: optimize memory placement for memory tiering system Huang Ying
2021-12-07  2:27 ` [PATCH -V10 RESEND 1/6] NUMA Balancing: add page promotion counter Huang Ying
2021-12-07  6:05   ` Hasan Al Maruf
2021-12-08  2:16     ` Huang, Ying
2021-12-17  7:25   ` Baolin Wang
2021-12-07  2:27 ` [PATCH -V10 RESEND 2/6] NUMA balancing: optimize page placement for memory tiering system Huang Ying
2021-12-07  6:36   ` Hasan Al Maruf [this message]
2021-12-08  3:16     ` Huang, Ying
2021-12-17  7:35   ` Baolin Wang
2021-12-07  2:27 ` [PATCH -V10 RESEND 3/6] memory tiering: skip to scan fast memory Huang Ying
2021-12-17  7:41   ` Baolin Wang
2021-12-07  2:27 ` [PATCH -V10 RESEND 4/6] memory tiering: hot page selection with hint page fault latency Huang Ying
2021-12-07  2:27 ` [PATCH -V10 RESEND 5/6] memory tiering: rate limit NUMA migration throughput Huang Ying
2021-12-07  2:27 ` [PATCH -V10 RESEND 6/6] memory tiering: adjust hot threshold automatically Huang Ying
2022-01-12 16:10 ` [PATCH -V10 RESEND 0/6] NUMA balancing: optimize memory placement for memory tiering system Peter Zijlstra
2022-01-13  7:19   ` Huang, Ying
2022-01-13  9:49     ` Peter Zijlstra
2022-01-13 12:06       ` Huang, Ying
2022-01-13 13:00         ` Peter Zijlstra
2022-01-13 13:13           ` Huang, Ying
2022-01-13 14:24           ` Huang, Ying
2022-01-14  5:24             ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211207063639.83762-1-hasanalmaruf@fb.com \
    --to=hasan3050@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=feng.tang@intel.com \
    --cc=hasanalmaruf@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=weixugc@google.com \
    --cc=ying.huang@intel.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.