linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Feng Tang <feng.tang@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Rik van Riel <riel@redhat.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>
Subject: Re: [RFC -V2 2/8] autonuma, memory tiering: Rate limit NUMA migration throughput
Date: Tue, 18 Feb 2020 08:57:21 +0000	[thread overview]
Message-ID: <20200218085721.GC3420@suse.de> (raw)
In-Reply-To: <20200218082634.1596727-3-ying.huang@intel.com>

On Tue, Feb 18, 2020 at 04:26:28PM +0800, Huang, Ying wrote:
> From: Huang Ying <ying.huang@intel.com>
> 
> In autonuma memory tiering mode, the hot PMEM (persistent memory)
> pages could be migrated to DRAM via autonuma.  But this incurs some
> overhead too.  So that sometimes the workload performance may be hurt.
> To avoid too much disturbing to the workload, the migration throughput
> should be rate-limited.
> 
> At the other hand, in some situation, for example, some workloads
> exits, many DRAM pages become free, so that some pages of the other
> workloads can be migrated to DRAM.  To respond to the workloads
> changing quickly, it's better to migrate pages faster.
> 
> To address the above 2 requirements, a rate limit algorithm as follows
> is used,
> 
> - If there is enough free memory in DRAM node (that is, > high
>   watermark + 2 * rate limit pages), then NUMA migration throughput will
>   not be rate-limited to respond to the workload changing quickly.
> 
> - Otherwise, counting the number of pages to try to migrate to a DRAM
>   node via autonuma, if the count exceeds the limit specified by the
>   users, stop NUMA migration until the next second.
> 
> A new sysctl knob kernel.numa_balancing_rate_limit_mbps is added for
> the users to specify the limit.  If its value is 0, the default
> value (high watermark) will be used.
> 
> TODO: Add ABI document for new sysctl knob.
> 

I very strongly suggest that this only be done as a last resort and with
supporting data as to why it is necessary. NUMA balancing did have rate
limiting at one point and it was removed when balancing was smart enough
to mostly do the right thing without rate limiting. I posted a series
that reconciled NUMA balancing with the CPU load balancer recently which
further reduced spurious and unnecessary migrations. I would not like
to see rate limiting reintroduced unless there is no other way of fixing
saturation of memory bandwidth due to NUMA balancing. Even if it's
needed as a stopgap while the feature is finalised, it should be
introduced late in the series explaining why it's temporarily necessary.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2020-02-18  8:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18  8:26 [RFC -V2 0/8] autonuma: Optimize memory placement in memory tiering system Huang, Ying
2020-02-18  8:26 ` [RFC -V2 1/8] autonuma: Add NUMA_BALANCING_MEMORY_TIERING mode Huang, Ying
2020-02-18  8:26 ` [RFC -V2 2/8] autonuma, memory tiering: Rate limit NUMA migration throughput Huang, Ying
2020-02-18  8:57   ` Mel Gorman [this message]
2020-02-19  6:01     ` Huang, Ying
2020-02-18  8:26 ` [RFC -V2 3/8] autonuma, memory tiering: Use kswapd to demote cold pages to PMEM Huang, Ying
2020-02-18  9:09   ` Mel Gorman
2020-02-19  6:05     ` Huang, Ying
2020-02-18  8:26 ` [RFC -V2 4/8] autonuma, memory tiering: Skip to scan fastest memory Huang, Ying
2020-02-18  8:26 ` [RFC -V2 5/8] autonuma, memory tiering: Only promote page if accessed twice Huang, Ying
2020-02-18  8:26 ` [RFC -V2 6/8] autonuma, memory tiering: Select hotter pages to promote to fast memory node Huang, Ying
2020-02-18  8:26 ` [RFC -V2 7/8] autonuma, memory tiering: Double hot threshold for write hint page fault Huang, Ying
2020-02-18  8:26 ` [RFC -V2 8/8] autonuma, memory tiering: Adjust hot threshold automatically Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200218085721.GC3420@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=feng.tang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).