linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Raghavendra K T <raghavendra.kt@amd.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	Bharata B Rao <bharata@amd.com>, Ingo Molnar <mingo@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 6/6] sched/numa: Complete scanning of inactive VMAs when there is no alternative
Date: Tue, 10 Oct 2023 11:23:00 +0200	[thread overview]
Message-ID: <ZSUX9NLa+DDjFLnZ@gmail.com> (raw)
In-Reply-To: <20231010083143.19593-7-mgorman@techsingularity.net>


* Mel Gorman <mgorman@techsingularity.net> wrote:

> On a 2-socket Cascade Lake test machine, the time to complete the
> workload is as follows;
> 
>                                                6.6.0-rc2              6.6.0-rc2
>                                      sched-numabtrace-v1 sched-numabselective-v1
> Min       elsp-NUMA01_THREADLOCAL      174.22 (   0.00%)      117.64 (  32.48%)
> Amean     elsp-NUMA01_THREADLOCAL      175.68 (   0.00%)      123.34 *  29.79%*
> Stddev    elsp-NUMA01_THREADLOCAL        1.20 (   0.00%)        4.06 (-238.20%)
> CoeffVar  elsp-NUMA01_THREADLOCAL        0.68 (   0.00%)        3.29 (-381.70%)
> Max       elsp-NUMA01_THREADLOCAL      177.18 (   0.00%)      128.03 (  27.74%)
> 
> The time to complete the workload is reduced by almost 30%
> 
>                    6.6.0-rc2   6.6.0-rc2
>                 sched-numabtrace-v1 sched-numabselective-v1 /
> Duration User       91201.80    63506.64
> Duration System      2015.53     1819.78
> Duration Elapsed     1234.77      868.37
> 
> In this specific case, system CPU time was not increased but it's not
> universally true.
> 
> From vmstat, the NUMA scanning and fault activity is as follows;
> 
>                                       6.6.0-rc2      6.6.0-rc2
>                             sched-numabtrace-v1 sched-numabselective-v1
> Ops NUMA base-page range updates       64272.00    26374386.00
> Ops NUMA PTE updates                   36624.00       55538.00
> Ops NUMA PMD updates                      54.00       51404.00
> Ops NUMA hint faults                   15504.00       75786.00
> Ops NUMA hint local faults %           14860.00       56763.00
> Ops NUMA hint local percent               95.85          74.90
> Ops NUMA pages migrated                 1629.00     6469222.00
> 
> Both the number of PTE updates and hint faults is dramatically
> increased. While this is superficially unfortunate, it represents
> ranges that were simply skipped without the patch. As a result
> of the scanning and hinting faults, many more pages were also
> migrated but as the time to completion is reduced, the overhead
> is offset by the gain.

Nice! I've applied your series to tip:sched/core with a few non-functional 
edits to comment/changelog formatting/clarity.

Btw., was any previous analysis done on the size of the pids_active[] hash
and the hash collision rate?

64 (BITS_PER_LONG) feels a bit small, especially on larger machines running 
threaded workloads, and the kmalloc of numab_state likely allocates a full 
cacheline anyway, so we could double the hash size from 8 bytes (2x1 longs) 
to 32 bytes (2x2 longs) with very little real cost, and still have a long 
field left to spare?

Thanks,

	Ingo

  reply	other threads:[~2023-10-10  9:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-10  8:31 [PATCH 0/6] sched/numa: Complete scanning of partial and inactive VMAs Mel Gorman
2023-10-10  8:31 ` [PATCH 1/6] sched/numa: Document vma_numab_state fields Mel Gorman
2023-10-10  9:43   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2023-10-10  8:31 ` [PATCH 2/6] sched/numa: Rename vma_numab_state.access_pids Mel Gorman
2023-10-10  9:43   ` [tip: sched/core] sched/numa: Rename vma_numab_state::access_pids[] => ::pids_active[], ::next_pid_reset => ::pids_active_reset tip-bot2 for Mel Gorman
2023-10-10  8:31 ` [PATCH 3/6] sched/numa: Trace decisions related to skipping VMAs Mel Gorman
2023-10-10  9:43   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2023-10-10  8:31 ` [PATCH 4/6] sched/numa: Move up the access pid reset logic Mel Gorman
2023-10-10  9:43   ` [tip: sched/core] " tip-bot2 for Raghavendra K T
2023-10-10  8:31 ` [PATCH 5/6] sched/numa: Complete scanning of partial VMAs regardless of PID activity Mel Gorman
2023-10-10  9:43   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2023-10-10 21:47   ` tip-bot2 for Mel Gorman
2023-10-10  8:31 ` [PATCH 6/6] sched/numa: Complete scanning of inactive VMAs when there is no alternative Mel Gorman
2023-10-10  9:23   ` Ingo Molnar [this message]
2023-10-10  9:57     ` Mel Gorman
2023-10-10 21:39       ` Ingo Molnar
2023-10-10 11:40     ` Raghavendra K T
2024-02-24  4:50       ` Raghavendra K T
2023-10-10  9:43   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2023-10-10 11:42   ` [PATCH 6/6] " Raghavendra K T
2023-10-10 21:47   ` [tip: sched/core] " tip-bot2 for Mel Gorman
2023-10-10 11:39 ` [PATCH 0/6] sched/numa: Complete scanning of partial and inactive VMAs Raghavendra K T
2023-10-10 21:45   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSUX9NLa+DDjFLnZ@gmail.com \
    --to=mingo@kernel.org \
    --cc=bharata@amd.com \
    --cc=kprateek.nayak@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=raghavendra.kt@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).