linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Mel Gorman <mgorman@techsingularity.net>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Aaron Lu <aaron.lu@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Michal Hocko <mhocko@kernel.org>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 4/5] mm/page_alloc: Free pages in a single pass during bulk free
Date: Wed, 16 Feb 2022 15:24:56 +0100	[thread overview]
Message-ID: <fdc14384-e3b7-921c-cc52-5992f3cbfd2c@suse.cz> (raw)
In-Reply-To: <20220215145111.27082-5-mgorman@techsingularity.net>

On 2/15/22 15:51, Mel Gorman wrote:
> free_pcppages_bulk() has taken two passes through the pcp lists since
> commit 0a5f4e5b4562 ("mm/free_pcppages_bulk: do not hold lock when picking
> pages to free") due to deferring the cost of selecting PCP lists until
> the zone lock is held. Now that list selection is simplier, the main
> cost during selection is bulkfree_pcp_prepare() which in the normal case
> is a simple check and prefetching. As the list manipulations have cost
> in itself, go back to freeing pages in a single pass.
> 
> The series up to this point was evaulated using a trunc microbenchmark
> that is truncating sparse files stored in page cache (mmtests config
> config-io-trunc). Sparse files were used to limit filesystem interaction.
> 
> The results versus a revert of storing high-order pages in the PCP lists is
> 
> 1-socket Skylake
>                               5.17.0-rc3             5.17.0-rc3             5.17.0-rc3
>                                  vanilla    mm-reverthighpcp-v1r1     mm-highpcpopt-v1
> Min       elapsed      540.00 (   0.00%)      530.00 (   1.85%)      530.00 (   1.85%)
> Amean     elapsed      543.00 (   0.00%)      530.00 *   2.39%*      530.00 *   2.39%*
> Stddev    elapsed        4.83 (   0.00%)        0.00 ( 100.00%)        0.00 ( 100.00%)
> CoeffVar  elapsed        0.89 (   0.00%)        0.00 ( 100.00%)        0.00 ( 100.00%)
> Max       elapsed      550.00 (   0.00%)      530.00 (   3.64%)      530.00 (   3.64%)
> BAmean-50 elapsed      540.00 (   0.00%)      530.00 (   1.85%)      530.00 (   1.85%)
> BAmean-95 elapsed      542.22 (   0.00%)      530.00 (   2.25%)      530.00 (   2.25%)
> BAmean-99 elapsed      542.22 (   0.00%)      530.00 (   2.25%)      530.00 (   2.25%)
> 
> 2-socket CascadeLake
>                               5.17.0-rc3             5.17.0-rc3             5.17.0-rc3
>                                  vanilla    mm-reverthighpcp-v1       mm-highpcpopt-v1
> Min       elapsed      510.00 (   0.00%)      500.00 (   1.96%)      500.00 (   1.96%)
> Amean     elapsed      529.00 (   0.00%)      521.00 (   1.51%)      516.00 *   2.46%*
> Stddev    elapsed       16.63 (   0.00%)       12.87 (  22.64%)        9.66 (  41.92%)
> CoeffVar  elapsed        3.14 (   0.00%)        2.47 (  21.46%)        1.87 (  40.45%)
> Max       elapsed      550.00 (   0.00%)      540.00 (   1.82%)      530.00 (   3.64%)
> BAmean-50 elapsed      516.00 (   0.00%)      512.00 (   0.78%)      510.00 (   1.16%)
> BAmean-95 elapsed      526.67 (   0.00%)      518.89 (   1.48%)      514.44 (   2.32%)
> BAmean-99 elapsed      526.67 (   0.00%)      518.89 (   1.48%)      514.44 (   2.32%)
> 
> The original motivation for multi-passes was will-it-scale page_fault1
> using $nr_cpu processes.
> 
> 2-socket CascadeLake (40 cores, 80 CPUs HT enabled)
>                                                     5.17.0-rc3                 5.17.0-rc3
>                                                        vanilla         mm-highpcpopt-v1r4
> Hmean     page_fault1-processes-2        2694662.26 (   0.00%)      2696801.07 (   0.08%)
> Hmean     page_fault1-processes-5        6425819.34 (   0.00%)      6426573.21 (   0.01%)
> Hmean     page_fault1-processes-8        9642169.10 (   0.00%)      9647444.94 (   0.05%)
> Hmean     page_fault1-processes-12      12167502.10 (   0.00%)     12073323.10 *  -0.77%*
> Hmean     page_fault1-processes-21      15636859.03 (   0.00%)     15587449.50 *  -0.32%*
> Hmean     page_fault1-processes-30      25157348.61 (   0.00%)     25111707.15 *  -0.18%*
> Hmean     page_fault1-processes-48      27694013.85 (   0.00%)     27728568.63 (   0.12%)
> Hmean     page_fault1-processes-79      25928742.64 (   0.00%)     25920933.41 (  -0.03%) <---
> Hmean     page_fault1-processes-110     25730869.75 (   0.00%)     25695727.57 *  -0.14%*
> Hmean     page_fault1-processes-141     25626992.42 (   0.00%)     25675346.68 *   0.19%*
> Hmean     page_fault1-processes-172     25611651.35 (   0.00%)     25650940.14 *   0.15%*
> Hmean     page_fault1-processes-203     25577298.75 (   0.00%)     25584848.65 (   0.03%)
> Hmean     page_fault1-processes-234     25580686.07 (   0.00%)     25601794.52 *   0.08%*
> Hmean     page_fault1-processes-265     25570215.47 (   0.00%)     25553191.25 (  -0.07%)
> Hmean     page_fault1-processes-296     25549488.62 (   0.00%)     25530311.58 (  -0.08%)
> Hmean     page_fault1-processes-320     25555149.05 (   0.00%)     25585059.83 (   0.12%)
> 
> The differences are mostly within the noise and the difference close to
> $nr_cpus is negligible.
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>


Reviewed-by: Vlastimil Babka <vbabka@suse.cz>


  reply	other threads:[~2022-02-16 14:24 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-15 14:51 [PATCH 0/5] Follow-up on high-order PCP caching Mel Gorman
2022-02-15 14:51 ` [PATCH 1/5] mm/page_alloc: Fetch the correct pcp buddy during bulk free Mel Gorman
2022-02-16 11:16   ` Vlastimil Babka
2022-02-15 14:51 ` [PATCH 2/5] mm/page_alloc: Track range of active PCP lists " Mel Gorman
2022-02-16 12:02   ` Vlastimil Babka
2022-02-16 13:05     ` Mel Gorman
2022-02-15 14:51 ` [PATCH 3/5] mm/page_alloc: Simplify how many pages are selected per pcp list " Mel Gorman
2022-02-16 12:20   ` Vlastimil Babka
2022-02-15 14:51 ` [PATCH 4/5] mm/page_alloc: Free pages in a single pass " Mel Gorman
2022-02-16 14:24   ` Vlastimil Babka [this message]
2022-02-15 14:51 ` [PATCH 5/5] mm/page_alloc: Limit number of high-order pages on PCP " Mel Gorman
2022-02-16 14:37   ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fdc14384-e3b7-921c-cc52-5992f3cbfd2c@suse.cz \
    --to=vbabka@suse.cz \
    --cc=aaron.lu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=brouer@redhat.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).