archive mirror
 help / color / mirror / Atom feed
From: Kemi Wang <>
To: Andrew Morton <>,
	Michal Hocko <>,
	Mel Gorman <>,
	Johannes Weiner <>,
	Christopher Lameter <>
Cc: Dave <>,
	Andi Kleen <>,
	Jesper Dangaard Brouer <>,
	Ying Huang <>, Aaron Lu <>,
	Tim Chen <>, Linux MM <>,
	Linux Kernel <>,
	Kemi Wang <>
Subject: [PATCH v2 0/3] Separate NUMA statistics from zone statistics
Date: Thu, 24 Aug 2017 17:59:58 +0800	[thread overview]
Message-ID: <> (raw)

Each page allocation updates a set of per-zone statistics with a call to
zone_statistics(). As discussed in 2017 MM summit, these are a substantial
source of overhead in the page allocator and are very rarely consumed. This
significant overhead in cache bouncing caused by zone counters (NUMA
associated counters) update in parallel in multi-threaded page allocation
(pointed out by Dave Hansen).

A link to the MM summit slides:

To mitigate this overhead, this patchset separates NUMA statistics from
zone statistics framework, and update NUMA counter threshold to a fixed
size of MAX_U16 - 2, as a small threshold greatly increases the update
frequency of the global counter from local per cpu counter (suggested by
Ying Huang). The rationality is that these statistics counters don't need
to be read often, unlike other VM counters, so it's not a problem to use a
large threshold and make readers more expensive.

With this patchset, we see 31.3% drop of CPU cycles(537-->369, see below)
for per single page allocation and reclaim on Jesper's page_bench03
benchmark. Meanwhile, this patchset keeps the same style of virtual memory
statistics with little end-user-visible effects (only move the numa stats
to show behind zone page stats, see the first patch for details).

I did an experiment of single page allocation and reclaim concurrently
using Jesper's page_bench03 benchmark on a 2-Socket Broadwell-based server
(88 processors with 126G memory) with different size of threshold of pcp

Benchmark provided by Jesper D Brouer(increase loop times to 10000000):

   Threshold   CPU cycles    Throughput(88 threads)
      32        799         241760478
      64        640         301628829
      125       537         358906028 <==> system by default
      256       468         412397590
      512       428         450550704
      4096      399         482520943
      20000     394         489009617
      30000     395         488017817
      65533     369(-31.3%) 521661345(+45.3%) <==> with this patchset
      N/A       342(-36.3%) 562900157(+56.8%) <==> disable zone_statistics

Kemi Wang (3):
  mm: Change the call sites of numa statistics items
  mm: Update NUMA counter threshold size
  mm: Consider the number in local CPUs when *reads* NUMA stats

 drivers/base/node.c    |  22 ++++---
 include/linux/mmzone.h |  24 +++++---
 include/linux/vmstat.h |  33 +++++++++++
 mm/page_alloc.c        |  10 ++--
 mm/vmstat.c            | 152 +++++++++++++++++++++++++++++++++++++++++++++++--
 5 files changed, 217 insertions(+), 24 deletions(-)


To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to  For more info on Linux MM,
see: .
Don't email: <a href=mailto:""> </a>

             reply	other threads:[~2017-08-24 10:01 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-24  9:59 Kemi Wang [this message]
2017-08-24  9:59 ` [PATCH v2 1/3] mm: Change the call sites of numa statistics items Kemi Wang
2017-08-24 10:00 ` [PATCH v2 2/3] mm: Update NUMA counter threshold size Kemi Wang
2017-08-24 10:00 ` [PATCH v2 3/3] mm: Consider the number in local CPUs when *reads* NUMA stats Kemi Wang
2017-08-25  8:04 ` [PATCH v2 0/3] Separate NUMA statistics from zone statistics Mel Gorman
2017-08-25 13:16   ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).