[RFC 0/3] mm/page_alloc: Remote per-cpu lists drain support

* [RFC 0/3] mm/page_alloc: Remote per-cpu lists drain support
@ 2021-10-08 16:19 Nicolas Saenz Julienne
  2021-10-08 16:19 ` [RFC 1/3] mm/page_alloc: Simplify __rmqueue_pcplist()'s arguments Nicolas Saenz Julienne
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Nicolas Saenz Julienne @ 2021-10-08 16:19 UTC (permalink / raw)
  To: akpm
  Cc: linux-kernel, linux-mm, frederic, tglx, peterz, mtosatti, nilal,
	mgorman, linux-rt-users, vbabka, cl, paulmck, ppandit,
	Nicolas Saenz Julienne

This series replaces mm/page_alloc's per-cpu lists drain mechanism in order for
it to be able to be run remotely. Currently, only a local CPU is permitted to
change its per-cpu lists, and it's expected to do so, on-demand, whenever a
process demands it (by means of queueing a drain task on the local CPU). Most
systems will handle this promptly, but it'll cause problems for NOHZ_FULL CPUs
that can't take any sort of interruption without breaking their functional
guarantees (latency, bandwidth, etc...). Having a way for these processes to
remotely drain the lists themselves will make co-existing with isolated CPUs
possible, and comes with minimal performance[1]/memory cost to other users.

The new algorithm will atomically switch the pointer to the per-cpu lists and
use RCU to make sure it's not being used before draining them. 

I'm interested in an sort of feedback, but especially validating that the
approach is acceptable, and any tests/benchmarks you'd like to see run against
it. For now, I've been testing this successfully on both arm64 and x86_64
systems while forcing high memory pressure (i.e. forcing the
page_alloc's slow path).

Patches 1-2 serve as cleanups/preparation to make patch 3 easier to follow.

Here's my previous attempt at fixing this:
https://lkml.org/lkml/2021/9/21/599

[1] Proper performance numbers will be provided if the approach is deemed
    acceptable. That said, mm/page_alloc.c's fast paths only grow by an extra
    pointer indirection and a compiler barrier, which I think is unlikely to be
    measurable.

---

Nicolas Saenz Julienne (3):
  mm/page_alloc: Simplify __rmqueue_pcplist()'s arguments
  mm/page_alloc: Access lists in 'struct per_cpu_pages' indirectly
  mm/page_alloc: Add remote draining support to per-cpu lists

 include/linux/mmzone.h |  24 +++++-
 mm/page_alloc.c        | 173 +++++++++++++++++++++--------------------
 mm/vmstat.c            |   6 +-
 3 files changed, 114 insertions(+), 89 deletions(-)

-- 
2.31.1

^ permalink raw reply	[flat|nested] 7+ messages in thread