Linux-rt-users Archive on lore.kernel.org
 help / color / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Linux-MM <linux-mm@kvack.org>
Cc: Linux-RT-Users <linux-rt-users@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>,
	Jesper Dangaard Brouer <brouer@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Mel Gorman <mgorman@techsingularity.net>
Subject: [RFC PATCH 0/6] Use local_lock for pcp protection and reduce stat overhead
Date: Mon, 29 Mar 2021 13:06:42 +0100
Message-ID: <20210329120648.19040-1-mgorman@techsingularity.net> (raw)

This series requires patches in Andrew's tree so the series is also
available at

git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git mm-percpu-local_lock-v1r15

tldr: Jesper and Chuck, it would be nice to verify if this series helps
	the allocation rate of the bulk page allocator. RT people, this
	*partially* addresses some problems PREEMPT_RT has with the page
	allocator but it needs review.

The PCP (per-cpu page allocator in page_alloc.c) share locking requirements
with vmstat which is inconvenient and causes some issues. Possibly because
of that, the PCP list and vmstat share the same per-cpu space meaning that
it's possible that vmstat updates dirty cache lines holding per-cpu lists
across CPUs unless padding is used. The series splits that structure and
separates the locking.

Second, PREEMPT_RT considers the following sequence to be unsafe
as documented in Documentation/locking/locktypes.rst

   local_irq_disable();
   spin_lock(&lock);

The pcp allocator has this sequence for rmqueue_pcplist (local_irq_save)
-> __rmqueue_pcplist -> rmqueue_bulk (spin_lock). This series explicitly
separates the locking requirements for the PCP list (local_lock) and stat
updates (irqs disabled). Once that is done, the length of time IRQs are
disabled can be reduced and in some cases, IRQ disabling can be replaced
with preempt_disable.

After that, it was very obvious that zone_statistics in particular has way
too much overhead and leaves IRQs disabled for longer than necessary. It
has perfectly accurate counters requiring IRQs be disabled for parallel
RMW sequences when inaccurate ones like vm_events would do. The series
makes the NUMA statistics (NUMA_HIT and friends) inaccurate counters that
only require preempt be disabled.

Finally the bulk page allocator can then do all the stat updates in bulk
with IRQs enabled which should improve the efficiency of the bulk page
allocator. Technically, this could have been done without the local_lock
and vmstat conversion work and the order simply reflects the timing of
when different series were implemented.

No performance data is included because despite the overhead of the
stats, it's within the noise for most workloads but Jesper and Chuck may
observe a significant different with the same tests used for the bulk
page allocator. The series is more likely to be interesting to the RT
folk in terms of slowing getting the PREEMPT tree into mainline.

 drivers/base/node.c    |  18 +--
 include/linux/mmzone.h |  29 +++--
 include/linux/vmstat.h |  65 ++++++-----
 mm/mempolicy.c         |   2 +-
 mm/page_alloc.c        | 173 ++++++++++++++++------------
 mm/vmstat.c            | 254 +++++++++++++++--------------------------
 6 files changed, 254 insertions(+), 287 deletions(-)

-- 
2.26.2


             reply index

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-29 12:06 Mel Gorman [this message]
2021-03-29 12:06 ` [PATCH 1/6] mm/page_alloc: Split per cpu page lists and zone stats Mel Gorman
2021-03-29 12:06 ` [PATCH 2/6] mm/page_alloc: Convert per-cpu list protection to local_lock Mel Gorman
2021-03-31  9:55   ` Thomas Gleixner
2021-03-31 11:01     ` Mel Gorman
2021-03-31 17:42       ` Thomas Gleixner
2021-03-31 17:46         ` Thomas Gleixner
2021-03-31 20:42         ` Mel Gorman
2021-03-29 12:06 ` [PATCH 3/6] mm/vmstat: Convert NUMA statistics to basic NUMA counters Mel Gorman
2021-03-29 12:06 ` [PATCH 4/6] mm/vmstat: Inline NUMA event counter updates Mel Gorman
2021-03-29 12:06 ` [PATCH 5/6] mm/page_alloc: Batch the accounting updates in the bulk allocator Mel Gorman
2021-03-29 12:06 ` [PATCH 6/6] mm/page_alloc: Reduce duration that IRQs are disabled for VM counters Mel Gorman
2021-03-30 18:51 ` [RFC PATCH 0/6] Use local_lock for pcp protection and reduce stat overhead Jesper Dangaard Brouer
2021-03-31  7:38   ` Mel Gorman
2021-03-31  8:17     ` Jesper Dangaard Brouer
2021-03-31  8:52 ` Mel Gorman
2021-03-31  9:51   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210329120648.19040-1-mgorman@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=brouer@redhat.com \
    --cc=chuck.lever@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-rt-users Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rt-users/0 linux-rt-users/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rt-users linux-rt-users/ https://lore.kernel.org/linux-rt-users \
		linux-rt-users@vger.kernel.org
	public-inbox-index linux-rt-users

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rt-users


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git