LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Daniel Jordan <daniel.m.jordan@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Steffen Klassert <steffen.klassert@secunet.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Jonathan Corbet <corbet@lwn.net>,
	Josh Triplett <josh@joshtriplett.org>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	Michal Hocko <mhocko@kernel.org>, Pavel Machek <pavel@ucw.cz>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Randy Dunlap <rdunlap@infradead.org>,
	Shile Zhang <shile.zhang@linux.alibaba.com>,
	Tejun Heo <tj@kernel.org>, Zi Yan <ziy@nvidia.com>,
	linux-crypto@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Daniel Jordan <daniel.m.jordan@oracle.com>
Subject: [PATCH 0/7] padata: parallelize deferred page init
Date: Thu, 30 Apr 2020 16:11:18 -0400
Message-ID: <20200430201125.532129-1-daniel.m.jordan@oracle.com> (raw)

Sometimes the kernel doesn't take full advantage of system memory
bandwidth, leading to a single CPU spending excessive time in
initialization paths where the data scales with memory size.

Multithreading naturally addresses this problem, and this series is the
first step.

It extends padata, a framework that handles many parallel singlethreaded
jobs, to handle multithreaded jobs as well by adding support for
splitting up the work evenly, specifying a minimum amount of work that's
appropriate for one helper thread to do, load balancing between helpers,
and coordinating them.  More documentation in patches 4 and 7.

The first user is deferred struct page init, a large bottleneck in
kernel boot--actually the largest for us and likely others too.  This
path doesn't require concurrency limits, resource control, or priority
adjustments like future users will (vfio, hugetlb fallocate, munmap)
because it happens during boot when the system is otherwise idle and
waiting on page init to finish.

This has been tested on a variety of x86 systems and speeds up kernel
boot by 6% to 49% by making deferred init 63% to 91% faster.  Patch 6
has detailed numbers.  Test results from other systems appreciated.

This series is based on v5.6 plus these three from mmotm:

  mm-call-touch_nmi_watchdog-on-max-order-boundaries-in-deferred-init.patch
  mm-initialize-deferred-pages-with-interrupts-enabled.patch
  mm-call-cond_resched-from-deferred_init_memmap.patch

All of the above can be found in this branch:

  git://oss.oracle.com/git/linux-dmjordan.git padata-mt-definit-v1
  https://oss.oracle.com/git/gitweb.cgi?p=linux-dmjordan.git;a=shortlog;h=refs/heads/padata-mt-definit-v1

The future users and related features are available as work-in-progress
here:

  git://oss.oracle.com/git/linux-dmjordan.git padata-mt-wip-v0.3
  https://oss.oracle.com/git/gitweb.cgi?p=linux-dmjordan.git;a=shortlog;h=refs/heads/padata-mt-wip-v0.3

Thanks to everyone who commented on the last version of this[0],
including Alex Williamson, Jason Gunthorpe, Jonathan Corbet, Michal
Hocko, Pavel Machek, Peter Zijlstra, Randy Dunlap, Robert Elliott, Tejun
Heo, and Zi Yan.

RFC v4 -> padata v1:
 - merged with padata (Peter)
 - got rid of the 'task' nomenclature (Peter, Jon)

future work branch:
 - made lockdep-aware (Jason, Peter)
 - adjust workqueue worker priority with renice_or_cancel() (Tejun)
 - fixed undo problem in VFIO (Alex)

The remaining feedback, mainly resource control awareness (cgroup etc),
is TODO for later series.

[0] https://lore.kernel.org/linux-mm/20181105165558.11698-1-daniel.m.jordan@oracle.com/

Daniel Jordan (7):
  padata: remove exit routine
  padata: initialize earlier
  padata: allocate work structures for parallel jobs from a pool
  padata: add basic support for multithreaded jobs
  mm: move zone iterator outside of deferred_init_maxorder()
  mm: parallelize deferred_init_memmap()
  padata: document multithreaded jobs

 Documentation/core-api/padata.rst |  41 +++--
 include/linux/padata.h            |  43 ++++-
 init/main.c                       |   2 +
 kernel/padata.c                   | 277 ++++++++++++++++++++++++------
 mm/Kconfig                        |   6 +-
 mm/page_alloc.c                   | 118 ++++++-------
 6 files changed, 355 insertions(+), 132 deletions(-)


base-commit: 7111951b8d4973bda27ff663f2cf18b663d15b48
prerequisite-patch-id: 4ad522141e1119a325a9799dad2bd982fbac8b7c
prerequisite-patch-id: 169273327e56f5461101a71dfbd6b4cfd4570cf0
prerequisite-patch-id: 0f34692c8a9673d4c4f6a3545cf8ec3a2abf8620
-- 
2.26.2


             reply index

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-30 20:11 Daniel Jordan [this message]
2020-04-30 20:11 ` [PATCH 1/7] padata: remove exit routine Daniel Jordan
2020-04-30 20:11 ` [PATCH 2/7] padata: initialize earlier Daniel Jordan
2020-04-30 20:11 ` [PATCH 3/7] padata: allocate work structures for parallel jobs from a pool Daniel Jordan
2020-04-30 20:11 ` [PATCH 4/7] padata: add basic support for multithreaded jobs Daniel Jordan
2020-04-30 20:11 ` [PATCH 5/7] mm: move zone iterator outside of deferred_init_maxorder() Daniel Jordan
2020-04-30 21:43   ` Alexander Duyck
2020-05-01  2:45     ` Daniel Jordan
2020-05-04 22:10       ` Alexander Duyck
2020-05-05  0:54         ` Daniel Jordan
2020-05-05 15:27           ` Alexander Duyck
2020-05-06 22:39             ` Daniel Jordan
2020-05-07 15:26               ` Alexander Duyck
2020-05-07 20:20                 ` Daniel Jordan
2020-05-07 21:18                   ` Alexander Duyck
2020-05-07 22:15                     ` Daniel Jordan
2020-04-30 20:11 ` [PATCH 6/7] mm: parallelize deferred_init_memmap() Daniel Jordan
2020-05-04 22:33   ` Alexander Duyck
2020-05-04 23:38     ` Josh Triplett
2020-05-05  0:40       ` Alexander Duyck
2020-05-05  1:48         ` Daniel Jordan
2020-05-05  2:09           ` Daniel Jordan
2020-05-05 14:55             ` Alexander Duyck
2020-05-06 22:21               ` Daniel Jordan
2020-05-06 22:36                 ` Alexander Duyck
2020-05-06 22:43                   ` Daniel Jordan
2020-05-06 23:01                     ` Daniel Jordan
2020-05-05  1:26     ` Daniel Jordan
2020-04-30 20:11 ` [PATCH 7/7] padata: document multithreaded jobs Daniel Jordan
2020-04-30 21:31 ` [PATCH 0/7] padata: parallelize deferred page init Andrew Morton
2020-04-30 21:40   ` Pavel Tatashin
2020-05-01  2:40     ` Daniel Jordan
2020-05-01  0:50   ` Josh Triplett
2020-05-01  1:09 ` Josh Triplett
2020-05-01  2:48   ` Daniel Jordan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200430201125.532129-1-daniel.m.jordan@oracle.com \
    --to=daniel.m.jordan@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jgg@ziepe.ca \
    --cc=josh@joshtriplett.org \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=shile.zhang@linux.alibaba.com \
    --cc=steffen.klassert@secunet.com \
    --cc=tj@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git