linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peng Zhang <zhangpeng.00@bytedance.com>
To: Liam.Howlett@oracle.com, corbet@lwn.net,
	akpm@linux-foundation.org, willy@infradead.org,
	brauner@kernel.org, surenb@google.com,
	michael.christie@oracle.com, mjguzik@gmail.com,
	mathieu.desnoyers@efficios.com, npiggin@gmail.com,
	peterz@infradead.org, oliver.sang@intel.com, mst@redhat.com
Cc: zhangpeng.00@bytedance.com, maple-tree@lists.infradead.org,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: [PATCH v5 00/10] Introduce __mt_dup() to improve the performance of fork()
Date: Mon, 16 Oct 2023 11:22:16 +0800	[thread overview]
Message-ID: <20231016032226.59199-1-zhangpeng.00@bytedance.com> (raw)

Hi all,

This series introduces __mt_dup() to improve the performance of fork(). During
the duplication process of mmap, all VMAs are traversed and inserted one by one
into the new maple tree, causing the maple tree to be rebalanced multiple times.
Balancing the maple tree is a costly operation. To duplicate VMAs more
efficiently, mtree_dup() and __mt_dup() are introduced for the maple tree. They
can efficiently duplicate a maple tree.

Here are some algorithmic details about {mtree,__mt}_dup(). We perform a DFS
pre-order traversal of all nodes in the source maple tree. During this process,
we fully copy the nodes from the source tree to the new tree. This involves
memory allocation, and when encountering a new node, if it is a non-leaf node,
all its child nodes are allocated at once.

Some previous discussions can be referred to as [1]. For a more detailed
analysis of the algorithm, please refer to the logs for patch [3/10] and patch
[10/10]

There is a "spawn" in byte-unixbench[2], which can be used to test the
performance of fork(). I modified it slightly to make it work with
different number of VMAs.

Below are the test results. The first row shows the number of VMAs.
The second and third rows show the number of fork() calls per ten seconds,
corresponding to next-20231006 and the this patchset, respectively. The
test results were obtained with CPU binding to avoid scheduler load
balancing that could cause unstable results. There are still some
fluctuations in the test results, but at least they are better than the
original performance.

21     121   221    421    821    1621   3221   6421   12821  25621  51221
112100 76261 54227  34035  20195  11112  6017   3161   1606   802    393
114558 83067 65008  45824  28751  16072  8922   4747   2436   1233   599
2.19%  8.92% 19.88% 34.64% 42.37% 44.64% 48.28% 50.17% 51.68% 53.74% 52.42%

Thanks for Liam's review.

Changes since v4:
 - Change the handling method for the failure of dup_mmap(). Handle it in
   exit_mmap().
 - Update check_forking() and bench_forking().
 - Add the corresponding copyright statement.

Peng Zhang (10):
  maple_tree: Add mt_free_one() and mt_attr() helpers
  maple_tree: Introduce {mtree,mas}_lock_nested()
  maple_tree: Introduce interfaces __mt_dup() and mtree_dup()
  radix tree test suite: Align kmem_cache_alloc_bulk() with kernel
    behavior.
  maple_tree: Add test for mtree_dup()
  maple_tree: Update the documentation of maple tree
  maple_tree: Skip other tests when BENCH is enabled
  maple_tree: Update check_forking() and bench_forking()
  maple_tree: Preserve the tree attributes when destroying maple tree
  fork: Use __mt_dup() to duplicate maple tree in dup_mmap()

 Documentation/core-api/maple_tree.rst |   4 +
 include/linux/maple_tree.h            |   7 +
 kernel/fork.c                         |  39 ++-
 lib/maple_tree.c                      | 304 ++++++++++++++++++++-
 lib/test_maple_tree.c                 | 123 +++++----
 mm/memory.c                           |   7 +-
 mm/mmap.c                             |   9 +-
 tools/include/linux/rwsem.h           |   4 +
 tools/include/linux/spinlock.h        |   1 +
 tools/testing/radix-tree/linux.c      |  45 +++-
 tools/testing/radix-tree/maple.c      | 363 ++++++++++++++++++++++++++
 11 files changed, 815 insertions(+), 91 deletions(-)

-- 
2.20.1


             reply	other threads:[~2023-10-16  3:23 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-16  3:22 Peng Zhang [this message]
2023-10-16  3:22 ` [PATCH v5 01/10] maple_tree: Add mt_free_one() and mt_attr() helpers Peng Zhang
2023-10-16  3:22 ` [PATCH v5 02/10] maple_tree: Introduce {mtree,mas}_lock_nested() Peng Zhang
2023-10-16  3:22 ` [PATCH v5 03/10] maple_tree: Introduce interfaces __mt_dup() and mtree_dup() Peng Zhang
2023-10-16 14:10   ` Matthew Wilcox
2023-10-17  2:44     ` Peng Zhang
2023-10-17 13:57   ` Liam R. Howlett
2023-10-24  8:40     ` Peng Zhang
2023-10-16  3:22 ` [PATCH v5 04/10] radix tree test suite: Align kmem_cache_alloc_bulk() with kernel behavior Peng Zhang
2023-10-16  3:22 ` [PATCH v5 05/10] maple_tree: Add test for mtree_dup() Peng Zhang
2023-10-16  3:22 ` [PATCH v5 06/10] maple_tree: Update the documentation of maple tree Peng Zhang
2023-10-16  3:22 ` [PATCH v5 07/10] maple_tree: Skip other tests when BENCH is enabled Peng Zhang
2023-10-16  3:22 ` [PATCH v5 08/10] maple_tree: Update check_forking() and bench_forking() Peng Zhang
2023-10-16  3:22 ` [PATCH v5 09/10] maple_tree: Preserve the tree attributes when destroying maple tree Peng Zhang
2023-10-16  3:22 ` [PATCH v5 10/10] fork: Use __mt_dup() to duplicate maple tree in dup_mmap() Peng Zhang
2023-10-17 13:50   ` Liam R. Howlett
2023-10-24  8:45     ` Peng Zhang
2023-10-16  3:40 ` [PATCH v5 00/10] Introduce __mt_dup() to improve the performance of fork() Peng Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231016032226.59199-1-zhangpeng.00@bytedance.com \
    --to=zhangpeng.00@bytedance.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maple-tree@lists.infradead.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=michael.christie@oracle.com \
    --cc=mjguzik@gmail.com \
    --cc=mst@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=oliver.sang@intel.com \
    --cc=peterz@infradead.org \
    --cc=surenb@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).