From: Ben Gardon <bgardon@google.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Peter Shier <pshier@google.com>,
Peter Feiner <pfeiner@google.com>,
Junaid Shahid <junaids@google.com>,
Jim Mattson <jmattson@google.com>,
Yulei Zhang <yulei.kernel@gmail.com>,
Wanpeng Li <kernellwp@gmail.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
Ben Gardon <bgardon@google.com>
Subject: [PATCH v2 00/28] Allow parallel MMU operations with TDP MMU
Date: Tue, 2 Feb 2021 10:57:06 -0800 [thread overview]
Message-ID: <20210202185734.1680553-1-bgardon@google.com> (raw)
The TDP MMU was implemented to simplify and improve the performance of
KVM's memory management on modern hardware with TDP (EPT / NPT). To build
on the existing performance improvements of the TDP MMU, add the ability
to handle vCPU page faults, enabling and disabling dirty logging, and
removing mappings, in parallel. In the current implementation,
vCPU page faults (actually EPT/NPT violations/misconfigurations) are the
largest source of MMU lock contention on VMs with many vCPUs. This
contention, and the resulting page fault latency, can soft-lock guests
and degrade performance. Handling page faults in parallel is especially
useful when booting VMs, enabling dirty logging, and handling demand
paging. In all these cases vCPUs are constantly incurring page faults on
each new page accessed.
Broadly, the following changes were required to allow parallel page
faults (and other MMU operations):
-- Contention detection and yielding added to rwlocks to bring them up to
feature parity with spin locks, at least as far as the use of the MMU
lock is concerned.
-- TDP MMU page table memory is protected with RCU and freed in RCU
callbacks to allow multiple threads to operate on that memory
concurrently.
-- The MMU lock was changed to an rwlock on x86. This allows the page
fault handlers to acquire the MMU lock in read mode and handle page
faults in parallel, and other operations to maintain exclusive use of
the lock by acquiring it in write mode.
-- An additional lock is added to protect some data structures needed by
the page fault handlers, for relatively infrequent operations.
-- The page fault handler is modified to use atomic cmpxchgs to set SPTEs
and some page fault handler operations are modified slightly to work
concurrently with other threads.
This series also contains a few bug fixes and optimizations, related to
the above, but not strictly part of enabling parallel page fault handling.
Correctness testing:
The following tests were performed with an SMP kernel and DBX kernel on an
Intel Skylake machine. The tests were run both with and without the TDP
MMU enabled.
-- This series introduces no new failures in kvm-unit-tests
SMP + no TDP MMU no new failures
SMP + TDP MMU no new failures
DBX + no TDP MMU no new failures
DBX + TDP MMU no new failures
-- All KVM selftests behave as expected
SMP + no TDP MMU all pass except ./x86_64/vmx_preemption_timer_test
SMP + TDP MMU all pass except ./x86_64/vmx_preemption_timer_test
(./x86_64/vmx_preemption_timer_test also fails without this patch set,
both with the TDP MMU on and off.)
DBX + no TDP MMU all pass
DBX + TDP MMU all pass
-- A VM can be booted running a Debian 9 and all memory accessed
SMP + no TDP MMU works
SMP + TDP MMU works
DBX + no TDP MMU works
DBX + TDP MMU works
This series can be viewed in Gerrit at:
https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/7172
Changelog v1 -> v2:
- Removed the MMU lock union + using a spinlock when the TDP MMU is disabled
- Merged RCU commits
- Extended additional MMU operations to operate in parallel
- Ammended dirty log perf test to cover newly parallelized code paths
- Misc refactorings (see changelogs for individual commits)
- Big thanks to Sean and Paolo for their thorough review of v1
Ben Gardon (28):
KVM: x86/mmu: change TDP MMU yield function returns to match
cond_resched
KVM: x86/mmu: Add comment on __tdp_mmu_set_spte
KVM: x86/mmu: Add lockdep when setting a TDP MMU SPTE
KVM: x86/mmu: Don't redundantly clear TDP MMU pt memory
KVM: x86/mmu: Factor out handling of removed page tables
locking/rwlocks: Add contention detection for rwlocks
sched: Add needbreak for rwlocks
sched: Add cond_resched_rwlock
KVM: x86/mmu: Fix braces in kvm_recover_nx_lpages
KVM: x86/mmu: Fix TDP MMU zap collapsible SPTEs
KVM: x86/mmu: Merge flush and non-flush tdp_mmu_iter_cond_resched
KVM: x86/mmu: Rename goal_gfn to next_last_level_gfn
KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter
KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed
KVM: x86/mmu: Skip no-op changes in TDP MMU functions
KVM: x86/mmu: Clear dirtied pages mask bit before early break
KVM: x86/mmu: Protect TDP MMU page table memory with RCU
KVM: x86/mmu: Use an rwlock for the x86 MMU
KVM: x86/mmu: Factor out functions to add/remove TDP MMU pages
KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map
KVM: x86/mmu: Flush TLBs after zap in TDP MMU PF handler
KVM: x86/mmu: Mark SPTEs in disconnected pages as removed
KVM: x86/mmu: Allow parallel page faults for the TDP MMU
KVM: x86/mmu: Allow zap gfn range to operate under the mmu read lock
KVM: x86/mmu: Allow zapping collapsible SPTEs to use MMU read lock
KVM: x86/mmu: Allow enabling / disabling dirty logging under MMU read
lock
KVM: selftests: Add backing src parameter to dirty_log_perf_test
KVM: selftests: Disable dirty logging with vCPUs running
arch/x86/include/asm/kvm_host.h | 15 +
arch/x86/kvm/mmu/mmu.c | 120 +--
arch/x86/kvm/mmu/mmu_internal.h | 9 +-
arch/x86/kvm/mmu/page_track.c | 8 +-
arch/x86/kvm/mmu/paging_tmpl.h | 8 +-
arch/x86/kvm/mmu/spte.h | 21 +-
arch/x86/kvm/mmu/tdp_iter.c | 46 +-
arch/x86/kvm/mmu/tdp_iter.h | 21 +-
arch/x86/kvm/mmu/tdp_mmu.c | 741 ++++++++++++++----
arch/x86/kvm/mmu/tdp_mmu.h | 5 +-
arch/x86/kvm/x86.c | 4 +-
include/asm-generic/qrwlock.h | 24 +-
include/linux/kvm_host.h | 5 +
include/linux/rwlock.h | 7 +
include/linux/sched.h | 29 +
kernel/sched/core.c | 40 +
.../selftests/kvm/demand_paging_test.c | 3 +-
.../selftests/kvm/dirty_log_perf_test.c | 25 +-
.../testing/selftests/kvm/include/kvm_util.h | 6 -
.../selftests/kvm/include/perf_test_util.h | 3 +-
.../testing/selftests/kvm/include/test_util.h | 14 +
.../selftests/kvm/lib/perf_test_util.c | 6 +-
tools/testing/selftests/kvm/lib/test_util.c | 29 +
virt/kvm/dirty_ring.c | 10 +
virt/kvm/kvm_main.c | 46 +-
25 files changed, 963 insertions(+), 282 deletions(-)
--
2.30.0.365.g02bc693789-goog
next reply other threads:[~2021-02-02 18:59 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-02 18:57 Ben Gardon [this message]
2021-02-02 18:57 ` [PATCH v2 01/28] KVM: x86/mmu: change TDP MMU yield function returns to match cond_resched Ben Gardon
2021-02-02 18:57 ` [PATCH v2 02/28] KVM: x86/mmu: Add comment on __tdp_mmu_set_spte Ben Gardon
2021-02-02 18:57 ` [PATCH v2 03/28] KVM: x86/mmu: Add lockdep when setting a TDP MMU SPTE Ben Gardon
2021-02-02 18:57 ` [PATCH v2 04/28] KVM: x86/mmu: Don't redundantly clear TDP MMU pt memory Ben Gardon
2021-02-02 18:57 ` [PATCH v2 05/28] KVM: x86/mmu: Factor out handling of removed page tables Ben Gardon
2021-02-02 18:57 ` [PATCH v2 06/28] locking/rwlocks: Add contention detection for rwlocks Ben Gardon
2021-02-02 22:06 ` kernel test robot
2021-02-09 20:39 ` Guenter Roeck
2021-02-09 21:46 ` Waiman Long
2021-02-09 22:25 ` Guenter Roeck
2021-02-10 0:27 ` Waiman Long
2021-02-10 0:41 ` Waiman Long
2021-02-10 6:04 ` Guenter Roeck
2021-02-10 14:57 ` Waiman Long
2021-02-10 3:32 ` Waiman Long
2021-02-10 15:15 ` Waiman Long
2021-02-02 18:57 ` [PATCH v2 07/28] sched: Add needbreak " Ben Gardon
2021-02-02 18:57 ` [PATCH v2 08/28] sched: Add cond_resched_rwlock Ben Gardon
2021-02-02 18:57 ` [PATCH v2 09/28] KVM: x86/mmu: Fix braces in kvm_recover_nx_lpages Ben Gardon
2021-02-02 18:57 ` [PATCH v2 10/28] KVM: x86/mmu: Fix TDP MMU zap collapsible SPTEs Ben Gardon
2021-02-03 9:43 ` Paolo Bonzini
2021-02-02 18:57 ` [PATCH v2 11/28] KVM: x86/mmu: Merge flush and non-flush tdp_mmu_iter_cond_resched Ben Gardon
2021-02-02 18:57 ` [PATCH v2 12/28] KVM: x86/mmu: Rename goal_gfn to next_last_level_gfn Ben Gardon
2021-02-02 18:57 ` [PATCH v2 13/28] KVM: x86/mmu: Ensure forward progress when yielding in TDP MMU iter Ben Gardon
2021-02-05 23:42 ` Sean Christopherson
2021-02-02 18:57 ` [PATCH v2 14/28] KVM: x86/mmu: Yield in TDU MMU iter even if no SPTES changed Ben Gardon
2021-02-02 18:57 ` [PATCH v2 15/28] KVM: x86/mmu: Skip no-op changes in TDP MMU functions Ben Gardon
2021-02-02 18:57 ` [PATCH v2 16/28] KVM: x86/mmu: Clear dirtied pages mask bit before early break Ben Gardon
2021-02-02 18:57 ` [PATCH v2 17/28] KVM: x86/mmu: Protect TDP MMU page table memory with RCU Ben Gardon
2021-02-02 18:57 ` [PATCH v2 18/28] KVM: x86/mmu: Use an rwlock for the x86 MMU Ben Gardon
[not found] ` <c8aa8f9c-2305-5d58-3b48-261663524ad5@redhat.com>
[not found] ` <CANgfPd_RxhBwM95MQQmGOdtmeH8c6=zPqUnXXHNV5Ta0R5R=iw@mail.gmail.com>
2021-02-03 18:14 ` Paolo Bonzini
2021-02-03 23:06 ` kernel test robot
2021-02-02 18:57 ` [PATCH v2 19/28] KVM: x86/mmu: Factor out functions to add/remove TDP MMU pages Ben Gardon
2021-02-02 18:57 ` [PATCH v2 20/28] KVM: x86/mmu: Use atomic ops to set SPTEs in TDP MMU map Ben Gardon
2021-02-03 2:48 ` kernel test robot
2021-02-03 2:48 ` kernel test robot
2021-02-03 11:14 ` Paolo Bonzini
2021-02-06 0:26 ` Sean Christopherson
2021-02-08 10:32 ` Paolo Bonzini
2021-04-01 10:32 ` Paolo Bonzini
2021-04-01 16:50 ` Ben Gardon
2021-04-01 17:32 ` Paolo Bonzini
2021-04-01 18:09 ` Sean Christopherson
2021-02-02 18:57 ` [PATCH v2 21/28] KVM: x86/mmu: Flush TLBs after zap in TDP MMU PF handler Ben Gardon
2021-02-06 0:29 ` Sean Christopherson
2021-02-02 18:57 ` [PATCH v2 22/28] KVM: x86/mmu: Mark SPTEs in disconnected pages as removed Ben Gardon
2021-02-03 11:17 ` Paolo Bonzini
2021-02-02 18:57 ` [PATCH v2 23/28] KVM: x86/mmu: Allow parallel page faults for the TDP MMU Ben Gardon
2021-02-03 12:39 ` Paolo Bonzini
2021-02-03 17:46 ` Ben Gardon
2021-02-03 18:30 ` Paolo Bonzini
2021-02-06 0:12 ` Sean Christopherson
2021-02-02 18:57 ` [PATCH v2 24/28] KVM: x86/mmu: Allow zap gfn range to operate under the mmu read lock Ben Gardon
2021-02-03 11:25 ` Paolo Bonzini
2021-02-03 11:26 ` Paolo Bonzini
2021-02-03 18:31 ` Ben Gardon
2021-02-03 18:32 ` Paolo Bonzini
2021-02-02 18:57 ` [PATCH v2 25/28] KVM: x86/mmu: Allow zapping collapsible SPTEs to use MMU " Ben Gardon
2021-02-03 11:34 ` Paolo Bonzini
2021-02-03 18:51 ` Ben Gardon
2021-02-02 18:57 ` [PATCH v2 26/28] KVM: x86/mmu: Allow enabling / disabling dirty logging under " Ben Gardon
2021-02-03 11:38 ` Paolo Bonzini
2021-02-02 18:57 ` [PATCH v2 27/28] KVM: selftests: Add backing src parameter to dirty_log_perf_test Ben Gardon
2021-02-02 18:57 ` [PATCH v2 28/28] KVM: selftests: Disable dirty logging with vCPUs running Ben Gardon
2021-02-03 10:07 ` Paolo Bonzini
2021-02-03 11:00 ` [PATCH v2 00/28] Allow parallel MMU operations with TDP MMU Paolo Bonzini
2021-02-03 17:54 ` Sean Christopherson
2021-02-03 18:13 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210202185734.1680553-1-bgardon@google.com \
--to=bgardon@google.com \
--cc=jmattson@google.com \
--cc=junaids@google.com \
--cc=kernellwp@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=pfeiner@google.com \
--cc=pshier@google.com \
--cc=seanjc@google.com \
--cc=vkuznets@redhat.com \
--cc=xiaoguangrong.eric@gmail.com \
--cc=yulei.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.