From: Oliver Upton <oliver.upton@linux.dev>
To: Marc Zyngier <maz@kernel.org>, James Morse <james.morse@arm.com>,
Alexandru Elisei <alexandru.elisei@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Quentin Perret <qperret@google.com>,
Ricardo Koller <ricarkol@google.com>,
Reiji Watanabe <reijiw@google.com>,
David Matlack <dmatlack@google.com>,
Ben Gardon <bgardon@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Gavin Shan <gshan@redhat.com>, Peter Xu <peterx@redhat.com>,
Sean Christopherson <seanjc@google.com>
Cc: kvmarm@lists.cs.columbia.edu,
linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org
Subject: [PATCH 00/14] KVM: arm64: Parallel stage-2 fault handling
Date: Tue, 30 Aug 2022 19:41:18 +0000 [thread overview]
Message-ID: <20220830194132.962932-1-oliver.upton@linux.dev> (raw)
Presently KVM only takes a read lock for stage 2 faults if it believes
the fault can be fixed by relaxing permissions on a PTE (write unprotect
for dirty logging). Otherwise, stage 2 faults grab the write lock, which
predictably can pile up all the vCPUs in a sufficiently large VM.
Like the TDP MMU for x86, this series loosens the locking around
manipulations of the stage 2 page tables to allow parallel faults. RCU
and atomics are exploited to safely build/destroy the stage 2 page
tables in light of multiple software observers.
Patches 1-2 are a cleanup to the way we collapse page tables, with the
added benefit of narrowing the window of time a range of memory is
unmapped.
Patches 3-7 are minor cleanups and refactorings to the way KVM reads
PTEs and traverses the stage 2 page tables to make it amenable to
concurrent modification.
Patches 8-9 use RCU to punt page table cleanup out of the vCPU fault
path, which should also improve fault latency a bit.
Patches 10-14 implement the meat of this series, extending the
'break-before-make' sequence with atomics to realize locking on PTEs.
Effectively a cmpxchg() is used to 'break' a PTE, thereby serializing
changes to a given PTE.
Finally, patch 15 flips the switch on all the new code and starts
grabbing the read side of the MMU lock for stage 2 faults.
Applies to 6.0-rc3. Tested with KVM selftests and benchmarked with
dirty_log_perf_test, scaling from 1 to 48 vCPUs with 4GB of memory per
vCPU backed by THP.
./dirty_log_perf_test -s anonymous_thp -m 2 -b 4G -v ${NR_VCPUS}
Time to dirty memory:
+-------+---------+------------------+
| vCPUs | 6.0-rc3 | 6.0-rc3 + series |
+-------+---------+------------------+
| 1 | 0.89s | 0.92s |
| 2 | 1.13s | 1.18s |
| 4 | 2.42s | 1.25s |
| 8 | 5.03s | 1.36s |
| 16 | 8.84s | 2.09s |
| 32 | 19.60s | 4.47s |
| 48 | 31.39s | 6.22s |
+-------+---------+------------------+
It is also worth mentioning that the time to populate memory has
improved:
+-------+---------+------------------+
| vCPUs | 6.0-rc3 | 6.0-rc3 + series |
+-------+---------+------------------+
| 1 | 0.19s | 0.18s |
| 2 | 0.25s | 0.21s |
| 4 | 0.38s | 0.32s |
| 8 | 0.64s | 0.40s |
| 16 | 1.22s | 0.54s |
| 32 | 2.50s | 1.03s |
| 48 | 3.88s | 1.52s |
+-------+---------+------------------+
RFC: https://lore.kernel.org/kvmarm/20220415215901.1737897-1-oupton@google.com/
RFC -> v1:
- Factored out page table teardown from kvm_pgtable_stage2_map()
- Use the RCU callback to tear down a subtree, instead of scheduling a
callback for every individual table page.
- Reorganized series to (hopefully) avoid intermediate breakage.
- Dropped the use of page headers, instead stuffing KVM metadata into
page::private directly
Oliver Upton (14):
KVM: arm64: Add a helper to tear down unlinked stage-2 subtrees
KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make
KVM: arm64: Directly read owner id field in stage2_pte_is_counted()
KVM: arm64: Read the PTE once per visit
KVM: arm64: Split init and set for table PTE
KVM: arm64: Return next table from map callbacks
KVM: arm64: Document behavior of pgtable visitor callback
KVM: arm64: Protect page table traversal with RCU
KVM: arm64: Free removed stage-2 tables in RCU callback
KVM: arm64: Atomically update stage 2 leaf attributes in parallel
walks
KVM: arm64: Make changes block->table to leaf PTEs parallel-aware
KVM: arm64: Make leaf->leaf PTE changes parallel-aware
KVM: arm64: Make table->block changes parallel-aware
KVM: arm64: Handle stage-2 faults in parallel
arch/arm64/include/asm/kvm_pgtable.h | 59 ++++-
arch/arm64/kvm/hyp/nvhe/mem_protect.c | 7 +-
arch/arm64/kvm/hyp/nvhe/setup.c | 4 +-
arch/arm64/kvm/hyp/pgtable.c | 360 ++++++++++++++++----------
arch/arm64/kvm/mmu.c | 65 +++--
5 files changed, 325 insertions(+), 170 deletions(-)
base-commit: b90cb1053190353cc30f0fef0ef1f378ccc063c5
--
2.37.2.672.g94769d06f0-goog
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
next reply other threads:[~2022-08-30 19:41 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-30 19:41 Oliver Upton [this message]
2022-08-30 19:41 ` [PATCH 01/14] KVM: arm64: Add a helper to tear down unlinked stage-2 subtrees Oliver Upton
2022-08-30 19:41 ` [PATCH 02/14] KVM: arm64: Tear down unlinked stage-2 subtree after break-before-make Oliver Upton
2022-09-06 14:35 ` Quentin Perret
2022-09-09 10:04 ` Oliver Upton
2022-09-07 20:57 ` David Matlack
2022-09-09 10:07 ` Oliver Upton
2022-09-14 0:20 ` Ricardo Koller
2022-10-10 3:58 ` Oliver Upton
2022-08-30 19:41 ` [PATCH 03/14] KVM: arm64: Directly read owner id field in stage2_pte_is_counted() Oliver Upton
2022-08-30 19:41 ` [PATCH 04/14] KVM: arm64: Read the PTE once per visit Oliver Upton
2022-08-30 19:41 ` [PATCH 05/14] KVM: arm64: Split init and set for table PTE Oliver Upton
2022-08-30 19:41 ` [PATCH 06/14] KVM: arm64: Return next table from map callbacks Oliver Upton
2022-09-07 21:32 ` David Matlack
2022-09-09 9:38 ` Oliver Upton
2022-08-30 19:41 ` [PATCH 07/14] KVM: arm64: Document behavior of pgtable visitor callback Oliver Upton
2022-08-30 19:41 ` [PATCH 08/14] KVM: arm64: Protect page table traversal with RCU Oliver Upton
2022-09-07 21:47 ` David Matlack
2022-09-09 9:55 ` Oliver Upton
2022-08-30 19:41 ` [PATCH 09/14] KVM: arm64: Free removed stage-2 tables in RCU callback Oliver Upton
2022-09-07 22:00 ` David Matlack
2022-09-08 16:40 ` David Matlack
2022-09-14 0:49 ` Ricardo Koller
2022-08-30 19:50 ` [PATCH 10/14] KVM: arm64: Atomically update stage 2 leaf attributes in parallel walks Oliver Upton
2022-08-30 19:51 ` [PATCH 11/14] KVM: arm64: Make changes block->table to leaf PTEs parallel-aware Oliver Upton
2022-09-14 0:51 ` Ricardo Koller
2022-09-14 0:53 ` Ricardo Koller
2022-08-30 19:51 ` [PATCH 12/14] KVM: arm64: Make leaf->leaf PTE changes parallel-aware Oliver Upton
2022-08-30 19:51 ` [PATCH 13/14] KVM: arm64: Make table->block " Oliver Upton
2022-08-30 19:52 ` [PATCH 14/14] KVM: arm64: Handle stage-2 faults in parallel Oliver Upton
2022-09-06 10:00 ` [PATCH 00/14] KVM: arm64: Parallel stage-2 fault handling Marc Zyngier
2022-09-09 10:01 ` Oliver Upton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220830194132.962932-1-oliver.upton@linux.dev \
--to=oliver.upton@linux.dev \
--cc=alexandru.elisei@arm.com \
--cc=bgardon@google.com \
--cc=catalin.marinas@arm.com \
--cc=dmatlack@google.com \
--cc=gshan@redhat.com \
--cc=james.morse@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qperret@google.com \
--cc=reijiw@google.com \
--cc=ricarkol@google.com \
--cc=seanjc@google.com \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).