From: Anish Moorthy <amoorthy@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>, Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>,
Sean Christopherson <seanjc@google.com>,
James Houghton <jthoughton@google.com>,
Anish Moorthy <amoorthy@google.com>,
Ben Gardon <bgardon@google.com>,
David Matlack <dmatlack@google.com>,
Ricardo Koller <ricarkol@google.com>,
Chao Peng <chao.p.peng@linux.intel.com>,
Axel Rasmussen <axelrasmussen@google.com>,
kvm@vger.kernel.org, kvmarm@lists.linux.dev
Subject: [PATCH 0/8] Add memory fault exits to avoid slow GUP
Date: Wed, 15 Feb 2023 01:16:06 +0000 [thread overview]
Message-ID: <20230215011614.725983-1-amoorthy@google.com> (raw)
This series improves scalabiity with userfaultfd-based postcopy live
migration. It implements the no-slow-gup approach which James Houghton
described in his earlier RFC ([1]). The new cap
KVM_CAP_MEM_FAULT_NOWAIT, is introduced, which causes KVM to exit to
userspace if fast get_user_pages (GUP) fails while resolving a page
fault. The motivation is to allow (most) EPT violations to be resolved
without going through userfaultfd, which involves serializing faults on
internal locks: see [1] for more details.
After receiving the new exit, userspace can check if it has previously
UFFDIO_COPY/CONTINUEd the faulting address- if not, then it knows that
fast GUP could not possibly have succeeded, and so the fault has to be
resolved via UFFDIO_COPY/CONTINUE. In these cases a UFFDIO_WAKE is
unnecessary, as the vCPU thread hasn't been put to sleep waiting on the
uffd.
If userspace *has* already COPY/CONTINUEd the address, then it must take
some other action to make fast GUP succeed: such as swapping in the
page (for instance, via MADV_POPULATE_WRITE for writable mappings).
This feature should only be enabled during userfaultfd postcopy, as it
prevents the generation of async page faults.
The actual kernel changes to implement the change on arm64/x86 are
small: most of this series is actually just adding support for the new
feature in the demand paging self test. Performance samples (rates
reported in thousands of pages/s, average of five runs each) generated
using [2] on an x86 machine with 256 cores, are shown below.
vCPUs, Paging Rate (w/o new cap), Paging Rate (w/ new cap)
1 150 340
2 191 477
4 210 809
8 155 1239
16 130 1595
32 108 2299
64 86 3482
128 62 4134
256 36 4012
[1] https://lore.kernel.org/linux-mm/CADrL8HVDB3u2EOhXHCrAgJNLwHkj2Lka1B_kkNb0dNwiWiAN_Q@mail.gmail.com/
[2] ./demand_paging_test -b 64M -u MINOR -s shmem -a -v <n> -r <n> [-w]
A quick rundown of the new flags (also detailed in later commits)
-a registers all of guest memory to a single uffd.
-r species the number of reader threads for polling the uffd.
-w is what actually enables memory fault exits.
All data was collected after applying the entire series.
This series is based on the latest kvm/next (7cb79f433e75).
Anish Moorthy (8):
selftests/kvm: Fix bug in how demand_paging_test calculates paging
rate
selftests/kvm: Allow many vcpus per UFFD in demand paging test
selftests/kvm: Switch demand paging uffd readers to epoll
kvm: Allow hva_pfn_fast to resolve read-only faults.
kvm: Add cap/kvm_run field for memory fault exits
kvm/x86: Add mem fault exit on EPT violations
kvm/arm64: Implement KVM_CAP_MEM_FAULT_NOWAIT for arm64
selftests/kvm: Handle mem fault exits in demand paging test
Documentation/virt/kvm/api.rst | 42 ++++
arch/arm64/kvm/arm.c | 1 +
arch/arm64/kvm/mmu.c | 14 +-
arch/x86/kvm/mmu/mmu.c | 23 +-
arch/x86/kvm/x86.c | 1 +
include/linux/kvm_host.h | 13 +
include/uapi/linux/kvm.h | 13 +-
tools/include/uapi/linux/kvm.h | 7 +
.../selftests/kvm/aarch64/page_fault_test.c | 4 +-
.../selftests/kvm/demand_paging_test.c | 237 ++++++++++++++----
.../selftests/kvm/include/userfaultfd_util.h | 18 +-
.../selftests/kvm/lib/userfaultfd_util.c | 160 +++++++-----
virt/kvm/kvm_main.c | 48 +++-
13 files changed, 442 insertions(+), 139 deletions(-)
--
2.39.1.581.gbfd45094c4-goog
next reply other threads:[~2023-02-15 1:16 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-15 1:16 Anish Moorthy [this message]
2023-02-15 1:16 ` [PATCH 1/8] selftests/kvm: Fix bug in how demand_paging_test calculates paging rate Anish Moorthy
2023-02-15 7:27 ` Oliver Upton
2023-02-15 16:44 ` Sean Christopherson
2023-02-15 18:05 ` Anish Moorthy
2023-02-15 1:16 ` [PATCH 2/8] selftests/kvm: Allow many vcpus per UFFD in demand paging test Anish Moorthy
2023-02-15 1:16 ` [PATCH 3/8] selftests/kvm: Switch demand paging uffd readers to epoll Anish Moorthy
2023-02-15 1:16 ` [PATCH 4/8] kvm: Allow hva_pfn_fast to resolve read-only faults Anish Moorthy
2023-02-15 9:01 ` Oliver Upton
2023-02-15 17:03 ` Sean Christopherson
2023-02-15 18:19 ` Anish Moorthy
2023-02-15 1:16 ` [PATCH 5/8] kvm: Add cap/kvm_run field for memory fault exits Anish Moorthy
2023-02-15 8:41 ` Marc Zyngier
2023-02-15 17:07 ` Sean Christopherson
2023-02-16 18:53 ` Anish Moorthy
2023-02-16 21:38 ` Sean Christopherson
2023-02-17 19:14 ` Anish Moorthy
2023-02-17 20:33 ` Sean Christopherson
2023-02-23 1:16 ` Anish Moorthy
2023-02-23 20:55 ` Sean Christopherson
2023-02-23 23:03 ` Anish Moorthy
2023-02-24 0:01 ` Sean Christopherson
2023-02-17 20:47 ` Sean Christopherson
2023-02-15 8:59 ` Oliver Upton
2023-02-15 1:16 ` [PATCH 6/8] kvm/x86: Add mem fault exit on EPT violations Anish Moorthy
2023-02-15 17:23 ` Sean Christopherson
2023-02-16 22:55 ` Peter Xu
2023-02-23 0:35 ` Anish Moorthy
2023-02-23 20:11 ` Sean Christopherson
2023-02-15 1:16 ` [PATCH 7/8] kvm/arm64: Implement KVM_CAP_MEM_FAULT_NOWAIT for arm64 Anish Moorthy
2023-02-15 18:24 ` Oliver Upton
2023-02-15 23:28 ` Anish Moorthy
2023-02-15 23:37 ` Oliver Upton
2023-02-15 1:16 ` [PATCH 8/8] selftests/kvm: Handle mem fault exits in demand paging test Anish Moorthy
2023-02-15 1:46 ` [PATCH 0/8] Add memory fault exits to avoid slow GUP James Houghton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230215011614.725983-1-amoorthy@google.com \
--to=amoorthy@google.com \
--cc=axelrasmussen@google.com \
--cc=bgardon@google.com \
--cc=chao.p.peng@linux.intel.com \
--cc=dmatlack@google.com \
--cc=jthoughton@google.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=maz@kernel.org \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=ricarkol@google.com \
--cc=seanjc@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.