All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anish Moorthy <amoorthy@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>, Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>,
	Sean Christopherson <seanjc@google.com>,
	James Houghton <jthoughton@google.com>,
	Anish Moorthy <amoorthy@google.com>,
	Ben Gardon <bgardon@google.com>,
	David Matlack <dmatlack@google.com>,
	Ricardo Koller <ricarkol@google.com>,
	Chao Peng <chao.p.peng@linux.intel.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
	kvm@vger.kernel.org, kvmarm@lists.linux.dev
Subject: [PATCH 0/8] Add memory fault exits to avoid slow GUP
Date: Wed, 15 Feb 2023 01:16:06 +0000	[thread overview]
Message-ID: <20230215011614.725983-1-amoorthy@google.com> (raw)

This series improves scalabiity with userfaultfd-based postcopy live
migration. It implements the no-slow-gup approach which James Houghton
described in his earlier RFC ([1]). The new cap
KVM_CAP_MEM_FAULT_NOWAIT, is introduced, which causes KVM to exit to
userspace if fast get_user_pages (GUP) fails while resolving a page
fault. The motivation is to allow (most) EPT violations to be resolved
without going through userfaultfd, which involves serializing faults on
internal locks: see [1] for more details.

After receiving the new exit, userspace can check if it has previously
UFFDIO_COPY/CONTINUEd the faulting address- if not, then it knows that
fast GUP could not possibly have succeeded, and so the fault has to be
resolved via UFFDIO_COPY/CONTINUE. In these cases a UFFDIO_WAKE is
unnecessary, as the vCPU thread hasn't been put to sleep waiting on the
uffd.

If userspace *has* already COPY/CONTINUEd the address, then it must take
some other action to make fast GUP succeed: such as swapping in the
page (for instance, via MADV_POPULATE_WRITE for writable mappings).

This feature should only be enabled during userfaultfd postcopy, as it
prevents the generation of async page faults.

The actual kernel changes to implement the change on arm64/x86 are
small: most of this series is actually just adding support for the new
feature in the demand paging self test. Performance samples (rates
reported in thousands of pages/s, average of five runs each) generated
using [2] on an x86 machine with 256 cores, are shown below.

vCPUs, Paging Rate (w/o new cap), Paging Rate (w/ new cap)
1       150     340
2       191     477
4       210     809
8       155     1239
16      130     1595
32      108     2299
64      86      3482
128     62      4134
256     36      4012

[1] https://lore.kernel.org/linux-mm/CADrL8HVDB3u2EOhXHCrAgJNLwHkj2Lka1B_kkNb0dNwiWiAN_Q@mail.gmail.com/
[2] ./demand_paging_test -b 64M -u MINOR -s shmem -a -v <n> -r <n> [-w]
    A quick rundown of the new flags (also detailed in later commits)
        -a registers all of guest memory to a single uffd.
        -r species the number of reader threads for polling the uffd.
        -w is what actually enables memory fault exits.
    All data was collected after applying the entire series.

This series is based on the latest kvm/next (7cb79f433e75).

Anish Moorthy (8):
  selftests/kvm: Fix bug in how demand_paging_test calculates paging
    rate
  selftests/kvm: Allow many vcpus per UFFD in demand paging test
  selftests/kvm: Switch demand paging uffd readers to epoll
  kvm: Allow hva_pfn_fast to resolve read-only faults.
  kvm: Add cap/kvm_run field for memory fault exits
  kvm/x86: Add mem fault exit on EPT violations
  kvm/arm64: Implement KVM_CAP_MEM_FAULT_NOWAIT for arm64
  selftests/kvm: Handle mem fault exits in demand paging test

 Documentation/virt/kvm/api.rst                |  42 ++++
 arch/arm64/kvm/arm.c                          |   1 +
 arch/arm64/kvm/mmu.c                          |  14 +-
 arch/x86/kvm/mmu/mmu.c                        |  23 +-
 arch/x86/kvm/x86.c                            |   1 +
 include/linux/kvm_host.h                      |  13 +
 include/uapi/linux/kvm.h                      |  13 +-
 tools/include/uapi/linux/kvm.h                |   7 +
 .../selftests/kvm/aarch64/page_fault_test.c   |   4 +-
 .../selftests/kvm/demand_paging_test.c        | 237 ++++++++++++++----
 .../selftests/kvm/include/userfaultfd_util.h  |  18 +-
 .../selftests/kvm/lib/userfaultfd_util.c      | 160 +++++++-----
 virt/kvm/kvm_main.c                           |  48 +++-
 13 files changed, 442 insertions(+), 139 deletions(-)

-- 
2.39.1.581.gbfd45094c4-goog


             reply	other threads:[~2023-02-15  1:16 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-15  1:16 Anish Moorthy [this message]
2023-02-15  1:16 ` [PATCH 1/8] selftests/kvm: Fix bug in how demand_paging_test calculates paging rate Anish Moorthy
2023-02-15  7:27   ` Oliver Upton
2023-02-15 16:44     ` Sean Christopherson
2023-02-15 18:05       ` Anish Moorthy
2023-02-15  1:16 ` [PATCH 2/8] selftests/kvm: Allow many vcpus per UFFD in demand paging test Anish Moorthy
2023-02-15  1:16 ` [PATCH 3/8] selftests/kvm: Switch demand paging uffd readers to epoll Anish Moorthy
2023-02-15  1:16 ` [PATCH 4/8] kvm: Allow hva_pfn_fast to resolve read-only faults Anish Moorthy
2023-02-15  9:01   ` Oliver Upton
2023-02-15 17:03     ` Sean Christopherson
2023-02-15 18:19       ` Anish Moorthy
2023-02-15  1:16 ` [PATCH 5/8] kvm: Add cap/kvm_run field for memory fault exits Anish Moorthy
2023-02-15  8:41   ` Marc Zyngier
2023-02-15 17:07     ` Sean Christopherson
2023-02-16 18:53     ` Anish Moorthy
2023-02-16 21:38       ` Sean Christopherson
2023-02-17 19:14         ` Anish Moorthy
2023-02-17 20:33           ` Sean Christopherson
2023-02-23  1:16             ` Anish Moorthy
2023-02-23 20:55               ` Sean Christopherson
2023-02-23 23:03                 ` Anish Moorthy
2023-02-24  0:01                   ` Sean Christopherson
2023-02-17 20:47           ` Sean Christopherson
2023-02-15  8:59   ` Oliver Upton
2023-02-15  1:16 ` [PATCH 6/8] kvm/x86: Add mem fault exit on EPT violations Anish Moorthy
2023-02-15 17:23   ` Sean Christopherson
2023-02-16 22:55     ` Peter Xu
2023-02-23  0:35     ` Anish Moorthy
2023-02-23 20:11       ` Sean Christopherson
2023-02-15  1:16 ` [PATCH 7/8] kvm/arm64: Implement KVM_CAP_MEM_FAULT_NOWAIT for arm64 Anish Moorthy
2023-02-15 18:24   ` Oliver Upton
2023-02-15 23:28     ` Anish Moorthy
2023-02-15 23:37       ` Oliver Upton
2023-02-15  1:16 ` [PATCH 8/8] selftests/kvm: Handle mem fault exits in demand paging test Anish Moorthy
2023-02-15  1:46 ` [PATCH 0/8] Add memory fault exits to avoid slow GUP James Houghton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230215011614.725983-1-amoorthy@google.com \
    --to=amoorthy@google.com \
    --cc=axelrasmussen@google.com \
    --cc=bgardon@google.com \
    --cc=chao.p.peng@linux.intel.com \
    --cc=dmatlack@google.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=ricarkol@google.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.