kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Oliver Upton <oliver.upton@linux.dev>
Cc: Anish Moorthy <amoorthy@google.com>,
	jthoughton@google.com, kvm@vger.kernel.org
Subject: Re: [WIP Patch v2 09/14] KVM: Introduce KVM_CAP_MEMORY_FAULT_NOWAIT without implementation
Date: Fri, 17 Mar 2023 13:17:22 -0700	[thread overview]
Message-ID: <ZBTK0vzAoWqY1hDh@google.com> (raw)
In-Reply-To: <ZBS4o75PVHL4FQqw@linux.dev>

On Fri, Mar 17, 2023, Oliver Upton wrote:
> On Wed, Mar 15, 2023 at 02:17:33AM +0000, Anish Moorthy wrote:
> > Add documentation, memslot flags, useful helper functions, and the
> > actual new capability itself.
> > 
> > Memory fault exits on absent mappings are particularly useful for
> > userfaultfd-based live migration postcopy. When many vCPUs fault upon a
> > single userfaultfd the faults can take a while to surface to userspace
> > due to having to contend for uffd wait queue locks. Bypassing the uffd
> > entirely by triggering a vCPU exit avoids this contention and can improve
> > the fault rate by as much as 10x.
> > ---
> >  Documentation/virt/kvm/api.rst | 37 +++++++++++++++++++++++++++++++---
> >  include/linux/kvm_host.h       |  6 ++++++
> >  include/uapi/linux/kvm.h       |  3 +++
> >  tools/include/uapi/linux/kvm.h |  2 ++
> >  virt/kvm/kvm_main.c            |  7 ++++++-
> >  5 files changed, 51 insertions(+), 4 deletions(-)
> > 
> > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > index f9ca18bbec879..4932c0f62eb3d 100644
> > --- a/Documentation/virt/kvm/api.rst
> > +++ b/Documentation/virt/kvm/api.rst
> > @@ -1312,6 +1312,7 @@ yet and must be cleared on entry.
> >    /* for kvm_userspace_memory_region::flags */
> >    #define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
> >    #define KVM_MEM_READONLY	(1UL << 1)
> > +  #define KVM_MEM_ABSENT_MAPPING_FAULT (1UL << 2)
> 
> call it KVM_MEM_EXIT_ABSENT_MAPPING

Ooh, look, a bikeshed!  :-)

I don't think it should have "EXIT" in the name.  The exit to userspace is a side
effect, e.g. KVM already exits to userspace on unresolved userfaults.  The only
thing this knob _directly_ controls is whether or not KVM attempts the slow path.
If we give the flag a name like "exit on absent userspace mappings", then KVM will
appear to do the wrong thing when KVM exits on a truly absent userspace mapping.

And as I argued in the last version[*], I am _strongly_ opposed to KVM speculating
on why KVM is exiting to userspace.  I.e. KVM should not set a special flag if
the memslot has "fast only" behavior.  The only thing the flag should do is control
whether or not KVM tries slow paths, what KVM does in response to an unresolved
fault should be an orthogonal thing.

E.g. If KVM encounters an unmapped page while prefetching SPTEs, KVM will (correctly)
not exit to userspace and instead simply terminate the prefetch.  Obviously we
could solve that through documentation, but I don't see any benefit in making this
more complex than it needs to be.

[*] https://lkml.kernel.org/r/Y%2B0RYMfw6pHrSLX4%40google.com

> > +7.35 KVM_CAP_MEMORY_FAULT_NOWAIT
> > +--------------------------------
> > +
> > +:Architectures: x86, arm64
> > +:Returns: -EINVAL.
> > +
> > +The presence of this capability indicates that userspace may pass the
> > +KVM_MEM_ABSENT_MAPPING_FAULT flag to KVM_SET_USER_MEMORY_REGION to cause KVM_RUN
> > +to exit to populate 'kvm_run.memory_fault' and exit to userspace (*) in response
> > +to page faults for which the userspace page tables do not contain present
> > +mappings. Attempting to enable the capability directly will fail.
> > +
> > +The 'gpa' and 'len' fields of kvm_run.memory_fault will be set to the starting
> > +address and length (in bytes) of the faulting page. 'flags' will be set to
> > +KVM_MEMFAULT_REASON_ABSENT_MAPPING.
> > +
> > +Userspace should determine how best to make the mapping present, then take
> > +appropriate action. For instance, in the case of absent mappings this might
> > +involve establishing the mapping for the first time via UFFDIO_COPY/CONTINUE or
> > +faulting the mapping in using MADV_POPULATE_READ/WRITE. After establishing the
> > +mapping, userspace can return to KVM to retry the previous memory access.
> > +
> > +(*) NOTE: On x86, KVM_CAP_X86_MEMORY_FAULT_EXIT must be enabled for the
> > +KVM_MEMFAULT_REASON_ABSENT_MAPPING_reason: otherwise userspace will only receive
> > +a -EFAULT from KVM_RUN without any useful information.
> 
> I'm not a fan of this architecture-specific dependency. Userspace is already
> explicitly opting in to this behavior by way of the memslot flag. These sort
> of exits are entirely orthogonal to the -EFAULT conversion earlier in the
> series.

Ya, yet another reason not to speculate on why KVM wasn't able to resolve a fault.

  parent reply	other threads:[~2023-03-17 20:17 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-15  2:17 [WIP Patch v2 00/14] Avoiding slow get-user-pages via memory fault exit Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 01/14] KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 02/14] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 03/14] KVM: Allow hva_pfn_fast to resolve read-only faults Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 04/14] KVM: x86: Add KVM_CAP_X86_MEMORY_FAULT_EXIT and associated kvm_run field Anish Moorthy
2023-03-17  0:02   ` Isaku Yamahata
2023-03-17 18:33     ` Anish Moorthy
2023-03-17 19:30       ` Oliver Upton
2023-03-17 21:50       ` Sean Christopherson
2023-03-17 22:44         ` Anish Moorthy
2023-03-20 15:53           ` Sean Christopherson
2023-03-20 18:19             ` Anish Moorthy
2023-03-20 22:11             ` Anish Moorthy
2023-03-21 15:21               ` Sean Christopherson
2023-03-21 18:01                 ` Anish Moorthy
2023-03-21 19:43                   ` Sean Christopherson
2023-03-22 21:06                     ` Anish Moorthy
2023-03-22 23:17                       ` Sean Christopherson
2023-03-28 22:19                     ` Anish Moorthy
2023-04-04 19:34                       ` Sean Christopherson
2023-04-04 20:40                         ` Anish Moorthy
2023-04-04 22:07                           ` Sean Christopherson
2023-04-05 20:21                             ` Anish Moorthy
2023-03-17 18:35   ` Oliver Upton
2023-03-15  2:17 ` [WIP Patch v2 05/14] KVM: x86: Implement memory fault exit for direct_map Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 06/14] KVM: x86: Implement memory fault exit for kvm_handle_page_fault Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 07/14] KVM: x86: Implement memory fault exit for setup_vmgexit_scratch Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 08/14] KVM: x86: Implement memory fault exit for FNAME(fetch) Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 09/14] KVM: Introduce KVM_CAP_MEMORY_FAULT_NOWAIT without implementation Anish Moorthy
2023-03-17 18:59   ` Oliver Upton
2023-03-17 20:15     ` Anish Moorthy
2023-03-17 20:54       ` Sean Christopherson
2023-03-17 23:42         ` Anish Moorthy
2023-03-20 15:13           ` Sean Christopherson
2023-03-20 19:53             ` Anish Moorthy
2023-03-17 20:17     ` Sean Christopherson [this message]
2023-03-20 22:22       ` Oliver Upton
2023-03-21 14:50         ` Sean Christopherson
2023-03-21 20:23           ` Oliver Upton
2023-03-21 21:01             ` Sean Christopherson
2023-03-15  2:17 ` [WIP Patch v2 10/14] KVM: x86: Implement KVM_CAP_MEMORY_FAULT_NOWAIT Anish Moorthy
2023-03-17  0:32   ` Isaku Yamahata
2023-03-15  2:17 ` [WIP Patch v2 11/14] KVM: arm64: Allow user_mem_abort to return 0 to signal a 'normal' exit Anish Moorthy
2023-03-17 18:18   ` Oliver Upton
2023-03-15  2:17 ` [WIP Patch v2 12/14] KVM: arm64: Implement KVM_CAP_MEMORY_FAULT_NOWAIT Anish Moorthy
2023-03-17 18:27   ` Oliver Upton
2023-03-17 19:00     ` Anish Moorthy
2023-03-17 19:03       ` Oliver Upton
2023-03-17 19:24       ` Sean Christopherson
2023-03-15  2:17 ` [WIP Patch v2 13/14] KVM: selftests: Add memslot_flags parameter to memstress_create_vm Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 14/14] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
2023-03-17 17:43 ` [WIP Patch v2 00/14] Avoiding slow get-user-pages via memory fault exit Oliver Upton
2023-03-17 18:13   ` Sean Christopherson
2023-03-17 18:46     ` David Matlack
2023-03-17 18:54       ` Oliver Upton
2023-03-17 18:59         ` David Matlack
2023-03-17 19:53           ` Anish Moorthy
2023-03-17 22:03             ` Sean Christopherson
2023-03-20 15:56               ` Sean Christopherson
2023-03-17 20:35 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBTK0vzAoWqY1hDh@google.com \
    --to=seanjc@google.com \
    --cc=amoorthy@google.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=oliver.upton@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).