kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oliver Upton <oliver.upton@linux.dev>
To: Sean Christopherson <seanjc@google.com>
Cc: Anish Moorthy <amoorthy@google.com>,
	jthoughton@google.com, kvm@vger.kernel.org
Subject: Re: [WIP Patch v2 09/14] KVM: Introduce KVM_CAP_MEMORY_FAULT_NOWAIT without implementation
Date: Mon, 20 Mar 2023 22:22:08 +0000	[thread overview]
Message-ID: <ZBjckKb6eWx2vSin@linux.dev> (raw)
In-Reply-To: <ZBTK0vzAoWqY1hDh@google.com>

Sean,

On Fri, Mar 17, 2023 at 01:17:22PM -0700, Sean Christopherson wrote:
> On Fri, Mar 17, 2023, Oliver Upton wrote:
> > On Wed, Mar 15, 2023 at 02:17:33AM +0000, Anish Moorthy wrote:
> > > Add documentation, memslot flags, useful helper functions, and the
> > > actual new capability itself.
> > > 
> > > Memory fault exits on absent mappings are particularly useful for
> > > userfaultfd-based live migration postcopy. When many vCPUs fault upon a
> > > single userfaultfd the faults can take a while to surface to userspace
> > > due to having to contend for uffd wait queue locks. Bypassing the uffd
> > > entirely by triggering a vCPU exit avoids this contention and can improve
> > > the fault rate by as much as 10x.
> > > ---
> > >  Documentation/virt/kvm/api.rst | 37 +++++++++++++++++++++++++++++++---
> > >  include/linux/kvm_host.h       |  6 ++++++
> > >  include/uapi/linux/kvm.h       |  3 +++
> > >  tools/include/uapi/linux/kvm.h |  2 ++
> > >  virt/kvm/kvm_main.c            |  7 ++++++-
> > >  5 files changed, 51 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> > > index f9ca18bbec879..4932c0f62eb3d 100644
> > > --- a/Documentation/virt/kvm/api.rst
> > > +++ b/Documentation/virt/kvm/api.rst
> > > @@ -1312,6 +1312,7 @@ yet and must be cleared on entry.
> > >    /* for kvm_userspace_memory_region::flags */
> > >    #define KVM_MEM_LOG_DIRTY_PAGES	(1UL << 0)
> > >    #define KVM_MEM_READONLY	(1UL << 1)
> > > +  #define KVM_MEM_ABSENT_MAPPING_FAULT (1UL << 2)
> > 
> > call it KVM_MEM_EXIT_ABSENT_MAPPING
> 
> Ooh, look, a bikeshed!  :-)

Couldn't help myself :)

> I don't think it should have "EXIT" in the name.  The exit to userspace is a side
> effect, e.g. KVM already exits to userspace on unresolved userfaults.  The only
> thing this knob _directly_ controls is whether or not KVM attempts the slow path.
> If we give the flag a name like "exit on absent userspace mappings", then KVM will
> appear to do the wrong thing when KVM exits on a truly absent userspace mapping.
> 
> And as I argued in the last version[*], I am _strongly_ opposed to KVM speculating
> on why KVM is exiting to userspace.  I.e. KVM should not set a special flag if
> the memslot has "fast only" behavior.  The only thing the flag should do is control
> whether or not KVM tries slow paths, what KVM does in response to an unresolved
> fault should be an orthogonal thing.
> 
> E.g. If KVM encounters an unmapped page while prefetching SPTEs, KVM will (correctly)
> not exit to userspace and instead simply terminate the prefetch.  Obviously we
> could solve that through documentation, but I don't see any benefit in making this
> more complex than it needs to be.

I couldn't care less about what the user-facing portion of this thing is
called, TBH. We could just refer to it as KVM_MEM_BIT_2 /s

The only bit I wanted to avoid is having a collision in the kernel between
literal faults arising from hardware and exits to userspace that we are also
calling 'faults'.

> [*] https://lkml.kernel.org/r/Y%2B0RYMfw6pHrSLX4%40google.com
> 
> > > +7.35 KVM_CAP_MEMORY_FAULT_NOWAIT
> > > +--------------------------------
> > > +
> > > +:Architectures: x86, arm64
> > > +:Returns: -EINVAL.
> > > +
> > > +The presence of this capability indicates that userspace may pass the
> > > +KVM_MEM_ABSENT_MAPPING_FAULT flag to KVM_SET_USER_MEMORY_REGION to cause KVM_RUN
> > > +to exit to populate 'kvm_run.memory_fault' and exit to userspace (*) in response
> > > +to page faults for which the userspace page tables do not contain present
> > > +mappings. Attempting to enable the capability directly will fail.
> > > +
> > > +The 'gpa' and 'len' fields of kvm_run.memory_fault will be set to the starting
> > > +address and length (in bytes) of the faulting page. 'flags' will be set to
> > > +KVM_MEMFAULT_REASON_ABSENT_MAPPING.
> > > +
> > > +Userspace should determine how best to make the mapping present, then take
> > > +appropriate action. For instance, in the case of absent mappings this might
> > > +involve establishing the mapping for the first time via UFFDIO_COPY/CONTINUE or
> > > +faulting the mapping in using MADV_POPULATE_READ/WRITE. After establishing the
> > > +mapping, userspace can return to KVM to retry the previous memory access.
> > > +
> > > +(*) NOTE: On x86, KVM_CAP_X86_MEMORY_FAULT_EXIT must be enabled for the
> > > +KVM_MEMFAULT_REASON_ABSENT_MAPPING_reason: otherwise userspace will only receive
> > > +a -EFAULT from KVM_RUN without any useful information.
> > 
> > I'm not a fan of this architecture-specific dependency. Userspace is already
> > explicitly opting in to this behavior by way of the memslot flag. These sort
> > of exits are entirely orthogonal to the -EFAULT conversion earlier in the
> > series.
> 
> Ya, yet another reason not to speculate on why KVM wasn't able to resolve a fault.

Regardless of what we name this memslot flag, we're already getting explicit
opt-in from userspace for new behavior. There seems to be zero value in
supporting memslot_flag && !MEMORY_FAULT_EXIT (i.e. returning EFAULT),
so why even bother?

Requiring two levels of opt-in to have the intended outcome for a single
architecture seems nauseating from a userspace perspective.

-- 
Thanks,
Oliver

  reply	other threads:[~2023-03-20 22:24 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-15  2:17 [WIP Patch v2 00/14] Avoiding slow get-user-pages via memory fault exit Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 01/14] KVM: selftests: Allow many vCPUs and reader threads per UFFD in demand paging test Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 02/14] KVM: selftests: Use EPOLL in userfaultfd_util reader threads and signal errors via TEST_ASSERT Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 03/14] KVM: Allow hva_pfn_fast to resolve read-only faults Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 04/14] KVM: x86: Add KVM_CAP_X86_MEMORY_FAULT_EXIT and associated kvm_run field Anish Moorthy
2023-03-17  0:02   ` Isaku Yamahata
2023-03-17 18:33     ` Anish Moorthy
2023-03-17 19:30       ` Oliver Upton
2023-03-17 21:50       ` Sean Christopherson
2023-03-17 22:44         ` Anish Moorthy
2023-03-20 15:53           ` Sean Christopherson
2023-03-20 18:19             ` Anish Moorthy
2023-03-20 22:11             ` Anish Moorthy
2023-03-21 15:21               ` Sean Christopherson
2023-03-21 18:01                 ` Anish Moorthy
2023-03-21 19:43                   ` Sean Christopherson
2023-03-22 21:06                     ` Anish Moorthy
2023-03-22 23:17                       ` Sean Christopherson
2023-03-28 22:19                     ` Anish Moorthy
2023-04-04 19:34                       ` Sean Christopherson
2023-04-04 20:40                         ` Anish Moorthy
2023-04-04 22:07                           ` Sean Christopherson
2023-04-05 20:21                             ` Anish Moorthy
2023-03-17 18:35   ` Oliver Upton
2023-03-15  2:17 ` [WIP Patch v2 05/14] KVM: x86: Implement memory fault exit for direct_map Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 06/14] KVM: x86: Implement memory fault exit for kvm_handle_page_fault Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 07/14] KVM: x86: Implement memory fault exit for setup_vmgexit_scratch Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 08/14] KVM: x86: Implement memory fault exit for FNAME(fetch) Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 09/14] KVM: Introduce KVM_CAP_MEMORY_FAULT_NOWAIT without implementation Anish Moorthy
2023-03-17 18:59   ` Oliver Upton
2023-03-17 20:15     ` Anish Moorthy
2023-03-17 20:54       ` Sean Christopherson
2023-03-17 23:42         ` Anish Moorthy
2023-03-20 15:13           ` Sean Christopherson
2023-03-20 19:53             ` Anish Moorthy
2023-03-17 20:17     ` Sean Christopherson
2023-03-20 22:22       ` Oliver Upton [this message]
2023-03-21 14:50         ` Sean Christopherson
2023-03-21 20:23           ` Oliver Upton
2023-03-21 21:01             ` Sean Christopherson
2023-03-15  2:17 ` [WIP Patch v2 10/14] KVM: x86: Implement KVM_CAP_MEMORY_FAULT_NOWAIT Anish Moorthy
2023-03-17  0:32   ` Isaku Yamahata
2023-03-15  2:17 ` [WIP Patch v2 11/14] KVM: arm64: Allow user_mem_abort to return 0 to signal a 'normal' exit Anish Moorthy
2023-03-17 18:18   ` Oliver Upton
2023-03-15  2:17 ` [WIP Patch v2 12/14] KVM: arm64: Implement KVM_CAP_MEMORY_FAULT_NOWAIT Anish Moorthy
2023-03-17 18:27   ` Oliver Upton
2023-03-17 19:00     ` Anish Moorthy
2023-03-17 19:03       ` Oliver Upton
2023-03-17 19:24       ` Sean Christopherson
2023-03-15  2:17 ` [WIP Patch v2 13/14] KVM: selftests: Add memslot_flags parameter to memstress_create_vm Anish Moorthy
2023-03-15  2:17 ` [WIP Patch v2 14/14] KVM: selftests: Handle memory fault exits in demand_paging_test Anish Moorthy
2023-03-17 17:43 ` [WIP Patch v2 00/14] Avoiding slow get-user-pages via memory fault exit Oliver Upton
2023-03-17 18:13   ` Sean Christopherson
2023-03-17 18:46     ` David Matlack
2023-03-17 18:54       ` Oliver Upton
2023-03-17 18:59         ` David Matlack
2023-03-17 19:53           ` Anish Moorthy
2023-03-17 22:03             ` Sean Christopherson
2023-03-20 15:56               ` Sean Christopherson
2023-03-17 20:35 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZBjckKb6eWx2vSin@linux.dev \
    --to=oliver.upton@linux.dev \
    --cc=amoorthy@google.com \
    --cc=jthoughton@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).