linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mingwei Zhang <mizhang@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/4] KVM: x86/mmu: Document the "rules" for using host_pfn_mapping_level()
Date: Mon, 18 Jul 2022 20:48:55 +0000	[thread overview]
Message-ID: <YtXHN9rrj6+SRa1Z@google.com> (raw)
In-Reply-To: <YtWPSILmAp/0m5eC@google.com>

On Mon, Jul 18, 2022, Sean Christopherson wrote:
> On Sat, Jul 16, 2022, Mingwei Zhang wrote:
> > On Fri, Jul 15, 2022, Sean Christopherson wrote:
> > > Add a comment to document how host_pfn_mapping_level() can be used safely,
> > > as the line between safe and dangerous is quite thin.  E.g. if KVM were
> > > to ever support in-place promotion to create huge pages, consuming the
> > > level is safe if the caller holds mmu_lock and checks that there's an
> > > existing _leaf_ SPTE, but unsafe if the caller only checks that there's a
> > > non-leaf SPTE.
> > > 
> > > Opportunistically tweak the existing comments to explicitly document why
> > > KVM needs to use READ_ONCE().
> > > 
> > > No functional change intended.
> > > 
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >  arch/x86/kvm/mmu/mmu.c | 42 +++++++++++++++++++++++++++++++++++-------
> > >  1 file changed, 35 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index bebff1d5acd4..d5b644f3e003 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -2919,6 +2919,31 @@ static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep)
> > >  	__direct_pte_prefetch(vcpu, sp, sptep);
> > >  }
> > >  
> > > +/*
> > > + * Lookup the mapping level for @gfn in the current mm.
> > > + *
> > > + * WARNING!  Use of host_pfn_mapping_level() requires the caller and the end
> > > + * consumer to be tied into KVM's handlers for MMU notifier events!
> > Since calling this function won't cause kernel crash now, I guess we can
> > remove the warning sign here, but keep the remaining statement since it
> > is necessary.
> 
> Calling this function won't _directly_ crash the kernel, but improper usage can
> most definitely crash the host kernel, or even worse, silently corrupt host and
> or guest data.  E.g. if KVM were to race with an mmu_notifier event and incorrectly
> map a stale huge page into the guest.
> 
> So yes, the function itself is robust, but usage is still very subtle and delicate.

Understood. So we basically create another "gup_fast_only()" within KVM
and we worry that may confuse other developers so we add the warning
sign.
> 
> > > + *
> > > + * There are several ways to safely use this helper:
> > > + *
> > > + * - Check mmu_notifier_retry_hva() after grabbing the mapping level, before
> > > + *   consuming it.  In this case, mmu_lock doesn't need to be held during the
> > > + *   lookup, but it does need to be held while checking the MMU notifier.
> > 
> > but it does need to be held while checking the MMU notifier and
> > consuming the result.
> 
> I didn't want to include "consuming the result" because arguably the result is
> being consumed while running the guest, and obviously KVM doesn't hold mmu_lock
> while running the guest (though I fully acknowledge the above effectively uses
> "consume" in the sense of shoving the result into SPTEs).  
> 
> > > + *
> > > + * - Hold mmu_lock AND ensure there is no in-progress MMU notifier invalidation
> > > + *   event for the hva.  This can be done by explicit checking the MMU notifier
> 
> s/explicit/explicitly
> 
> > > + *   or by ensuring that KVM already has a valid mapping that covers the hva.
> > 
> > Yes, more specifically, "mmu notifier sequence counter".
> 
> Heh, depends on what the reader interprets as "sequence counter".  If the reader
> interprets that as the literal sequence counter, mmu_notifier_seq, then this phrasing
> is incorrect as mmu_notifier_seq isn't bumped until the invalidation completes,
> i.e. it guards against _past_ invalidations, not in-progress validations.
> 
> My preference is to intentionally not be precise in describing how to check for an
> in-progress invalidation, e.g. so that this comment doesn't need to be updated if
> the details change, and to also to try and force developers to do more than copy
> and paste if they want to use this helper.

Hmm, I was going to say that I strongly disagree about the intentional
unclearness. But then I find that MMU notifier implementation does
require more than just the counter but also the range, so yeah, talking
too much may fall into the weeds. But in general, I think mmu notifier
deserves better documentation in both concept and implementation in KVM.


  reply	other threads:[~2022-07-18 20:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-15 23:21 [PATCH 0/4] Huge page related cleanups Sean Christopherson
2022-07-15 23:21 ` [PATCH 1/4] KVM: x86/mmu: Don't require refcounted "struct page" to create huge SPTEs Sean Christopherson
2022-07-16  5:53   ` Mingwei Zhang
2022-07-15 23:21 ` [PATCH 2/4] KVM: x86/mmu: Document the "rules" for using host_pfn_mapping_level() Sean Christopherson
2022-07-16 18:51   ` Mingwei Zhang
2022-07-18 16:50     ` Sean Christopherson
2022-07-18 20:48       ` Mingwei Zhang [this message]
2022-07-15 23:21 ` [PATCH 3/4] KVM: x86/mmu: Don't bottom out on leafs when zapping collapsible SPTEs Sean Christopherson
2022-07-15 23:21 ` [PATCH 4/4] KVM: selftests: Add an option to run vCPUs while disabling dirty logging Sean Christopherson
2022-07-19 18:02 ` [PATCH 0/4] Huge page related cleanups Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YtXHN9rrj6+SRa1Z@google.com \
    --to=mizhang@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).