kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>, kvm <kvm@vger.kernel.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Joao Martins <joao.m.martins@oracle.com>,
	"jmattson@google.com" <jmattson@google.com>,
	"wanpengli@tencent.com" <wanpengli@tencent.com>,
	"vkuznets@redhat.com" <vkuznets@redhat.com>,
	"mtosatti@redhat.com" <mtosatti@redhat.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	karahmed@amazon.com
Subject: Re: [RFC PATCH 0/11] Rework gfn_to_pfn_cache
Date: Mon, 15 Nov 2021 22:59:54 +0000	[thread overview]
Message-ID: <YZLmapmzs7sLpu/L@google.com> (raw)
In-Reply-To: <537a1d4e-9168-cd4a-cd2f-cddfd8733b05@redhat.com>

On Mon, Nov 15, 2021, Paolo Bonzini wrote:
> On 11/15/21 20:11, David Woodhouse wrote:
> > > Changing mn_memslots_update_rcuwait to a waitq (and renaming it to
> > > mn_invalidate_waitq) is of course also a possibility.
> > I suspect that's the answer.
> > 
> > I think the actual*invalidation*  of the cache still lives in the
> > invalidate_range() callback where I have it at the moment.

Oooh!  [finally had a lightbulb moment about ->invalidate_range() after years of
befuddlement].

Two things:

  1. Using _only_ ->invalidate_range() is not correct.  ->invalidate_range() is
     required if and only if the old PFN needs to be _unmapped_.  Specifically,
     if the protections are being downgraded without changing the PFN, it doesn't
     need to be called.  E.g. from hugetlb_change_protection():

	/*
	 * No need to call mmu_notifier_invalidate_range() we are downgrading
	 * page table protection not changing it to point to a new page.
	 *
	 * See Documentation/vm/mmu_notifier.rst
	 */

     x86's kvm_arch_mmu_notifier_invalidate_range() is a special snowflake because
     the APIC access page's VMA is controlled by KVM, i.e. is never downgraded, the
     only thing KVM cares about is if the PFN is changed, because that's the only
     thing that can change.

     In this case, if an HVA is downgraded from RW=R, KVM may not invalidate the
     cache and end up writing to memory that is supposed to be read-only.

     I believe we could use ->invalidate_range() to handle the unmap case if KVM's
     ->invalidate_range_start() hook is enhanced to handle the RW=>R case.  The
     "struct mmu_notifier_range" provides the event type, IIUC we could have the
     _start() variant handle MMU_NOTIFY_PROTECTION_{VMA,PAGE} (and maybe
     MMU_NOTIFY_SOFT_DIRTY?), and let the more precise unmap-only variant handle
     everything else.

  2. If we do split the logic across the two hooks, we should (a) do it in a separate
     series and (b) make the logic common to the gfn_to_pfn cache and to the standard
     kvm_unmap_gfn_range().  That would in theory shave a bit of time off walking
     gfn ranges (maybe even moreso with the scalable memslots implementation?), and
     if we're lucky, would resurrect the mostly-dead .change_pte() hook (see commit
     c13fda237f08 ("KVM: Assert that notifier count is elevated in .change_pte()")).

> > But making the req to the affected vCPUs can live in
> > invalidate_range_start(). And then the code which*handles*  that req can
> > wait for the mmu_notifier_count to reach zero before it proceeds. Atomic
> > users of the cache (like the Xen event channel code) don't have to get
> > involved with that.
> > 
> > > Also, for the small requests: since you are at it, can you add the code
> > > in a new file under virt/kvm/?
> > 
> > Hm... only if I can make hva_to_pfn() and probably a handful of other
> > things non-static?
> 
> Yes, I think sooner or later we also want all pfn stuff in one file
> (together with MMU notifiers) and all hva stuff in another; so for now you
> can create virt/kvm/hva_to_pfn.h, or virt/kvm/mm.h, or whatever color of the
> bikeshed you prefer.

Preemptive bikeshed strike... the MMU notifiers aren't strictly "pfn stuff", as
they operate on HVAs.  I don't know exactly what Paolo has in mind, but kvm/mm.h
or kvm/kvm_mm.h seems like it's less likely to become stale in the future.

  reply	other threads:[~2021-11-16  0:43 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-01 14:09 [PATCH] KVM: x86: Fix recording of guest steal time / preempted status David Woodhouse
2021-11-02 16:38 ` [PATCH v2] " David Woodhouse
2021-11-02 17:01   ` Paolo Bonzini
2021-11-02 17:11     ` David Woodhouse
2021-11-02 17:19       ` Paolo Bonzini
2021-11-02 17:26         ` David Woodhouse
2021-11-02 17:36         ` [PATCH v3] " David Woodhouse
2021-11-11 13:23           ` Paolo Bonzini
2021-11-12  8:28             ` David Woodhouse
2021-11-12  9:31               ` Paolo Bonzini
2021-11-12  9:54                 ` David Woodhouse
2021-11-12 10:49                   ` Paolo Bonzini
2021-11-12 11:29                     ` David Woodhouse
2021-11-12 12:27                       ` Paolo Bonzini
2021-11-12 13:28                         ` David Woodhouse
2021-11-12 14:56                           ` Paolo Bonzini
2021-11-12 15:27                             ` David Woodhouse
2021-11-15 16:47                             ` [RFC PATCH 0/11] Rework gfn_to_pfn_cache David Woodhouse
2021-11-15 16:50                               ` [PATCH 01/11] KVM: x86: Fix steal time asm constraints in 32-bit mode David Woodhouse
2021-11-15 16:50                                 ` [PATCH 02/11] KVM: x86/xen: Fix get_attr of KVM_XEN_ATTR_TYPE_SHARED_INFO David Woodhouse
2021-11-15 16:50                                 ` [PATCH 03/11] KVM: selftests: Add event channel upcall support to xen_shinfo_test David Woodhouse
2021-11-15 16:50                                 ` [PATCH 04/11] KVM: x86/xen: Use sizeof_field() instead of open-coding it David Woodhouse
2021-11-15 16:50                                 ` [PATCH 05/11] KVM: nVMX: Use kvm_{read,write}_guest_cached() for shadow_vmcs12 David Woodhouse
2021-11-15 16:50                                 ` [PATCH 06/11] KVM: nVMX: Use kvm_read_guest_offset_cached() for nested VMCS check David Woodhouse
2021-11-15 16:50                                 ` [PATCH 07/11] KVM: nVMX: Use a gfn_to_hva_cache for vmptrld David Woodhouse
2021-11-15 16:50                                 ` [PATCH 08/11] KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache David Woodhouse
2021-11-16 10:21                                   ` Paolo Bonzini
2021-11-17 17:18                                     ` David Woodhouse
2021-11-15 16:50                                 ` [PATCH 09/11] KVM: Reinstate gfn_to_pfn_cache with invalidation support David Woodhouse
2021-11-15 16:50                                 ` [PATCH 10/11] KVM: x86/xen: Maintain valid mapping of Xen shared_info page David Woodhouse
2021-11-15 16:50                                 ` [PATCH 11/11] KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery David Woodhouse
2021-11-15 17:02                                   ` David Woodhouse
2021-11-15 18:49                                   ` Paolo Bonzini
2021-11-15 18:55                                     ` David Woodhouse
2021-11-15 18:50                               ` [RFC PATCH 0/11] Rework gfn_to_pfn_cache Paolo Bonzini
2021-11-15 19:11                                 ` David Woodhouse
2021-11-15 19:26                                   ` Paolo Bonzini
2021-11-15 22:59                                     ` Sean Christopherson [this message]
2021-11-15 23:22                                       ` David Woodhouse
2021-11-16 13:17                                         ` David Woodhouse
2021-11-16 14:11                                           ` Paolo Bonzini
2021-11-16 14:25                                             ` David Woodhouse
2021-11-16 14:57                                               ` Paolo Bonzini
2021-11-16 15:09                                                 ` David Woodhouse
2021-11-16 15:49                                                   ` Paolo Bonzini
2021-11-16 16:06                                                     ` David Woodhouse
2021-11-16 17:42                                                       ` Paolo Bonzini
2021-11-16 17:57                                                         ` David Woodhouse
2021-11-16 18:46                                                           ` Paolo Bonzini
2021-11-16 19:34                                                             ` David Woodhouse
2021-11-15 23:24                                       ` David Woodhouse
2021-11-16 11:50                                     ` [PATCH 0/7] KVM: Add Makefile.kvm for common files David Woodhouse
2021-11-16 11:50                                       ` [PATCH 1/7] KVM: Introduce CONFIG_HAVE_KVM_DIRTY_RING David Woodhouse
2021-11-16 11:50                                         ` [PATCH 2/7] KVM: Add Makefile.kvm for common files, use it for x86 David Woodhouse
2021-11-16 11:50                                         ` [PATCH 3/7] KVM: s390: Use Makefile.kvm for common files David Woodhouse
2021-11-17  7:29                                           ` Christian Borntraeger
2021-11-16 11:50                                         ` [PATCH 4/7] KVM: mips: " David Woodhouse
2021-11-16 11:50                                         ` [PATCH 5/7] KVM: RISC-V: " David Woodhouse
2021-11-16 11:50                                         ` [PATCH 6/7] KVM: powerpc: " David Woodhouse
2021-11-16 18:43                                           ` Sean Christopherson
2021-11-16 19:13                                             ` David Woodhouse
2021-11-16 11:50                                         ` [PATCH 7/7] KVM: arm64: " David Woodhouse
2021-11-15 21:38                                 ` [RFC PATCH 0/11] Rework gfn_to_pfn_cache David Woodhouse
2021-11-12 19:44                 ` [PATCH v3] KVM: x86: Fix recording of guest steal time / preempted status David Woodhouse
2021-11-03  9:47         ` [PATCH v2] " David Woodhouse
2021-11-03 12:35           ` Paolo Bonzini
2021-11-03 12:56             ` David Woodhouse
2021-11-03 13:05               ` Paolo Bonzini
2021-11-03 13:23                 ` David Woodhouse
2021-11-03 13:34                 ` David Woodhouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YZLmapmzs7sLpu/L@google.com \
    --to=seanjc@google.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dwmw2@infradead.org \
    --cc=jmattson@google.com \
    --cc=joao.m.martins@oracle.com \
    --cc=joro@8bytes.org \
    --cc=karahmed@amazon.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).