KVM Archive on lore.kernel.org
 help / color / Atom feed
From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Brijesh Singh <brijesh.singh@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	eric van tassell <Eric.VanTassell@amd.com>,
	Tom Lendacky <thomas.lendacky@amd.com>
Subject: Re: [RFC PATCH 0/8] KVM: x86/mmu: Introduce pinned SPTEs framework
Date: Mon, 3 Aug 2020 10:16:20 -0700
Message-ID: <20200803171620.GC3151@linux.intel.com> (raw)
In-Reply-To: <3bf90589-8404-8bd6-925c-427f72528fc2@amd.com>

On Mon, Aug 03, 2020 at 10:52:05AM -0500, Brijesh Singh wrote:
> Thanks for series Sean. Some thoughts
> 
> 
> On 7/31/20 4:23 PM, Sean Christopherson wrote:
> > SEV currently needs to pin guest memory as it doesn't support migrating
> > encrypted pages.  Introduce a framework in KVM's MMU to support pinning
> > pages on demand without requiring additional memory allocations, and with
> > (somewhat hazy) line of sight toward supporting more advanced features for
> > encrypted guest memory, e.g. host page migration.
> 
> 
> Eric's attempt to do a lazy pinning suffers with the memory allocation
> problem and your series seems to address it. As you have noticed,
> currently the SEV enablement  in the KVM does not support migrating the
> encrypted pages. But the recent SEV firmware provides a support to
> migrate the encrypted pages (e.g host page migration). The support is
> available in SEV FW >= 0.17.

I assume SEV also doesn't support ballooning?  Ballooning would be a good
first step toward page migration as I think it'd be easier for KVM to
support, e.g. only needs to deal with the "zap" and not the "move".

> > The idea is to use a software available bit in the SPTE to track that a
> > page has been pinned.  The decision to pin a page and the actual pinning
> > managment is handled by vendor code via kvm_x86_ops hooks.  There are
> > intentionally two hooks (zap and unzap) introduced that are not needed for
> > SEV.  I included them to again show how the flag (probably renamed?) could
> > be used for more than just pin/unpin.
> 
> If using the available software bits for the tracking the pinning is
> acceptable then it can be used for the non-SEV guests (if needed). I
> will look through your patch more carefully but one immediate question,
> when do we unpin the pages? In the case of the SEV, once a page is
> pinned then it should not be unpinned until the guest terminates. If we
> unpin the page before the VM terminates then there is a  chance the host
> page migration will kick-in and move the pages. The KVM MMU code may
> call to drop the spte's during the zap/unzap and this happens a lot
> during a guest execution and it will lead us to the path where a vendor
> specific code will unpin the pages during the guest execution and cause
> a data corruption for the SEV guest.

The pages are unpinned by:

  drop_spte()
  |
  -> rmap_remove()
     |
     -> sev_drop_pinned_spte()


The intent is to allow unpinning pages when the mm_struct dies, i.e. when
the memory is no longer reachable (as opposed to when the last reference to
KVM is put), but typing that out, I realize there are dependencies and
assumptions that don't hold true for SEV as implemented.

  - Parent shadow pages won't be zapped.  Recycling MMU pages and zapping
    all SPs due to memslot updates are the two concerns.

    The easy way out for recycling is to not recycle SPs with pinned
    children, though that may or may not fly with VMM admins.

    I'm trying to resolve the memslot issue[*], but confirming that there's
    no longer an issue with not zapping everything is proving difficult as
    we haven't yet reproduced the original bug.

  - drop_large_spte() won't be invoked.  I believe the only semi-legitimate
    scenario is if the NX huge page workaround is toggled on while a VM is
    running.  Disallowing that if there is an SEV guest seems reasonable?

    There might be an issue with the host page size changing, but I don't
    think that can happen if the page is pinned.  That needs more
    investigation.


[*] https://lkml.kernel.org/r/20200703025047.13987-1-sean.j.christopherson@intel.com

> > Bugs in the core implementation are pretty much guaranteed.  The basic
> > concept has been tested, but in a fairly different incarnation.  Most
> > notably, tagging PRESENT SPTEs as PINNED has not been tested, although
> > using the PINNED flag to track zapped (and known to be pinned) SPTEs has
> > been tested.  I cobbled this variation together fairly quickly to get the
> > code out there for discussion.
> >
> > The last patch to pin SEV pages during sev_launch_update_data() is
> > incomplete; it's there to show how we might leverage MMU-based pinning to
> > support pinning pages before the guest is live.
> 
> 
> I will add the SEV specific bits and  give this a try.
> 
> >
> > Sean Christopherson (8):
> >   KVM: x86/mmu: Return old SPTE from mmu_spte_clear_track_bits()
> >   KVM: x86/mmu: Use bits 2:0 to check for present SPTEs
> >   KVM: x86/mmu: Refactor handling of not-present SPTEs in mmu_set_spte()
> >   KVM: x86/mmu: Add infrastructure for pinning PFNs on demand
> >   KVM: SVM: Use the KVM MMU SPTE pinning hooks to pin pages on demand
> >   KVM: x86/mmu: Move 'pfn' variable to caller of direct_page_fault()
> >   KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by SEV
> >   KVM: SVM: Pin SEV pages in MMU during sev_launch_update_data()
> >
> >  arch/x86/include/asm/kvm_host.h |   7 ++
> >  arch/x86/kvm/mmu.h              |   3 +
> >  arch/x86/kvm/mmu/mmu.c          | 186 +++++++++++++++++++++++++-------
> >  arch/x86/kvm/mmu/paging_tmpl.h  |   3 +-
> >  arch/x86/kvm/svm/sev.c          | 141 +++++++++++++++++++++++-
> >  arch/x86/kvm/svm/svm.c          |   3 +
> >  arch/x86/kvm/svm/svm.h          |   3 +
> >  7 files changed, 302 insertions(+), 44 deletions(-)
> >

  reply index

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-31 21:23 Sean Christopherson
2020-07-31 21:23 ` [RFC PATCH 1/8] KVM: x86/mmu: Return old SPTE from mmu_spte_clear_track_bits() Sean Christopherson
2020-07-31 21:23 ` [RFC PATCH 2/8] KVM: x86/mmu: Use bits 2:0 to check for present SPTEs Sean Christopherson
2020-07-31 21:23 ` [RFC PATCH 3/8] KVM: x86/mmu: Refactor handling of not-present SPTEs in mmu_set_spte() Sean Christopherson
2020-07-31 21:23 ` [RFC PATCH 4/8] KVM: x86/mmu: Add infrastructure for pinning PFNs on demand Sean Christopherson
2020-07-31 21:23 ` [RFC PATCH 5/8] KVM: SVM: Use the KVM MMU SPTE pinning hooks to pin pages " Sean Christopherson
2020-07-31 21:23 ` [RFC PATCH 6/8] KVM: x86/mmu: Move 'pfn' variable to caller of direct_page_fault() Sean Christopherson
2020-07-31 21:23 ` [RFC PATCH 7/8] KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by SEV Sean Christopherson
2020-07-31 21:23 ` [RFC PATCH 8/8] KVM: SVM: Pin SEV pages in MMU during sev_launch_update_data() Sean Christopherson
2020-08-03  3:00 ` [RFC PATCH 0/8] KVM: x86/mmu: Introduce pinned SPTEs framework Eric van Tassell
2020-08-03 15:00   ` Sean Christopherson
2020-08-03 15:52 ` Brijesh Singh
2020-08-03 17:16   ` Sean Christopherson [this message]
2020-08-04 19:40     ` Brijesh Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200803171620.GC3151@linux.intel.com \
    --to=sean.j.christopherson@intel.com \
    --cc=Eric.VanTassell@amd.com \
    --cc=brijesh.singh@amd.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=thomas.lendacky@amd.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

KVM Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kvm/0 kvm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kvm kvm/ https://lore.kernel.org/kvm \
		kvm@vger.kernel.org
	public-inbox-index kvm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.kvm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git