linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steve Rutherford <srutherford@google.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Ashish Kalra <ashish.kalra@amd.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"Lendacky, Thomas" <Thomas.Lendacky@amd.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"venu.busireddy@oracle.com" <venu.busireddy@oracle.com>,
	"Singh, Brijesh" <brijesh.singh@amd.com>,
	Will Deacon <will@kernel.org>,
	Quentin Perret <qperret@google.com>
Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl
Date: Mon, 8 Mar 2021 13:48:43 -0800	[thread overview]
Message-ID: <CABayD+eCPebG2R74Huoff5yj1PqEMnri2znn26EOf=qRKYw6Aw@mail.gmail.com> (raw)
In-Reply-To: <YEaAXXGZH0uSMA3v@google.com>

On Mon, Mar 8, 2021 at 11:52 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Mon, Mar 08, 2021, Ashish Kalra wrote:
> > On Fri, Feb 26, 2021 at 09:44:41AM -0800, Sean Christopherson wrote:
> > > +Will and Quentin (arm64)
> > >
> > > Moving the non-KVM x86 folks to bcc, I don't they care about KVM details at this
> > > point.
> > >
> > > On Fri, Feb 26, 2021, Ashish Kalra wrote:
> > > > On Thu, Feb 25, 2021 at 02:59:27PM -0800, Steve Rutherford wrote:
> > > > > On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> > > > > Thanks for grabbing the data!
> > > > >
> > > > > I am fine with both paths. Sean has stated an explicit desire for
> > > > > hypercall exiting, so I think that would be the current consensus.
> > >
> > > Yep, though it'd be good to get Paolo's input, too.
> > >
> > > > > If we want to do hypercall exiting, this should be in a follow-up
> > > > > series where we implement something more generic, e.g. a hypercall
> > > > > exiting bitmap or hypercall exit list. If we are taking the hypercall
> > > > > exit route, we can drop the kvm side of the hypercall.
> > >
> > > I don't think this is a good candidate for arbitrary hypercall interception.  Or
> > > rather, I think hypercall interception should be an orthogonal implementation.
> > >
> > > The guest, including guest firmware, needs to be aware that the hypercall is
> > > supported, and the ABI needs to be well-defined.  Relying on userspace VMMs to
> > > implement a common ABI is an unnecessary risk.
> > >
> > > We could make KVM's default behavior be a nop, i.e. have KVM enforce the ABI but
> > > require further VMM intervention.  But, I just don't see the point, it would
> > > save only a few lines of code.  It would also limit what KVM could do in the
> > > future, e.g. if KVM wanted to do its own bookkeeping _and_ exit to userspace,
> > > then mandatory interception would essentially make it impossible for KVM to do
> > > bookkeeping while still honoring the interception request.
> > >
> > > However, I do think it would make sense to have the userspace exit be a generic
> > > exit type.  But hey, we already have the necessary ABI defined for that!  It's
> > > just not used anywhere.
> > >
> > >     /* KVM_EXIT_HYPERCALL */
> > >     struct {
> > >             __u64 nr;
> > >             __u64 args[6];
> > >             __u64 ret;
> > >             __u32 longmode;
> > >             __u32 pad;
> > >     } hypercall;
> > >
> > >
> > > > > Userspace could also handle the MSR using MSR filters (would need to
> > > > > confirm that).  Then userspace could also be in control of the cpuid bit.
> > >
> > > An MSR is not a great fit; it's x86 specific and limited to 64 bits of data.
> > > The data limitation could be fudged by shoving data into non-standard GPRs, but
> > > that will result in truly heinous guest code, and extensibility issues.
> > >
> > > The data limitation is a moot point, because the x86-only thing is a deal
> > > breaker.  arm64's pKVM work has a near-identical use case for a guest to share
> > > memory with a host.  I can't think of a clever way to avoid having to support
> > > TDX's and SNP's hypervisor-agnostic variants, but we can at least not have
> > > multiple KVM variants.
> > >
> >
> > Potentially, there is another reason for in-kernel hypercall handling
> > considering SEV-SNP. In case of SEV-SNP the RMP table tracks the state
> > of each guest page, for instance pages in hypervisor state, i.e., pages
> > with C=0 and pages in guest valid state with C=1.
> >
> > Now, there shouldn't be a need for page encryption status hypercalls on
> > SEV-SNP as KVM can track & reference guest page status directly using
> > the RMP table.
>
> Relying on the RMP table itself would require locking the RMP table for an
> extended duration, and walking the entire RMP to find shared pages would be
> very inefficient.
>
> > As KVM maintains the RMP table, therefore we will need SET/GET type of
> > interfaces to provide the guest page encryption status to userspace.
>
> Hrm, somehow I temporarily forgot about SNP and TDX adding their own hypercalls
> for converting between shared and private.  And in the case of TDX, the hypercall
> can't be trusted, i.e. is just a hint, otherwise the guest could induce a #MC in
> the host.
>
> But, the different guest behavior doesn't require KVM to maintain a list/tree,
> e.g. adding a dedicated KVM_EXIT_* for notifying userspace of page encryption
> status changes would also suffice.
>
> Actually, that made me think of another argument against maintaining a list in
> KVM: there's no way to notify userspace that a page's status has changed.
> Userspace would need to query KVM to do GET_LIST after every GET_DIRTY.
> Obviously not a huge issue, but it does make migration slightly less efficient.
>
> On a related topic, there are fatal race conditions that will require careful
> coordination between guest and host, and will effectively be wired into the ABI.
> SNP and TDX don't suffer these issues because host awareness of status is atomic
> with respect to the guest actually writing the page with the new encryption
> status.
>
> For SEV live migration...
>
> If the guest does the hypercall after writing the page, then the guest is hosed
> if it gets migrated while writing the page (scenario #1):
>
>   vCPU                 Userspace
>   zero_bytes[0:N]
>                        <transfers written bytes as private instead of shared>
>                        <migrates vCPU>
>   zero_bytes[N+1:4095]
>   set_shared (dest)
>   kaboom!
>
> If userspace does GET_DIRTY after GET_LIST, then the host would transfer bad
> data by consuming a stale list (scenario #2):
>
>   vCPU               Userspace
>                      get_list (from KVM or internally)
>   set_shared (src)
>   zero_page (src)
>                      get_dirty
>                      <transfers private data instead of shared>
>                      <migrates vCPU>
>   kaboom!
>
> If both guest and host order things to avoid #1 and #2, the host can still
> migrate the wrong data (scenario #3):
>
>   vCPU               Userspace
>   set_private
>   zero_bytes[0:4096]
>                      get_dirty
>   set_shared (src)
>                      get_list
>                      <transfers as shared instead of private>
>                      <migrates vCPU>
>   set_private (dest)
>   kaboom!
>
> Scenario #3 is unlikely, but plausible, e.g. if the guest bails from its
> conversion flow for whatever reason, after making the initial hypercall.  Maybe
> it goes without saying, but to address #3, the guest must consider existing data
> as lost the instant it tells the host the page has been converted to a different
> type.
This requires the guest to bail on the conversion flow, and
subsequently treat the page as having it's previous value. The kernel
does not currently do this. I don't think it should ever do this.
Currently, each time the guest changes it's view of a page, it
re-zeroes that page, and it does so after flipping the c-bit and
calling the hypercall (IIRC). This is much more conservative than is
necessary for supporting LM, but does suffice. I think the minimal
requirement for LM support is "Flipping a page's encryption status
(and advertising that flip to the host) twice is not a no-op. The
kernel must reinitialize that page". This invariant should also be
necessary for anything that moves pages around on the same host (COPY,
SWAP_OUT, SWAP_IN).

Relatedly, there is no workaround if the kernel does in-place
reencryption (since the page would be both encrypted and unencrypted).
That said, I believe the spec affirmatively tells you not to do this
outside of SME.

There's also no solution if the guest relies on the value of
ciphertext, or on the value of "decrypted plaintext". The guest could
detect live migration by noticing that a page's value from an
unencrypted perspective has changed when it's value from an encrypted
perspective has not (or the other way around). This would also imply
that a very strange guest could be broken by live migration.

  parent reply	other threads:[~2021-03-08 21:50 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-04  0:35 [PATCH v10 00/17] Add AMD SEV guest live migration support Ashish Kalra
2021-02-04  0:36 ` [PATCH v10 01/16] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
2021-02-04  0:36 ` [PATCH v10 02/16] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
2021-02-04  0:37 ` [PATCH v10 03/16] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
2021-02-04  0:37 ` [PATCH v10 04/16] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
2021-02-04  0:37 ` [PATCH v10 05/16] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
2021-02-04  0:37 ` [PATCH v10 06/16] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
2021-02-04  0:38 ` [PATCH v10 07/16] KVM: x86: Add AMD SEV specific Hypercall3 Ashish Kalra
2021-02-04  0:38 ` [PATCH v10 08/16] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
2021-02-04 16:03   ` Tom Lendacky
2021-02-05  1:44   ` Steve Rutherford
2021-02-05  3:32     ` Ashish Kalra
2021-02-04  0:39 ` [PATCH v10 09/16] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
2021-02-04  0:39 ` [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl Ashish Kalra
2021-02-04 16:14   ` Tom Lendacky
2021-02-04 16:34     ` Ashish Kalra
2021-02-17  1:03   ` Sean Christopherson
2021-02-17 14:00     ` Kalra, Ashish
2021-02-17 16:13       ` Sean Christopherson
2021-02-18  6:48         ` Kalra, Ashish
2021-02-18 16:39           ` Sean Christopherson
2021-02-18 17:05             ` Kalra, Ashish
2021-02-18 17:50               ` Sean Christopherson
2021-02-18 18:32     ` Kalra, Ashish
2021-02-24 17:51       ` Ashish Kalra
2021-02-24 18:22         ` Sean Christopherson
2021-02-25 20:20           ` Ashish Kalra
2021-02-25 22:59             ` Steve Rutherford
2021-02-25 23:24               ` Steve Rutherford
2021-02-26 14:04               ` Ashish Kalra
2021-02-26 17:44                 ` Sean Christopherson
2021-03-02 14:55                   ` Ashish Kalra
2021-03-02 15:15                     ` Ashish Kalra
2021-03-03 18:54                     ` Will Deacon
2021-03-03 19:32                       ` Ashish Kalra
2021-03-09 19:10                       ` Ashish Kalra
2021-03-11 18:14                       ` Ashish Kalra
2021-03-11 20:48                         ` Steve Rutherford
2021-03-19 17:59                           ` Ashish Kalra
2021-04-02  1:40                             ` Steve Rutherford
2021-04-02 11:09                               ` Ashish Kalra
2021-03-08 10:40                   ` Ashish Kalra
2021-03-08 19:51                     ` Sean Christopherson
2021-03-08 21:05                       ` Ashish Kalra
2021-03-08 21:11                       ` Brijesh Singh
2021-03-08 21:32                         ` Ashish Kalra
2021-03-08 21:51                         ` Steve Rutherford
2021-03-09 19:42                           ` Sean Christopherson
2021-03-10  3:42                           ` Kalra, Ashish
2021-03-10  3:47                             ` Steve Rutherford
2021-03-08 21:48                       ` Steve Rutherford [this message]
2021-02-17  1:06   ` Sean Christopherson
2021-02-04  0:39 ` [PATCH v10 11/16] KVM: x86: Introduce KVM_SET_SHARED_PAGES_LIST ioctl Ashish Kalra
2021-02-04  0:39 ` [PATCH v10 12/16] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR Ashish Kalra
2021-02-05  0:56   ` Steve Rutherford
2021-02-05  3:07     ` Ashish Kalra
2021-02-06  2:54       ` Steve Rutherford
2021-02-06  4:49         ` Ashish Kalra
2021-02-06  5:46         ` Ashish Kalra
2021-02-06 13:56           ` Ashish Kalra
2021-02-08  0:28             ` Ashish Kalra
2021-02-08 22:50               ` Steve Rutherford
2021-02-10 20:36                 ` Ashish Kalra
2021-02-10 22:01                   ` Steve Rutherford
2021-02-10 22:05                     ` Steve Rutherford
2021-02-16 23:20   ` Sean Christopherson
2021-02-04  0:40 ` [PATCH v10 13/16] EFI: Introduce the new AMD Memory Encryption GUID Ashish Kalra
2021-02-04  0:40 ` [PATCH v10 14/16] KVM: x86: Add guest support for detecting and enabling SEV Live Migration feature Ashish Kalra
2021-02-18 17:56   ` Sean Christopherson
2021-02-04  0:40 ` [PATCH v10 15/16] KVM: x86: Add kexec support for SEV Live Migration Ashish Kalra
2021-02-04  0:40 ` [PATCH v10 16/16] KVM: SVM: Bypass DBG_DECRYPT API calls for unencrypted guest memory Ashish Kalra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABayD+eCPebG2R74Huoff5yj1PqEMnri2znn26EOf=qRKYw6Aw@mail.gmail.com' \
    --to=srutherford@google.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=ashish.kalra@amd.com \
    --cc=brijesh.singh@amd.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=seanjc@google.com \
    --cc=venu.busireddy@oracle.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).