linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ashish Kalra <ashish.kalra@amd.com>
To: Brijesh Singh <brijesh.singh@amd.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Steve Rutherford <srutherford@google.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"Lendacky, Thomas" <Thomas.Lendacky@amd.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"venu.busireddy@oracle.com" <venu.busireddy@oracle.com>,
	Will Deacon <will@kernel.org>,
	Quentin Perret <qperret@google.com>
Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl
Date: Mon, 8 Mar 2021 21:32:58 +0000	[thread overview]
Message-ID: <20210308213258.GA5580@ashkalra_ubuntu_server> (raw)
In-Reply-To: <bdf0767f-c2c4-5863-fd0d-352a3f68f7f9@amd.com>

On Mon, Mar 08, 2021 at 03:11:41PM -0600, Brijesh Singh wrote:
> 
> On 3/8/21 1:51 PM, Sean Christopherson wrote:
> > On Mon, Mar 08, 2021, Ashish Kalra wrote:
> >> On Fri, Feb 26, 2021 at 09:44:41AM -0800, Sean Christopherson wrote:
> >>> +Will and Quentin (arm64)
> >>>
> >>> Moving the non-KVM x86 folks to bcc, I don't they care about KVM details at this
> >>> point.
> >>>
> >>> On Fri, Feb 26, 2021, Ashish Kalra wrote:
> >>>> On Thu, Feb 25, 2021 at 02:59:27PM -0800, Steve Rutherford wrote:
> >>>>> On Thu, Feb 25, 2021 at 12:20 PM Ashish Kalra <ashish.kalra@amd.com> wrote:
> >>>>> Thanks for grabbing the data!
> >>>>>
> >>>>> I am fine with both paths. Sean has stated an explicit desire for
> >>>>> hypercall exiting, so I think that would be the current consensus.
> >>> Yep, though it'd be good to get Paolo's input, too.
> >>>
> >>>>> If we want to do hypercall exiting, this should be in a follow-up
> >>>>> series where we implement something more generic, e.g. a hypercall
> >>>>> exiting bitmap or hypercall exit list. If we are taking the hypercall
> >>>>> exit route, we can drop the kvm side of the hypercall.
> >>> I don't think this is a good candidate for arbitrary hypercall interception.  Or
> >>> rather, I think hypercall interception should be an orthogonal implementation.
> >>>
> >>> The guest, including guest firmware, needs to be aware that the hypercall is
> >>> supported, and the ABI needs to be well-defined.  Relying on userspace VMMs to
> >>> implement a common ABI is an unnecessary risk.
> >>>
> >>> We could make KVM's default behavior be a nop, i.e. have KVM enforce the ABI but
> >>> require further VMM intervention.  But, I just don't see the point, it would
> >>> save only a few lines of code.  It would also limit what KVM could do in the
> >>> future, e.g. if KVM wanted to do its own bookkeeping _and_ exit to userspace,
> >>> then mandatory interception would essentially make it impossible for KVM to do
> >>> bookkeeping while still honoring the interception request.
> >>>
> >>> However, I do think it would make sense to have the userspace exit be a generic
> >>> exit type.  But hey, we already have the necessary ABI defined for that!  It's
> >>> just not used anywhere.
> >>>
> >>> 	/* KVM_EXIT_HYPERCALL */
> >>> 	struct {
> >>> 		__u64 nr;
> >>> 		__u64 args[6];
> >>> 		__u64 ret;
> >>> 		__u32 longmode;
> >>> 		__u32 pad;
> >>> 	} hypercall;
> >>>
> >>>
> >>>>> Userspace could also handle the MSR using MSR filters (would need to
> >>>>> confirm that).  Then userspace could also be in control of the cpuid bit.
> >>> An MSR is not a great fit; it's x86 specific and limited to 64 bits of data.
> >>> The data limitation could be fudged by shoving data into non-standard GPRs, but
> >>> that will result in truly heinous guest code, and extensibility issues.
> >>>
> >>> The data limitation is a moot point, because the x86-only thing is a deal
> >>> breaker.  arm64's pKVM work has a near-identical use case for a guest to share
> >>> memory with a host.  I can't think of a clever way to avoid having to support
> >>> TDX's and SNP's hypervisor-agnostic variants, but we can at least not have
> >>> multiple KVM variants.
> >>>
> >> Potentially, there is another reason for in-kernel hypercall handling
> >> considering SEV-SNP. In case of SEV-SNP the RMP table tracks the state
> >> of each guest page, for instance pages in hypervisor state, i.e., pages
> >> with C=0 and pages in guest valid state with C=1.
> >>
> >> Now, there shouldn't be a need for page encryption status hypercalls on 
> >> SEV-SNP as KVM can track & reference guest page status directly using 
> >> the RMP table.
> > Relying on the RMP table itself would require locking the RMP table for an
> > extended duration, and walking the entire RMP to find shared pages would be
> > very inefficient.
> >
> >> As KVM maintains the RMP table, therefore we will need SET/GET type of
> >> interfaces to provide the guest page encryption status to userspace.
> > Hrm, somehow I temporarily forgot about SNP and TDX adding their own hypercalls
> > for converting between shared and private.  And in the case of TDX, the hypercall
> > can't be trusted, i.e. is just a hint, otherwise the guest could induce a #MC in
> > the host.
> >
> > But, the different guest behavior doesn't require KVM to maintain a list/tree,
> > e.g. adding a dedicated KVM_EXIT_* for notifying userspace of page encryption
> > status changes would also suffice.  
> >
> > Actually, that made me think of another argument against maintaining a list in
> > KVM: there's no way to notify userspace that a page's status has changed.
> > Userspace would need to query KVM to do GET_LIST after every GET_DIRTY.
> > Obviously not a huge issue, but it does make migration slightly less efficient.
> >
> > On a related topic, there are fatal race conditions that will require careful
> > coordination between guest and host, and will effectively be wired into the ABI.
> > SNP and TDX don't suffer these issues because host awareness of status is atomic
> > with respect to the guest actually writing the page with the new encryption
> > status.
> >
> > For SEV live migration...
> >
> > If the guest does the hypercall after writing the page, then the guest is hosed
> > if it gets migrated while writing the page (scenario #1):
> >
> >   vCPU                 Userspace
> >   zero_bytes[0:N]
> >                        <transfers written bytes as private instead of shared>
> > 		       <migrates vCPU>
> >   zero_bytes[N+1:4095]
> >   set_shared (dest)
> >   kaboom!
> 
> 
> Maybe I am missing something, this is not any different from a normal
> operation inside a guest. Making a page shared/private in the page table
> does not update the content of the page itself. In your above case, I
> assume zero_bytes[N+1:4095] are written by the destination VM. The
> memory region was private in the source VM page table, so, those writes
> will be performed encrypted. The destination VM later changed the memory
> to shared, but nobody wrote to the memory after it has been transitioned
> to the  shared, so a reader of the memory should get ciphertext and
> unless there was a write after the set_shared (dest).
> 
> 
> > If userspace does GET_DIRTY after GET_LIST, then the host would transfer bad
> > data by consuming a stale list (scenario #2):
> >
> >   vCPU               Userspace
> >                      get_list (from KVM or internally)
> >   set_shared (src)
> >   zero_page (src)
> >                      get_dirty
> >                      <transfers private data instead of shared>
> >                      <migrates vCPU>
> >   kaboom!
> 
> 
> I don't remember how things are done in recent Ashish Qemu/KVM patches
> but in previous series, the get_dirty() happens before the querying the
> encrypted state. There was some logic in VMM to resync the encrypted
> bitmap during the final migration stage and perform any additional data
> transfer since last sync.
> 
> 

Yes, we do that and in fact, we added logic in VMM to resync the
encrypted bitmap after every migration iteration and if there is a
difference in encrypted page states, then we perform additional data
transfers corresponding to those changes.

Thanks,
Ashish


  reply	other threads:[~2021-03-08 21:34 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-04  0:35 [PATCH v10 00/17] Add AMD SEV guest live migration support Ashish Kalra
2021-02-04  0:36 ` [PATCH v10 01/16] KVM: SVM: Add KVM_SEV SEND_START command Ashish Kalra
2021-02-04  0:36 ` [PATCH v10 02/16] KVM: SVM: Add KVM_SEND_UPDATE_DATA command Ashish Kalra
2021-02-04  0:37 ` [PATCH v10 03/16] KVM: SVM: Add KVM_SEV_SEND_FINISH command Ashish Kalra
2021-02-04  0:37 ` [PATCH v10 04/16] KVM: SVM: Add support for KVM_SEV_RECEIVE_START command Ashish Kalra
2021-02-04  0:37 ` [PATCH v10 05/16] KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command Ashish Kalra
2021-02-04  0:37 ` [PATCH v10 06/16] KVM: SVM: Add KVM_SEV_RECEIVE_FINISH command Ashish Kalra
2021-02-04  0:38 ` [PATCH v10 07/16] KVM: x86: Add AMD SEV specific Hypercall3 Ashish Kalra
2021-02-04  0:38 ` [PATCH v10 08/16] KVM: X86: Introduce KVM_HC_PAGE_ENC_STATUS hypercall Ashish Kalra
2021-02-04 16:03   ` Tom Lendacky
2021-02-05  1:44   ` Steve Rutherford
2021-02-05  3:32     ` Ashish Kalra
2021-02-04  0:39 ` [PATCH v10 09/16] mm: x86: Invoke hypercall when page encryption status is changed Ashish Kalra
2021-02-04  0:39 ` [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST ioctl Ashish Kalra
2021-02-04 16:14   ` Tom Lendacky
2021-02-04 16:34     ` Ashish Kalra
2021-02-17  1:03   ` Sean Christopherson
2021-02-17 14:00     ` Kalra, Ashish
2021-02-17 16:13       ` Sean Christopherson
2021-02-18  6:48         ` Kalra, Ashish
2021-02-18 16:39           ` Sean Christopherson
2021-02-18 17:05             ` Kalra, Ashish
2021-02-18 17:50               ` Sean Christopherson
2021-02-18 18:32     ` Kalra, Ashish
2021-02-24 17:51       ` Ashish Kalra
2021-02-24 18:22         ` Sean Christopherson
2021-02-25 20:20           ` Ashish Kalra
2021-02-25 22:59             ` Steve Rutherford
2021-02-25 23:24               ` Steve Rutherford
2021-02-26 14:04               ` Ashish Kalra
2021-02-26 17:44                 ` Sean Christopherson
2021-03-02 14:55                   ` Ashish Kalra
2021-03-02 15:15                     ` Ashish Kalra
2021-03-03 18:54                     ` Will Deacon
2021-03-03 19:32                       ` Ashish Kalra
2021-03-09 19:10                       ` Ashish Kalra
2021-03-11 18:14                       ` Ashish Kalra
2021-03-11 20:48                         ` Steve Rutherford
2021-03-19 17:59                           ` Ashish Kalra
2021-04-02  1:40                             ` Steve Rutherford
2021-04-02 11:09                               ` Ashish Kalra
2021-03-08 10:40                   ` Ashish Kalra
2021-03-08 19:51                     ` Sean Christopherson
2021-03-08 21:05                       ` Ashish Kalra
2021-03-08 21:11                       ` Brijesh Singh
2021-03-08 21:32                         ` Ashish Kalra [this message]
2021-03-08 21:51                         ` Steve Rutherford
2021-03-09 19:42                           ` Sean Christopherson
2021-03-10  3:42                           ` Kalra, Ashish
2021-03-10  3:47                             ` Steve Rutherford
2021-03-08 21:48                       ` Steve Rutherford
2021-02-17  1:06   ` Sean Christopherson
2021-02-04  0:39 ` [PATCH v10 11/16] KVM: x86: Introduce KVM_SET_SHARED_PAGES_LIST ioctl Ashish Kalra
2021-02-04  0:39 ` [PATCH v10 12/16] KVM: x86: Introduce new KVM_FEATURE_SEV_LIVE_MIGRATION feature & Custom MSR Ashish Kalra
2021-02-05  0:56   ` Steve Rutherford
2021-02-05  3:07     ` Ashish Kalra
2021-02-06  2:54       ` Steve Rutherford
2021-02-06  4:49         ` Ashish Kalra
2021-02-06  5:46         ` Ashish Kalra
2021-02-06 13:56           ` Ashish Kalra
2021-02-08  0:28             ` Ashish Kalra
2021-02-08 22:50               ` Steve Rutherford
2021-02-10 20:36                 ` Ashish Kalra
2021-02-10 22:01                   ` Steve Rutherford
2021-02-10 22:05                     ` Steve Rutherford
2021-02-16 23:20   ` Sean Christopherson
2021-02-04  0:40 ` [PATCH v10 13/16] EFI: Introduce the new AMD Memory Encryption GUID Ashish Kalra
2021-02-04  0:40 ` [PATCH v10 14/16] KVM: x86: Add guest support for detecting and enabling SEV Live Migration feature Ashish Kalra
2021-02-18 17:56   ` Sean Christopherson
2021-02-04  0:40 ` [PATCH v10 15/16] KVM: x86: Add kexec support for SEV Live Migration Ashish Kalra
2021-02-04  0:40 ` [PATCH v10 16/16] KVM: SVM: Bypass DBG_DECRYPT API calls for unencrypted guest memory Ashish Kalra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210308213258.GA5580@ashkalra_ubuntu_server \
    --to=ashish.kalra@amd.com \
    --cc=Thomas.Lendacky@amd.com \
    --cc=brijesh.singh@amd.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=seanjc@google.com \
    --cc=srutherford@google.com \
    --cc=venu.busireddy@oracle.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).