kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: x86: Allow restore of some sregs with protected state
@ 2023-03-20 22:51 Alexander Graf
  2023-03-21 17:03 ` Sean Christopherson
  0 siblings, 1 reply; 4+ messages in thread
From: Alexander Graf @ 2023-03-20 22:51 UTC (permalink / raw)
  To: kvm
  Cc: Sean Christopherson, Paolo Bonzini, Borislav Petkov, x86,
	Tom Lendacky, Michael Roth, Sabin Rapan

With protected state (like SEV-ES and SEV-SNP), KVM does not have direct
access to guest registers. However, we deflect modifications to CR0,
CR4 and EFER to the host. We also carry the apic_base register and learn
about CR8 directly from a VMCB field.

That means these bits of information do exist in the host's KVM data
structures. If we ever want to resume consumption of an already
initialized VMSA (guest state), we will need to also restore these
additional bits of KVM state.

Prepare ourselves for such a world by allowing set_sregs to set the
relevant fields. This way, it mirrors get_sregs properly that already
exposes them to user space.

Signed-off-by: Alexander Graf <graf@amazon.com>
---
 arch/x86/kvm/x86.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7713420abab0..88fa8b657a2f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11370,7 +11370,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
 	int idx;
 	struct desc_ptr dt;
 
-	if (!kvm_is_valid_sregs(vcpu, sregs))
+	if (!vcpu->arch.guest_state_protected &&
+	    !kvm_is_valid_sregs(vcpu, sregs))
 		return -EINVAL;
 
 	apic_base_msr.data = sregs->apic_base;
@@ -11378,8 +11379,19 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
 	if (kvm_set_apic_base(vcpu, &apic_base_msr))
 		return -EINVAL;
 
-	if (vcpu->arch.guest_state_protected)
+	if (vcpu->arch.guest_state_protected) {
+		/*
+		 * While most actual guest state is hidden from us,
+		 * CR{0,4,8}, efer and apic_base still hold KVM state
+		 * with protection enabled, so let's allow restoring
+		 */
+		kvm_set_cr8(vcpu, sregs->cr8);
+		kvm_x86_ops.set_efer(vcpu, sregs->efer);
+		kvm_x86_ops.set_cr0(vcpu, sregs->cr0);
+		vcpu->arch.cr0 = sregs->cr0;
+		kvm_x86_ops.set_cr4(vcpu, sregs->cr4);
 		return 0;
+	}
 
 	dt.size = sregs->idt.limit;
 	dt.address = sregs->idt.base;
-- 
2.39.2




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879




^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86: Allow restore of some sregs with protected state
  2023-03-20 22:51 [PATCH] KVM: x86: Allow restore of some sregs with protected state Alexander Graf
@ 2023-03-21 17:03 ` Sean Christopherson
  2023-03-21 18:15   ` Peter Gonda
  0 siblings, 1 reply; 4+ messages in thread
From: Sean Christopherson @ 2023-03-21 17:03 UTC (permalink / raw)
  To: Alexander Graf
  Cc: kvm, Paolo Bonzini, Borislav Petkov, x86, Tom Lendacky,
	Michael Roth, Sabin Rapan, Peter Gonda

+Peter

On Mon, Mar 20, 2023, Alexander Graf wrote:
> With protected state (like SEV-ES and SEV-SNP), KVM does not have direct
> access to guest registers. However, we deflect modifications to CR0,

Please avoid pronouns in changelogs and comments.

> CR4 and EFER to the host. We also carry the apic_base register and learn
> about CR8 directly from a VMCB field.
> 
> That means these bits of information do exist in the host's KVM data
> structures. If we ever want to resume consumption of an already
> initialized VMSA (guest state), we will need to also restore these
> additional bits of KVM state.

For some definitions of "need".  I've looked at this code multiple times in the
past, and even posted patches[1], but I'm still unconvinced that trapping
CR0, CR4, and EFER updates is necessary[2], which is partly why series to fix
this stalled out.

 : If KVM chugs along happily without these patches, I'd love to pivot and yank out
 : all of the CR0/4/8 and EFER trapping/tracking, and then make KVM_GET_SREGS a nop
 : as well.

[1] https://lore.kernel.org/all/20210507165947.2502412-1-seanjc@google.com
[2] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@google.com

> Prepare ourselves for such a world by allowing set_sregs to set the
> relevant fields. This way, it mirrors get_sregs properly that already
> exposes them to user space.
> 
> Signed-off-by: Alexander Graf <graf@amazon.com>
> ---
>  arch/x86/kvm/x86.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7713420abab0..88fa8b657a2f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11370,7 +11370,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
>  	int idx;
>  	struct desc_ptr dt;
>  
> -	if (!kvm_is_valid_sregs(vcpu, sregs))
> +	if (!vcpu->arch.guest_state_protected &&
> +	    !kvm_is_valid_sregs(vcpu, sregs))

This is broken, userspace shouldn't be allowed to stuff complete garbage just
because guest state is protected.

>  		return -EINVAL;
>  
>  	apic_base_msr.data = sregs->apic_base;
> @@ -11378,8 +11379,19 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
>  	if (kvm_set_apic_base(vcpu, &apic_base_msr))
>  		return -EINVAL;
>  
> -	if (vcpu->arch.guest_state_protected)
> +	if (vcpu->arch.guest_state_protected) {
> +		/*
> +		 * While most actual guest state is hidden from us,
> +		 * CR{0,4,8}, efer and apic_base still hold KVM state
> +		 * with protection enabled, so let's allow restoring
> +		 */
> +		kvm_set_cr8(vcpu, sregs->cr8);
> +		kvm_x86_ops.set_efer(vcpu, sregs->efer);
> +		kvm_x86_ops.set_cr0(vcpu, sregs->cr0);

Use static_call().  This code is also broken in that it doesn't set mmu_reset_needed.
This patch also misses the problematic behavior in svm_set_cr0().

And I don't like having a completely separate one-off flow for protected guests.
It's more code and arguably uglier, but I would prefer to explicitly skip each
chunk as needed so that we aren't effectively maintaining multiple flows.

Easiest thing is probably to just resurrect my aforementioned series, pending the
result of the discussion on whether or not KVM should be trapping CR0/CR4/EFER
writes in the first place.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86: Allow restore of some sregs with protected state
  2023-03-21 17:03 ` Sean Christopherson
@ 2023-03-21 18:15   ` Peter Gonda
  2023-03-21 23:46     ` Alexander Graf
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Gonda @ 2023-03-21 18:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Alexander Graf, kvm, Paolo Bonzini, Borislav Petkov, x86,
	Tom Lendacky, Michael Roth, Sabin Rapan

On Tue, Mar 21, 2023 at 11:03 AM Sean Christopherson <seanjc@google.com> wrote:
>
> +Peter
>
> On Mon, Mar 20, 2023, Alexander Graf wrote:
> > With protected state (like SEV-ES and SEV-SNP), KVM does not have direct
> > access to guest registers. However, we deflect modifications to CR0,
>
> Please avoid pronouns in changelogs and comments.
>
> > CR4 and EFER to the host. We also carry the apic_base register and learn
> > about CR8 directly from a VMCB field.
> >
> > That means these bits of information do exist in the host's KVM data
> > structures. If we ever want to resume consumption of an already
> > initialized VMSA (guest state), we will need to also restore these
> > additional bits of KVM state.
>
> For some definitions of "need".  I've looked at this code multiple times in the
> past, and even posted patches[1], but I'm still unconvinced that trapping
> CR0, CR4, and EFER updates is necessary[2], which is partly why series to fix
> this stalled out.
>
>  : If KVM chugs along happily without these patches, I'd love to pivot and yank out
>  : all of the CR0/4/8 and EFER trapping/tracking, and then make KVM_GET_SREGS a nop
>  : as well.
>
> [1] https://lore.kernel.org/all/20210507165947.2502412-1-seanjc@google.com
> [2] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@google.com

Yea we are using similar patches to do intra-host migration for SNP VMs.

I have dropped the ball on my AI from that thread. Let me look/test this patch.

>
> > Prepare ourselves for such a world by allowing set_sregs to set the
> > relevant fields. This way, it mirrors get_sregs properly that already
> > exposes them to user space.
> >
> > Signed-off-by: Alexander Graf <graf@amazon.com>
> > ---
> >  arch/x86/kvm/x86.c | 16 ++++++++++++++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 7713420abab0..88fa8b657a2f 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -11370,7 +11370,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
> >       int idx;
> >       struct desc_ptr dt;
> >
> > -     if (!kvm_is_valid_sregs(vcpu, sregs))
> > +     if (!vcpu->arch.guest_state_protected &&
> > +         !kvm_is_valid_sregs(vcpu, sregs))
>
> This is broken, userspace shouldn't be allowed to stuff complete garbage just
> because guest state is protected.

Agreed, we've been using something like this:

kvm_valid_sregs()
-       if (sregs->fer & EFER_LME && !vcpu->arch.guest_state_protected) {
+       if (sregs->efer & EFER_LME) {


>
> >               return -EINVAL;
> >
> >       apic_base_msr.data = sregs->apic_base;
> > @@ -11378,8 +11379,19 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
> >       if (kvm_set_apic_base(vcpu, &apic_base_msr))
> >               return -EINVAL;
> >
> > -     if (vcpu->arch.guest_state_protected)
> > +     if (vcpu->arch.guest_state_protected) {
> > +             /*
> > +              * While most actual guest state is hidden from us,
> > +              * CR{0,4,8}, efer and apic_base still hold KVM state
> > +              * with protection enabled, so let's allow restoring
> > +              */
> > +             kvm_set_cr8(vcpu, sregs->cr8);
> > +             kvm_x86_ops.set_efer(vcpu, sregs->efer);
> > +             kvm_x86_ops.set_cr0(vcpu, sregs->cr0);
>
> Use static_call().  This code is also broken in that it doesn't set mmu_reset_needed.
> This patch also misses the problematic behavior in svm_set_cr0().
>
> And I don't like having a completely separate one-off flow for protected guests.
> It's more code and arguably uglier, but I would prefer to explicitly skip each
> chunk as needed so that we aren't effectively maintaining multiple flows.
>
> Easiest thing is probably to just resurrect my aforementioned series, pending the
> result of the discussion on whether or not KVM should be trapping CR0/CR4/EFER
> writes in the first place.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] KVM: x86: Allow restore of some sregs with protected state
  2023-03-21 18:15   ` Peter Gonda
@ 2023-03-21 23:46     ` Alexander Graf
  0 siblings, 0 replies; 4+ messages in thread
From: Alexander Graf @ 2023-03-21 23:46 UTC (permalink / raw)
  To: Peter Gonda, Sean Christopherson
  Cc: kvm, Paolo Bonzini, Borislav Petkov, x86, Tom Lendacky,
	Michael Roth, Sabin Rapan


On 21.03.23 19:15, Peter Gonda wrote:

> On Tue, Mar 21, 2023 at 11:03 AM Sean Christopherson <seanjc@google.com> wrote:
>> +Peter
>>
>> On Mon, Mar 20, 2023, Alexander Graf wrote:
>>> With protected state (like SEV-ES and SEV-SNP), KVM does not have direct
>>> access to guest registers. However, we deflect modifications to CR0,
>> Please avoid pronouns in changelogs and comments.
>>
>>> CR4 and EFER to the host. We also carry the apic_base register and learn
>>> about CR8 directly from a VMCB field.
>>>
>>> That means these bits of information do exist in the host's KVM data
>>> structures. If we ever want to resume consumption of an already
>>> initialized VMSA (guest state), we will need to also restore these
>>> additional bits of KVM state.
>> For some definitions of "need".  I've looked at this code multiple times in the
>> past, and even posted patches[1], but I'm still unconvinced that trapping
>> CR0, CR4, and EFER updates is necessary[2], which is partly why series to fix
>> this stalled out.
>>
>>   : If KVM chugs along happily without these patches, I'd love to pivot and yank out
>>   : all of the CR0/4/8 and EFER trapping/tracking, and then make KVM_GET_SREGS a nop
>>   : as well.
>>
>> [1] https://lore.kernel.org/all/20210507165947.2502412-1-seanjc@google.com
>> [2] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@google.com
> Yea we are using similar patches to do intra-host migration for SNP VMs.
>
> I have dropped the ball on my AI from that thread. Let me look/test this patch.


Awesome, thanks. If we can get away without any of the above states and 
make sregs completely useless for protected state, I'd be even happier :)


Alex





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-21 23:46 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-20 22:51 [PATCH] KVM: x86: Allow restore of some sregs with protected state Alexander Graf
2023-03-21 17:03 ` Sean Christopherson
2023-03-21 18:15   ` Peter Gonda
2023-03-21 23:46     ` Alexander Graf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).