kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: vmx/nested: avoid blindly setting SECONDARY_EXEC_ENCLS_EXITING when sgx is enabled
@ 2022-10-24 12:48 Emanuele Giuseppe Esposito
  2022-10-24 16:52 ` Sean Christopherson
  0 siblings, 1 reply; 6+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-10-24 12:48 UTC (permalink / raw)
  To: kvm
  Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Bandan Das,
	linux-kernel, Emanuele Giuseppe Esposito

Currently vmx enables SECONDARY_EXEC_ENCLS_EXITING even when sgx
is not set in the host MSR.
This was probably introduced when sgx was not yet fully supported, and
we wanted to trap guests trying to use the feature.

When booting a guest, KVM checks that the cpuid bit is actually set
in vmx.c, and if not, it does not enable the feature.

However, in nesting this control bit is blindly copied, and will be
propagated to VMCS12 and VMCS02. Therefore, when L1 tries to boot
the guest, the host will try to execute VMLOAD with VMCS02 containing
a feature that the hardware does not support, making it fail with
hardware error 0x7.

According with section A.3.3 of Intel System Programming Guide,
we should *always* check the value in the actual
MSR_IA32_VMX_PROCBASED_CTLS2 before enabling this bit.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 8f67a9c4a287..f651084010cc 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6678,6 +6678,25 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
 	return (u64)max_idx << VMCS_FIELD_INDEX_SHIFT;
 }
 
+/*
+ * According with section A.3.3 of Intel System Programming Guide,
+ * we *can* set the guest MSR control X (in our case
+ * SECONDARY_EXEC_ENCLS_EXITING) *iff* bit 32+X of
+ * MSR_IA32_VMX_PROCBASED_CTLS2 is set to 1.
+ * Otherwise it must remain zero.
+ */
+static void nested_vmx_setup_encls_exiting(struct nested_vmx_msrs *msrs)
+{
+	u32 vmx_msr_procb_low, vmx_msr_procb_high;
+
+	rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2, vmx_msr_procb_low, vmx_msr_procb_high);
+
+	WARN_ON(vmx_msr_procb_low & SECONDARY_EXEC_ENCLS_EXITING);
+
+	if (enable_sgx && (vmx_msr_procb_high & SECONDARY_EXEC_ENCLS_EXITING))
+		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
+}
+
 /*
  * nested_vmx_setup_ctls_msrs() sets up variables containing the values to be
  * returned for the various VMX controls MSRs when nested VMX is enabled.
@@ -6874,8 +6893,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
 		msrs->secondary_ctls_high |=
 			SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
 
-	if (enable_sgx)
-		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
+	nested_vmx_setup_encls_exiting(msrs);
 
 	/* miscellaneous data */
 	msrs->misc_low = (u32)vmcs_conf->misc & VMX_MISC_SAVE_EFER_LMA;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: vmx/nested: avoid blindly setting SECONDARY_EXEC_ENCLS_EXITING when sgx is enabled
  2022-10-24 12:48 [PATCH] KVM: vmx/nested: avoid blindly setting SECONDARY_EXEC_ENCLS_EXITING when sgx is enabled Emanuele Giuseppe Esposito
@ 2022-10-24 16:52 ` Sean Christopherson
  2022-10-25 12:36   ` Emanuele Giuseppe Esposito
  0 siblings, 1 reply; 6+ messages in thread
From: Sean Christopherson @ 2022-10-24 16:52 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito
  Cc: kvm, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Bandan Das,
	linux-kernel

"nVMX" instead of "vmx/nested"

On Mon, Oct 24, 2022, Emanuele Giuseppe Esposito wrote:
> Currently vmx enables SECONDARY_EXEC_ENCLS_EXITING even when sgx
> is not set in the host MSR.
> This was probably introduced when sgx was not yet fully supported, and
> we wanted to trap guests trying to use the feature.

Nah, it's just a boneheaded bug.

> When booting a guest, KVM checks that the cpuid bit is actually set
> in vmx.c, and if not, it does not enable the feature.

The CPUID thing is a red herring.  That's an _additional_ restriction, KVM honors
MSR_IA32_VMX_PROCBASED_CTLS2 when configuring vmcs01.  See adjust_vmx_controls()
for secondary controls in setup_vmcs_config().

> However, in nesting this control bit is blindly copied, and will be

It's not "copied", KVM sets the bit in the nVMX MSR irrespective of host support,
which is the problem.

> propagated to VMCS12 and VMCS02. Therefore, when L1 tries to boot
> the guest, the host will try to execute VMLOAD with VMCS02 containing
> a feature that the hardware does not support, making it fail with
> hardware error 0x7.
> 
> According with section A.3.3 of Intel System Programming Guide,
> we should *always* check the value in the actual

s/we/software

> MSR_IA32_VMX_PROCBASED_CTLS2 before enabling this bit.
> 
> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128
> 

Fixes: 72add915fbd5 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC")
Cc: stable@vger.kernel.org

> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 8f67a9c4a287..f651084010cc 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6678,6 +6678,25 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
>  	return (u64)max_idx << VMCS_FIELD_INDEX_SHIFT;
>  }
>  
> +/*
> + * According with section A.3.3 of

Avoid referencing sections/tables by "number" in comments (and changelogs), as the
comment is all but guaranteed to become stale because future versions of the SDM
will shift things around.

The slightly more robust way to reference a specific SDM/APM section is to use the
title, e.g. According to section "Secondary Processor-Based VM-Execution Controls"
in Intel's SDM ...", as hardware vendors are less likely to arbitrarily rename
sections and tables.  It's a bit more work for readers, but any decent PDF viewer
can search these days.

> Intel System Programming Guide

KVM typically uses "Intel's SDM" (and "AMD's APM").  Like "VMX" or "SVM", it's ok
to use the SDM acronym without introducing since "SDM" is 

> + * we *can* set the guest MSR control X (in our case

Avoid pronouns in comments.  "we" and "us" are ambiguous, e.g. "we" can mean KVM,
the developer, the user, etc...

> + * SECONDARY_EXEC_ENCLS_EXITING) *iff* bit 32+X of
> + * MSR_IA32_VMX_PROCBASED_CTLS2 is set to 1.
> + * Otherwise it must remain zero.

As a general rule, if you find yourself writing a comment and a helper for
something that KVM absolutely needs to get right (honoring VMX MSRs), then odds
are very good that there's a simpler/easier fix, i.e. that you're effectively
re-inventing part of the weel.

> + */
> +static void nested_vmx_setup_encls_exiting(struct nested_vmx_msrs *msrs)
> +{
> +	u32 vmx_msr_procb_low, vmx_msr_procb_high;
> +
> +	rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2, vmx_msr_procb_low, vmx_msr_procb_high);
> +
> +	WARN_ON(vmx_msr_procb_low & SECONDARY_EXEC_ENCLS_EXITING);
> +
> +	if (enable_sgx && (vmx_msr_procb_high & SECONDARY_EXEC_ENCLS_EXITING))
> +		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
> +}
> +
>  /*
>   * nested_vmx_setup_ctls_msrs() sets up variables containing the values to be
>   * returned for the various VMX controls MSRs when nested VMX is enabled.
> @@ -6874,8 +6893,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
>  		msrs->secondary_ctls_high |=
>  			SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
>  
> -	if (enable_sgx)
> -		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;

The issue here is that I, who wrote this code long, long ago, copied the pattern
used for enable_unrestricted_guest, flexpriority_enabled, etc... without fully
appreciating the logic.  Unlike those module params, enable_sgx doesn't track
hardware support, i.e. enable_sgx isn't cleared when SGX can't be enabled due to
lack of hardware support.  As a result, KVM effectively enumerates to L1 that the
control is always available, i.e. that KVM emulates ENCLS-exiting for L1, but KVM
obviously doesn't actually emulating the behavior.

Not updating enable_sgx is responsible for a second bug: vmx_set_cpu_caps() doesn't
clear the SGX bits when hardware support is unavailable.  This is a much less
problematic bug as as it only pops up if SGX is soft-disabled (the case being
handled by cpu_has_sgx()) or if SGX is supported for bare metal but not in the
VMCS (will never happen when running on bare metal, but can theoertically happen
when running in a VM).

Last but not least, KVM should ideally have module params reflect KVM's actual
configuration.

Killing all birds with one stone, simply clear enable_sgx when ENCLS-exiting isn't
supported.  The #ifdef is a little gross, but I think it's marginally less ugly
than having vmx.c define a dummy boolean.

Compile tested only...

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9dba04b6b019..65f092e4a81b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8263,6 +8263,11 @@ static __init int hardware_setup(void)
        if (!cpu_has_virtual_nmis())
                enable_vnmi = 0;
 
+#ifdef CONFIG_X86_SGX_KVM
+       if (!cpu_has_vmx_encls_vmexit())
+               enable_sgx = false;
+#endif
+
        /*
         * set_apic_access_page_addr() is used to reload apic access
         * page upon invalidation.  No need to do anything if not

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: vmx/nested: avoid blindly setting SECONDARY_EXEC_ENCLS_EXITING when sgx is enabled
  2022-10-24 16:52 ` Sean Christopherson
@ 2022-10-25 12:36   ` Emanuele Giuseppe Esposito
  0 siblings, 0 replies; 6+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-10-25 12:36 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Bandan Das,
	linux-kernel



Am 24/10/2022 um 18:52 schrieb Sean Christopherson:
> "nVMX" instead of "vmx/nested"
> 
> On Mon, Oct 24, 2022, Emanuele Giuseppe Esposito wrote:
>> Currently vmx enables SECONDARY_EXEC_ENCLS_EXITING even when sgx
>> is not set in the host MSR.
>> This was probably introduced when sgx was not yet fully supported, and
>> we wanted to trap guests trying to use the feature.
> 
> Nah, it's just a boneheaded bug.
> 
>> When booting a guest, KVM checks that the cpuid bit is actually set
>> in vmx.c, and if not, it does not enable the feature.
> 
> The CPUID thing is a red herring.  That's an _additional_ restriction, KVM honors
> MSR_IA32_VMX_PROCBASED_CTLS2 when configuring vmcs01.  See adjust_vmx_controls()
> for secondary controls in setup_vmcs_config().
> 
>> However, in nesting this control bit is blindly copied, and will be
> 
> It's not "copied", KVM sets the bit in the nVMX MSR irrespective of host support,
> which is the problem.
> 
>> propagated to VMCS12 and VMCS02. Therefore, when L1 tries to boot
>> the guest, the host will try to execute VMLOAD with VMCS02 containing
>> a feature that the hardware does not support, making it fail with
>> hardware error 0x7.
>>
>> According with section A.3.3 of Intel System Programming Guide,
>> we should *always* check the value in the actual
> 
> s/we/software
> 
>> MSR_IA32_VMX_PROCBASED_CTLS2 before enabling this bit.
>>
>> RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128
>>
> 
> Fixes: 72add915fbd5 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC")
> Cc: stable@vger.kernel.org
> 
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>  arch/x86/kvm/vmx/nested.c | 22 ++++++++++++++++++++--
>>  1 file changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index 8f67a9c4a287..f651084010cc 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -6678,6 +6678,25 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
>>  	return (u64)max_idx << VMCS_FIELD_INDEX_SHIFT;
>>  }
>>  
>> +/*
>> + * According with section A.3.3 of
> 
> Avoid referencing sections/tables by "number" in comments (and changelogs), as the
> comment is all but guaranteed to become stale because future versions of the SDM
> will shift things around.
> 
> The slightly more robust way to reference a specific SDM/APM section is to use the
> title, e.g. According to section "Secondary Processor-Based VM-Execution Controls"
> in Intel's SDM ...", as hardware vendors are less likely to arbitrarily rename
> sections and tables.  It's a bit more work for readers, but any decent PDF viewer
> can search these days.
> 
>> Intel System Programming Guide
> 
> KVM typically uses "Intel's SDM" (and "AMD's APM").  Like "VMX" or "SVM", it's ok
> to use the SDM acronym without introducing since "SDM" is 
> 
>> + * we *can* set the guest MSR control X (in our case
> 
> Avoid pronouns in comments.  "we" and "us" are ambiguous, e.g. "we" can mean KVM,
> the developer, the user, etc...
> 
>> + * SECONDARY_EXEC_ENCLS_EXITING) *iff* bit 32+X of
>> + * MSR_IA32_VMX_PROCBASED_CTLS2 is set to 1.
>> + * Otherwise it must remain zero.
> 
> As a general rule, if you find yourself writing a comment and a helper for
> something that KVM absolutely needs to get right (honoring VMX MSRs), then odds
> are very good that there's a simpler/easier fix, i.e. that you're effectively
> re-inventing part of the weel.
> 
>> + */
>> +static void nested_vmx_setup_encls_exiting(struct nested_vmx_msrs *msrs)
>> +{
>> +	u32 vmx_msr_procb_low, vmx_msr_procb_high;
>> +
>> +	rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2, vmx_msr_procb_low, vmx_msr_procb_high);
>> +
>> +	WARN_ON(vmx_msr_procb_low & SECONDARY_EXEC_ENCLS_EXITING);
>> +
>> +	if (enable_sgx && (vmx_msr_procb_high & SECONDARY_EXEC_ENCLS_EXITING))
>> +		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
>> +}
>> +
>>  /*
>>   * nested_vmx_setup_ctls_msrs() sets up variables containing the values to be
>>   * returned for the various VMX controls MSRs when nested VMX is enabled.
>> @@ -6874,8 +6893,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
>>  		msrs->secondary_ctls_high |=
>>  			SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
>>  
>> -	if (enable_sgx)
>> -		msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;
> 
> The issue here is that I, who wrote this code long, long ago, copied the pattern
> used for enable_unrestricted_guest, flexpriority_enabled, etc... without fully
> appreciating the logic.  Unlike those module params, enable_sgx doesn't track
> hardware support, i.e. enable_sgx isn't cleared when SGX can't be enabled due to
> lack of hardware support.  As a result, KVM effectively enumerates to L1 that the
> control is always available, i.e. that KVM emulates ENCLS-exiting for L1, but KVM
> obviously doesn't actually emulating the behavior.
> 
> Not updating enable_sgx is responsible for a second bug: vmx_set_cpu_caps() doesn't
> clear the SGX bits when hardware support is unavailable.  This is a much less
> problematic bug as as it only pops up if SGX is soft-disabled (the case being
> handled by cpu_has_sgx()) or if SGX is supported for bare metal but not in the
> VMCS (will never happen when running on bare metal, but can theoertically happen
> when running in a VM).
> 
> Last but not least, KVM should ideally have module params reflect KVM's actual
> configuration.
> 
> Killing all birds with one stone, simply clear enable_sgx when ENCLS-exiting isn't
> supported.  The #ifdef is a little gross, but I think it's marginally less ugly
> than having vmx.c define a dummy boolean.
> 
> Compile tested only...
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9dba04b6b019..65f092e4a81b 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -8263,6 +8263,11 @@ static __init int hardware_setup(void)
>         if (!cpu_has_virtual_nmis())
>                 enable_vnmi = 0;
>  
> +#ifdef CONFIG_X86_SGX_KVM
> +       if (!cpu_has_vmx_encls_vmexit())
> +               enable_sgx = false;
> +#endif
> +
>         /*
>          * set_apic_access_page_addr() is used to reload apic access
>          * page upon invalidation.  No need to do anything if not
> 

Thanks for all the suggestions and explanations, I am going to apply
your changes and send v2!

Emanuele


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: vmx/nested: avoid blindly setting SECONDARY_EXEC_ENCLS_EXITING when sgx is enabled
  2022-10-25 17:21 ` Sean Christopherson
@ 2022-10-27 10:29   ` Paolo Bonzini
  0 siblings, 0 replies; 6+ messages in thread
From: Paolo Bonzini @ 2022-10-27 10:29 UTC (permalink / raw)
  To: Sean Christopherson, Emanuele Giuseppe Esposito
  Cc: kvm, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	x86, H. Peter Anvin, Bandan Das, linux-kernel, stable

On 10/25/22 19:21, Sean Christopherson wrote:
> Shortlog scope is still wrong, should be "KVM: nVMX:"
> 
> The shortlog is also somewhat is misleading/confusing, as it's not at all obvious
> that "sgx enabled" means "KVM's sgx_module param is enabled" and not "SGX is enabled
> in the system".
> 
> E.g.
> 
>    KVM: nVMX: Advertise ENCLS_EXITING to L1 iff SGX is fully supported

Queued with this commit message:

---
KVM: VMX: fully disable SGX if SECONDARY_EXEC_ENCLS_EXITING unavailable

Clear enable_sgx if ENCLS-exiting is not supported, i.e. if SGX cannot be
virtualized.  When KVM is loaded, adjust_vmx_controls checks that the
bit is available before enabling the feature; however, other parts of the
code check enable_sgx and not clearing the variable caused two different
bugs, mostly affecting nested virtualization scenarios.

First, because enable_sgx remained true, SECONDARY_EXEC_ENCLS_EXITING
would be marked available in the capability MSR that are accessed by a
nested hypervisor.  KVM would then propagate the control from vmcs12
to vmcs02 even if it isn't supported by the processor, thus causing an
unexpected VM-Fail (exit code 0x7) in L1.

Second, vmx_set_cpu_caps() would not clear the SGX bits when hardware
support is unavailable.  This is a much less problematic bug as it only
happens if SGX is soft-disabled (available in the processor but hidden
in CPUID) or if SGX is supported for bare metal but not in the VMCS
(will never happen when running on bare metal, but can theoertically
happen when running in a VM).

Last but not least, this ensures that module params in sysfs reflect
KVM's actual configuration.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128
Fixes: 72add915fbd5 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Suggested-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221025123749.2201649-1-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---

The bug is strictly speaking not in nVMX, although that's where most
of the symptoms surface.

Paolo


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] KVM: vmx/nested: avoid blindly setting SECONDARY_EXEC_ENCLS_EXITING when sgx is enabled
  2022-10-25 12:37 Emanuele Giuseppe Esposito
@ 2022-10-25 17:21 ` Sean Christopherson
  2022-10-27 10:29   ` Paolo Bonzini
  0 siblings, 1 reply; 6+ messages in thread
From: Sean Christopherson @ 2022-10-25 17:21 UTC (permalink / raw)
  To: Emanuele Giuseppe Esposito
  Cc: kvm, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Bandan Das,
	linux-kernel, stable

Shortlog scope is still wrong, should be "KVM: nVMX:"

The shortlog is also somewhat is misleading/confusing, as it's not at all obvious
that "sgx enabled" means "KVM's sgx_module param is enabled" and not "SGX is enabled
in the system".

E.g.

  KVM: nVMX: Advertise ENCLS_EXITING to L1 iff SGX is fully supported


On Tue, Oct 25, 2022, Emanuele Giuseppe Esposito wrote:
> Currently vmx

s/vmx/KVM

> enables SECONDARY_EXEC_ENCLS_EXITING even when sgx is not set in the host MSR.

"sgx is not set in the host MSR" is ambiguous.  "sgx ... in the host MSR" could
easily refer to the SGX_ENABLED bit in IA32_FEATURE_CONTROL, it could refer to
the ENCLS_EXITING bit in the allowed-1 half of IA32_VMX_PROCBASED_CTLS2, etc...

In other words, please be more precise.

This statement is also wrong in that it implies that KVM _always_ sets ENCLS_EXITING,
whereas the bug is purely limited to nested virtualization.

E.g.

Clear enable_sgx if ENCLS-exiting is not supported, i.e. if SGX cannot be
virtualized.  This fixes a bug where KVM would advertise ENCLS-exiting to
L1 and propagate the control from vmcs12 to vmcs02 even if ENCLS-exiting
isn't supported in secondary execution controls, e.g. because SGX isn't
fully enabled, and thus induce an unexpected VM-Fail in L1.

> When booting a guest, KVM checks that the cpuid bit is actually set
> in vmx.c, and if not, it does not enable the feature.

Again, this is nothing to do with the failure.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] KVM: vmx/nested: avoid blindly setting SECONDARY_EXEC_ENCLS_EXITING when sgx is enabled
@ 2022-10-25 12:37 Emanuele Giuseppe Esposito
  2022-10-25 17:21 ` Sean Christopherson
  0 siblings, 1 reply; 6+ messages in thread
From: Emanuele Giuseppe Esposito @ 2022-10-25 12:37 UTC (permalink / raw)
  To: kvm
  Cc: Sean Christopherson, Paolo Bonzini, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Bandan Das,
	linux-kernel, Emanuele Giuseppe Esposito, stable

Currently vmx enables SECONDARY_EXEC_ENCLS_EXITING even when sgx
is not set in the host MSR.

When booting a guest, KVM checks that the cpuid bit is actually set
in vmx.c, and if not, it does not enable the feature.

However, in nesting this control bit is blindly set, and will be
propagated to VMCS12 and VMCS02. Therefore, when L1 tries to boot
the guest, the host will try to execute VMLOAD with VMCS02 containing
a feature that the hardware does not support, making it fail with
hardware error 0x7.

According to section "Secondary Processor-Based VM-Execution Controls"
in the Intel SDM, software should *always* check the value in the
actual MSR_IA32_VMX_PROCBASED_CTLS2 before enabling this bit.

Not updating enable_sgx is responsible for a second bug:
vmx_set_cpu_caps() doesn't clear the SGX bits when hardware support is
unavailable.  This is a much less problematic bug as it only pops up
if SGX is soft-disabled (the case being handled by cpu_has_sgx()) or if
SGX is supported for bare metal but not in the VMCS (will never happen
when running on bare metal, but can theoertically happen when running in
a VM).

Last but not least, KVM should ideally have module params reflect KVM's
actual configuration.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128

Fixes: 72add915fbd5 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC")
Cc: stable@vger.kernel.org

Suggested-by: Sean Christopherson <seanjc@google.com>
Suggested-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9dba04b6b019..ea0c65d3c08a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -8263,6 +8263,11 @@ static __init int hardware_setup(void)
 	if (!cpu_has_virtual_nmis())
 		enable_vnmi = 0;
 
+	#ifdef CONFIG_X86_SGX_KVM
+		if (!cpu_has_vmx_encls_vmexit())
+			enable_sgx = false;
+	#endif
+
 	/*
 	 * set_apic_access_page_addr() is used to reload apic access
 	 * page upon invalidation.  No need to do anything if not
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-10-27 10:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-24 12:48 [PATCH] KVM: vmx/nested: avoid blindly setting SECONDARY_EXEC_ENCLS_EXITING when sgx is enabled Emanuele Giuseppe Esposito
2022-10-24 16:52 ` Sean Christopherson
2022-10-25 12:36   ` Emanuele Giuseppe Esposito
2022-10-25 12:37 Emanuele Giuseppe Esposito
2022-10-25 17:21 ` Sean Christopherson
2022-10-27 10:29   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).