linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
@ 2021-06-18 21:46 Sean Christopherson
  2021-06-21 16:39 ` Paolo Bonzini
  2021-07-22  7:59 ` Maxim Levitsky
  0 siblings, 2 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-06-18 21:46 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

Calculate the max VMCS index for vmcs12 by walking the array to find the
actual max index.  Hardcoding the index is prone to bitrot, and the
calculation is only done on KVM bringup (albeit on every CPU, but there
aren't _that_ many null entries in the array).

Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---

Note, the vmx test in kvm-unit-tests will still fail using stock QEMU,
as QEMU also hardcodes and overwrites the MSR.  The test passes if I
hack KVM to ignore userspace (it was easier than rebuilding QEMU).

 arch/x86/kvm/vmx/nested.c | 37 +++++++++++++++++++++++++++++++++++--
 arch/x86/kvm/vmx/vmcs.h   |  8 ++++++++
 arch/x86/kvm/vmx/vmcs12.h |  6 ------
 3 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b531e08a095b..183fd9d62fc5 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6374,6 +6374,40 @@ void nested_vmx_set_vmcs_shadowing_bitmap(void)
 	}
 }
 
+/*
+ * Indexing into the vmcs12 uses the VMCS encoding rotated left by 6.  Undo
+ * that madness to get the encoding for comparison.
+ */
+#define VMCS12_IDX_TO_ENC(idx) ((u16)(((u16)(idx) >> 6) | ((u16)(idx) << 10)))
+
+static u64 nested_vmx_calc_vmcs_enum_msr(void)
+{
+	/*
+	 * Note these are the so called "index" of the VMCS field encoding, not
+	 * the index into vmcs12.
+	 */
+	unsigned int max_idx, idx;
+	int i;
+
+	/*
+	 * For better or worse, KVM allows VMREAD/VMWRITE to all fields in
+	 * vmcs12, regardless of whether or not the associated feature is
+	 * exposed to L1.  Simply find the field with the highest index.
+	 */
+	max_idx = 0;
+	for (i = 0; i < nr_vmcs12_fields; i++) {
+		/* The vmcs12 table is very, very sparsely populated. */
+		if (!vmcs_field_to_offset_table[i])
+			continue;
+
+		idx = vmcs_field_index(VMCS12_IDX_TO_ENC(i));
+		if (idx > max_idx)
+			max_idx = idx;
+	}
+
+	return (u64)max_idx << VMCS_FIELD_INDEX_SHIFT;
+}
+
 /*
  * nested_vmx_setup_ctls_msrs() sets up variables containing the values to be
  * returned for the various VMX controls MSRs when nested VMX is enabled.
@@ -6619,8 +6653,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
 	rdmsrl(MSR_IA32_VMX_CR0_FIXED1, msrs->cr0_fixed1);
 	rdmsrl(MSR_IA32_VMX_CR4_FIXED1, msrs->cr4_fixed1);
 
-	/* highest index: VMX_PREEMPTION_TIMER_VALUE */
-	msrs->vmcs_enum = VMCS12_MAX_FIELD_INDEX << 1;
+	msrs->vmcs_enum = nested_vmx_calc_vmcs_enum_msr();
 }
 
 void nested_vmx_hardware_unsetup(void)
diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
index 1472c6c376f7..de3b04d4b587 100644
--- a/arch/x86/kvm/vmx/vmcs.h
+++ b/arch/x86/kvm/vmx/vmcs.h
@@ -164,4 +164,12 @@ static inline int vmcs_field_readonly(unsigned long field)
 	return (((field >> 10) & 0x3) == 1);
 }
 
+#define VMCS_FIELD_INDEX_SHIFT		(1)
+#define VMCS_FIELD_INDEX_MASK		GENMASK(9, 1)
+
+static inline unsigned int vmcs_field_index(unsigned long field)
+{
+	return (field & VMCS_FIELD_INDEX_MASK) >> VMCS_FIELD_INDEX_SHIFT;
+}
+
 #endif /* __KVM_X86_VMX_VMCS_H */
diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
index bb81a23afe89..5e0e1b39f495 100644
--- a/arch/x86/kvm/vmx/vmcs12.h
+++ b/arch/x86/kvm/vmx/vmcs12.h
@@ -205,12 +205,6 @@ struct __packed vmcs12 {
  */
 #define VMCS12_SIZE		KVM_STATE_NESTED_VMX_VMCS_SIZE
 
-/*
- * VMCS12_MAX_FIELD_INDEX is the highest index value used in any
- * supported VMCS12 field encoding.
- */
-#define VMCS12_MAX_FIELD_INDEX 0x17
-
 /*
  * For save/restore compatibility, the vmcs12 field offsets must not change.
  */
-- 
2.32.0.288.g62a8d224e6-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-06-18 21:46 [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12 Sean Christopherson
@ 2021-06-21 16:39 ` Paolo Bonzini
  2021-06-21 17:08   ` Sean Christopherson
  2021-07-22  7:59 ` Maxim Levitsky
  1 sibling, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2021-06-21 16:39 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On 18/06/21 23:46, Sean Christopherson wrote:
> Calculate the max VMCS index for vmcs12 by walking the array to find the
> actual max index.  Hardcoding the index is prone to bitrot, and the
> calculation is only done on KVM bringup (albeit on every CPU, but there
> aren't _that_ many null entries in the array).
> 
> Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
> 
> Note, the vmx test in kvm-unit-tests will still fail using stock QEMU,
> as QEMU also hardcodes and overwrites the MSR.  The test passes if I
> hack KVM to ignore userspace (it was easier than rebuilding QEMU).

Queued, thanks.  Without having checked the kvm-unit-tests sources very 
thoroughly, this might be a configuration issue in kvm-unit-tests; in 
theory "-cpu host" (unlike "-cpu host,migratable=no") should not enable 
TSC scaling.

Paolo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-06-21 16:39 ` Paolo Bonzini
@ 2021-06-21 17:08   ` Sean Christopherson
  2021-07-06  3:05     ` Hu, Robert
  2021-07-21 10:02     ` Hu, Robert
  0 siblings, 2 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-06-21 17:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On Mon, Jun 21, 2021, Paolo Bonzini wrote:
> On 18/06/21 23:46, Sean Christopherson wrote:
> > Calculate the max VMCS index for vmcs12 by walking the array to find the
> > actual max index.  Hardcoding the index is prone to bitrot, and the
> > calculation is only done on KVM bringup (albeit on every CPU, but there
> > aren't _that_ many null entries in the array).
> > 
> > Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> > 
> > Note, the vmx test in kvm-unit-tests will still fail using stock QEMU,
> > as QEMU also hardcodes and overwrites the MSR.  The test passes if I
> > hack KVM to ignore userspace (it was easier than rebuilding QEMU).
> 
> Queued, thanks.  Without having checked the kvm-unit-tests sources very
> thoroughly, this might be a configuration issue in kvm-unit-tests; in theory
> "-cpu host" (unlike "-cpu host,migratable=no") should not enable TSC
> scaling.

As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined fields,
whether or not the field should actually exist for the vCPU model doesn't enter
into the equation.  That's technically wrong as there are a number of fields
that the SDM explicitly states exist iff a certain feature is supported.  To fix
that we'd need to add a "feature flag" to vmcs_field_to_offset_table that is
checked against the vCPU model, though updating the MSR would probably fall onto
userspace's shoulders?

And FWIW, this is the QEMU code:

  #define VMCS12_MAX_FIELD_INDEX (0x17)

  static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
  {
      ...

      /*
       * Just to be safe, write these with constant values.  The CRn_FIXED1
       * MSRs are generated by KVM based on the vCPU's CPUID.
       */
      kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
                        CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
      kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
                        CR4_VMXE_MASK);
      kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
                        VMCS12_MAX_FIELD_INDEX << 1);
  }


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-06-21 17:08   ` Sean Christopherson
@ 2021-07-06  3:05     ` Hu, Robert
  2021-07-06  5:42       ` Paolo Bonzini
  2021-07-21 10:02     ` Hu, Robert
  1 sibling, 1 reply; 10+ messages in thread
From: Hu, Robert @ 2021-07-06  3:05 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel



> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Tuesday, June 22, 2021 01:08
> To: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>; Wanpeng Li
> <wanpengli@tencent.com>; Jim Mattson <jmattson@google.com>; Joerg
> Roedel <joro@8bytes.org>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for
> vmcs12
> 
> On Mon, Jun 21, 2021, Paolo Bonzini wrote:
> > On 18/06/21 23:46, Sean Christopherson wrote:
> > > Calculate the max VMCS index for vmcs12 by walking the array to find
> > > the actual max index.  Hardcoding the index is prone to bitrot, and
> > > the calculation is only done on KVM bringup (albeit on every CPU,
> > > but there aren't _that_ many null entries in the array).
> > >
> > > Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in
> > > VMCS12")
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >
> > > Note, the vmx test in kvm-unit-tests will still fail using stock
> > > QEMU, as QEMU also hardcodes and overwrites the MSR.  The test
> > > passes if I hack KVM to ignore userspace (it was easier than rebuilding
> QEMU).
> >
> > Queued, thanks.  Without having checked the kvm-unit-tests sources
> > very thoroughly, this might be a configuration issue in
> > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > host,migratable=no") should not enable TSC scaling.
> 
> As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined
> fields, whether or not the field should actually exist for the vCPU model doesn't
> enter into the equation.  That's technically wrong as there are a number of
> fields that the SDM explicitly states exist iff a certain feature is supported.  To
> fix that we'd need to add a "feature flag" to vmcs_field_to_offset_table that is
> checked against the vCPU model, though updating the MSR would probably fall
> onto userspace's shoulders?
[Hu, Robert] 
Perhaps more easier and proper to do this in KVM side.
QEMU sets actual feature set down to KVM, and KVM updates IA32_VMX_VMCS_ENUM
MSR accordingly. We don't see a channel that QEMU constructs a VMCS and sets a whole
to KVM.

> 
> And FWIW, this is the QEMU code:
> 
>   #define VMCS12_MAX_FIELD_INDEX (0x17)
> 
>   static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
>   {
>       ...
> 
>       /*
>        * Just to be safe, write these with constant values.  The CRn_FIXED1
>        * MSRs are generated by KVM based on the vCPU's CPUID.
>        */
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
>                         CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
>                         CR4_VMXE_MASK);
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
>                         VMCS12_MAX_FIELD_INDEX << 1);
>   }


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-07-06  3:05     ` Hu, Robert
@ 2021-07-06  5:42       ` Paolo Bonzini
  0 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2021-07-06  5:42 UTC (permalink / raw)
  To: Hu, Robert, Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On 06/07/21 05:05, Hu, Robert wrote:
>> As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined
>> fields, whether or not the field should actually exist for the vCPU model doesn't
>> enter into the equation.  That's technically wrong as there are a number of
>> fields that the SDM explicitly states exist iff a certain feature is supported.  To
>> fix that we'd need to add a "feature flag" to vmcs_field_to_offset_table that is
>> checked against the vCPU model, though updating the MSR would probably fall
>> onto userspace's shoulders?
> 
> [Hu, Robert]
> Perhaps more easier and proper to do this in KVM side.
> QEMU sets actual feature set down to KVM, and KVM updates IA32_VMX_VMCS_ENUM
> MSR accordingly. We don't see a channel that QEMU constructs a VMCS and sets a whole
> to KVM.

Yes, it's possible to do that too.  If that is included in Linux 5.14, 
we can remove it from QEMU.

Paolo


^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-06-21 17:08   ` Sean Christopherson
  2021-07-06  3:05     ` Hu, Robert
@ 2021-07-21 10:02     ` Hu, Robert
  2021-07-21 16:18       ` Sean Christopherson
  1 sibling, 1 reply; 10+ messages in thread
From: Hu, Robert @ 2021-07-21 10:02 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Robert Hoo

> -----Original Message-----
> From: Sean Christopherson <seanjc@google.com>
> Sent: Tuesday, June 22, 2021 01:08
> To: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Vitaly Kuznetsov <vkuznets@redhat.com>; Wanpeng Li
> <wanpengli@tencent.com>; Jim Mattson <jmattson@google.com>; Joerg
> Roedel <joro@8bytes.org>; kvm@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for
> vmcs12
> 
> On Mon, Jun 21, 2021, Paolo Bonzini wrote:
> > On 18/06/21 23:46, Sean Christopherson wrote:
> > > Calculate the max VMCS index for vmcs12 by walking the array to find
> > > the actual max index.  Hardcoding the index is prone to bitrot, and
> > > the calculation is only done on KVM bringup (albeit on every CPU,
> > > but there aren't _that_ many null entries in the array).
> > >
> > > Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in
> > > VMCS12")
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> > >
> > > Note, the vmx test in kvm-unit-tests will still fail using stock
> > > QEMU, as QEMU also hardcodes and overwrites the MSR.  The test
> > > passes if I hack KVM to ignore userspace (it was easier than rebuilding
> QEMU).
> >
> > Queued, thanks.  Without having checked the kvm-unit-tests sources
> > very thoroughly, this might be a configuration issue in
> > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > host,migratable=no") should not enable TSC scaling.
> 
> As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined
> fields, whether or not the field should actually exist for the vCPU model doesn't
> enter into the equation.  That's technically wrong as there are a number of
> fields that the SDM explicitly states exist iff a certain feature is supported.  
[Hu, Robert] 
It's right that a number of fields' existence depends on some certain feature.
Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest index
value used in the encoding of any field *supported* by the processor", rather than
*existed*.
So my understanding is no matter what VMCS exec control field's value is set, value
of IA32_VMX_VMCS_ENUM shall not be affected, as it reports the physical CPU's capability
rather than runtime VMCS configuration.
Back to nested case, L1's VMCS configuration lays the "physical" capability for L2, right?
nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put to vcpu
scope, rather than current kvm global.
Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage, before vcpu features
are settled. I think should be moved to later stage as well.
 
> To fix that we'd need to add a "feature flag" to vmcs_field_to_offset_table that is
> checked against the vCPU model, though updating the MSR would probably fall
> onto userspace's shoulders?
> 
> And FWIW, this is the QEMU code:
> 
>   #define VMCS12_MAX_FIELD_INDEX (0x17)
> 
>   static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
>   {
>       ...
> 
>       /*
>        * Just to be safe, write these with constant values.  The CRn_FIXED1
>        * MSRs are generated by KVM based on the vCPU's CPUID.
>        */
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
>                         CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
>                         CR4_VMXE_MASK);
>       kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
>                         VMCS12_MAX_FIELD_INDEX << 1);
>   }


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-07-21 10:02     ` Hu, Robert
@ 2021-07-21 16:18       ` Sean Christopherson
  2021-07-22  2:43         ` Robert Hoo
  0 siblings, 1 reply; 10+ messages in thread
From: Sean Christopherson @ 2021-07-21 16:18 UTC (permalink / raw)
  To: Hu, Robert
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Robert Hoo

On Wed, Jul 21, 2021, Hu, Robert wrote:
> > > Queued, thanks.  Without having checked the kvm-unit-tests sources
> > > very thoroughly, this might be a configuration issue in
> > > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > > host,migratable=no") should not enable TSC scaling.
> > 
> > As noted in the code comments, KVM allows VMREAD/VMWRITE to all defined
> > fields, whether or not the field should actually exist for the vCPU model doesn't
> > enter into the equation.  That's technically wrong as there are a number of
> > fields that the SDM explicitly states exist iff a certain feature is supported.  
>
> It's right that a number of fields' existence depends on some certain feature.
> Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest index
> value used in the encoding of any field *supported* by the processor", rather than
> *existed*.

Yes.

> So my understanding is no matter what VMCS exec control field's value is set,
> value of IA32_VMX_VMCS_ENUM shall not be affected, as it reports the physical
> CPU's capability rather than runtime VMCS configuration.

Yes.

> Back to nested case, L1's VMCS configuration lays the "physical" capability
> for L2, right?

Yes. 

> nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put to vcpu
> scope, rather than current kvm global.
>
> Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage, before
> vcpu features are settled. I think should be moved to later stage as well.

Just moving the helper (or adding another call) wouldn't fix the underlying
problem that KVM doesn't correctly model the interplay between VMX features and
VMCS fields.  It's arguably less wrong than letting userspace stuff an incorrect
value, but it's not 100% correct and ignoring/overriding userspace is sketchy at
best.  As suggested below, the full fix is to fail VMREAD/VMWRITE to fields that
shouldn't exist given the vCPU model.

> > To fix that we'd need to add a "feature flag" to vmcs_field_to_offset_table
> > that is checked against the vCPU model, though updating the MSR would
> > probably fall onto userspace's shoulders?
> > 
> > And FWIW, this is the QEMU code:
> > 
> >   #define VMCS12_MAX_FIELD_INDEX (0x17)
> > 
> >   static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray f)
> >   {
> >       ...
> > 
> >       /*
> >        * Just to be safe, write these with constant values.  The CRn_FIXED1
> >        * MSRs are generated by KVM based on the vCPU's CPUID.
> >        */
> >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
> >                         CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
> >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
> >                         CR4_VMXE_MASK);
> >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
> >                         VMCS12_MAX_FIELD_INDEX << 1);
> >   }
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-07-21 16:18       ` Sean Christopherson
@ 2021-07-22  2:43         ` Robert Hoo
  0 siblings, 0 replies; 10+ messages in thread
From: Robert Hoo @ 2021-07-22  2:43 UTC (permalink / raw)
  To: Sean Christopherson, Hu, Robert
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

On Wed, 2021-07-21 at 16:18 +0000, Sean Christopherson wrote:
> On Wed, Jul 21, 2021, Hu, Robert wrote:
> > > > Queued, thanks.  Without having checked the kvm-unit-tests
> > > > sources
> > > > very thoroughly, this might be a configuration issue in
> > > > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > > > host,migratable=no") should not enable TSC scaling.
> > > 
> > > As noted in the code comments, KVM allows VMREAD/VMWRITE to all
> > > defined
> > > fields, whether or not the field should actually exist for the
> > > vCPU model doesn't
> > > enter into the equation.  That's technically wrong as there are a
> > > number of
> > > fields that the SDM explicitly states exist iff a certain feature
> > > is supported.  
> > 
> > It's right that a number of fields' existence depends on some
> > certain feature.
> > Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest
> > index
> > value used in the encoding of any field *supported* by the
> > processor", rather than
> > *existed*.
> 
> Yes.
> 
> > So my understanding is no matter what VMCS exec control field's
> > value is set,
> > value of IA32_VMX_VMCS_ENUM shall not be affected, as it reports
> > the physical
> > CPU's capability rather than runtime VMCS configuration.
> 
> Yes.
> 
> > Back to nested case, L1's VMCS configuration lays the "physical"
> > capability
> > for L2, right?
> 
> Yes. 
> 
> > nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put
> > to vcpu
> > scope, rather than current kvm global.
> > 
> > Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage,
> > before
> > vcpu features are settled. I think should be moved to later stage
> > as well.
> 
> Just moving the helper (or adding another call) wouldn't fix the
> underlying
> problem that KVM doesn't correctly model the interplay between VMX
> features and
> VMCS fields.  It's arguably less wrong than letting userspace stuff
> an incorrect
> value, but it's not 100% correct and ignoring/overriding userspace is
> sketchy at
> best.  
I think IA32_VMX_VMCS_ENUM MSR shall not be set by QEMU, it is actually
already indirectly set by user space CPU feature set.

> As suggested below, the full fix is to fail VMREAD/VMWRITE to fields
> that
> shouldn't exist given the vCPU model.
> 
> > > To fix that we'd need to add a "feature flag" to
> > > vmcs_field_to_offset_table
> > > that is checked against the vCPU model, though updating the MSR
> > > would
> > > probably fall onto userspace's shoulders?
> > > 
> > > And FWIW, this is the QEMU code:
> > > 
> > >   #define VMCS12_MAX_FIELD_INDEX (0x17)
> > > 
> > >   static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray
> > > f)
> > >   {
> > >       ...
> > > 
> > >       /*
> > >        * Just to be safe, write these with constant values.  The
> > > CRn_FIXED1
> > >        * MSRs are generated by KVM based on the vCPU's CPUID.
> > >        */
> > >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
> > >                         CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
> > >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
> > >                         CR4_VMXE_MASK);
> > >       kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
> > >                         VMCS12_MAX_FIELD_INDEX << 1);
> > >   }


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-06-18 21:46 [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12 Sean Christopherson
  2021-06-21 16:39 ` Paolo Bonzini
@ 2021-07-22  7:59 ` Maxim Levitsky
  2021-07-22 15:04   ` Sean Christopherson
  1 sibling, 1 reply; 10+ messages in thread
From: Maxim Levitsky @ 2021-07-22  7:59 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel

On Fri, 2021-06-18 at 14:46 -0700, Sean Christopherson wrote:
> Calculate the max VMCS index for vmcs12 by walking the array to find the
> actual max index.  Hardcoding the index is prone to bitrot, and the
> calculation is only done on KVM bringup (albeit on every CPU, but there
> aren't _that_ many null entries in the array).
> 
> Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Could you give me an example on how this fails in the KVM unit tests?
I have a bug report here and it might be related so I want to save some
time triaging it.

Best regards,
	Maxim Levitsky

> ---
> 
> Note, the vmx test in kvm-unit-tests will still fail using stock QEMU,
> as QEMU also hardcodes and overwrites the MSR.  The test passes if I
> hack KVM to ignore userspace (it was easier than rebuilding QEMU).
> 
>  arch/x86/kvm/vmx/nested.c | 37 +++++++++++++++++++++++++++++++++++--
>  arch/x86/kvm/vmx/vmcs.h   |  8 ++++++++
>  arch/x86/kvm/vmx/vmcs12.h |  6 ------
>  3 files changed, 43 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index b531e08a095b..183fd9d62fc5 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6374,6 +6374,40 @@ void nested_vmx_set_vmcs_shadowing_bitmap(void)
>  	}
>  }
>  
> +/*
> + * Indexing into the vmcs12 uses the VMCS encoding rotated left by 6.  Undo
> + * that madness to get the encoding for comparison.
> + */
> +#define VMCS12_IDX_TO_ENC(idx) ((u16)(((u16)(idx) >> 6) | ((u16)(idx) << 10)))
> +
> +static u64 nested_vmx_calc_vmcs_enum_msr(void)
> +{
> +	/*
> +	 * Note these are the so called "index" of the VMCS field encoding, not
> +	 * the index into vmcs12.
> +	 */
> +	unsigned int max_idx, idx;
> +	int i;
> +
> +	/*
> +	 * For better or worse, KVM allows VMREAD/VMWRITE to all fields in
> +	 * vmcs12, regardless of whether or not the associated feature is
> +	 * exposed to L1.  Simply find the field with the highest index.
> +	 */
> +	max_idx = 0;
> +	for (i = 0; i < nr_vmcs12_fields; i++) {
> +		/* The vmcs12 table is very, very sparsely populated. */
> +		if (!vmcs_field_to_offset_table[i])
> +			continue;
> +
> +		idx = vmcs_field_index(VMCS12_IDX_TO_ENC(i));
> +		if (idx > max_idx)
> +			max_idx = idx;
> +	}
> +
> +	return (u64)max_idx << VMCS_FIELD_INDEX_SHIFT;
> +}
> +
>  /*
>   * nested_vmx_setup_ctls_msrs() sets up variables containing the values to be
>   * returned for the various VMX controls MSRs when nested VMX is enabled.
> @@ -6619,8 +6653,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>  	rdmsrl(MSR_IA32_VMX_CR0_FIXED1, msrs->cr0_fixed1);
>  	rdmsrl(MSR_IA32_VMX_CR4_FIXED1, msrs->cr4_fixed1);
>  
> -	/* highest index: VMX_PREEMPTION_TIMER_VALUE */
> -	msrs->vmcs_enum = VMCS12_MAX_FIELD_INDEX << 1;
> +	msrs->vmcs_enum = nested_vmx_calc_vmcs_enum_msr();
>  }
>  
>  void nested_vmx_hardware_unsetup(void)
> diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h
> index 1472c6c376f7..de3b04d4b587 100644
> --- a/arch/x86/kvm/vmx/vmcs.h
> +++ b/arch/x86/kvm/vmx/vmcs.h
> @@ -164,4 +164,12 @@ static inline int vmcs_field_readonly(unsigned long field)
>  	return (((field >> 10) & 0x3) == 1);
>  }
>  
> +#define VMCS_FIELD_INDEX_SHIFT		(1)
> +#define VMCS_FIELD_INDEX_MASK		GENMASK(9, 1)
> +
> +static inline unsigned int vmcs_field_index(unsigned long field)
> +{
> +	return (field & VMCS_FIELD_INDEX_MASK) >> VMCS_FIELD_INDEX_SHIFT;
> +}
> +
>  #endif /* __KVM_X86_VMX_VMCS_H */
> diff --git a/arch/x86/kvm/vmx/vmcs12.h b/arch/x86/kvm/vmx/vmcs12.h
> index bb81a23afe89..5e0e1b39f495 100644
> --- a/arch/x86/kvm/vmx/vmcs12.h
> +++ b/arch/x86/kvm/vmx/vmcs12.h
> @@ -205,12 +205,6 @@ struct __packed vmcs12 {
>   */
>  #define VMCS12_SIZE		KVM_STATE_NESTED_VMX_VMCS_SIZE
>  
> -/*
> - * VMCS12_MAX_FIELD_INDEX is the highest index value used in any
> - * supported VMCS12 field encoding.
> - */
> -#define VMCS12_MAX_FIELD_INDEX 0x17
> -
>  /*
>   * For save/restore compatibility, the vmcs12 field offsets must not change.
>   */



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12
  2021-07-22  7:59 ` Maxim Levitsky
@ 2021-07-22 15:04   ` Sean Christopherson
  0 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-07-22 15:04 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel

On Thu, Jul 22, 2021, Maxim Levitsky wrote:
> On Fri, 2021-06-18 at 14:46 -0700, Sean Christopherson wrote:
> > Calculate the max VMCS index for vmcs12 by walking the array to find the
> > actual max index.  Hardcoding the index is prone to bitrot, and the
> > calculation is only done on KVM bringup (albeit on every CPU, but there
> > aren't _that_ many null entries in the array).
> > 
> > Fixes: 3c0f99366e34 ("KVM: nVMX: Add a TSC multiplier field in VMCS12")
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> 
> Could you give me an example on how this fails in the KVM unit tests?
> I have a bug report here and it might be related so I want to save some
> time triaging it.

FAIL: VMX_VMCS_ENUM.MAX_INDEX expected: 19, actual: 17

FWIW, unless a kernel/hypervisor is sanity checking VMREAD/VMWRITE or doing something
clever with the MAX_INDEX, I wouldn't expect this to cause any real world failures.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-07-22 15:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-18 21:46 [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12 Sean Christopherson
2021-06-21 16:39 ` Paolo Bonzini
2021-06-21 17:08   ` Sean Christopherson
2021-07-06  3:05     ` Hu, Robert
2021-07-06  5:42       ` Paolo Bonzini
2021-07-21 10:02     ` Hu, Robert
2021-07-21 16:18       ` Sean Christopherson
2021-07-22  2:43         ` Robert Hoo
2021-07-22  7:59 ` Maxim Levitsky
2021-07-22 15:04   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).