LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/2] Fix issue with not starting nesting guests on my system
@ 2020-05-23 16:14 Maxim Levitsky
  2020-05-23 16:14 ` [PATCH 1/2] kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities Maxim Levitsky
                   ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-23 16:14 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, H. Peter Anvin, Tao Xu, Sean Christopherson,
	Jim Mattson, linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu, Maxim Levitsky

On my AMD machine I noticed that I can't start any nested guests,\r
because nested KVM (everything from master git branches) complains\r
that it can't find msr MSR_IA32_UMWAIT_CONTROL which my system doesn't support\r
at all anyway.\r
\r
I traced it to the recently added UMWAIT support to qemu and kvm.\r
The kvm portion exposed the new MSR in KVM_GET_MSR_INDEX_LIST without\r
checking that it the underlying feature is supported in CPUID.\r
It happened to work when non nested because as a precation kvm,\r
tries to read each MSR on host before adding it to that list,\r
and when read gets a #GP it ignores it.\r
\r
When running nested, the L1 hypervisor can be set to ignore unknown\r
msr read/writes (I need this for some other guests), thus this safety\r
check doesn't work anymore.\r
\r
V2: * added a patch to setup correctly the X86_FEATURE_WAITPKG kvm capability\r
    * dropped the cosmetic fix patch as it is now fixed in kvm/queue\r
\r
Best regards,\r
	Maxim Levitsky\r
\r
Maxim Levitsky (2):\r
  kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities\r
  kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally\r
\r
 arch/x86/kvm/vmx/vmx.c | 3 +++\r
 arch/x86/kvm/x86.c     | 4 ++++\r
 2 files changed, 7 insertions(+)\r
\r
-- \r
2.26.2\r
\r


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/2] kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
  2020-05-23 16:14 [PATCH 0/2] Fix issue with not starting nesting guests on my system Maxim Levitsky
@ 2020-05-23 16:14 ` Maxim Levitsky
  2020-05-27  1:20   ` Sean Christopherson
  2020-05-23 16:14 ` [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally Maxim Levitsky
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-23 16:14 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, H. Peter Anvin, Tao Xu, Sean Christopherson,
	Jim Mattson, linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu, Maxim Levitsky

Even though we might not allow the guest to use
WAITPKG's new instructions, we should tell KVM
that the feature is supported by the host CPU.

Note that vmx_waitpkg_supported checks that WAITPKG
_can_ be set in secondary execution controls as specified
by VMX capability MSR, rather that we actually enable it for a guest.

Fixes: e69e72faa3a0 KVM: x86: Add support for user wait instructions

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/vmx.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 55712dd86bafa..fca493d4517c5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7298,6 +7298,9 @@ static __init void vmx_set_cpu_caps(void)
 	/* CPUID 0x80000001 */
 	if (!cpu_has_vmx_rdtscp())
 		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
+
+	if (vmx_waitpkg_supported())
+		kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
 }
 
 static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
-- 
2.26.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-23 16:14 [PATCH 0/2] Fix issue with not starting nesting guests on my system Maxim Levitsky
  2020-05-23 16:14 ` [PATCH 1/2] kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities Maxim Levitsky
@ 2020-05-23 16:14 ` Maxim Levitsky
  2020-05-27  1:21   ` Sean Christopherson
  2020-05-27  1:03 ` [PATCH 0/2] Fix issue with not starting nesting guests on my system Krish Sadhukhan
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-23 16:14 UTC (permalink / raw)
  To: kvm
  Cc: Paolo Bonzini, H. Peter Anvin, Tao Xu, Sean Christopherson,
	Jim Mattson, linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu, Maxim Levitsky

This msr is only available when the host supports WAITPKG feature.

This breaks a nested guest, if the L1 hypervisor is set to ignore
unknown msrs, because the only other safety check that the
kernel does is that it attempts to read the msr and
rejects it if it gets an exception.

Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/x86.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b226fb8abe41b..4752293312947 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5316,6 +5316,10 @@ static void kvm_init_msr_list(void)
 			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
 				continue;
 			break;
+		case MSR_IA32_UMWAIT_CONTROL:
+			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
+				continue;
+			break;
 		default:
 			break;
 		}
-- 
2.26.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/2] Fix issue with not starting nesting guests on my system
  2020-05-23 16:14 [PATCH 0/2] Fix issue with not starting nesting guests on my system Maxim Levitsky
  2020-05-23 16:14 ` [PATCH 1/2] kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities Maxim Levitsky
  2020-05-23 16:14 ` [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally Maxim Levitsky
@ 2020-05-27  1:03 ` Krish Sadhukhan
  2020-05-27  1:22   ` Sean Christopherson
  2020-05-27  1:13 ` Sean Christopherson
  2020-05-27 17:00 ` Paolo Bonzini
  4 siblings, 1 reply; 27+ messages in thread
From: Krish Sadhukhan @ 2020-05-27  1:03 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: Paolo Bonzini, H. Peter Anvin, Tao Xu, Sean Christopherson,
	Jim Mattson, linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu


On 5/23/20 9:14 AM, Maxim Levitsky wrote:
> On my AMD machine I noticed that I can't start any nested guests,
> because nested KVM (everything from master git branches) complains
> that it can't find msr MSR_IA32_UMWAIT_CONTROL which my system doesn't support
> at all anyway.
>
> I traced it to the recently added UMWAIT support to qemu and kvm.
> The kvm portion exposed the new MSR in KVM_GET_MSR_INDEX_LIST without
> checking that it the underlying feature is supported in CPUID.
> It happened to work when non nested because as a precation kvm,
> tries to read each MSR on host before adding it to that list,
> and when read gets a #GP it ignores it.
>
> When running nested, the L1 hypervisor can be set to ignore unknown
> msr read/writes (I need this for some other guests), thus this safety
> check doesn't work anymore.
>
> V2: * added a patch to setup correctly the X86_FEATURE_WAITPKG kvm capability
>      * dropped the cosmetic fix patch as it is now fixed in kvm/queue
>
> Best regards,
> 	Maxim Levitsky
>
> Maxim Levitsky (2):
>    kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
>    kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
>
>   arch/x86/kvm/vmx/vmx.c | 3 +++
>   arch/x86/kvm/x86.c     | 4 ++++
>   2 files changed, 7 insertions(+)
>
Nit: The added 'break' statement in patch# 2 is not required.

Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/2] Fix issue with not starting nesting guests on my system
  2020-05-23 16:14 [PATCH 0/2] Fix issue with not starting nesting guests on my system Maxim Levitsky
                   ` (2 preceding siblings ...)
  2020-05-27  1:03 ` [PATCH 0/2] Fix issue with not starting nesting guests on my system Krish Sadhukhan
@ 2020-05-27  1:13 ` Sean Christopherson
  2020-05-27 15:17   ` Maxim Levitsky
  2020-05-27 15:17   ` Maxim Levitsky
  2020-05-27 17:00 ` Paolo Bonzini
  4 siblings, 2 replies; 27+ messages in thread
From: Sean Christopherson @ 2020-05-27  1:13 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, Paolo Bonzini, H. Peter Anvin, Tao Xu, Jim Mattson,
	linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On Sat, May 23, 2020 at 07:14:53PM +0300, Maxim Levitsky wrote:
> On my AMD machine I noticed that I can't start any nested guests,
> because nested KVM (everything from master git branches) complains
> that it can't find msr MSR_IA32_UMWAIT_CONTROL which my system doesn't support
> at all anyway.
> 
> I traced it to the recently added UMWAIT support to qemu and kvm.
> The kvm portion exposed the new MSR in KVM_GET_MSR_INDEX_LIST without
> checking that it the underlying feature is supported in CPUID.
> It happened to work when non nested because as a precation kvm,
> tries to read each MSR on host before adding it to that list,
> and when read gets a #GP it ignores it.
> 
> When running nested, the L1 hypervisor can be set to ignore unknown
> msr read/writes (I need this for some other guests), thus this safety
> check doesn't work anymore.
> 
> V2: * added a patch to setup correctly the X86_FEATURE_WAITPKG kvm capability
>     * dropped the cosmetic fix patch as it is now fixed in kvm/queue
> 
> Best regards,
> 	Maxim Levitsky
> 
> Maxim Levitsky (2):
>   kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
>   kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally

Standard scoping in the shortlog is "KVM: VMX:" and "KVM: x86:".

> 
>  arch/x86/kvm/vmx/vmx.c | 3 +++
>  arch/x86/kvm/x86.c     | 4 ++++
>  2 files changed, 7 insertions(+)
> 
> -- 
> 2.26.2
> 
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
  2020-05-23 16:14 ` [PATCH 1/2] kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities Maxim Levitsky
@ 2020-05-27  1:20   ` Sean Christopherson
  2020-05-27 15:16     ` Maxim Levitsky
  0 siblings, 1 reply; 27+ messages in thread
From: Sean Christopherson @ 2020-05-27  1:20 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, Paolo Bonzini, H. Peter Anvin, Tao Xu, Jim Mattson,
	linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On Sat, May 23, 2020 at 07:14:54PM +0300, Maxim Levitsky wrote:
> Even though we might not allow the guest to use
> WAITPKG's new instructions, we should tell KVM
> that the feature is supported by the host CPU.
> 
> Note that vmx_waitpkg_supported checks that WAITPKG
> _can_ be set in secondary execution controls as specified
> by VMX capability MSR, rather that we actually enable it for a guest.

These line wraps are quite weird and inconsistent.

> 
> Fixes: e69e72faa3a0 KVM: x86: Add support for user wait instructions

Checkpatch doesn't complain,  but the preferred Fixes format is

  Fixes: e69e72faa3a07 ("KVM: x86: Add support for user wait instructions")

e.g.

  git show -s --pretty='tformat:%h ("%s")'

For the code itself:

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

> Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 55712dd86bafa..fca493d4517c5 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7298,6 +7298,9 @@ static __init void vmx_set_cpu_caps(void)
>  	/* CPUID 0x80000001 */
>  	if (!cpu_has_vmx_rdtscp())
>  		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
> +
> +	if (vmx_waitpkg_supported())
> +		kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
>  }
>  
>  static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-23 16:14 ` [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally Maxim Levitsky
@ 2020-05-27  1:21   ` Sean Christopherson
  2020-05-27 15:17     ` Maxim Levitsky
  0 siblings, 1 reply; 27+ messages in thread
From: Sean Christopherson @ 2020-05-27  1:21 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, Paolo Bonzini, H. Peter Anvin, Tao Xu, Jim Mattson,
	linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On Sat, May 23, 2020 at 07:14:55PM +0300, Maxim Levitsky wrote:
> This msr is only available when the host supports WAITPKG feature.
> 
> This breaks a nested guest, if the L1 hypervisor is set to ignore
> unknown msrs, because the only other safety check that the
> kernel does is that it attempts to read the msr and
> rejects it if it gets an exception.
> 
> Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL

Same comments on the line wraps and Fixes tag.

For the code:

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>

> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b226fb8abe41b..4752293312947 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5316,6 +5316,10 @@ static void kvm_init_msr_list(void)
>  			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>  				continue;
>  			break;
> +		case MSR_IA32_UMWAIT_CONTROL:
> +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
> +				continue;
> +			break;
>  		default:
>  			break;
>  		}
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/2] Fix issue with not starting nesting guests on my system
  2020-05-27  1:03 ` [PATCH 0/2] Fix issue with not starting nesting guests on my system Krish Sadhukhan
@ 2020-05-27  1:22   ` Sean Christopherson
  0 siblings, 0 replies; 27+ messages in thread
From: Sean Christopherson @ 2020-05-27  1:22 UTC (permalink / raw)
  To: Krish Sadhukhan
  Cc: Maxim Levitsky, kvm, Paolo Bonzini, H. Peter Anvin, Tao Xu,
	Jim Mattson, linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On Tue, May 26, 2020 at 06:03:32PM -0700, Krish Sadhukhan wrote:
> 
> On 5/23/20 9:14 AM, Maxim Levitsky wrote:
> >On my AMD machine I noticed that I can't start any nested guests,
> >because nested KVM (everything from master git branches) complains
> >that it can't find msr MSR_IA32_UMWAIT_CONTROL which my system doesn't support
> >at all anyway.
> >
> >I traced it to the recently added UMWAIT support to qemu and kvm.
> >The kvm portion exposed the new MSR in KVM_GET_MSR_INDEX_LIST without
> >checking that it the underlying feature is supported in CPUID.
> >It happened to work when non nested because as a precation kvm,
> >tries to read each MSR on host before adding it to that list,
> >and when read gets a #GP it ignores it.
> >
> >When running nested, the L1 hypervisor can be set to ignore unknown
> >msr read/writes (I need this for some other guests), thus this safety
> >check doesn't work anymore.
> >
> >V2: * added a patch to setup correctly the X86_FEATURE_WAITPKG kvm capability
> >     * dropped the cosmetic fix patch as it is now fixed in kvm/queue
> >
> >Best regards,
> >	Maxim Levitsky
> >
> >Maxim Levitsky (2):
> >   kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
> >   kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
> >
> >  arch/x86/kvm/vmx/vmx.c | 3 +++
> >  arch/x86/kvm/x86.c     | 4 ++++
> >  2 files changed, 7 insertions(+)
> >
> Nit: The added 'break' statement in patch# 2 is not required.

It is unless you want to add a fallthrough annotation.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/2] kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
  2020-05-27  1:20   ` Sean Christopherson
@ 2020-05-27 15:16     ` Maxim Levitsky
  0 siblings, 0 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-27 15:16 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, H. Peter Anvin, Tao Xu, Jim Mattson,
	linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On Tue, 2020-05-26 at 18:20 -0700, Sean Christopherson wrote:
> On Sat, May 23, 2020 at 07:14:54PM +0300, Maxim Levitsky wrote:
> > Even though we might not allow the guest to use
> > WAITPKG's new instructions, we should tell KVM
> > that the feature is supported by the host CPU.
> > 
> > Note that vmx_waitpkg_supported checks that WAITPKG
> > _can_ be set in secondary execution controls as specified
> > by VMX capability MSR, rather that we actually enable it for a
> > guest.
> 
> These line wraps are quite weird and inconsistent.
Known issue for me, I usually don't have line wrapping enabled,
and I wrap the lines a bit earlier that 72 character limit. 
I'll re-formatted the commit message to be on 72 line format and I will
try now to pay much more attention to that.

> 
> > Fixes: e69e72faa3a0 KVM: x86: Add support for user wait
> > instructions
> 
> Checkpatch doesn't complain,  but the preferred Fixes format is
> 
>   Fixes: e69e72faa3a07 ("KVM: x86: Add support for user wait
> instructions")


> 
> e.g.
> 
>   git show -s --pretty='tformat:%h ("%s")'

Got it, and added to git aliases :-)

> 
> For the code itself:
> 
> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Thank you!

> 
> > Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 55712dd86bafa..fca493d4517c5 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -7298,6 +7298,9 @@ static __init void vmx_set_cpu_caps(void)
> >  	/* CPUID 0x80000001 */
> >  	if (!cpu_has_vmx_rdtscp())
> >  		kvm_cpu_cap_clear(X86_FEATURE_RDTSCP);
> > +
> > +	if (vmx_waitpkg_supported())
> > +		kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
> >  }
> >  
> >  static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu)
> > -- 
> > 2.26.2
> > 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-27  1:21   ` Sean Christopherson
@ 2020-05-27 15:17     ` Maxim Levitsky
  0 siblings, 0 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-27 15:17 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, H. Peter Anvin, Tao Xu, Jim Mattson,
	linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On Tue, 2020-05-26 at 18:21 -0700, Sean Christopherson wrote:
> On Sat, May 23, 2020 at 07:14:55PM +0300, Maxim Levitsky wrote:
> > This msr is only available when the host supports WAITPKG feature.
> > 
> > This breaks a nested guest, if the L1 hypervisor is set to ignore
> > unknown msrs, because the only other safety check that the
> > kernel does is that it attempts to read the msr and
> > rejects it if it gets an exception.
> > 
> > Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
> 
> Same comments on the line wraps and Fixes tag.
I rewrote the commit message and I hope that the new version
is better. I fixed the 'fixes' message as well.

> 
> For the code:
> 
> Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Thank you!

Best regards,
	Maxim Levitsky


> 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  arch/x86/kvm/x86.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index b226fb8abe41b..4752293312947 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5316,6 +5316,10 @@ static void kvm_init_msr_list(void)
> >  			    min(INTEL_PMC_MAX_GENERIC,
> > x86_pmu.num_counters_gp))
> >  				continue;
> >  			break;
> > +		case MSR_IA32_UMWAIT_CONTROL:
> > +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
> > +				continue;
> > +			break;
> >  		default:
> >  			break;
> >  		}
> > -- 
> > 2.26.2
> > 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/2] Fix issue with not starting nesting guests on my system
  2020-05-27  1:13 ` Sean Christopherson
@ 2020-05-27 15:17   ` Maxim Levitsky
  2020-05-27 15:17   ` Maxim Levitsky
  1 sibling, 0 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-27 15:17 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, H. Peter Anvin, Tao Xu, Jim Mattson,
	linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On Tue, 2020-05-26 at 18:13 -0700, Sean Christopherson wrote:
> On Sat, May 23, 2020 at 07:14:53PM +0300, Maxim Levitsky wrote:
> > On my AMD machine I noticed that I can't start any nested guests,
> > because nested KVM (everything from master git branches) complains
> > that it can't find msr MSR_IA32_UMWAIT_CONTROL which my system
> > doesn't support
> > at all anyway.
> > 
> > I traced it to the recently added UMWAIT support to qemu and kvm.
> > The kvm portion exposed the new MSR in KVM_GET_MSR_INDEX_LIST
> > without
> > checking that it the underlying feature is supported in CPUID.
> > It happened to work when non nested because as a precation kvm,
> > tries to read each MSR on host before adding it to that list,
> > and when read gets a #GP it ignores it.
> > 
> > When running nested, the L1 hypervisor can be set to ignore unknown
> > msr read/writes (I need this for some other guests), thus this
> > safety
> > check doesn't work anymore.
> > 
> > V2: * added a patch to setup correctly the X86_FEATURE_WAITPKG kvm
> > capability
> >     * dropped the cosmetic fix patch as it is now fixed in
> > kvm/queue
> > 
> > Best regards,
> > 	Maxim Levitsky
> > 
> > Maxim Levitsky (2):
> >   kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
> >   kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
> 
> Standard scoping in the shortlog is "KVM: VMX:" and "KVM: x86:".
This another thing I usually mess up in the commit messages.
Fixed and noted for futher patches 

> 
> >  arch/x86/kvm/vmx/vmx.c | 3 +++
> >  arch/x86/kvm/x86.c     | 4 ++++
> >  2 files changed, 7 insertions(+)
> > 
> > -- 
> > 2.26.2
> > 
> > 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/2] Fix issue with not starting nesting guests on my system
  2020-05-27  1:13 ` Sean Christopherson
  2020-05-27 15:17   ` Maxim Levitsky
@ 2020-05-27 15:17   ` Maxim Levitsky
  1 sibling, 0 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-27 15:17 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, H. Peter Anvin, Tao Xu, Jim Mattson,
	linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On Tue, 2020-05-26 at 18:13 -0700, Sean Christopherson wrote:
> On Sat, May 23, 2020 at 07:14:53PM +0300, Maxim Levitsky wrote:
> > On my AMD machine I noticed that I can't start any nested guests,
> > because nested KVM (everything from master git branches) complains
> > that it can't find msr MSR_IA32_UMWAIT_CONTROL which my system
> > doesn't support
> > at all anyway.
> > 
> > I traced it to the recently added UMWAIT support to qemu and kvm.
> > The kvm portion exposed the new MSR in KVM_GET_MSR_INDEX_LIST
> > without
> > checking that it the underlying feature is supported in CPUID.
> > It happened to work when non nested because as a precation kvm,
> > tries to read each MSR on host before adding it to that list,
> > and when read gets a #GP it ignores it.
> > 
> > When running nested, the L1 hypervisor can be set to ignore unknown
> > msr read/writes (I need this for some other guests), thus this
> > safety
> > check doesn't work anymore.
> > 
> > V2: * added a patch to setup correctly the X86_FEATURE_WAITPKG kvm
> > capability
> >     * dropped the cosmetic fix patch as it is now fixed in kvm/queue
> > 
> > Best regards,
> > 	Maxim Levitsky
> > 
> > Maxim Levitsky (2):
> >   kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
> >   kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
> 
> Standard scoping in the shortlog is "KVM: VMX:" and "KVM: x86:".
Noted and I will use it from now on.
Thanks!

Best regards,
	Maxim Levitsky

> 
> >  arch/x86/kvm/vmx/vmx.c | 3 +++
> >  arch/x86/kvm/x86.c     | 4 ++++
> >  2 files changed, 7 insertions(+)
> > 
> > -- 
> > 2.26.2
> > 
> > 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/2] Fix issue with not starting nesting guests on my system
  2020-05-23 16:14 [PATCH 0/2] Fix issue with not starting nesting guests on my system Maxim Levitsky
                   ` (3 preceding siblings ...)
  2020-05-27  1:13 ` Sean Christopherson
@ 2020-05-27 17:00 ` Paolo Bonzini
  4 siblings, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2020-05-27 17:00 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: H. Peter Anvin, Tao Xu, Sean Christopherson, Jim Mattson,
	linux-kernel, Joerg Roedel,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Wanpeng Li, Ingo Molnar, Thomas Gleixner, Borislav Petkov,
	Vitaly Kuznetsov, Jingqi Liu

On 23/05/20 18:14, Maxim Levitsky wrote:
> On my AMD machine I noticed that I can't start any nested guests,
> because nested KVM (everything from master git branches) complains
> that it can't find msr MSR_IA32_UMWAIT_CONTROL which my system doesn't support
> at all anyway.
> 
> I traced it to the recently added UMWAIT support to qemu and kvm.
> The kvm portion exposed the new MSR in KVM_GET_MSR_INDEX_LIST without
> checking that it the underlying feature is supported in CPUID.
> It happened to work when non nested because as a precation kvm,
> tries to read each MSR on host before adding it to that list,
> and when read gets a #GP it ignores it.
> 
> When running nested, the L1 hypervisor can be set to ignore unknown
> msr read/writes (I need this for some other guests), thus this safety
> check doesn't work anymore.
> 
> V2: * added a patch to setup correctly the X86_FEATURE_WAITPKG kvm capability
>     * dropped the cosmetic fix patch as it is now fixed in kvm/queue
> 
> Best regards,
> 	Maxim Levitsky
> 
> Maxim Levitsky (2):
>   kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities
>   kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
> 
>  arch/x86/kvm/vmx/vmx.c | 3 +++
>  arch/x86/kvm/x86.c     | 4 ++++
>  2 files changed, 7 insertions(+)
> 

Queued for 5.7, thanks (with cosmetic touches to the commit message, and
moving the "case" earlier to avoid conflicts).

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-21  6:44           ` Tao Xu
@ 2020-05-21  8:40             ` Paolo Bonzini
  0 siblings, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2020-05-21  8:40 UTC (permalink / raw)
  To: Tao Xu, Xiaoyao Li; +Cc: Maxim Levitsky, kvm, linux-kernel

On 21/05/20 08:44, Tao Xu wrote:
> 
> I am sorry, I mean:
> By default, we don't expose WAITPKG to guest. For QEMU, we can use
> "-overcommit cpu-pm=on" to use WAITPKG.

But UMONITOR, UMWAIT and TPAUSE are not NOPs on older processors (which
I should have checked before committing your patch, I admit).  So you
have broken "-cpu host -overcommit cpu-pm=on" on any processor that
doesn't have WAITPKG.  I'll send a patch.

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-21  4:33     ` Xiaoyao Li
  2020-05-21  5:28       ` Tao Xu
@ 2020-05-21  8:24       ` Paolo Bonzini
  1 sibling, 0 replies; 27+ messages in thread
From: Paolo Bonzini @ 2020-05-21  8:24 UTC (permalink / raw)
  To: Xiaoyao Li, Maxim Levitsky, kvm, Tao Xu; +Cc: linux-kernel

On 21/05/20 06:33, Xiaoyao Li wrote:
> I remember there is certainly some reason why we don't expose WAITPKG to
> guest by default.

That's a userspace policy decision.  KVM_GET_SUPPORTED_CPUID should
still tell userspace that it's supported.

Paolo

> Tao, please help clarify it.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 16:56     ` Maxim Levitsky
  2020-05-20 17:15       ` Vitaly Kuznetsov
@ 2020-05-21  8:03       ` Xiaoyao Li
  1 sibling, 0 replies; 27+ messages in thread
From: Xiaoyao Li @ 2020-05-21  8:03 UTC (permalink / raw)
  To: Maxim Levitsky, Vitaly Kuznetsov, kvm; +Cc: linux-kernel

On 5/21/2020 12:56 AM, Maxim Levitsky wrote:
> On Wed, 2020-05-20 at 18:33 +0200, Vitaly Kuznetsov wrote:
>> Maxim Levitsky <mlevitsk@redhat.com> writes:
>>
>>> This msr is only available when the host supports WAITPKG feature.
>>>
>>> This breaks a nested guest, if the L1 hypervisor is set to ignore
>>> unknown msrs, because the only other safety check that the
>>> kernel does is that it attempts to read the msr and
>>> rejects it if it gets an exception.
>>>
>>> Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
>>>
>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>> ---
>>>   arch/x86/kvm/x86.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index fe3a24fd6b263..9c507b32b1b77 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
>>>   			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
>>>   			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>>>   				continue;
>>> +			break;
>>> +		case MSR_IA32_UMWAIT_CONTROL:
>>> +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
>>> +				continue;
>>
>> I'm probably missing something but (if I understand correctly) the only
>> effect of dropping MSR_IA32_UMWAIT_CONTROL from msrs_to_save would be
>> that KVM userspace won't see it in e.g. KVM_GET_MSR_INDEX_LIST. But why
>> is this causing an issue? I see both vmx_get_msr()/vmx_set_msr() have
>> 'host_initiated' check:
>>
>>         case MSR_IA32_UMWAIT_CONTROL:
>>                  if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
>>                          return 1;
> 
> Here it fails like that:
> 
> 1. KVM_GET_MSR_INDEX_LIST returns this msrs, and qemu notes that
>     it is supported in 'has_msr_umwait' global var

In general, KVM_GET_MSR_INDEX_LIST won't return MSR_IA32_UMWAIT_CONTROL 
if KVM cannot read this MSR, see kvm_init_msr_list().

You hit issue because you used "ignore_msrs".




^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-21  6:37         ` Xiaoyao Li
@ 2020-05-21  6:44           ` Tao Xu
  2020-05-21  8:40             ` Paolo Bonzini
  0 siblings, 1 reply; 27+ messages in thread
From: Tao Xu @ 2020-05-21  6:44 UTC (permalink / raw)
  To: Xiaoyao Li; +Cc: Paolo Bonzini, Maxim Levitsky, kvm, linux-kernel



On 5/21/2020 2:37 PM, Xiaoyao Li wrote:
> On 5/21/2020 1:28 PM, Tao Xu wrote:
>>
>>
>> On 5/21/2020 12:33 PM, Xiaoyao Li wrote:
>>> On 5/21/2020 5:05 AM, Paolo Bonzini wrote:
>>>> On 20/05/20 18:07, Maxim Levitsky wrote:
>>>>> This msr is only available when the host supports WAITPKG feature.
>>>>>
>>>>> This breaks a nested guest, if the L1 hypervisor is set to ignore
>>>>> unknown msrs, because the only other safety check that the
>>>>> kernel does is that it attempts to read the msr and
>>>>> rejects it if it gets an exception.
>>>>>
>>>>> Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
>>>>>
>>>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>>>> ---
>>>>>   arch/x86/kvm/x86.c | 4 ++++
>>>>>   1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>> index fe3a24fd6b263..9c507b32b1b77 100644
>>>>> --- a/arch/x86/kvm/x86.c
>>>>> +++ b/arch/x86/kvm/x86.c
>>>>> @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
>>>>>               if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
>>>>>                   min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>>>>>                   continue;
>>>>> +            break;
>>>>> +        case MSR_IA32_UMWAIT_CONTROL:
>>>>> +            if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
>>>>> +                continue;
>>>>>           default:
>>>>>               break;
>>>>>           }
>>>>
>>>> The patch is correct, and matches what is done for the other entries of
>>>> msrs_to_save_all.  However, while looking at it I noticed that
>>>> X86_FEATURE_WAITPKG is actually never added, and that is because it was
>>>> also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: 
>>>> x86:
>>>> Add support for user wait instructions", 2019-09-24), which was before
>>>> the kvm_cpu_cap mechanism was added.
>>>>
>>>> So while at it you should also fix that.  The right way to do that 
>>>> is to
>>>> add a
>>>>
>>>>          if (vmx_waitpkg_supported())
>>>>                  kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
>>>
>>> + Tao
>>>
>>> I remember there is certainly some reason why we don't expose WAITPKG 
>>> to guest by default.
>>>
>>> Tao, please help clarify it.
>>>
>>> Thanks,
>>> -Xiaoyao
>>>
>>
>> Because in VM, umwait and tpause can put a (psysical) CPU into a power 
>> saving state. So from host view, this cpu will be 100% usage by VM. 
>> Although umwait and tpause just cause short wait(maybe 100 
>> microseconds), we still want to unconditionally expose WAITPKG in VM.
> 
> I guess you typed "unconditionally" by mistake that you meant to say 
> "conditionally" in fact?

I am sorry, I mean:
By default, we don't expose WAITPKG to guest. For QEMU, we can use 
"-overcommit cpu-pm=on" to use WAITPKG.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-21  5:28       ` Tao Xu
@ 2020-05-21  6:37         ` Xiaoyao Li
  2020-05-21  6:44           ` Tao Xu
  0 siblings, 1 reply; 27+ messages in thread
From: Xiaoyao Li @ 2020-05-21  6:37 UTC (permalink / raw)
  To: Tao Xu, Paolo Bonzini, Maxim Levitsky, kvm; +Cc: linux-kernel

On 5/21/2020 1:28 PM, Tao Xu wrote:
> 
> 
> On 5/21/2020 12:33 PM, Xiaoyao Li wrote:
>> On 5/21/2020 5:05 AM, Paolo Bonzini wrote:
>>> On 20/05/20 18:07, Maxim Levitsky wrote:
>>>> This msr is only available when the host supports WAITPKG feature.
>>>>
>>>> This breaks a nested guest, if the L1 hypervisor is set to ignore
>>>> unknown msrs, because the only other safety check that the
>>>> kernel does is that it attempts to read the msr and
>>>> rejects it if it gets an exception.
>>>>
>>>> Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
>>>>
>>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>>> ---
>>>>   arch/x86/kvm/x86.c | 4 ++++
>>>>   1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index fe3a24fd6b263..9c507b32b1b77 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
>>>>               if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
>>>>                   min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>>>>                   continue;
>>>> +            break;
>>>> +        case MSR_IA32_UMWAIT_CONTROL:
>>>> +            if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
>>>> +                continue;
>>>>           default:
>>>>               break;
>>>>           }
>>>
>>> The patch is correct, and matches what is done for the other entries of
>>> msrs_to_save_all.  However, while looking at it I noticed that
>>> X86_FEATURE_WAITPKG is actually never added, and that is because it was
>>> also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: x86:
>>> Add support for user wait instructions", 2019-09-24), which was before
>>> the kvm_cpu_cap mechanism was added.
>>>
>>> So while at it you should also fix that.  The right way to do that is to
>>> add a
>>>
>>>          if (vmx_waitpkg_supported())
>>>                  kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
>>
>> + Tao
>>
>> I remember there is certainly some reason why we don't expose WAITPKG 
>> to guest by default.
>>
>> Tao, please help clarify it.
>>
>> Thanks,
>> -Xiaoyao
>>
> 
> Because in VM, umwait and tpause can put a (psysical) CPU into a power 
> saving state. So from host view, this cpu will be 100% usage by VM. 
> Although umwait and tpause just cause short wait(maybe 100 
> microseconds), we still want to unconditionally expose WAITPKG in VM.

I guess you typed "unconditionally" by mistake that you meant to say 
"conditionally" in fact?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-21  4:33     ` Xiaoyao Li
@ 2020-05-21  5:28       ` Tao Xu
  2020-05-21  6:37         ` Xiaoyao Li
  2020-05-21  8:24       ` Paolo Bonzini
  1 sibling, 1 reply; 27+ messages in thread
From: Tao Xu @ 2020-05-21  5:28 UTC (permalink / raw)
  To: Xiaoyao Li, Paolo Bonzini, Maxim Levitsky, kvm; +Cc: linux-kernel



On 5/21/2020 12:33 PM, Xiaoyao Li wrote:
> On 5/21/2020 5:05 AM, Paolo Bonzini wrote:
>> On 20/05/20 18:07, Maxim Levitsky wrote:
>>> This msr is only available when the host supports WAITPKG feature.
>>>
>>> This breaks a nested guest, if the L1 hypervisor is set to ignore
>>> unknown msrs, because the only other safety check that the
>>> kernel does is that it attempts to read the msr and
>>> rejects it if it gets an exception.
>>>
>>> Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
>>>
>>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>>> ---
>>>   arch/x86/kvm/x86.c | 4 ++++
>>>   1 file changed, 4 insertions(+)
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index fe3a24fd6b263..9c507b32b1b77 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
>>>               if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
>>>                   min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>>>                   continue;
>>> +            break;
>>> +        case MSR_IA32_UMWAIT_CONTROL:
>>> +            if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
>>> +                continue;
>>>           default:
>>>               break;
>>>           }
>>
>> The patch is correct, and matches what is done for the other entries of
>> msrs_to_save_all.  However, while looking at it I noticed that
>> X86_FEATURE_WAITPKG is actually never added, and that is because it was
>> also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: x86:
>> Add support for user wait instructions", 2019-09-24), which was before
>> the kvm_cpu_cap mechanism was added.
>>
>> So while at it you should also fix that.  The right way to do that is to
>> add a
>>
>>          if (vmx_waitpkg_supported())
>>                  kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
> 
> + Tao
> 
> I remember there is certainly some reason why we don't expose WAITPKG to 
> guest by default.
> 
> Tao, please help clarify it.
> 
> Thanks,
> -Xiaoyao
> 

Because in VM, umwait and tpause can put a (psysical) CPU into a power 
saving state. So from host view, this cpu will be 100% usage by VM. 
Although umwait and tpause just cause short wait(maybe 100 
microseconds), we still want to unconditionally expose WAITPKG in VM.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 21:05   ` Paolo Bonzini
  2020-05-20 21:09     ` Maxim Levitsky
@ 2020-05-21  4:33     ` Xiaoyao Li
  2020-05-21  5:28       ` Tao Xu
  2020-05-21  8:24       ` Paolo Bonzini
  1 sibling, 2 replies; 27+ messages in thread
From: Xiaoyao Li @ 2020-05-21  4:33 UTC (permalink / raw)
  To: Paolo Bonzini, Maxim Levitsky, kvm, Tao Xu; +Cc: linux-kernel

On 5/21/2020 5:05 AM, Paolo Bonzini wrote:
> On 20/05/20 18:07, Maxim Levitsky wrote:
>> This msr is only available when the host supports WAITPKG feature.
>>
>> This breaks a nested guest, if the L1 hypervisor is set to ignore
>> unknown msrs, because the only other safety check that the
>> kernel does is that it attempts to read the msr and
>> rejects it if it gets an exception.
>>
>> Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
>>
>> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>> ---
>>   arch/x86/kvm/x86.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index fe3a24fd6b263..9c507b32b1b77 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
>>   			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
>>   			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>>   				continue;
>> +			break;
>> +		case MSR_IA32_UMWAIT_CONTROL:
>> +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
>> +				continue;
>>   		default:
>>   			break;
>>   		}
> 
> The patch is correct, and matches what is done for the other entries of
> msrs_to_save_all.  However, while looking at it I noticed that
> X86_FEATURE_WAITPKG is actually never added, and that is because it was
> also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: x86:
> Add support for user wait instructions", 2019-09-24), which was before
> the kvm_cpu_cap mechanism was added.
> 
> So while at it you should also fix that.  The right way to do that is to
> add a
> 
>          if (vmx_waitpkg_supported())
>                  kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);

+ Tao

I remember there is certainly some reason why we don't expose WAITPKG to 
guest by default.

Tao, please help clarify it.

Thanks,
-Xiaoyao

> 
> in vmx_set_cpu_caps.
> 
> Thanks,
> 
> Paolo
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 21:05   ` Paolo Bonzini
@ 2020-05-20 21:09     ` Maxim Levitsky
  2020-05-21  4:33     ` Xiaoyao Li
  1 sibling, 0 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-20 21:09 UTC (permalink / raw)
  To: Paolo Bonzini, kvm; +Cc: linux-kernel

On Wed, 2020-05-20 at 23:05 +0200, Paolo Bonzini wrote:
> On 20/05/20 18:07, Maxim Levitsky wrote:
> > This msr is only available when the host supports WAITPKG feature.
> > 
> > This breaks a nested guest, if the L1 hypervisor is set to ignore
> > unknown msrs, because the only other safety check that the
> > kernel does is that it attempts to read the msr and
> > rejects it if it gets an exception.
> > 
> > Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  arch/x86/kvm/x86.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index fe3a24fd6b263..9c507b32b1b77 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
> >  			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
> >  			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
> >  				continue;
> > +			break;
> > +		case MSR_IA32_UMWAIT_CONTROL:
> > +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
> > +				continue;
> >  		default:
> >  			break;
> >  		}
> 
> The patch is correct, and matches what is done for the other entries of
> msrs_to_save_all.  However, while looking at it I noticed that
> X86_FEATURE_WAITPKG is actually never added, and that is because it was
> also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: x86:
> Add support for user wait instructions", 2019-09-24), which was before
> the kvm_cpu_cap mechanism was added.
> 
> So while at it you should also fix that.  The right way to do that is to
> add a
> 
>         if (vmx_waitpkg_supported())
>                 kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);
> 
> in vmx_set_cpu_caps.
> 
> Thanks,

Thank you very much for finding this. I didn't expect this to be broken.
I will send a new version with this fix as well tomorrow.

Best regards,
	Maxim Levitsky


> 
> Paolo
> 



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 16:07 ` [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally Maxim Levitsky
  2020-05-20 16:33   ` Vitaly Kuznetsov
@ 2020-05-20 21:05   ` Paolo Bonzini
  2020-05-20 21:09     ` Maxim Levitsky
  2020-05-21  4:33     ` Xiaoyao Li
  1 sibling, 2 replies; 27+ messages in thread
From: Paolo Bonzini @ 2020-05-20 21:05 UTC (permalink / raw)
  To: Maxim Levitsky, kvm; +Cc: linux-kernel

On 20/05/20 18:07, Maxim Levitsky wrote:
> This msr is only available when the host supports WAITPKG feature.
> 
> This breaks a nested guest, if the L1 hypervisor is set to ignore
> unknown msrs, because the only other safety check that the
> kernel does is that it attempts to read the msr and
> rejects it if it gets an exception.
> 
> Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fe3a24fd6b263..9c507b32b1b77 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
>  			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
>  			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>  				continue;
> +			break;
> +		case MSR_IA32_UMWAIT_CONTROL:
> +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
> +				continue;
>  		default:
>  			break;
>  		}

The patch is correct, and matches what is done for the other entries of
msrs_to_save_all.  However, while looking at it I noticed that
X86_FEATURE_WAITPKG is actually never added, and that is because it was
also not added to the supported CPUID in commit e69e72faa3a0 ("KVM: x86:
Add support for user wait instructions", 2019-09-24), which was before
the kvm_cpu_cap mechanism was added.

So while at it you should also fix that.  The right way to do that is to
add a

        if (vmx_waitpkg_supported())
                kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG);

in vmx_set_cpu_caps.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 17:15       ` Vitaly Kuznetsov
@ 2020-05-20 17:39         ` Maxim Levitsky
  0 siblings, 0 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-20 17:39 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm; +Cc: linux-kernel, Paolo Bonzini

On Wed, 2020-05-20 at 19:15 +0200, Vitaly Kuznetsov wrote:
> Maxim Levitsky <mlevitsk@redhat.com> writes:
> 
> > On Wed, 2020-05-20 at 18:33 +0200, Vitaly Kuznetsov wrote:
> > > Maxim Levitsky <mlevitsk@redhat.com> writes:
> > > 
> > > > This msr is only available when the host supports WAITPKG feature.
> > > > 
> > > > This breaks a nested guest, if the L1 hypervisor is set to ignore
> > > > unknown msrs, because the only other safety check that the
> > > > kernel does is that it attempts to read the msr and
> > > > rejects it if it gets an exception.
> > > > 
> > > > Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
> > > > 
> > > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > > > ---
> > > >  arch/x86/kvm/x86.c | 4 ++++
> > > >  1 file changed, 4 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > index fe3a24fd6b263..9c507b32b1b77 100644
> > > > --- a/arch/x86/kvm/x86.c
> > > > +++ b/arch/x86/kvm/x86.c
> > > > @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
> > > >  			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
> > > >  			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
> > > >  				continue;
> > > > +			break;
> > > > +		case MSR_IA32_UMWAIT_CONTROL:
> > > > +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
> > > > +				continue;
> > > 
> > > I'm probably missing something but (if I understand correctly) the only
> > > effect of dropping MSR_IA32_UMWAIT_CONTROL from msrs_to_save would be
> > > that KVM userspace won't see it in e.g. KVM_GET_MSR_INDEX_LIST. But why
> > > is this causing an issue? I see both vmx_get_msr()/vmx_set_msr() have
> > > 'host_initiated' check:
> > > 
> > >        case MSR_IA32_UMWAIT_CONTROL:
> > >                 if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
> > >                         return 1;
> > 
> > Here it fails like that:
> > 
> > 1. KVM_GET_MSR_INDEX_LIST returns this msrs, and qemu notes that
> >    it is supported in 'has_msr_umwait' global var
> > 
> > 2. Qemu does kvm_arch_get/put_registers->kvm_get/put_msrs->ioctl(KVM_GET_MSRS)
> >    and while doing this it adds MSR_IA32_UMWAIT_CONTROL to that msr list.
> >    That reaches 'svm_get_msr', and this one knows nothing about MSR_IA32_UMWAIT_CONTROL.
> > 
> > So the difference here is that vmx_get_msr not called at all.
> > I can add this msr to svm_get_msr instead but that feels wrong since this feature
> > is not yet supported on AMD.
> > When AMD adds support for this feature, then the VMX specific code can be moved to
> > kvm_get_msr_common I guess.
> > 
> > 
> 
> Oh, SVM, I missed that completely) 
> 
> > > so KVM userspace should be able to read/write this MSR even when there's
> > > no hardware support for it. Or who's trying to read/write it?
> > > 
> > > Also, kvm_cpu_cap_has() check is not equal to vmx_has_waitpkg() which
> > > checks secondary execution controls.
> > 
> > I was afraid that something like that will happen, but in this particular
> > case we can only check CPUID support and if supported, the then it means
> > we are dealing with intel system and thus vmx_get_msr will be called and
> > ignore that msr.
> > 
> > Calling vmx_has_waitpkg from the common code doesn't seem right, and besides,
> > it checks the secondary controls which are set by the host and can change,
> > at least in theory during runtime (I don't know if KVM does this).
> > 
> > Note that if I now understand correctly, the 'host_initiated' means
> > that MSR read/write is done by the host itself and not on behalf of the guest.
> 
> Yes, it does that. 
> 
> We have kvm_x86_ops.has_emulated_msr() mechanism, can we use it here?
> E.g. completely untested
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 38f6aeefeb55..c19a9542e6c3 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -3471,6 +3471,8 @@ static bool svm_has_emulated_msr(int index)
>         case MSR_IA32_MCG_EXT_CTL:
>         case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
>                 return false;
> +       case MSR_IA32_UMWAIT_CONTROL:
> +               return false;
>         default:
>                 break;
>         }
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d786c7d27ce5..f45153ef3b81 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1183,7 +1183,6 @@ static const u32 msrs_to_save_all[] = {
>         MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
>         MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
>         MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
> -       MSR_IA32_UMWAIT_CONTROL,
>  
>         MSR_ARCH_PERFMON_FIXED_CTR0, MSR_ARCH_PERFMON_FIXED_CTR1,
>         MSR_ARCH_PERFMON_FIXED_CTR0 + 2, MSR_ARCH_PERFMON_FIXED_CTR0 + 3,
> @@ -1266,6 +1265,7 @@ static const u32 emulated_msrs_all[] = {
>         MSR_IA32_VMX_PROCBASED_CTLS2,
>         MSR_IA32_VMX_EPT_VPID_CAP,
>         MSR_IA32_VMX_VMFUNC,
> +       MSR_IA32_UMWAIT_CONTROL,
>  
>         MSR_K7_HWCR,
>         MSR_KVM_POLL_CONTROL,
> 
I don't see any reason why the above won't work, and to be honest
I also took a look at this but to me it wasn't clear what the purpose
of the emulated msrs is, this is why I took the approach in the patch I had sent.

It 'seems' (although this is not enforced anywhere) that emulated msr list is
intended for MSRs that are emulated by KVM, which means that KVM traps these msrs,
and give guest arbitrary values it thinks that the guest should see.

However MSR_IA32_UMWAIT_CONTROL appears to be exposed directly to the guest
without any traps, with the virtualization done by cpu, and the only intervention
we do is to set a value to be load when guest mode is entered and value to be
loaded when guest mode is done (using VMX msr entry/exit msr lists),
I see that done by atomic_switch_umwait_control_msr.

So I am not sure if we should add it to emulated_msrs_all list.

Paulo, what do you think about this? I personally don't mind how to fix
this as long as it works and everyone agrees on the patch.

Best regards,
	Maxim Levitsky





^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 16:56     ` Maxim Levitsky
@ 2020-05-20 17:15       ` Vitaly Kuznetsov
  2020-05-20 17:39         ` Maxim Levitsky
  2020-05-21  8:03       ` Xiaoyao Li
  1 sibling, 1 reply; 27+ messages in thread
From: Vitaly Kuznetsov @ 2020-05-20 17:15 UTC (permalink / raw)
  To: Maxim Levitsky, kvm; +Cc: linux-kernel

Maxim Levitsky <mlevitsk@redhat.com> writes:

> On Wed, 2020-05-20 at 18:33 +0200, Vitaly Kuznetsov wrote:
>> Maxim Levitsky <mlevitsk@redhat.com> writes:
>> 
>> > This msr is only available when the host supports WAITPKG feature.
>> > 
>> > This breaks a nested guest, if the L1 hypervisor is set to ignore
>> > unknown msrs, because the only other safety check that the
>> > kernel does is that it attempts to read the msr and
>> > rejects it if it gets an exception.
>> > 
>> > Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
>> > 
>> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
>> > ---
>> >  arch/x86/kvm/x86.c | 4 ++++
>> >  1 file changed, 4 insertions(+)
>> > 
>> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> > index fe3a24fd6b263..9c507b32b1b77 100644
>> > --- a/arch/x86/kvm/x86.c
>> > +++ b/arch/x86/kvm/x86.c
>> > @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
>> >  			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
>> >  			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>> >  				continue;
>> > +			break;
>> > +		case MSR_IA32_UMWAIT_CONTROL:
>> > +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
>> > +				continue;
>> 
>> I'm probably missing something but (if I understand correctly) the only
>> effect of dropping MSR_IA32_UMWAIT_CONTROL from msrs_to_save would be
>> that KVM userspace won't see it in e.g. KVM_GET_MSR_INDEX_LIST. But why
>> is this causing an issue? I see both vmx_get_msr()/vmx_set_msr() have
>> 'host_initiated' check:
>> 
>>        case MSR_IA32_UMWAIT_CONTROL:
>>                 if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
>>                         return 1;
>
> Here it fails like that:
>
> 1. KVM_GET_MSR_INDEX_LIST returns this msrs, and qemu notes that
>    it is supported in 'has_msr_umwait' global var
>
> 2. Qemu does kvm_arch_get/put_registers->kvm_get/put_msrs->ioctl(KVM_GET_MSRS)
>    and while doing this it adds MSR_IA32_UMWAIT_CONTROL to that msr list.
>    That reaches 'svm_get_msr', and this one knows nothing about MSR_IA32_UMWAIT_CONTROL.
>
> So the difference here is that vmx_get_msr not called at all.
> I can add this msr to svm_get_msr instead but that feels wrong since this feature
> is not yet supported on AMD.
> When AMD adds support for this feature, then the VMX specific code can be moved to
> kvm_get_msr_common I guess.
>
>

Oh, SVM, I missed that completely) 

>
>> 
>> so KVM userspace should be able to read/write this MSR even when there's
>> no hardware support for it. Or who's trying to read/write it?
>> 
>> Also, kvm_cpu_cap_has() check is not equal to vmx_has_waitpkg() which
>> checks secondary execution controls.
>
> I was afraid that something like that will happen, but in this particular
> case we can only check CPUID support and if supported, the then it means
> we are dealing with intel system and thus vmx_get_msr will be called and
> ignore that msr.
>
> Calling vmx_has_waitpkg from the common code doesn't seem right, and besides,
> it checks the secondary controls which are set by the host and can change,
> at least in theory during runtime (I don't know if KVM does this).
>
> Note that if I now understand correctly, the 'host_initiated' means
> that MSR read/write is done by the host itself and not on behalf of the guest.

Yes, it does that. 

We have kvm_x86_ops.has_emulated_msr() mechanism, can we use it here?
E.g. completely untested

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 38f6aeefeb55..c19a9542e6c3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3471,6 +3471,8 @@ static bool svm_has_emulated_msr(int index)
        case MSR_IA32_MCG_EXT_CTL:
        case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
                return false;
+       case MSR_IA32_UMWAIT_CONTROL:
+               return false;
        default:
                break;
        }
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d786c7d27ce5..f45153ef3b81 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1183,7 +1183,6 @@ static const u32 msrs_to_save_all[] = {
        MSR_IA32_RTIT_ADDR1_A, MSR_IA32_RTIT_ADDR1_B,
        MSR_IA32_RTIT_ADDR2_A, MSR_IA32_RTIT_ADDR2_B,
        MSR_IA32_RTIT_ADDR3_A, MSR_IA32_RTIT_ADDR3_B,
-       MSR_IA32_UMWAIT_CONTROL,
 
        MSR_ARCH_PERFMON_FIXED_CTR0, MSR_ARCH_PERFMON_FIXED_CTR1,
        MSR_ARCH_PERFMON_FIXED_CTR0 + 2, MSR_ARCH_PERFMON_FIXED_CTR0 + 3,
@@ -1266,6 +1265,7 @@ static const u32 emulated_msrs_all[] = {
        MSR_IA32_VMX_PROCBASED_CTLS2,
        MSR_IA32_VMX_EPT_VPID_CAP,
        MSR_IA32_VMX_VMFUNC,
+       MSR_IA32_UMWAIT_CONTROL,
 
        MSR_K7_HWCR,
        MSR_KVM_POLL_CONTROL,

-- 
Vitaly


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 16:33   ` Vitaly Kuznetsov
@ 2020-05-20 16:56     ` Maxim Levitsky
  2020-05-20 17:15       ` Vitaly Kuznetsov
  2020-05-21  8:03       ` Xiaoyao Li
  0 siblings, 2 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-20 16:56 UTC (permalink / raw)
  To: Vitaly Kuznetsov, kvm; +Cc: linux-kernel

On Wed, 2020-05-20 at 18:33 +0200, Vitaly Kuznetsov wrote:
> Maxim Levitsky <mlevitsk@redhat.com> writes:
> 
> > This msr is only available when the host supports WAITPKG feature.
> > 
> > This breaks a nested guest, if the L1 hypervisor is set to ignore
> > unknown msrs, because the only other safety check that the
> > kernel does is that it attempts to read the msr and
> > rejects it if it gets an exception.
> > 
> > Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  arch/x86/kvm/x86.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index fe3a24fd6b263..9c507b32b1b77 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
> >  			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
> >  			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
> >  				continue;
> > +			break;
> > +		case MSR_IA32_UMWAIT_CONTROL:
> > +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
> > +				continue;
> 
> I'm probably missing something but (if I understand correctly) the only
> effect of dropping MSR_IA32_UMWAIT_CONTROL from msrs_to_save would be
> that KVM userspace won't see it in e.g. KVM_GET_MSR_INDEX_LIST. But why
> is this causing an issue? I see both vmx_get_msr()/vmx_set_msr() have
> 'host_initiated' check:
> 
>        case MSR_IA32_UMWAIT_CONTROL:
>                 if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
>                         return 1;

Here it fails like that:

1. KVM_GET_MSR_INDEX_LIST returns this msrs, and qemu notes that
   it is supported in 'has_msr_umwait' global var

2. Qemu does kvm_arch_get/put_registers->kvm_get/put_msrs->ioctl(KVM_GET_MSRS)
   and while doing this it adds MSR_IA32_UMWAIT_CONTROL to that msr list.
   That reaches 'svm_get_msr', and this one knows nothing about MSR_IA32_UMWAIT_CONTROL.

So the difference here is that vmx_get_msr not called at all.
I can add this msr to svm_get_msr instead but that feels wrong since this feature
is not yet supported on AMD.
When AMD adds support for this feature, then the VMX specific code can be moved to
kvm_get_msr_common I guess.



> 
> so KVM userspace should be able to read/write this MSR even when there's
> no hardware support for it. Or who's trying to read/write it?
> 
> Also, kvm_cpu_cap_has() check is not equal to vmx_has_waitpkg() which
> checks secondary execution controls.

I was afraid that something like that will happen, but in this particular
case we can only check CPUID support and if supported, the then it means
we are dealing with intel system and thus vmx_get_msr will be called and
ignore that msr.

Calling vmx_has_waitpkg from the common code doesn't seem right, and besides,
it checks the secondary controls which are set by the host and can change,
at least in theory during runtime (I don't know if KVM does this).

Note that if I now understand correctly, the 'host_initiated' means that MSR read/write
is done by the host itself and not on behalf of the guest.


Best regards,
	Maxim Levitsky

> 
> >  		default:
> >  			break;
> >  		}



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 16:07 ` [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally Maxim Levitsky
@ 2020-05-20 16:33   ` Vitaly Kuznetsov
  2020-05-20 16:56     ` Maxim Levitsky
  2020-05-20 21:05   ` Paolo Bonzini
  1 sibling, 1 reply; 27+ messages in thread
From: Vitaly Kuznetsov @ 2020-05-20 16:33 UTC (permalink / raw)
  To: Maxim Levitsky, kvm; +Cc: linux-kernel, Maxim Levitsky

Maxim Levitsky <mlevitsk@redhat.com> writes:

> This msr is only available when the host supports WAITPKG feature.
>
> This breaks a nested guest, if the L1 hypervisor is set to ignore
> unknown msrs, because the only other safety check that the
> kernel does is that it attempts to read the msr and
> rejects it if it gets an exception.
>
> Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL
>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/x86.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index fe3a24fd6b263..9c507b32b1b77 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
>  			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
>  			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
>  				continue;
> +			break;
> +		case MSR_IA32_UMWAIT_CONTROL:
> +			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
> +				continue;

I'm probably missing something but (if I understand correctly) the only
effect of dropping MSR_IA32_UMWAIT_CONTROL from msrs_to_save would be
that KVM userspace won't see it in e.g. KVM_GET_MSR_INDEX_LIST. But why
is this causing an issue? I see both vmx_get_msr()/vmx_set_msr() have
'host_initiated' check:

       case MSR_IA32_UMWAIT_CONTROL:
                if (!msr_info->host_initiated && !vmx_has_waitpkg(vmx))
                        return 1;

so KVM userspace should be able to read/write this MSR even when there's
no hardware support for it. Or who's trying to read/write it?

Also, kvm_cpu_cap_has() check is not equal to vmx_has_waitpkg() which
checks secondary execution controls.

>  		default:
>  			break;
>  		}

-- 
Vitaly


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally
  2020-05-20 16:07 [PATCH 0/2] Fix breakage from adding MSR_IA32_UMWAIT_CONTROL Maxim Levitsky
@ 2020-05-20 16:07 ` Maxim Levitsky
  2020-05-20 16:33   ` Vitaly Kuznetsov
  2020-05-20 21:05   ` Paolo Bonzini
  0 siblings, 2 replies; 27+ messages in thread
From: Maxim Levitsky @ 2020-05-20 16:07 UTC (permalink / raw)
  To: kvm; +Cc: linux-kernel, Maxim Levitsky

This msr is only available when the host supports WAITPKG feature.

This breaks a nested guest, if the L1 hypervisor is set to ignore
unknown msrs, because the only other safety check that the
kernel does is that it attempts to read the msr and
rejects it if it gets an exception.

Fixes: 6e3ba4abce KVM: vmx: Emulate MSR IA32_UMWAIT_CONTROL

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/x86.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fe3a24fd6b263..9c507b32b1b77 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5314,6 +5314,10 @@ static void kvm_init_msr_list(void)
 			if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
 			    min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
 				continue;
+			break;
+		case MSR_IA32_UMWAIT_CONTROL:
+			if (!kvm_cpu_cap_has(X86_FEATURE_WAITPKG))
+				continue;
 		default:
 			break;
 		}
-- 
2.26.2


^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, back to index

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-23 16:14 [PATCH 0/2] Fix issue with not starting nesting guests on my system Maxim Levitsky
2020-05-23 16:14 ` [PATCH 1/2] kvm/x86/vmx: enable X86_FEATURE_WAITPKG in KVM capabilities Maxim Levitsky
2020-05-27  1:20   ` Sean Christopherson
2020-05-27 15:16     ` Maxim Levitsky
2020-05-23 16:14 ` [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally Maxim Levitsky
2020-05-27  1:21   ` Sean Christopherson
2020-05-27 15:17     ` Maxim Levitsky
2020-05-27  1:03 ` [PATCH 0/2] Fix issue with not starting nesting guests on my system Krish Sadhukhan
2020-05-27  1:22   ` Sean Christopherson
2020-05-27  1:13 ` Sean Christopherson
2020-05-27 15:17   ` Maxim Levitsky
2020-05-27 15:17   ` Maxim Levitsky
2020-05-27 17:00 ` Paolo Bonzini
  -- strict thread matches above, loose matches on Subject: below --
2020-05-20 16:07 [PATCH 0/2] Fix breakage from adding MSR_IA32_UMWAIT_CONTROL Maxim Levitsky
2020-05-20 16:07 ` [PATCH 2/2] kvm/x86: don't expose MSR_IA32_UMWAIT_CONTROL unconditionally Maxim Levitsky
2020-05-20 16:33   ` Vitaly Kuznetsov
2020-05-20 16:56     ` Maxim Levitsky
2020-05-20 17:15       ` Vitaly Kuznetsov
2020-05-20 17:39         ` Maxim Levitsky
2020-05-21  8:03       ` Xiaoyao Li
2020-05-20 21:05   ` Paolo Bonzini
2020-05-20 21:09     ` Maxim Levitsky
2020-05-21  4:33     ` Xiaoyao Li
2020-05-21  5:28       ` Tao Xu
2020-05-21  6:37         ` Xiaoyao Li
2020-05-21  6:44           ` Tao Xu
2020-05-21  8:40             ` Paolo Bonzini
2020-05-21  8:24       ` Paolo Bonzini

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git