* [PATCH 1/3] KVM: Documentation: Add disable pause exits to KVM_CAP_X86_DISABLE_EXITS @ 2019-05-21 6:06 Wanpeng Li 2019-05-21 6:06 ` [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts Wanpeng Li 2019-05-21 6:06 ` [PATCH v2 3/3] KVM: X86: Emulate MSR_IA32_MISC_ENABLE MWAIT bit Wanpeng Li 0 siblings, 2 replies; 11+ messages in thread From: Wanpeng Li @ 2019-05-21 6:06 UTC (permalink / raw) To: linux-kernel, kvm Cc: Paolo Bonzini, Radim Krčmář, Sean Christopherson, Liran Alon From: Wanpeng Li <wanpengli@tencent.com> Commit b31c114b (KVM: X86: Provide a capability to disable PAUSE intercepts) forgot to add the KVM_X86_DISABLE_EXITS_PAUSE into api doc. This patch adds it. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> --- Documentation/virtual/kvm/api.txt | 1 + 1 file changed, 1 insertion(+) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index ba6c42c..33cd92d 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -4893,6 +4893,7 @@ Valid bits in args[0] are #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) +#define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) Enabling this capability on a VM provides userspace with a way to no longer intercept some instructions for improved latency in some -- 2.7.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts 2019-05-21 6:06 [PATCH 1/3] KVM: Documentation: Add disable pause exits to KVM_CAP_X86_DISABLE_EXITS Wanpeng Li @ 2019-05-21 6:06 ` Wanpeng Li 2019-06-04 16:53 ` Paolo Bonzini 2019-05-21 6:06 ` [PATCH v2 3/3] KVM: X86: Emulate MSR_IA32_MISC_ENABLE MWAIT bit Wanpeng Li 1 sibling, 1 reply; 11+ messages in thread From: Wanpeng Li @ 2019-05-21 6:06 UTC (permalink / raw) To: linux-kernel, kvm Cc: Paolo Bonzini, Radim Krčmář, Sean Christopherson, Liran Alon From: Wanpeng Li <wanpengli@tencent.com> Allow guest reads CORE cstate when exposing host CPU power management capabilities to the guest. PKG cstate is restricted to avoid a guest to get the whole package information in multi-tenant scenario. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Liran Alon <liran.alon@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> --- v1 -> v2: * use a separate bit for KVM_CAP_X86_DISABLE_EXITS Documentation/virtual/kvm/api.txt | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/vmx.c | 6 ++++++ arch/x86/kvm/x86.c | 5 ++++- arch/x86/kvm/x86.h | 5 +++++ include/uapi/linux/kvm.h | 4 +++- tools/include/uapi/linux/kvm.h | 4 +++- 7 files changed, 23 insertions(+), 3 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index 33cd92d..91fd86f 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -4894,6 +4894,7 @@ Valid bits in args[0] are #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) +#define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) Enabling this capability on a VM provides userspace with a way to no longer intercept some instructions for improved latency in some diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index d5457c7..1ce8289 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -882,6 +882,7 @@ struct kvm_arch { bool mwait_in_guest; bool hlt_in_guest; bool pause_in_guest; + bool cstate_in_guest; unsigned long irq_sources_bitmap; s64 kvmclock_offset; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0861c71..da24f18 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6637,6 +6637,12 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); + if (kvm_cstate_in_guest(kvm)) { + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C1_RES, MSR_TYPE_R); + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R); + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R); + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R); + } vmx->msr_bitmap_mode = 0; vmx->loaded_vmcs = &vmx->vmcs01; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 202048e..765fe59 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3098,7 +3098,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) r = KVM_CLOCK_TSC_STABLE; break; case KVM_CAP_X86_DISABLE_EXITS: - r |= KVM_X86_DISABLE_EXITS_HLT | KVM_X86_DISABLE_EXITS_PAUSE; + r |= KVM_X86_DISABLE_EXITS_HLT | KVM_X86_DISABLE_EXITS_PAUSE | + KVM_X86_DISABLE_EXITS_CSTATE; if(kvm_can_mwait_in_guest()) r |= KVM_X86_DISABLE_EXITS_MWAIT; break; @@ -4612,6 +4613,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, kvm->arch.hlt_in_guest = true; if (cap->args[0] & KVM_X86_DISABLE_EXITS_PAUSE) kvm->arch.pause_in_guest = true; + if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE) + kvm->arch.cstate_in_guest = true; r = 0; break; case KVM_CAP_MSR_PLATFORM_INFO: diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index a470ff0..275b3b6 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -333,6 +333,11 @@ static inline bool kvm_pause_in_guest(struct kvm *kvm) return kvm->arch.pause_in_guest; } +static inline bool kvm_cstate_in_guest(struct kvm *kvm) +{ + return kvm->arch.cstate_in_guest; +} + DECLARE_PER_CPU(struct kvm_vcpu *, current_vcpu); static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2fe12b4..c2152f3 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -696,9 +696,11 @@ struct kvm_ioeventfd { #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) +#define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) #define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT | \ KVM_X86_DISABLE_EXITS_HLT | \ - KVM_X86_DISABLE_EXITS_PAUSE) + KVM_X86_DISABLE_EXITS_PAUSE | \ + KVM_X86_DISABLE_EXITS_CSTATE) /* for KVM_ENABLE_CAP */ struct kvm_enable_cap { diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index 6d4ea4b..ef3303f 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -696,9 +696,11 @@ struct kvm_ioeventfd { #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) +#define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) #define KVM_X86_DISABLE_VALID_EXITS (KVM_X86_DISABLE_EXITS_MWAIT | \ KVM_X86_DISABLE_EXITS_HLT | \ - KVM_X86_DISABLE_EXITS_PAUSE) + KVM_X86_DISABLE_EXITS_PAUSE | \ + KVM_X86_DISABLE_EXITS_CSTATE) /* for KVM_ENABLE_CAP */ struct kvm_enable_cap { -- 2.7.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts 2019-05-21 6:06 ` [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts Wanpeng Li @ 2019-06-04 16:53 ` Paolo Bonzini 2019-06-05 10:56 ` Wanpeng Li 2019-06-11 7:38 ` Wanpeng Li 0 siblings, 2 replies; 11+ messages in thread From: Paolo Bonzini @ 2019-06-04 16:53 UTC (permalink / raw) To: Wanpeng Li, linux-kernel, kvm Cc: Radim Krčmář, Sean Christopherson, Liran Alon On 21/05/19 08:06, Wanpeng Li wrote: > From: Wanpeng Li <wanpengli@tencent.com> > > Allow guest reads CORE cstate when exposing host CPU power management capabilities > to the guest. PKG cstate is restricted to avoid a guest to get the whole package > information in multi-tenant scenario. > > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Radim Krčmář <rkrcmar@redhat.com> > Cc: Sean Christopherson <sean.j.christopherson@intel.com> > Cc: Liran Alon <liran.alon@oracle.com> > Signed-off-by: Wanpeng Li <wanpengli@tencent.com> > --- > v1 -> v2: > * use a separate bit for KVM_CAP_X86_DISABLE_EXITS > > Documentation/virtual/kvm/api.txt | 1 + > arch/x86/include/asm/kvm_host.h | 1 + > arch/x86/kvm/vmx/vmx.c | 6 ++++++ > arch/x86/kvm/x86.c | 5 ++++- > arch/x86/kvm/x86.h | 5 +++++ > include/uapi/linux/kvm.h | 4 +++- > tools/include/uapi/linux/kvm.h | 4 +++- > 7 files changed, 23 insertions(+), 3 deletions(-) > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > index 33cd92d..91fd86f 100644 > --- a/Documentation/virtual/kvm/api.txt > +++ b/Documentation/virtual/kvm/api.txt > @@ -4894,6 +4894,7 @@ Valid bits in args[0] are > #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) > #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) > #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) > +#define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) > > Enabling this capability on a VM provides userspace with a way to no > longer intercept some instructions for improved latency in some > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index d5457c7..1ce8289 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -882,6 +882,7 @@ struct kvm_arch { > bool mwait_in_guest; > bool hlt_in_guest; > bool pause_in_guest; > + bool cstate_in_guest; > > unsigned long irq_sources_bitmap; > s64 kvmclock_offset; > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 0861c71..da24f18 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -6637,6 +6637,12 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); > + if (kvm_cstate_in_guest(kvm)) { > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C1_RES, MSR_TYPE_R); > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R); > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R); > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R); I think I have changed my mind on the implementation of this, sorry. 1) We should emulate these MSRs always, otherwise the guest API changes between different values of KVM_CAP_X86_DISABLE_EXITS which is not intended. Also, KVM_CAP_X86_DISABLE_EXITS does not prevent live migration, so it should be possible to set the MSRs in the host to change the delta between the host and guest values. 2) If both KVM_X86_DISABLE_EXITS_HLT and KVM_X86_DISABLE_EXITS_MWAIT are disabled (i.e. exit happens), the MSRs will be purely emulated. C3/C6/C7 residency will never increase (it will remain the value that is set by the host). When the VM executes an hlt vmexit, it should save the current TSC. When it comes back, the C1 residency MSR should be increased by the time that has passed. 3) If KVM_X86_DISABLE_EXITS_HLT is enabled but KVM_X86_DISABLE_EXITS_MWAIT is disabled (i.e. mait exits happen), C3/C6/C7 residency will also never increase, but the C1 residency value should be read using rdmsr from the host, with a delta added from the host value. 4) If KVM_X86_DISABLE_EXITS_HLT and KVM_X86_DISABLE_EXITS_MWAIT are both disabled (i.e. mwait exits do not happen), all four residency values should be read using rdmsr from the host, with a delta added from the host value. 5) If KVM_X86_DISABLE_EXITS_HLT is disabled and KVM_X86_DISABLE_EXITS_MWAIT is enabled, the configuration makes no sense so it's okay not to be very optimized. In this case, the residency value should be read as in (4), but hlt vmexits will be accounted as in (2) so we need to be careful not to double-count the residency during hlt. This means doing four rdmsr before the beginning of the hlt vmexit and four at the end of the hlt vmexit. Therefore the data structure should be something like struct kvm_residency_msr { u64 value; bool delta_from_host; bool count_with_host; } u64 kvm_residency_read_host(struct kvm_residency_msr *msr) { u64 unscaled_value = rdmsrl(msr->index); // apply TSC scaling... return ... } u64 kvm_residency_read(struct kvm_residency_msr *msr) { return msr->value + (msr->delta_from_host ? kvm_residency_read_host(msr) : 0); } void kvm_residency_write(struct kvm_residency_msr *msr, u64 value) { msr->value = value - (msr->delta_from_host ? kvm_residency_read_host(msr) : 0); } // count_with_host is true for C1 iff any of KVM_CAP_DISABLE_EXITS_HLT // or KVM_CAP_DISABLE_EXITS_MWAIT is set // count_with_host is true for C3/C6/C7 iff KVM_CAP_DISABLE_EXITS_MWAIT is set void kvm_residency_setup(struct kvm_residency_msr *msr, u16 index, bool count_with_host) { /* Preserve value on calls after the first */ u64 value = msr->index ? kvm_residency_read(msr) : 0; msr->delta_from_host = msr->count_with_host = count_with_host; msr->index = index; kvm_residency_write(msr, value); } // The following functions are called from hlt vmexits. void kvm_residency_start_hlt(struct kvm_residency_msr *msr) { if (msr->count_with_host) { WARN_ON(msr->delta_from_host); msr->value += kvm_residency_read_host(msr); msr->delta_from_host = false; } } // host_tsc_waited is 0 except for MSR_CORE_C1_RES void kvm_residency_end_hlt(struct kvm_residency_msr *msr, u64 host_tsc_waited) { if (msr->count_with_host) { WARN_ON(!msr->delta_from_host); msr->value -= kvm_residency_read_host(msr); msr->delta_from_host = true; } if (host_tsc_waited) { // ... apply TSC scaling to host_tsc_waited ... msr->value += ...; } } Thanks, Paolo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts 2019-06-04 16:53 ` Paolo Bonzini @ 2019-06-05 10:56 ` Wanpeng Li 2019-06-11 7:38 ` Wanpeng Li 1 sibling, 0 replies; 11+ messages in thread From: Wanpeng Li @ 2019-06-05 10:56 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Liran Alon On Wed, 5 Jun 2019 at 00:53, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 21/05/19 08:06, Wanpeng Li wrote: > > From: Wanpeng Li <wanpengli@tencent.com> > > > > Allow guest reads CORE cstate when exposing host CPU power management capabilities > > to the guest. PKG cstate is restricted to avoid a guest to get the whole package > > information in multi-tenant scenario. > > > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > Cc: Radim Krčmář <rkrcmar@redhat.com> > > Cc: Sean Christopherson <sean.j.christopherson@intel.com> > > Cc: Liran Alon <liran.alon@oracle.com> > > Signed-off-by: Wanpeng Li <wanpengli@tencent.com> > > --- > > v1 -> v2: > > * use a separate bit for KVM_CAP_X86_DISABLE_EXITS > > > > Documentation/virtual/kvm/api.txt | 1 + > > arch/x86/include/asm/kvm_host.h | 1 + > > arch/x86/kvm/vmx/vmx.c | 6 ++++++ > > arch/x86/kvm/x86.c | 5 ++++- > > arch/x86/kvm/x86.h | 5 +++++ > > include/uapi/linux/kvm.h | 4 +++- > > tools/include/uapi/linux/kvm.h | 4 +++- > > 7 files changed, 23 insertions(+), 3 deletions(-) > > > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > > index 33cd92d..91fd86f 100644 > > --- a/Documentation/virtual/kvm/api.txt > > +++ b/Documentation/virtual/kvm/api.txt > > @@ -4894,6 +4894,7 @@ Valid bits in args[0] are > > #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) > > #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) > > #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) > > +#define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) > > > > Enabling this capability on a VM provides userspace with a way to no > > longer intercept some instructions for improved latency in some > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index d5457c7..1ce8289 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -882,6 +882,7 @@ struct kvm_arch { > > bool mwait_in_guest; > > bool hlt_in_guest; > > bool pause_in_guest; > > + bool cstate_in_guest; > > > > unsigned long irq_sources_bitmap; > > s64 kvmclock_offset; > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 0861c71..da24f18 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -6637,6 +6637,12 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) > > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); > > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); > > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); > > + if (kvm_cstate_in_guest(kvm)) { > > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C1_RES, MSR_TYPE_R); > > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R); > > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R); > > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R); > > I think I have changed my mind on the implementation of this, sorry. > > 1) We should emulate these MSRs always, otherwise the guest API changes > between different values of KVM_CAP_X86_DISABLE_EXITS which is not > intended. Also, KVM_CAP_X86_DISABLE_EXITS does not prevent live > migration, so it should be possible to set the MSRs in the host to > change the delta between the host and guest values. > > 2) If both KVM_X86_DISABLE_EXITS_HLT and KVM_X86_DISABLE_EXITS_MWAIT are > disabled (i.e. exit happens), the MSRs will be purely emulated. > C3/C6/C7 residency will never increase (it will remain the value that is > set by the host). When the VM executes an hlt vmexit, it should save > the current TSC. When it comes back, the C1 residency MSR should be > increased by the time that has passed. > > 3) If KVM_X86_DISABLE_EXITS_HLT is enabled but > KVM_X86_DISABLE_EXITS_MWAIT is disabled (i.e. mait exits happen), > C3/C6/C7 residency will also never increase, but the C1 residency value > should be read using rdmsr from the host, with a delta added from the > host value. > > 4) If KVM_X86_DISABLE_EXITS_HLT and KVM_X86_DISABLE_EXITS_MWAIT are both > disabled (i.e. mwait exits do not happen), all four residency values > should be read using rdmsr from the host, with a delta added from the > host value. > > 5) If KVM_X86_DISABLE_EXITS_HLT is disabled and > KVM_X86_DISABLE_EXITS_MWAIT is enabled, the configuration makes no sense > so it's okay not to be very optimized. In this case, the residency > value should be read as in (4), but hlt vmexits will be accounted as in > (2) so we need to be careful not to double-count the residency during > hlt. This means doing four rdmsr before the beginning of the hlt vmexit > and four at the end of the hlt vmexit. I will have a try, thanks Paolo! :) Regards, Wanpeng Li > > Therefore the data structure should be something like > > struct kvm_residency_msr { > u64 value; > bool delta_from_host; > bool count_with_host; > } > > u64 kvm_residency_read_host(struct kvm_residency_msr *msr) > { > u64 unscaled_value = rdmsrl(msr->index); > // apply TSC scaling... > return ... > } > > u64 kvm_residency_read(struct kvm_residency_msr *msr) > { > return msr->value + > (msr->delta_from_host ? kvm_residency_read_host(msr) : 0); > } > > void kvm_residency_write(struct kvm_residency_msr *msr, > u64 value) > { > msr->value = value - > (msr->delta_from_host ? kvm_residency_read_host(msr) : 0); > } > > // count_with_host is true for C1 iff any of KVM_CAP_DISABLE_EXITS_HLT > // or KVM_CAP_DISABLE_EXITS_MWAIT is set > // count_with_host is true for C3/C6/C7 iff KVM_CAP_DISABLE_EXITS_MWAIT > is set > void kvm_residency_setup(struct kvm_residency_msr *msr, u16 index, > bool count_with_host) > { > /* Preserve value on calls after the first */ > u64 value = msr->index ? kvm_residency_read(msr) : 0; > msr->delta_from_host = msr->count_with_host = count_with_host; > msr->index = index; > kvm_residency_write(msr, value); > } > > // The following functions are called from hlt vmexits. > > void kvm_residency_start_hlt(struct kvm_residency_msr *msr) > { > if (msr->count_with_host) { > WARN_ON(msr->delta_from_host); > msr->value += kvm_residency_read_host(msr); > msr->delta_from_host = false; > } > } > > // host_tsc_waited is 0 except for MSR_CORE_C1_RES > void kvm_residency_end_hlt(struct kvm_residency_msr *msr, > u64 host_tsc_waited) > { > if (msr->count_with_host) { > WARN_ON(!msr->delta_from_host); > msr->value -= kvm_residency_read_host(msr); > msr->delta_from_host = true; > } > if (host_tsc_waited) { > // ... apply TSC scaling to host_tsc_waited ... > msr->value += ...; > } > } > > Thanks, > > Paolo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts 2019-06-04 16:53 ` Paolo Bonzini 2019-06-05 10:56 ` Wanpeng Li @ 2019-06-11 7:38 ` Wanpeng Li 2019-06-11 11:09 ` Paolo Bonzini 1 sibling, 1 reply; 11+ messages in thread From: Wanpeng Li @ 2019-06-11 7:38 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Liran Alon On Wed, 5 Jun 2019 at 00:53, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 21/05/19 08:06, Wanpeng Li wrote: > > From: Wanpeng Li <wanpengli@tencent.com> > > > > Allow guest reads CORE cstate when exposing host CPU power management capabilities > > to the guest. PKG cstate is restricted to avoid a guest to get the whole package > > information in multi-tenant scenario. > > > > Cc: Paolo Bonzini <pbonzini@redhat.com> > > Cc: Radim Krčmář <rkrcmar@redhat.com> > > Cc: Sean Christopherson <sean.j.christopherson@intel.com> > > Cc: Liran Alon <liran.alon@oracle.com> > > Signed-off-by: Wanpeng Li <wanpengli@tencent.com> > > --- > > v1 -> v2: > > * use a separate bit for KVM_CAP_X86_DISABLE_EXITS > > > > Documentation/virtual/kvm/api.txt | 1 + > > arch/x86/include/asm/kvm_host.h | 1 + > > arch/x86/kvm/vmx/vmx.c | 6 ++++++ > > arch/x86/kvm/x86.c | 5 ++++- > > arch/x86/kvm/x86.h | 5 +++++ > > include/uapi/linux/kvm.h | 4 +++- > > tools/include/uapi/linux/kvm.h | 4 +++- > > 7 files changed, 23 insertions(+), 3 deletions(-) > > > > diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt > > index 33cd92d..91fd86f 100644 > > --- a/Documentation/virtual/kvm/api.txt > > +++ b/Documentation/virtual/kvm/api.txt > > @@ -4894,6 +4894,7 @@ Valid bits in args[0] are > > #define KVM_X86_DISABLE_EXITS_MWAIT (1 << 0) > > #define KVM_X86_DISABLE_EXITS_HLT (1 << 1) > > #define KVM_X86_DISABLE_EXITS_PAUSE (1 << 2) > > +#define KVM_X86_DISABLE_EXITS_CSTATE (1 << 3) > > > > Enabling this capability on a VM provides userspace with a way to no > > longer intercept some instructions for improved latency in some > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index d5457c7..1ce8289 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -882,6 +882,7 @@ struct kvm_arch { > > bool mwait_in_guest; > > bool hlt_in_guest; > > bool pause_in_guest; > > + bool cstate_in_guest; > > > > unsigned long irq_sources_bitmap; > > s64 kvmclock_offset; > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 0861c71..da24f18 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -6637,6 +6637,12 @@ static struct kvm_vcpu *vmx_create_vcpu(struct kvm *kvm, unsigned int id) > > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW); > > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW); > > vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW); > > + if (kvm_cstate_in_guest(kvm)) { > > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C1_RES, MSR_TYPE_R); > > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C3_RESIDENCY, MSR_TYPE_R); > > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C6_RESIDENCY, MSR_TYPE_R); > > + vmx_disable_intercept_for_msr(msr_bitmap, MSR_CORE_C7_RESIDENCY, MSR_TYPE_R); > > I think I have changed my mind on the implementation of this, sorry. > > 1) We should emulate these MSRs always, otherwise the guest API changes > between different values of KVM_CAP_X86_DISABLE_EXITS which is not > intended. Also, KVM_CAP_X86_DISABLE_EXITS does not prevent live > migration, so it should be possible to set the MSRs in the host to > change the delta between the host and guest values. > > 2) If both KVM_X86_DISABLE_EXITS_HLT and KVM_X86_DISABLE_EXITS_MWAIT are > disabled (i.e. exit happens), the MSRs will be purely emulated. > C3/C6/C7 residency will never increase (it will remain the value that is > set by the host). When the VM executes an hlt vmexit, it should save > the current TSC. When it comes back, the C1 residency MSR should be > increased by the time that has passed. > > 3) If KVM_X86_DISABLE_EXITS_HLT is enabled but > KVM_X86_DISABLE_EXITS_MWAIT is disabled (i.e. mait exits happen), > C3/C6/C7 residency will also never increase, but the C1 residency value > should be read using rdmsr from the host, with a delta added from the > host value. > > 4) If KVM_X86_DISABLE_EXITS_HLT and KVM_X86_DISABLE_EXITS_MWAIT are both > disabled (i.e. mwait exits do not happen), all four residency values > should be read using rdmsr from the host, with a delta added from the > host value. > > 5) If KVM_X86_DISABLE_EXITS_HLT is disabled and > KVM_X86_DISABLE_EXITS_MWAIT is enabled, the configuration makes no sense > so it's okay not to be very optimized. In this case, the residency > value should be read as in (4), but hlt vmexits will be accounted as in > (2) so we need to be careful not to double-count the residency during > hlt. This means doing four rdmsr before the beginning of the hlt vmexit > and four at the end of the hlt vmexit. MSR_CORE_C1_RES is unreadable except for ATOM platform, so I think we can avoid the complex logic to handle C1 now. :) Regards, Wanpeng Li > > Therefore the data structure should be something like > > struct kvm_residency_msr { > u64 value; > bool delta_from_host; > bool count_with_host; > } > > u64 kvm_residency_read_host(struct kvm_residency_msr *msr) > { > u64 unscaled_value = rdmsrl(msr->index); > // apply TSC scaling... > return ... > } > > u64 kvm_residency_read(struct kvm_residency_msr *msr) > { > return msr->value + > (msr->delta_from_host ? kvm_residency_read_host(msr) : 0); > } > > void kvm_residency_write(struct kvm_residency_msr *msr, > u64 value) > { > msr->value = value - > (msr->delta_from_host ? kvm_residency_read_host(msr) : 0); > } > > // count_with_host is true for C1 iff any of KVM_CAP_DISABLE_EXITS_HLT > // or KVM_CAP_DISABLE_EXITS_MWAIT is set > // count_with_host is true for C3/C6/C7 iff KVM_CAP_DISABLE_EXITS_MWAIT > is set > void kvm_residency_setup(struct kvm_residency_msr *msr, u16 index, > bool count_with_host) > { > /* Preserve value on calls after the first */ > u64 value = msr->index ? kvm_residency_read(msr) : 0; > msr->delta_from_host = msr->count_with_host = count_with_host; > msr->index = index; > kvm_residency_write(msr, value); > } > > // The following functions are called from hlt vmexits. > > void kvm_residency_start_hlt(struct kvm_residency_msr *msr) > { > if (msr->count_with_host) { > WARN_ON(msr->delta_from_host); > msr->value += kvm_residency_read_host(msr); > msr->delta_from_host = false; > } > } > > // host_tsc_waited is 0 except for MSR_CORE_C1_RES > void kvm_residency_end_hlt(struct kvm_residency_msr *msr, > u64 host_tsc_waited) > { > if (msr->count_with_host) { > WARN_ON(!msr->delta_from_host); > msr->value -= kvm_residency_read_host(msr); > msr->delta_from_host = true; > } > if (host_tsc_waited) { > // ... apply TSC scaling to host_tsc_waited ... > msr->value += ...; > } > } > > Thanks, > > Paolo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts 2019-06-11 7:38 ` Wanpeng Li @ 2019-06-11 11:09 ` Paolo Bonzini 2019-06-11 11:35 ` Wanpeng Li 2019-06-12 2:20 ` Wanpeng Li 0 siblings, 2 replies; 11+ messages in thread From: Paolo Bonzini @ 2019-06-11 11:09 UTC (permalink / raw) To: Wanpeng Li Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Liran Alon On 11/06/19 09:38, Wanpeng Li wrote: > MSR_CORE_C1_RES is unreadable except for ATOM platform, so I think we > can avoid the complex logic to handle C1 now. :) I disagree. Linux uses it on all platforms is available, and virtual machines that don't pass mwait through _only_ have C1, so it would be less useful to have deep C-state residency MSRs and not C1 residency. But turbostat can get the information from sysfs, so what are these MSRs used for? Paolo ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts 2019-06-11 11:09 ` Paolo Bonzini @ 2019-06-11 11:35 ` Wanpeng Li 2019-06-12 2:20 ` Wanpeng Li 1 sibling, 0 replies; 11+ messages in thread From: Wanpeng Li @ 2019-06-11 11:35 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Liran Alon On Tue, 11 Jun 2019 at 19:09, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 11/06/19 09:38, Wanpeng Li wrote: > > MSR_CORE_C1_RES is unreadable except for ATOM platform, so I think we > > can avoid the complex logic to handle C1 now. :) > > I disagree. Linux uses it on all platforms is available, and virtual > machines that don't pass mwait through _only_ have C1, so it would be > less useful to have deep C-state residency MSRs and not C1 residency. > > But turbostat can get the information from sysfs, so what are these MSRs > used for? The sysfs is not accurate, the time which is accumulated in the function cpuidle_enter_state() is the expected cstate we hope to enter instead of the real cstate we finally enter. For example, we found several SKX/CLX models can't enter deeper cstates in non-root mode, Intel hardware team has been working on this according to our report recently. The bare-metal cstate residency msrs don't increase in non-root mode, however, the time under /sys/devices/system/cpu/cpux/cpuidle/statex/ increase gradually in the guest. Regards, Wanpeng Li ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts 2019-06-11 11:09 ` Paolo Bonzini 2019-06-11 11:35 ` Wanpeng Li @ 2019-06-12 2:20 ` Wanpeng Li 1 sibling, 0 replies; 11+ messages in thread From: Wanpeng Li @ 2019-06-12 2:20 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Liran Alon On Tue, 11 Jun 2019 at 19:09, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 11/06/19 09:38, Wanpeng Li wrote: > > MSR_CORE_C1_RES is unreadable except for ATOM platform, so I think we > > can avoid the complex logic to handle C1 now. :) > > I disagree. Linux uses it on all platforms is available, and virtual > machines that don't pass mwait through _only_ have C1, so it would be > less useful to have deep C-state residency MSRs and not C1 residency. Your design heavily depends on read host MSR_CORE_C1_RES for C1 residency msr emulation, however, the host MSR_CORE_C1_RES is unreadable. #rdmsr 0x660 rdmsr: CPU 0 cannot read MSR 0x00000660 Refer to turbostat codes, https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/power/x86/turbostat/turbostat.c#n1335 C1 is derivated from other parameters. use_c1_residency_msr is true just for ATOM platform. Regards, Wanpeng Li ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 3/3] KVM: X86: Emulate MSR_IA32_MISC_ENABLE MWAIT bit 2019-05-21 6:06 [PATCH 1/3] KVM: Documentation: Add disable pause exits to KVM_CAP_X86_DISABLE_EXITS Wanpeng Li 2019-05-21 6:06 ` [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts Wanpeng Li @ 2019-05-21 6:06 ` Wanpeng Li 2019-06-04 16:59 ` Paolo Bonzini 1 sibling, 1 reply; 11+ messages in thread From: Wanpeng Li @ 2019-05-21 6:06 UTC (permalink / raw) To: linux-kernel, kvm Cc: Paolo Bonzini, Radim Krčmář, Sean Christopherson, Liran Alon, Konrad Rzeszutek Wilk From: Wanpeng Li <wanpengli@tencent.com> MSR IA32_MISC_ENABLE bit 18, according to SDM: | When this bit is set to 0, the MONITOR feature flag is not set (CPUID.01H:ECX[bit 3] = 0). | This indicates that MONITOR/MWAIT are not supported. | | Software attempts to execute MONITOR/MWAIT will cause #UD when this bit is 0. | | When this bit is set to 1 (default), MONITOR/MWAIT are supported (CPUID.01H:ECX[bit 3] = 1). The CPUID.01H:ECX[bit 3] ought to mirror the value of the MSR bit, CPUID.01H:ECX[bit 3] is a better guard than kvm_mwait_in_guest(). kvm_mwait_in_guest() affects the behavior of MONITOR/MWAIT, not its guest visibility. This patch implements toggling of the CPUID bit based on guest writes to the MSR. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Liran Alon <liran.alon@oracle.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> --- v1 -> v2: * hide behind KVM_CAP_DISABLE_QUIRKS arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/cpuid.c | 10 ++++++++++ arch/x86/kvm/x86.c | 10 ++++++++++ 3 files changed, 21 insertions(+) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 7a0e64c..e3ae96b5 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -382,6 +382,7 @@ struct kvm_sync_regs { #define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1) #define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2) #define KVM_X86_QUIRK_OUT_7E_INC_RIP (1 << 3) +#define KVM_X86_QUIRK_MISC_ENABLE_MWAIT (1 << 4) #define KVM_STATE_NESTED_GUEST_MODE 0x00000001 #define KVM_STATE_NESTED_RUN_PENDING 0x00000002 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index e18a9f9..f54d266 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -137,6 +137,16 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); + if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_MWAIT)) { + best = kvm_find_cpuid_entry(vcpu, 0x1, 0); + if (best) { + if (vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT) + best->ecx |= F(MWAIT); + else + best->ecx &= ~F(MWAIT); + } + } + /* Update physical-address width */ vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu); kvm_mmu_reset_context(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 765fe59..a4eb711 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2547,6 +2547,16 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } break; case MSR_IA32_MISC_ENABLE: + if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_MWAIT) && + ((vcpu->arch.ia32_misc_enable_msr ^ data) & MSR_IA32_MISC_ENABLE_MWAIT)) { + if ((vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT) && + !(data & MSR_IA32_MISC_ENABLE_MWAIT)) { + if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3)) + return 1; + } + vcpu->arch.ia32_misc_enable_msr = data; + kvm_update_cpuid(vcpu); + } vcpu->arch.ia32_misc_enable_msr = data; break; case MSR_IA32_SMBASE: -- 2.7.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 3/3] KVM: X86: Emulate MSR_IA32_MISC_ENABLE MWAIT bit 2019-05-21 6:06 ` [PATCH v2 3/3] KVM: X86: Emulate MSR_IA32_MISC_ENABLE MWAIT bit Wanpeng Li @ 2019-06-04 16:59 ` Paolo Bonzini 2019-06-05 11:00 ` Wanpeng Li 0 siblings, 1 reply; 11+ messages in thread From: Paolo Bonzini @ 2019-06-04 16:59 UTC (permalink / raw) To: Wanpeng Li, linux-kernel, kvm Cc: Radim Krčmář, Sean Christopherson, Liran Alon, Konrad Rzeszutek Wilk On 21/05/19 08:06, Wanpeng Li wrote: > > The CPUID.01H:ECX[bit 3] ought to mirror the value of the MSR bit, > CPUID.01H:ECX[bit 3] is a better guard than kvm_mwait_in_guest(). > kvm_mwait_in_guest() affects the behavior of MONITOR/MWAIT, not its > guest visibility. This needs some adjustment so that the default is backwards-compatible: diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index e3ae96b52a16..f9b021e16ebc 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -378,11 +378,11 @@ struct kvm_sync_regs { struct kvm_vcpu_events events; }; -#define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0) -#define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1) -#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2) -#define KVM_X86_QUIRK_OUT_7E_INC_RIP (1 << 3) -#define KVM_X86_QUIRK_MISC_ENABLE_MWAIT (1 << 4) +#define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0) +#define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1) +#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2) +#define KVM_X86_QUIRK_OUT_7E_INC_RIP (1 << 3) +#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4) #define KVM_STATE_NESTED_GUEST_MODE 0x00000001 #define KVM_STATE_NESTED_RUN_PENDING 0x00000002 diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index f54d266fd3b5..bfa1341ce6f1 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -137,10 +137,10 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); - if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_MWAIT)) { + if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) { best = kvm_find_cpuid_entry(vcpu, 0x1, 0); if (best) { - if (vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT) + if (vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_NO_MWAIT) best->ecx |= F(MWAIT); else best->ecx &= ~F(MWAIT); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 528935733fe0..0c1498da46c7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2548,17 +2548,15 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } break; case MSR_IA32_MISC_ENABLE: - if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_MWAIT) && - ((vcpu->arch.ia32_misc_enable_msr ^ data) & MSR_IA32_MISC_ENABLE_MWAIT)) { - if ((vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT) && - !(data & MSR_IA32_MISC_ENABLE_MWAIT)) { - if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3)) - return 1; - } + if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) && + ((vcpu->arch.ia32_misc_enable_msr ^ data) & MSR_IA32_MISC_ENABLE_NO_MWAIT)) { + if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3)) + return 1; vcpu->arch.ia32_misc_enable_msr = data; kvm_update_cpuid(vcpu); + } else { + vcpu->arch.ia32_misc_enable_msr = data; } - vcpu->arch.ia32_misc_enable_msr = data; break; case MSR_IA32_SMBASE: if (!msr_info->host_initiated) Please double check, in the meanwhile I've queued the patch. Paolo ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 3/3] KVM: X86: Emulate MSR_IA32_MISC_ENABLE MWAIT bit 2019-06-04 16:59 ` Paolo Bonzini @ 2019-06-05 11:00 ` Wanpeng Li 0 siblings, 0 replies; 11+ messages in thread From: Wanpeng Li @ 2019-06-05 11:00 UTC (permalink / raw) To: Paolo Bonzini Cc: LKML, kvm, Radim Krčmář, Sean Christopherson, Liran Alon, Konrad Rzeszutek Wilk On Wed, 5 Jun 2019 at 00:59, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 21/05/19 08:06, Wanpeng Li wrote: > > > > The CPUID.01H:ECX[bit 3] ought to mirror the value of the MSR bit, > > CPUID.01H:ECX[bit 3] is a better guard than kvm_mwait_in_guest(). > > kvm_mwait_in_guest() affects the behavior of MONITOR/MWAIT, not its > > guest visibility. > > This needs some adjustment so that the default is backwards-compatible: > > diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h > index e3ae96b52a16..f9b021e16ebc 100644 > --- a/arch/x86/include/uapi/asm/kvm.h > +++ b/arch/x86/include/uapi/asm/kvm.h > @@ -378,11 +378,11 @@ struct kvm_sync_regs { > struct kvm_vcpu_events events; > }; > > -#define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0) > -#define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1) > -#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2) > -#define KVM_X86_QUIRK_OUT_7E_INC_RIP (1 << 3) > -#define KVM_X86_QUIRK_MISC_ENABLE_MWAIT (1 << 4) > +#define KVM_X86_QUIRK_LINT0_REENABLED (1 << 0) > +#define KVM_X86_QUIRK_CD_NW_CLEARED (1 << 1) > +#define KVM_X86_QUIRK_LAPIC_MMIO_HOLE (1 << 2) > +#define KVM_X86_QUIRK_OUT_7E_INC_RIP (1 << 3) > +#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4) > > #define KVM_STATE_NESTED_GUEST_MODE 0x00000001 > #define KVM_STATE_NESTED_RUN_PENDING 0x00000002 > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index f54d266fd3b5..bfa1341ce6f1 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -137,10 +137,10 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) > (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > > - if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_MWAIT)) { > + if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) { > best = kvm_find_cpuid_entry(vcpu, 0x1, 0); > if (best) { > - if (vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT) > + if (vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_NO_MWAIT) > best->ecx |= F(MWAIT); > else > best->ecx &= ~F(MWAIT); > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 528935733fe0..0c1498da46c7 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2548,17 +2548,15 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > } > break; > case MSR_IA32_MISC_ENABLE: > - if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_MWAIT) && > - ((vcpu->arch.ia32_misc_enable_msr ^ data) & MSR_IA32_MISC_ENABLE_MWAIT)) { > - if ((vcpu->arch.ia32_misc_enable_msr & MSR_IA32_MISC_ENABLE_MWAIT) && > - !(data & MSR_IA32_MISC_ENABLE_MWAIT)) { > - if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3)) > - return 1; > - } > + if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) && > + ((vcpu->arch.ia32_misc_enable_msr ^ data) & MSR_IA32_MISC_ENABLE_NO_MWAIT)) { > + if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3)) > + return 1; > vcpu->arch.ia32_misc_enable_msr = data; > kvm_update_cpuid(vcpu); > + } else { > + vcpu->arch.ia32_misc_enable_msr = data; > } > - vcpu->arch.ia32_misc_enable_msr = data; > break; > case MSR_IA32_SMBASE: > if (!msr_info->host_initiated) > > Please double check, in the meanwhile I've queued the patch. Looks good to me, thanks. :) Regards, Wanpeng Li ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-06-12 2:19 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-05-21 6:06 [PATCH 1/3] KVM: Documentation: Add disable pause exits to KVM_CAP_X86_DISABLE_EXITS Wanpeng Li 2019-05-21 6:06 ` [PATCH v2 2/3] KVM: X86: Provide a capability to disable cstate msr read intercepts Wanpeng Li 2019-06-04 16:53 ` Paolo Bonzini 2019-06-05 10:56 ` Wanpeng Li 2019-06-11 7:38 ` Wanpeng Li 2019-06-11 11:09 ` Paolo Bonzini 2019-06-11 11:35 ` Wanpeng Li 2019-06-12 2:20 ` Wanpeng Li 2019-05-21 6:06 ` [PATCH v2 3/3] KVM: X86: Emulate MSR_IA32_MISC_ENABLE MWAIT bit Wanpeng Li 2019-06-04 16:59 ` Paolo Bonzini 2019-06-05 11:00 ` Wanpeng Li
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).