* [PATCH v3 0/2] Switch MSR_MISC_FEATURES_ENABLES and one optimization @ 2019-03-25 8:06 Xiaoyao Li 2019-03-25 8:06 ` [PATCH v3 1/2] kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest Xiaoyao Li 2019-03-25 8:06 ` [PATCH v3 2/2] x86/vmx: optimize MSR_MISC_FEATURES_ENABLES switch Xiaoyao Li 0 siblings, 2 replies; 6+ messages in thread From: Xiaoyao Li @ 2019-03-25 8:06 UTC (permalink / raw) To: Paolo Bonzini, Radim Krčmář, linux-kernel, kvm Cc: Xiaoyao Li, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, chao.gao, Sean Christopherson Patch 1 switches MSR_MISC_FEATURES_ENABLE between host and guest to avoid cpuid faulting and ring3mwait of host leaking to guest. Because cpuid faulting eanbled in host may potentially cause guest boot failure, and kvm doesn't expose ring3mwait to guest yet, it should be leaked to guest. Patch 2 optimizes the switch of MSR_MISC_FEATURES_ENABLES by avoiding WRMSR whenever possible to save cycles. ==changelog== v2->v3: - use msr_misc_features_shadow instead of reading hardware msr, from Sean Christopherson - avoid WRMSR whenever possible, from Sean Christopherson. v1->v2: - move the save/restore of cpuid faulting bit to vmx_prepare_swich_to_guest/vmx_prepare_swich_to_host to avoid every vmentry RDMSR, based on Paolo's comment. Xiaoyao Li (2): kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest x86/vmx: optimize MSR_MISC_FEATURES_ENABLES switch arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kernel/process.c | 1 + arch/x86/kvm/vmx/vmx.c | 31 +++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 15 ++++++++++++--- 4 files changed, 46 insertions(+), 3 deletions(-) -- 2.19.1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 1/2] kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest 2019-03-25 8:06 [PATCH v3 0/2] Switch MSR_MISC_FEATURES_ENABLES and one optimization Xiaoyao Li @ 2019-03-25 8:06 ` Xiaoyao Li 2019-03-25 15:33 ` Sean Christopherson 2019-03-25 8:06 ` [PATCH v3 2/2] x86/vmx: optimize MSR_MISC_FEATURES_ENABLES switch Xiaoyao Li 1 sibling, 1 reply; 6+ messages in thread From: Xiaoyao Li @ 2019-03-25 8:06 UTC (permalink / raw) To: Paolo Bonzini, Radim Krčmář, linux-kernel, kvm Cc: Xiaoyao Li, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, chao.gao, Sean Christopherson There are two defined bits in MSR_MISC_FEATURES_ENABLES, bit 0 for cpuid faulting and bit 1 for ring3mwait. == cpuid Faulting == cpuid faulting is a feature about CPUID instruction. When cpuid faulting is enabled, all execution of the CPUID instruction outside system-management mode (SMM) cause a general-protection (#GP) if the CPL > 0. About this feature, detailed information can be found at https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf Current KVM provides software emulation of this feature for guest. However, because cpuid faulting takes higher priority over CPUID vm exit (Intel SDM vol3.25.1.1), there is a risk of leaking cpuid faulting to guest when host enables it. If host enables cpuid faulting by setting the bit 0 of MSR_MISC_FEATURES_ENABLES, it will pass to guest since there is no switch of MSR_MISC_FEATURES_ENABLES yet. As a result, when guest calls CPUID instruction in CPL > 0, it will generate a #GP instead of CPUID vm eixt. This issue will cause guest boot failure when guest uses *modprobe* to load modules. *modprobe* calls CPUID instruction, thus causing #GP in guest. Since there is no handling of cpuid faulting in #GP handler, guest fails boot. == ring3mwait == Ring3mwait is a Xeon-Phi Product Family x200 series specific feature, which allows the MONITOR and MWAIT instructions to be executed in rings other than ring 0. The feature can be enabled by setting bit 1 in MSR_MISC_FEATURES_ENABLES. The register can also be read to determine whether the instructions are enabled at other than ring 0. About this feature, description can be found at https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait Current kvm doesn't expose feature ring3mwait to guest. However, there is also a risk of leaking ring3mwait to guest if host enables it since there is no switch of MSR_MISC_FEATURES_ENABLES. == solution == From above analysis, both cpuid faulting and ring3mwait can be leaked to guest. To fix this issue, MSR_MISC_FEATURES_ENABLES should be switched between host and guest. Since MSR_MISC_FEATURES_ENABLES is intel-specific, this patch implement the switching only in vmx. For the reason that kvm provides the software emulation of cpuid faulting and kvm doesn't expose ring3mwait to guest. MSR_MISC_FEATURES_ENABLES can be just cleared to zero for guest when any of the features is enabled in host. Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com> --- arch/x86/kernel/process.c | 1 + arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 1bba1a3c0b01..94a566e79b6c 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -191,6 +191,7 @@ int set_tsc_mode(unsigned int val) } DEFINE_PER_CPU(u64, msr_misc_features_shadow); +EXPORT_PER_CPU_SYMBOL_GPL(msr_misc_features_shadow); static void set_cpuid_faulting(bool on) { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 270c6566fd5a..65aa947947ba 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1031,6 +1031,16 @@ static void pt_guest_exit(struct vcpu_vmx *vmx) wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl); } +static void vmx_prepare_guest_misc_features_enables(struct vcpu_vmx *vmx) +{ + u64 msrval = this_cpu_read(msr_misc_features_shadow); + + if (!msrval) + return; + + wrmsrl(MSR_MISC_FEATURES_ENABLES, 0ULL); +} + void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -1064,6 +1074,8 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) vmx->loaded_cpu_state = vmx->loaded_vmcs; host_state = &vmx->loaded_cpu_state->host_state; + vmx_prepare_guest_misc_features_enables(vmx); + /* * Set host fs and gs selectors. Unfortunately, 22.2.3 does not * allow segment selectors with cpl > 0 or ti == 1. @@ -1120,6 +1132,16 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) } } +static void vmx_load_host_misc_features_enables(struct vcpu_vmx *vmx) +{ + u64 msrval = this_cpu_read(msr_misc_features_shadow); + + if (!msrval) + return; + + wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); +} + static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) { struct vmcs_host_state *host_state; @@ -1133,6 +1155,8 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) ++vmx->vcpu.stat.host_state_reload; vmx->loaded_cpu_state = NULL; + vmx_load_host_misc_features_enables(vmx); + #ifdef CONFIG_X86_64 rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base); #endif -- 2.19.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/2] kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest 2019-03-25 8:06 ` [PATCH v3 1/2] kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest Xiaoyao Li @ 2019-03-25 15:33 ` Sean Christopherson 2019-03-25 16:38 ` Xiaoyao Li 0 siblings, 1 reply; 6+ messages in thread From: Sean Christopherson @ 2019-03-25 15:33 UTC (permalink / raw) To: Xiaoyao Li Cc: Paolo Bonzini, Radim Krčmář, linux-kernel, kvm, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, chao.gao On Mon, Mar 25, 2019 at 04:06:49PM +0800, Xiaoyao Li wrote: > There are two defined bits in MSR_MISC_FEATURES_ENABLES, bit 0 for cpuid > faulting and bit 1 for ring3mwait. > > == cpuid Faulting == > cpuid faulting is a feature about CPUID instruction. When cpuid faulting > is enabled, all execution of the CPUID instruction outside system-management > mode (SMM) cause a general-protection (#GP) if the CPL > 0. > > About this feature, detailed information can be found at > https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf > > Current KVM provides software emulation of this feature for guest. > However, because cpuid faulting takes higher priority over CPUID vm exit (Intel > SDM vol3.25.1.1), there is a risk of leaking cpuid faulting to guest when host > enables it. If host enables cpuid faulting by setting the bit 0 of > MSR_MISC_FEATURES_ENABLES, it will pass to guest since there is no switch of > MSR_MISC_FEATURES_ENABLES yet. As a result, when guest calls CPUID instruction > in CPL > 0, it will generate a #GP instead of CPUID vm eixt. > > This issue will cause guest boot failure when guest uses *modprobe* > to load modules. *modprobe* calls CPUID instruction, thus causing #GP in > guest. Since there is no handling of cpuid faulting in #GP handler, guest > fails boot. > > == ring3mwait == > Ring3mwait is a Xeon-Phi Product Family x200 series specific feature, > which allows the MONITOR and MWAIT instructions to be executed in rings > other than ring 0. The feature can be enabled by setting bit 1 in > MSR_MISC_FEATURES_ENABLES. The register can also be read to determine > whether the instructions are enabled at other than ring 0. > > About this feature, description can be found at > https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait > > Current kvm doesn't expose feature ring3mwait to guest. However, there is also > a risk of leaking ring3mwait to guest if host enables it since there is no > switch of MSR_MISC_FEATURES_ENABLES. > > == solution == > From above analysis, both cpuid faulting and ring3mwait can be leaked to guest. > To fix this issue, MSR_MISC_FEATURES_ENABLES should be switched between host > and guest. Since MSR_MISC_FEATURES_ENABLES is intel-specific, this patch > implement the switching only in vmx. > > For the reason that kvm provides the software emulation of cpuid faulting and > kvm doesn't expose ring3mwait to guest. MSR_MISC_FEATURES_ENABLES can be just > cleared to zero for guest when any of the features is enabled in host. > > Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com> > --- > arch/x86/kernel/process.c | 1 + > arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++++ > 2 files changed, 25 insertions(+) > > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > index 1bba1a3c0b01..94a566e79b6c 100644 > --- a/arch/x86/kernel/process.c > +++ b/arch/x86/kernel/process.c > @@ -191,6 +191,7 @@ int set_tsc_mode(unsigned int val) > } > > DEFINE_PER_CPU(u64, msr_misc_features_shadow); > +EXPORT_PER_CPU_SYMBOL_GPL(msr_misc_features_shadow); > > static void set_cpuid_faulting(bool on) > { > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 270c6566fd5a..65aa947947ba 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1031,6 +1031,16 @@ static void pt_guest_exit(struct vcpu_vmx *vmx) > wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl); > } > > +static void vmx_prepare_guest_misc_features_enables(struct vcpu_vmx *vmx) No need for @vmx. My preference would be to drop the helpers entirely, e.g. it's two lines of code, three if you count the declaration of msrval. > +{ > + u64 msrval = this_cpu_read(msr_misc_features_shadow); > + > + if (!msrval) > + return; > + > + wrmsrl(MSR_MISC_FEATURES_ENABLES, 0ULL); > +} > + > void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -1064,6 +1074,8 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > vmx->loaded_cpu_state = vmx->loaded_vmcs; > host_state = &vmx->loaded_cpu_state->host_state; > > + vmx_prepare_guest_misc_features_enables(vmx); > + > /* > * Set host fs and gs selectors. Unfortunately, 22.2.3 does not > * allow segment selectors with cpl > 0 or ti == 1. > @@ -1120,6 +1132,16 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > } > } > > +static void vmx_load_host_misc_features_enables(struct vcpu_vmx *vmx) Likewise, no need for @vmx. > +{ > + u64 msrval = this_cpu_read(msr_misc_features_shadow); > + > + if (!msrval) > + return; > + > + wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); > +} > + > static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) > { > struct vmcs_host_state *host_state; > @@ -1133,6 +1155,8 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) > ++vmx->vcpu.stat.host_state_reload; > vmx->loaded_cpu_state = NULL; > > + vmx_load_host_misc_features_enables(vmx); > + > #ifdef CONFIG_X86_64 > rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base); > #endif > -- > 2.19.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v3 1/2] kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest 2019-03-25 15:33 ` Sean Christopherson @ 2019-03-25 16:38 ` Xiaoyao Li 0 siblings, 0 replies; 6+ messages in thread From: Xiaoyao Li @ 2019-03-25 16:38 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Radim Krčmář, linux-kernel, kvm, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, chao.gao On Mon, 2019-03-25 at 08:33 -0700, Sean Christopherson wrote: > On Mon, Mar 25, 2019 at 04:06:49PM +0800, Xiaoyao Li wrote: > > There are two defined bits in MSR_MISC_FEATURES_ENABLES, bit 0 for cpuid > > faulting and bit 1 for ring3mwait. > > > > == cpuid Faulting == > > cpuid faulting is a feature about CPUID instruction. When cpuid faulting > > is enabled, all execution of the CPUID instruction outside system-management > > mode (SMM) cause a general-protection (#GP) if the CPL > 0. > > > > About this feature, detailed information can be found at > > https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf > > > > Current KVM provides software emulation of this feature for guest. > > However, because cpuid faulting takes higher priority over CPUID vm exit > > (Intel > > SDM vol3.25.1.1), there is a risk of leaking cpuid faulting to guest when > > host > > enables it. If host enables cpuid faulting by setting the bit 0 of > > MSR_MISC_FEATURES_ENABLES, it will pass to guest since there is no switch of > > MSR_MISC_FEATURES_ENABLES yet. As a result, when guest calls CPUID > > instruction > > in CPL > 0, it will generate a #GP instead of CPUID vm eixt. > > > > This issue will cause guest boot failure when guest uses *modprobe* > > to load modules. *modprobe* calls CPUID instruction, thus causing #GP in > > guest. Since there is no handling of cpuid faulting in #GP handler, guest > > fails boot. > > > > == ring3mwait == > > Ring3mwait is a Xeon-Phi Product Family x200 series specific feature, > > which allows the MONITOR and MWAIT instructions to be executed in rings > > other than ring 0. The feature can be enabled by setting bit 1 in > > MSR_MISC_FEATURES_ENABLES. The register can also be read to determine > > whether the instructions are enabled at other than ring 0. > > > > About this feature, description can be found at > > https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait > > > > Current kvm doesn't expose feature ring3mwait to guest. However, there is > > also > > a risk of leaking ring3mwait to guest if host enables it since there is no > > switch of MSR_MISC_FEATURES_ENABLES. > > > > == solution == > > From above analysis, both cpuid faulting and ring3mwait can be leaked to > > guest. > > To fix this issue, MSR_MISC_FEATURES_ENABLES should be switched between host > > and guest. Since MSR_MISC_FEATURES_ENABLES is intel-specific, this patch > > implement the switching only in vmx. > > > > For the reason that kvm provides the software emulation of cpuid faulting > > and > > kvm doesn't expose ring3mwait to guest. MSR_MISC_FEATURES_ENABLES can be > > just > > cleared to zero for guest when any of the features is enabled in host. > > > > Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com> > > --- > > arch/x86/kernel/process.c | 1 + > > arch/x86/kvm/vmx/vmx.c | 24 ++++++++++++++++++++++++ > > 2 files changed, 25 insertions(+) > > > > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c > > index 1bba1a3c0b01..94a566e79b6c 100644 > > --- a/arch/x86/kernel/process.c > > +++ b/arch/x86/kernel/process.c > > @@ -191,6 +191,7 @@ int set_tsc_mode(unsigned int val) > > } > > > > DEFINE_PER_CPU(u64, msr_misc_features_shadow); > > +EXPORT_PER_CPU_SYMBOL_GPL(msr_misc_features_shadow); > > > > static void set_cpuid_faulting(bool on) > > { > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index 270c6566fd5a..65aa947947ba 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -1031,6 +1031,16 @@ static void pt_guest_exit(struct vcpu_vmx *vmx) > > wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl); > > } > > > > +static void vmx_prepare_guest_misc_features_enables(struct vcpu_vmx *vmx) > > No need for @vmx. My preference would be to drop the helpers entirely, > e.g. it's two lines of code, three if you count the declaration of msrval. My mistake, @vmx should be removed and add in the next patch. Well, my initial thought of using the helper is because the handling of MSR_MISC_FEATURES_ENABLES might not be so simple if add the handling of ring3wait bit. Although in this patch, it just clears the bit 1 of MSR_MISC_FEATURES_ENABLES since kvm doesn't expose ring3mwait to guest yet and kvm doesn't allow guest to set ring3mwait bit, there indeed is an issue that guest kernel will generate "unchecked MSR access error: WRMSR to 0x140" when we use `-cpu host` in a Xeon Phi 200 series host. Since feature ring3mwait is enumerated by FMS in current kernel code, using '-cpu host' makes guest kernel think it has feature ring3wait too. However there is support of ring3mwait yet, guest kernel will generate "unchecked MSR access error: WRMSR to 0x140" when it setup ring3mwait. Unfortunately, I didn't provide the emualtion and handling of ring3mwait in this patchset, but keep the helper. You're right, it is really unnecessary in this patchset. I will remove the helper in the next version. > > +{ > > + u64 msrval = this_cpu_read(msr_misc_features_shadow); > > + > > + if (!msrval) > > + return; > > + > > + wrmsrl(MSR_MISC_FEATURES_ENABLES, 0ULL); > > +} > > + > > void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > > { > > struct vcpu_vmx *vmx = to_vmx(vcpu); > > @@ -1064,6 +1074,8 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu > > *vcpu) > > vmx->loaded_cpu_state = vmx->loaded_vmcs; > > host_state = &vmx->loaded_cpu_state->host_state; > > > > + vmx_prepare_guest_misc_features_enables(vmx); > > + > > /* > > * Set host fs and gs selectors. Unfortunately, 22.2.3 does not > > * allow segment selectors with cpl > 0 or ti == 1. > > @@ -1120,6 +1132,16 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu > > *vcpu) > > } > > } > > > > +static void vmx_load_host_misc_features_enables(struct vcpu_vmx *vmx) > > Likewise, no need for @vmx. > > > +{ > > + u64 msrval = this_cpu_read(msr_misc_features_shadow); > > + > > + if (!msrval) > > + return; > > + > > + wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); > > +} > > + > > static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx) > > { > > struct vmcs_host_state *host_state; > > @@ -1133,6 +1155,8 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx > > *vmx) > > ++vmx->vcpu.stat.host_state_reload; > > vmx->loaded_cpu_state = NULL; > > > > + vmx_load_host_misc_features_enables(vmx); > > + > > #ifdef CONFIG_X86_64 > > rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base); > > #endif > > -- > > 2.19.1 > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v3 2/2] x86/vmx: optimize MSR_MISC_FEATURES_ENABLES switch 2019-03-25 8:06 [PATCH v3 0/2] Switch MSR_MISC_FEATURES_ENABLES and one optimization Xiaoyao Li 2019-03-25 8:06 ` [PATCH v3 1/2] kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest Xiaoyao Li @ 2019-03-25 8:06 ` Xiaoyao Li 2019-03-25 15:47 ` Sean Christopherson 1 sibling, 1 reply; 6+ messages in thread From: Xiaoyao Li @ 2019-03-25 8:06 UTC (permalink / raw) To: Paolo Bonzini, Radim Krčmář, linux-kernel, kvm Cc: Xiaoyao Li, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, chao.gao, Sean Christopherson KVM needs to switch MSR_MISC_FEATURES_ENABLES between host and guest in every pcpu/vcpu context switch. Since WRMSR is expensive, this patch tries to save cycles by avoiding WRMSR MSR_MISC_FEATURES_ENABLES whenever possible. If host's value is zero, nothing needs to be done, since guest can use kvm emulated cpuid faulting. If host's value is non-zero, it need not clear MSR_MISC_FEATURES_ENABLES unconditionally. We can use hardware cpuid faulting if guest's value is equal to host'value, thus avoid WRMSR MSR_MISC_FEATURES_ENABLES. Since hardware cpuid faulting takes higher priority than CPUID vm exit, it should be updated to hardware while guest wrmsr and hardware cpuid faulting is used for guest. Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com> --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/vmx/vmx.c | 13 ++++++++++--- arch/x86/kvm/x86.c | 15 ++++++++++++--- 3 files changed, 24 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2c53df4a5a2a..9bcb444b903d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1343,6 +1343,8 @@ void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw); void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l); int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr); +int kvm_supported_msr_misc_features_enables(struct kvm_vcpu *vcpu, u64 data); + int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr); int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 65aa947947ba..73bb11f74b36 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1035,10 +1035,10 @@ static void vmx_prepare_guest_misc_features_enables(struct vcpu_vmx *vmx) { u64 msrval = this_cpu_read(msr_misc_features_shadow); - if (!msrval) + if (!msrval || msrval == vmx->vcpu.arch.msr_misc_features_enables) return; - wrmsrl(MSR_MISC_FEATURES_ENABLES, 0ULL); + wrmsrl(MSR_MISC_FEATURES_ENABLES, vmx->vcpu.arch.msr_misc_features_enables); } void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) @@ -1136,7 +1136,7 @@ static void vmx_load_host_misc_features_enables(struct vcpu_vmx *vmx) { u64 msrval = this_cpu_read(msr_misc_features_shadow); - if (!msrval) + if (!msrval || msrval == vmx->vcpu.arch.msr_misc_features_enables) return; wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); @@ -2043,6 +2043,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) else vmx->pt_desc.guest.addr_a[index / 2] = data; break; + case MSR_MISC_FEATURES_ENABLES: + if (!kvm_supported_msr_misc_features_enables(vcpu, data)) + return 1; + if (this_cpu_read(msr_misc_features_shadow) && vmx->loaded_cpu_state) + wrmsrl(MSR_MISC_FEATURES_ENABLES, data); + vcpu->arch.msr_misc_features_enables = data; + break; case MSR_TSC_AUX: if (!msr_info->host_initiated && !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ad1df965574e..d2af90422a51 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2449,6 +2449,17 @@ static void record_steal_time(struct kvm_vcpu *vcpu) &vcpu->arch.st.steal, sizeof(struct kvm_steal_time)); } +int kvm_supported_msr_misc_features_enables(struct kvm_vcpu *vcpu, u64 data) +{ + if (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT || + (data & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT && + !supports_cpuid_fault(vcpu))) + return 0; + else + return 1; +} +EXPORT_SYMBOL_GPL(kvm_supported_msr_misc_features_enables); + int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { bool pr = false; @@ -2669,9 +2680,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vcpu->arch.msr_platform_info = data; break; case MSR_MISC_FEATURES_ENABLES: - if (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT || - (data & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT && - !supports_cpuid_fault(vcpu))) + if (!kvm_supported_msr_misc_features_enables(vcpu, data)) return 1; vcpu->arch.msr_misc_features_enables = data; break; -- 2.19.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH v3 2/2] x86/vmx: optimize MSR_MISC_FEATURES_ENABLES switch 2019-03-25 8:06 ` [PATCH v3 2/2] x86/vmx: optimize MSR_MISC_FEATURES_ENABLES switch Xiaoyao Li @ 2019-03-25 15:47 ` Sean Christopherson 0 siblings, 0 replies; 6+ messages in thread From: Sean Christopherson @ 2019-03-25 15:47 UTC (permalink / raw) To: Xiaoyao Li Cc: Paolo Bonzini, Radim Krčmář, linux-kernel, kvm, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, chao.gao On Mon, Mar 25, 2019 at 04:06:50PM +0800, Xiaoyao Li wrote: > KVM needs to switch MSR_MISC_FEATURES_ENABLES between host and guest in > every pcpu/vcpu context switch. Since WRMSR is expensive, this patch tries > to save cycles by avoiding WRMSR MSR_MISC_FEATURES_ENABLES whenever possible. > > If host's value is zero, nothing needs to be done, since guest can use > kvm emulated cpuid faulting. > > If host's value is non-zero, it need not clear MSR_MISC_FEATURES_ENABLES > unconditionally. We can use hardware cpuid faulting if guest's > value is equal to host'value, thus avoid WRMSR > MSR_MISC_FEATURES_ENABLES. > > Since hardware cpuid faulting takes higher priority than CPUID vm exit, > it should be updated to hardware while guest wrmsr and hardware cpuid > faulting is used for guest. > > Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com> > --- > arch/x86/include/asm/kvm_host.h | 2 ++ > arch/x86/kvm/vmx/vmx.c | 13 ++++++++++--- > arch/x86/kvm/x86.c | 15 ++++++++++++--- > 3 files changed, 24 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index 2c53df4a5a2a..9bcb444b903d 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -1343,6 +1343,8 @@ void kvm_lmsw(struct kvm_vcpu *vcpu, unsigned long msw); > void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l); > int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr); > > +int kvm_supported_msr_misc_features_enables(struct kvm_vcpu *vcpu, u64 data); > + > int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr); > int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr); > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 65aa947947ba..73bb11f74b36 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1035,10 +1035,10 @@ static void vmx_prepare_guest_misc_features_enables(struct vcpu_vmx *vmx) > { > u64 msrval = this_cpu_read(msr_misc_features_shadow); > > - if (!msrval) > + if (!msrval || msrval == vmx->vcpu.arch.msr_misc_features_enables) > return; > > - wrmsrl(MSR_MISC_FEATURES_ENABLES, 0ULL); > + wrmsrl(MSR_MISC_FEATURES_ENABLES, vmx->vcpu.arch.msr_misc_features_enables); > } > > void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > @@ -1136,7 +1136,7 @@ static void vmx_load_host_misc_features_enables(struct vcpu_vmx *vmx) > { > u64 msrval = this_cpu_read(msr_misc_features_shadow); > > - if (!msrval) > + if (!msrval || msrval == vmx->vcpu.arch.msr_misc_features_enables) > return; > > wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval); > @@ -2043,6 +2043,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > else > vmx->pt_desc.guest.addr_a[index / 2] = data; > break; > + case MSR_MISC_FEATURES_ENABLES: > + if (!kvm_supported_msr_misc_features_enables(vcpu, data)) > + return 1; > + if (this_cpu_read(msr_misc_features_shadow) && vmx->loaded_cpu_state) > + wrmsrl(MSR_MISC_FEATURES_ENABLES, data); > + vcpu->arch.msr_misc_features_enables = data; > + break; > case MSR_TSC_AUX: > if (!msr_info->host_initiated && > !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index ad1df965574e..d2af90422a51 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2449,6 +2449,17 @@ static void record_steal_time(struct kvm_vcpu *vcpu) > &vcpu->arch.st.steal, sizeof(struct kvm_steal_time)); > } > > +int kvm_supported_msr_misc_features_enables(struct kvm_vcpu *vcpu, u64 data) The name kvm_supported_msr_misc_features_enables() is a bit confusing, e.g. I expected it to return a bitmask of the supported features. Maybe kvm_misc_features_enables_msr_valid() to be consistent with vmx_feature_control_msr_valid()? > +{ > + if (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT || > + (data & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT && > + !supports_cpuid_fault(vcpu))) > + return 0; > + else > + return 1; No need for the if-else, you can simply do: return (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT) || (data & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT && !supports_cpuid_fault(vcpu)); You could even shorten it to: return (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT) || (data && !supports_cpuid_fault(vcpu)); although that might be getting a bit too cute. > +} > +EXPORT_SYMBOL_GPL(kvm_supported_msr_misc_features_enables); > + > int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > { > bool pr = false; > @@ -2669,9 +2680,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > vcpu->arch.msr_platform_info = data; > break; > case MSR_MISC_FEATURES_ENABLES: > - if (data & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT || > - (data & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT && > - !supports_cpuid_fault(vcpu))) > + if (!kvm_supported_msr_misc_features_enables(vcpu, data)) > return 1; > vcpu->arch.msr_misc_features_enables = data; > break; > -- > 2.19.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-03-25 16:41 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-03-25 8:06 [PATCH v3 0/2] Switch MSR_MISC_FEATURES_ENABLES and one optimization Xiaoyao Li 2019-03-25 8:06 ` [PATCH v3 1/2] kvm/vmx: Switch MSR_MISC_FEATURES_ENABLES between host and guest Xiaoyao Li 2019-03-25 15:33 ` Sean Christopherson 2019-03-25 16:38 ` Xiaoyao Li 2019-03-25 8:06 ` [PATCH v3 2/2] x86/vmx: optimize MSR_MISC_FEATURES_ENABLES switch Xiaoyao Li 2019-03-25 15:47 ` Sean Christopherson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).