* [PATCH 0/3 v4] KVM: nSVM: Check MBZ bits in CR3 and CR4 on vmrun of nested guests @ 2020-07-08 0:39 Krish Sadhukhan 2020-07-08 0:39 ` [PATCH 1/3 v4] KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid() Krish Sadhukhan ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Krish Sadhukhan @ 2020-07-08 0:39 UTC (permalink / raw) To: kvm; +Cc: pbonzini v3 -> v4: 1. In patch# 1, 'guest_cr4_reserved_bits' has been renamed to 'cr4_guest_rsvd_bits' and it's now located where other CR4-related members are. 2. Rebased to the latest Upstream sources. [PATCH 1/3 v4] KVM: x86: Create mask for guest CR4 reserved bits in [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test that MBZ bits in CR3 and CR4 are arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/cpuid.c | 2 ++ arch/x86/kvm/svm/nested.c | 22 ++++++++++++++++++++-- arch/x86/kvm/svm/svm.h | 5 ++++- arch/x86/kvm/x86.c | 27 ++++----------------------- arch/x86/kvm/x86.h | 21 +++++++++++++++++++++ 6 files changed, 52 insertions(+), 26 deletions(-) Krish Sadhukhan (2): KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid() nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested gu x86/svm.h | 5 +++ x86/svm_tests.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 95 insertions(+), 4 deletions(-) Krish Sadhukhan (1): kvm-unit-tests: nSVM: Test that MBZ bits in CR3 and CR4 are not set on vmr ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/3 v4] KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid() 2020-07-08 0:39 [PATCH 0/3 v4] KVM: nSVM: Check MBZ bits in CR3 and CR4 on vmrun of nested guests Krish Sadhukhan @ 2020-07-08 0:39 ` Krish Sadhukhan 2020-07-08 9:48 ` Paolo Bonzini 2020-07-08 0:39 ` [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests Krish Sadhukhan 2020-07-08 0:39 ` [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test " Krish Sadhukhan 2 siblings, 1 reply; 13+ messages in thread From: Krish Sadhukhan @ 2020-07-08 0:39 UTC (permalink / raw) To: kvm; +Cc: pbonzini Instead of creating the mask for guest CR4 reserved bits in kvm_valid_cr4(), do it in kvm_update_cpuid() so that it can be reused instead of creating it each time kvm_valid_cr4() is called. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/cpuid.c | 2 ++ arch/x86/kvm/x86.c | 24 ++---------------------- arch/x86/kvm/x86.h | 21 +++++++++++++++++++++ 4 files changed, 26 insertions(+), 22 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index be5363b..06eb426 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -580,6 +580,7 @@ struct kvm_vcpu_arch { unsigned long cr3; unsigned long cr4; unsigned long cr4_guest_owned_bits; + unsigned long cr4_guest_rsvd_bits; unsigned long cr8; u32 host_pkru; u32 pkru; diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 8a294f9..5bec182 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -128,6 +128,8 @@ int kvm_update_cpuid(struct kvm_vcpu *vcpu) kvm_mmu_reset_context(vcpu); kvm_pmu_refresh(vcpu); + vcpu->arch.cr4_guest_rsvd_bits = + __cr4_reserved_bits(guest_cpuid_has, vcpu); return 0; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 88c593f..f0335bc 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -97,6 +97,7 @@ #endif static u64 __read_mostly cr4_reserved_bits = CR4_RESERVED_BITS; +u64 __guest_cr4_reserved_bits; #define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \ KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) @@ -931,33 +932,12 @@ int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) } EXPORT_SYMBOL_GPL(kvm_set_xcr); -#define __cr4_reserved_bits(__cpu_has, __c) \ -({ \ - u64 __reserved_bits = CR4_RESERVED_BITS; \ - \ - if (!__cpu_has(__c, X86_FEATURE_XSAVE)) \ - __reserved_bits |= X86_CR4_OSXSAVE; \ - if (!__cpu_has(__c, X86_FEATURE_SMEP)) \ - __reserved_bits |= X86_CR4_SMEP; \ - if (!__cpu_has(__c, X86_FEATURE_SMAP)) \ - __reserved_bits |= X86_CR4_SMAP; \ - if (!__cpu_has(__c, X86_FEATURE_FSGSBASE)) \ - __reserved_bits |= X86_CR4_FSGSBASE; \ - if (!__cpu_has(__c, X86_FEATURE_PKU)) \ - __reserved_bits |= X86_CR4_PKE; \ - if (!__cpu_has(__c, X86_FEATURE_LA57)) \ - __reserved_bits |= X86_CR4_LA57; \ - if (!__cpu_has(__c, X86_FEATURE_UMIP)) \ - __reserved_bits |= X86_CR4_UMIP; \ - __reserved_bits; \ -}) - static int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { if (cr4 & cr4_reserved_bits) return -EINVAL; - if (cr4 & __cr4_reserved_bits(guest_cpuid_has, vcpu)) + if (cr4 & vcpu->arch.cr4_guest_rsvd_bits) return -EINVAL; return 0; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 6eb62e9..bac8b30 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -366,4 +366,25 @@ static inline bool kvm_dr7_valid(u64 data) u64 kvm_spec_ctrl_valid_bits(struct kvm_vcpu *vcpu); bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu); +#define __cr4_reserved_bits(__cpu_has, __c) \ +({ \ + u64 __reserved_bits = CR4_RESERVED_BITS; \ + \ + if (!__cpu_has(__c, X86_FEATURE_XSAVE)) \ + __reserved_bits |= X86_CR4_OSXSAVE; \ + if (!__cpu_has(__c, X86_FEATURE_SMEP)) \ + __reserved_bits |= X86_CR4_SMEP; \ + if (!__cpu_has(__c, X86_FEATURE_SMAP)) \ + __reserved_bits |= X86_CR4_SMAP; \ + if (!__cpu_has(__c, X86_FEATURE_FSGSBASE)) \ + __reserved_bits |= X86_CR4_FSGSBASE; \ + if (!__cpu_has(__c, X86_FEATURE_PKU)) \ + __reserved_bits |= X86_CR4_PKE; \ + if (!__cpu_has(__c, X86_FEATURE_LA57)) \ + __reserved_bits |= X86_CR4_LA57; \ + if (!__cpu_has(__c, X86_FEATURE_UMIP)) \ + __reserved_bits |= X86_CR4_UMIP; \ + __reserved_bits; \ +}) + #endif -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 1/3 v4] KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid() 2020-07-08 0:39 ` [PATCH 1/3 v4] KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid() Krish Sadhukhan @ 2020-07-08 9:48 ` Paolo Bonzini 0 siblings, 0 replies; 13+ messages in thread From: Paolo Bonzini @ 2020-07-08 9:48 UTC (permalink / raw) To: Krish Sadhukhan, kvm On 08/07/20 02:39, Krish Sadhukhan wrote: > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 88c593f..f0335bc 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -97,6 +97,7 @@ > #endif > > static u64 __read_mostly cr4_reserved_bits = CR4_RESERVED_BITS; > +u64 __guest_cr4_reserved_bits; > > #define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \ > KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) Stray line. Paolo ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 0:39 [PATCH 0/3 v4] KVM: nSVM: Check MBZ bits in CR3 and CR4 on vmrun of nested guests Krish Sadhukhan 2020-07-08 0:39 ` [PATCH 1/3 v4] KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid() Krish Sadhukhan @ 2020-07-08 0:39 ` Krish Sadhukhan 2020-07-08 10:03 ` Paolo Bonzini 2020-07-08 0:39 ` [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test " Krish Sadhukhan 2 siblings, 1 reply; 13+ messages in thread From: Krish Sadhukhan @ 2020-07-08 0:39 UTC (permalink / raw) To: kvm; +Cc: pbonzini According to section "Canonicalization and Consistency Checks" in APM vol. 2 the following guest state is illegal: "Any MBZ bit of CR3 is set." "Any MBZ bit of CR4 is set." Suggeted-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> --- arch/x86/kvm/svm/nested.c | 22 ++++++++++++++++++++-- arch/x86/kvm/svm/svm.h | 5 ++++- arch/x86/kvm/x86.c | 3 ++- 3 files changed, 26 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 6bceafb..6d2ac5a 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -222,7 +222,9 @@ static bool nested_vmcb_check_controls(struct vmcb_control_area *control) return true; } -static bool nested_vmcb_checks(struct vmcb *vmcb) +extern int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4); + +static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb) { if ((vmcb->save.efer & EFER_SVME) == 0) return false; @@ -231,6 +233,22 @@ static bool nested_vmcb_checks(struct vmcb *vmcb) (vmcb->save.cr0 & X86_CR0_NW)) return false; + if (!is_long_mode(&(svm->vcpu))) { + if (vmcb->save.cr4 & X86_CR4_PAE) { + if (vmcb->save.cr3 & MSR_CR3_LEGACY_PAE_RESERVED_MASK) + return false; + } else { + if (vmcb->save.cr3 & MSR_CR3_LEGACY_RESERVED_MASK) + return false; + } + } else { + if ((vmcb->save.cr4 & X86_CR4_PAE) && + (vmcb->save.cr3 & MSR_CR3_LONG_RESERVED_MASK)) + return false; + } + if (kvm_valid_cr4(&(svm->vcpu), vmcb->save.cr4)) + return false; + return nested_vmcb_check_controls(&vmcb->control); } @@ -416,7 +434,7 @@ int nested_svm_vmrun(struct vcpu_svm *svm) nested_vmcb = map.hva; - if (!nested_vmcb_checks(nested_vmcb)) { + if (!nested_vmcb_checks(svm, nested_vmcb)) { nested_vmcb->control.exit_code = SVM_EXIT_ERR; nested_vmcb->control.exit_code_hi = 0; nested_vmcb->control.exit_info_1 = 0; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 6ac4c00..26b14ec 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -346,7 +346,10 @@ static inline bool gif_set(struct vcpu_svm *svm) } /* svm.c */ -#define MSR_INVALID 0xffffffffU +#define MSR_CR3_LEGACY_RESERVED_MASK 0xfe7U +#define MSR_CR3_LEGACY_PAE_RESERVED_MASK 0x7U +#define MSR_CR3_LONG_RESERVED_MASK 0xfff0000000000fe7U +#define MSR_INVALID 0xffffffffU u32 svm_msrpm_offset(u32 msr); void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f0335bc..732ae6a 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -932,7 +932,7 @@ int kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr) } EXPORT_SYMBOL_GPL(kvm_set_xcr); -static int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) +int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { if (cr4 & cr4_reserved_bits) return -EINVAL; @@ -942,6 +942,7 @@ static int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) return 0; } +EXPORT_SYMBOL_GPL(kvm_valid_cr4); int kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 0:39 ` [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests Krish Sadhukhan @ 2020-07-08 10:03 ` Paolo Bonzini 2020-07-08 21:36 ` Krish Sadhukhan 2020-07-08 22:51 ` Krish Sadhukhan 0 siblings, 2 replies; 13+ messages in thread From: Paolo Bonzini @ 2020-07-08 10:03 UTC (permalink / raw) To: Krish Sadhukhan, kvm On 08/07/20 02:39, Krish Sadhukhan wrote: > +extern int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4); > + This should be added in x86.h, not here. > +static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb) > { > if ((vmcb->save.efer & EFER_SVME) == 0) > return false; > @@ -231,6 +233,22 @@ static bool nested_vmcb_checks(struct vmcb *vmcb) > (vmcb->save.cr0 & X86_CR0_NW)) > return false; > > + if (!is_long_mode(&(svm->vcpu))) { > + if (vmcb->save.cr4 & X86_CR4_PAE) { > + if (vmcb->save.cr3 & MSR_CR3_LEGACY_PAE_RESERVED_MASK) > + return false; > + } else { > + if (vmcb->save.cr3 & MSR_CR3_LEGACY_RESERVED_MASK) > + return false; > + } > + } else { > + if ((vmcb->save.cr4 & X86_CR4_PAE) && > + (vmcb->save.cr3 & MSR_CR3_LONG_RESERVED_MASK)) > + return false; > + } is_long_mode here is wrong, as it refers to the host. You need to do something like this: diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 385461496cf5..cbbab83f19cc 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -222,8 +222,9 @@ static bool nested_vmcb_check_controls(struct vmcb_control_area *control) return true; } -static bool nested_vmcb_checks(struct vmcb *vmcb) +static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb) { + bool nested_vmcb_lma; if ((vmcb->save.efer & EFER_SVME) == 0) return false; @@ -234,6 +237,27 @@ static bool nested_vmcb_checks(struct vmcb *vmcb) if (!kvm_dr6_valid(vmcb->save.dr6) || !kvm_dr7_valid(vmcb->save.dr7)) return false; + nested_vmcb_lma = + (vmcb->save.efer & EFER_LME) && + (vmcb->save.cr0 & X86_CR0_PG); + + if (!nested_vmcb_lma) { + if (vmcb->save.cr4 & X86_CR4_PAE) { + if (vmcb->save.cr3 & MSR_CR3_LEGACY_PAE_RESERVED_MASK) + return false; + } else { + if (vmcb->save.cr3 & MSR_CR3_LEGACY_RESERVED_MASK) + return false; + } + } else { + if (!(vmcb->save.cr4 & X86_CR4_PAE) || + !(vmcb->save.cr0 & X86_CR0_PE) || + (vmcb->save.cr3 & MSR_CR3_LONG_RESERVED_MASK)) + return false; + } + if (kvm_valid_cr4(&(svm->vcpu), vmcb->save.cr4)) + return false; + return nested_vmcb_check_controls(&vmcb->control); } which also takes care of other CR0/CR4 checks in the APM. I'll test this a bit more and queue it. Are you also going to add more checks in svm_set_nested_state? Paolo ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 10:03 ` Paolo Bonzini @ 2020-07-08 21:36 ` Krish Sadhukhan 2020-07-08 22:51 ` Krish Sadhukhan 1 sibling, 0 replies; 13+ messages in thread From: Krish Sadhukhan @ 2020-07-08 21:36 UTC (permalink / raw) To: Paolo Bonzini, kvm On 7/8/20 3:03 AM, Paolo Bonzini wrote: > On 08/07/20 02:39, Krish Sadhukhan wrote: >> +extern int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4); >> + > This should be added in x86.h, not here. > >> +static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb) >> { >> if ((vmcb->save.efer & EFER_SVME) == 0) >> return false; >> @@ -231,6 +233,22 @@ static bool nested_vmcb_checks(struct vmcb *vmcb) >> (vmcb->save.cr0 & X86_CR0_NW)) >> return false; >> >> + if (!is_long_mode(&(svm->vcpu))) { >> + if (vmcb->save.cr4 & X86_CR4_PAE) { >> + if (vmcb->save.cr3 & MSR_CR3_LEGACY_PAE_RESERVED_MASK) >> + return false; >> + } else { >> + if (vmcb->save.cr3 & MSR_CR3_LEGACY_RESERVED_MASK) >> + return false; >> + } >> + } else { >> + if ((vmcb->save.cr4 & X86_CR4_PAE) && >> + (vmcb->save.cr3 & MSR_CR3_LONG_RESERVED_MASK)) >> + return false; >> + } > is_long_mode here is wrong, as it refers to the host. > > You need to do something like this: > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > index 385461496cf5..cbbab83f19cc 100644 > --- a/arch/x86/kvm/svm/nested.c > +++ b/arch/x86/kvm/svm/nested.c > @@ -222,8 +222,9 @@ static bool nested_vmcb_check_controls(struct vmcb_control_area *control) > return true; > } > > -static bool nested_vmcb_checks(struct vmcb *vmcb) > +static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb) > { > + bool nested_vmcb_lma; > if ((vmcb->save.efer & EFER_SVME) == 0) > return false; > > @@ -234,6 +237,27 @@ static bool nested_vmcb_checks(struct vmcb *vmcb) > if (!kvm_dr6_valid(vmcb->save.dr6) || !kvm_dr7_valid(vmcb->save.dr7)) > return false; > > + nested_vmcb_lma = > + (vmcb->save.efer & EFER_LME) && > + (vmcb->save.cr0 & X86_CR0_PG); > + > + if (!nested_vmcb_lma) { > + if (vmcb->save.cr4 & X86_CR4_PAE) { > + if (vmcb->save.cr3 & MSR_CR3_LEGACY_PAE_RESERVED_MASK) > + return false; > + } else { > + if (vmcb->save.cr3 & MSR_CR3_LEGACY_RESERVED_MASK) > + return false; > + } > + } else { > + if (!(vmcb->save.cr4 & X86_CR4_PAE) || > + !(vmcb->save.cr0 & X86_CR0_PE) || > + (vmcb->save.cr3 & MSR_CR3_LONG_RESERVED_MASK)) > + return false; > + } > + if (kvm_valid_cr4(&(svm->vcpu), vmcb->save.cr4)) > + return false; > + > return nested_vmcb_check_controls(&vmcb->control); > } > > which also takes care of other CR0/CR4 checks in the APM. Thanks for adding additional other checks and fixing the long-mode check. > > I'll test this a bit more and queue it. Are you also going to add > more checks in svm_set_nested_state? I will send a patch to cover the missing checks in svm_set_nested_state(). > > Paolo > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 10:03 ` Paolo Bonzini 2020-07-08 21:36 ` Krish Sadhukhan @ 2020-07-08 22:51 ` Krish Sadhukhan 2020-07-08 23:07 ` Jim Mattson 1 sibling, 1 reply; 13+ messages in thread From: Krish Sadhukhan @ 2020-07-08 22:51 UTC (permalink / raw) To: Paolo Bonzini, kvm On 7/8/20 3:03 AM, Paolo Bonzini wrote: > On 08/07/20 02:39, Krish Sadhukhan wrote: >> +extern int kvm_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4); >> + > This should be added in x86.h, not here. > >> +static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb) >> { >> if ((vmcb->save.efer & EFER_SVME) == 0) >> return false; >> @@ -231,6 +233,22 @@ static bool nested_vmcb_checks(struct vmcb *vmcb) >> (vmcb->save.cr0 & X86_CR0_NW)) >> return false; >> >> + if (!is_long_mode(&(svm->vcpu))) { >> + if (vmcb->save.cr4 & X86_CR4_PAE) { >> + if (vmcb->save.cr3 & MSR_CR3_LEGACY_PAE_RESERVED_MASK) >> + return false; >> + } else { >> + if (vmcb->save.cr3 & MSR_CR3_LEGACY_RESERVED_MASK) >> + return false; >> + } >> + } else { >> + if ((vmcb->save.cr4 & X86_CR4_PAE) && >> + (vmcb->save.cr3 & MSR_CR3_LONG_RESERVED_MASK)) >> + return false; >> + } > is_long_mode here is wrong, as it refers to the host. > > You need to do something like this: > > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c > index 385461496cf5..cbbab83f19cc 100644 > --- a/arch/x86/kvm/svm/nested.c > +++ b/arch/x86/kvm/svm/nested.c > @@ -222,8 +222,9 @@ static bool nested_vmcb_check_controls(struct vmcb_control_area *control) > return true; > } > > -static bool nested_vmcb_checks(struct vmcb *vmcb) > +static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb) > { > + bool nested_vmcb_lma; > if ((vmcb->save.efer & EFER_SVME) == 0) > return false; > > @@ -234,6 +237,27 @@ static bool nested_vmcb_checks(struct vmcb *vmcb) > if (!kvm_dr6_valid(vmcb->save.dr6) || !kvm_dr7_valid(vmcb->save.dr7)) > return false; > > + nested_vmcb_lma = > + (vmcb->save.efer & EFER_LME) && Just curious about using LME instead of LMA. According to APM, " The processor behaves as a 32-bit x86 processor in all respects until long mode is activated, even if long mode is enabled. None of the new 64-bit data sizes, addressing, or system aspects available in long mode can be used until EFER.LMA=1." Is it possible that L1 sets LME, but not LMA, in L2's VMCS and this code will execute even if the processor is not in long-mode ? > + (vmcb->save.cr0 & X86_CR0_PG); > + > + if (!nested_vmcb_lma) { > + if (vmcb->save.cr4 & X86_CR4_PAE) { > + if (vmcb->save.cr3 & MSR_CR3_LEGACY_PAE_RESERVED_MASK) > + return false; > + } else { > + if (vmcb->save.cr3 & MSR_CR3_LEGACY_RESERVED_MASK) > + return false; > + } > + } else { > + if (!(vmcb->save.cr4 & X86_CR4_PAE) || > + !(vmcb->save.cr0 & X86_CR0_PE) || > + (vmcb->save.cr3 & MSR_CR3_LONG_RESERVED_MASK)) > + return false; > + } > + if (kvm_valid_cr4(&(svm->vcpu), vmcb->save.cr4)) > + return false; > + > return nested_vmcb_check_controls(&vmcb->control); > } > > which also takes care of other CR0/CR4 checks in the APM. > > I'll test this a bit more and queue it. Are you also going to add > more checks in svm_set_nested_state? > > Paolo > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 22:51 ` Krish Sadhukhan @ 2020-07-08 23:07 ` Jim Mattson 2020-07-09 9:36 ` Paolo Bonzini 0 siblings, 1 reply; 13+ messages in thread From: Jim Mattson @ 2020-07-08 23:07 UTC (permalink / raw) To: Krish Sadhukhan; +Cc: Paolo Bonzini, kvm list On Wed, Jul 8, 2020 at 3:51 PM Krish Sadhukhan <krish.sadhukhan@oracle.com> wrote: > Just curious about using LME instead of LMA. According to APM, > > " The processor behaves as a 32-bit x86 processor in all respects > until long mode is activated, even if long mode is enabled. None of the > new 64-bit data sizes, addressing, or system aspects available in long > mode can be used until EFER.LMA=1." > > > Is it possible that L1 sets LME, but not LMA, in L2's VMCS and this > code will execute even if the processor is not in long-mode ? No. EFER.LMA is not modifiable through software. It is always "EFER.LME != 0 && CR0.PG != 0." ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 23:07 ` Jim Mattson @ 2020-07-09 9:36 ` Paolo Bonzini 0 siblings, 0 replies; 13+ messages in thread From: Paolo Bonzini @ 2020-07-09 9:36 UTC (permalink / raw) To: Jim Mattson, Krish Sadhukhan; +Cc: kvm list On 09/07/20 01:07, Jim Mattson wrote: >> Just curious about using LME instead of LMA. According to APM, >> >> " The processor behaves as a 32-bit x86 processor in all respects >> until long mode is activated, even if long mode is enabled. None of the >> new 64-bit data sizes, addressing, or system aspects available in long >> mode can be used until EFER.LMA=1." >> >> >> Is it possible that L1 sets LME, but not LMA, in L2's VMCS and this >> code will execute even if the processor is not in long-mode ? > > No. EFER.LMA is not modifiable through software. It is always > "EFER.LME != 0 && CR0.PG != 0." In fact, AMD doesn't specify (unlike Intel) that EFER.LME, CR0.PG and EFER.LMA must be consistent, and for SMM state restore they say that "The EFER.LMA register bit is set to the value obtained by logically ANDing the SMRAM values of EFER.LME, CR0.PG, and CR4.PAE". So it is plausible that they ignore completely EFER.LMA in the VMCB. I quickly tried hacking svm_set_efer to set or reset it, and it works either way. EFLAGS.VM from the VMCB is also ignored if the processor is in long mode just like the APM says in "10.4 Leaving SMM"! Paolo ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 0:39 [PATCH 0/3 v4] KVM: nSVM: Check MBZ bits in CR3 and CR4 on vmrun of nested guests Krish Sadhukhan 2020-07-08 0:39 ` [PATCH 1/3 v4] KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid() Krish Sadhukhan 2020-07-08 0:39 ` [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests Krish Sadhukhan @ 2020-07-08 0:39 ` Krish Sadhukhan 2020-07-08 11:07 ` Paolo Bonzini 2 siblings, 1 reply; 13+ messages in thread From: Krish Sadhukhan @ 2020-07-08 0:39 UTC (permalink / raw) To: kvm; +Cc: pbonzini According to section "Canonicalization and Consistency Checks" in APM vol. 2, the following guest state is illegal: "Any MBZ bit of CR3 is set." "Any MBZ bit of CR4 is set." Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> --- x86/svm.h | 5 +++ x86/svm_tests.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 95 insertions(+), 4 deletions(-) diff --git a/x86/svm.h b/x86/svm.h index 457ce3c..f6b9a31 100644 --- a/x86/svm.h +++ b/x86/svm.h @@ -325,6 +325,11 @@ struct __attribute__ ((__packed__)) vmcb { #define SVM_CR0_SELECTIVE_MASK (X86_CR0_TS | X86_CR0_MP) #define SVM_CR0_RESERVED_MASK 0xffffffff00000000U +#define SVM_CR3_LEGACY_RESERVED_MASK 0xfe7U +#define SVM_CR3_LEGACY_PAE_RESERVED_MASK 0x7U +#define SVM_CR3_LONG_RESERVED_MASK 0xfff0000000000fe7U +#define SVM_CR4_LEGACY_RESERVED_MASK 0xffbaf000U +#define SVM_CR4_RESERVED_MASK 0xffffffffffbaf000U #define SVM_DR6_RESERVED_MASK 0xffffffffffff1ff0U #define SVM_DR7_RESERVED_MASK 0xffffffff0000cc00U #define SVM_EFER_RESERVED_MASK 0xffffffffffff0200U diff --git a/x86/svm_tests.c b/x86/svm_tests.c index d4d130f..c59e7eb 100644 --- a/x86/svm_tests.c +++ b/x86/svm_tests.c @@ -1913,6 +1913,31 @@ static void basic_guest_main(struct svm_test *test) } \ } +#define SVM_TEST_CR_RESERVED_BITS(start, end, inc, cr, val, resv_mask) \ +{ \ + u64 tmp, mask; \ + int i; \ + \ + for (i = start; i <= end; i = i + inc) { \ + mask = 1ull << i; \ + if (!(mask & resv_mask)) \ + continue; \ + tmp = val | mask; \ + switch (cr) { \ + case 0: \ + vmcb->save.cr0 = tmp; \ + break; \ + case 3: \ + vmcb->save.cr3 = tmp; \ + break; \ + case 4: \ + vmcb->save.cr4 = tmp; \ + } \ + report(svm_vmrun() == SVM_EXIT_ERR, "Test CR%d %d:%d: %lx",\ + cr, end, start, tmp); \ + } \ +} + static void svm_guest_state_test(void) { test_set_guest(basic_guest_main); @@ -1938,17 +1963,21 @@ static void svm_guest_state_test(void) cr0 |= X86_CR0_CD; cr0 &= ~X86_CR0_NW; vmcb->save.cr0 = cr0; - report (svm_vmrun() == SVM_EXIT_VMMCALL, "CR0: %lx", cr0); + report (svm_vmrun() == SVM_EXIT_VMMCALL, "Test CR0 CD=1,NW=0: %lx", + cr0); cr0 |= X86_CR0_NW; vmcb->save.cr0 = cr0; - report (svm_vmrun() == SVM_EXIT_VMMCALL, "CR0: %lx", cr0); + report (svm_vmrun() == SVM_EXIT_VMMCALL, "Test CR0 CD=1,NW=1: %lx", + cr0); cr0 &= ~X86_CR0_NW; cr0 &= ~X86_CR0_CD; vmcb->save.cr0 = cr0; - report (svm_vmrun() == SVM_EXIT_VMMCALL, "CR0: %lx", cr0); + report (svm_vmrun() == SVM_EXIT_VMMCALL, "Test CR0 CD=0,NW=0: %lx", + cr0); cr0 |= X86_CR0_NW; vmcb->save.cr0 = cr0; - report (svm_vmrun() == SVM_EXIT_ERR, "CR0: %lx", cr0); + report (svm_vmrun() == SVM_EXIT_ERR, "Test CR0 CD=0,NW=1: %lx", + cr0); vmcb->save.cr0 = cr0_saved; /* @@ -1961,6 +1990,63 @@ static void svm_guest_state_test(void) vmcb->save.cr0 = cr0_saved; /* + * CR3 MBZ bits based on different modes: + * [2:0] - legacy PAE + * [2:0], [11:5] - legacy non-PAE + * [2:0], [11:5], [63:52] - long mode + */ + u64 cr3_saved = vmcb->save.cr3; + u64 cr4_saved = vmcb->save.cr4; + u64 cr4 = cr4_saved; + efer_saved = vmcb->save.efer; + efer = efer_saved; + + efer &= ~EFER_LMA; + vmcb->save.efer = efer; + cr4 |= X86_CR4_PAE; + vmcb->save.cr4 = cr4; + SVM_TEST_CR_RESERVED_BITS(0, 2, 1, 3, cr3_saved, + SVM_CR3_LEGACY_PAE_RESERVED_MASK); + + cr4 = cr4_saved & ~X86_CR4_PAE; + vmcb->save.cr4 = cr4; + SVM_TEST_CR_RESERVED_BITS(0, 11, 2, 3, cr3_saved, + SVM_CR3_LEGACY_RESERVED_MASK); + + cr4 |= X86_CR4_PAE; + vmcb->save.cr4 = cr4; + efer |= EFER_LMA; + vmcb->save.efer = efer; + SVM_TEST_CR_RESERVED_BITS(0, 63, 2, 3, cr3_saved, + SVM_CR3_LONG_RESERVED_MASK); + + vmcb->save.cr4 = cr4_saved; + vmcb->save.cr3 = cr3_saved; + vmcb->save.efer = efer_saved; + + /* + * CR4 MBZ bits based on different modes: + * [15:12], 17, 19, [31:22] - legacy mode + * [15:12], 17, 19, [63:22] - long mode + */ + cr4_saved = vmcb->save.cr4; + efer_saved = vmcb->save.efer; + efer &= ~EFER_LMA; + vmcb->save.efer = efer; + SVM_TEST_CR_RESERVED_BITS(12, 31, 2, 4, cr4_saved, + SVM_CR4_LEGACY_RESERVED_MASK); + + efer |= EFER_LMA; + vmcb->save.efer = efer; + SVM_TEST_CR_RESERVED_BITS(12, 31, 2, 4, cr4_saved, + SVM_CR4_RESERVED_MASK); + SVM_TEST_CR_RESERVED_BITS(32, 63, 4, 4, cr4_saved, + SVM_CR4_RESERVED_MASK); + + vmcb->save.cr4 = cr4_saved; + vmcb->save.efer = efer_saved; + + /* * DR6[63:32] and DR7[63:32] are MBZ */ u64 dr_saved = vmcb->save.dr6; -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 0:39 ` [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test " Krish Sadhukhan @ 2020-07-08 11:07 ` Paolo Bonzini 2020-07-09 0:01 ` Krish Sadhukhan 0 siblings, 1 reply; 13+ messages in thread From: Paolo Bonzini @ 2020-07-08 11:07 UTC (permalink / raw) To: Krish Sadhukhan, kvm On 08/07/20 02:39, Krish Sadhukhan wrote: > + SVM_TEST_CR_RESERVED_BITS(0, 2, 1, 3, cr3_saved, > + SVM_CR3_LEGACY_PAE_RESERVED_MASK); > + > + cr4 = cr4_saved & ~X86_CR4_PAE; > + vmcb->save.cr4 = cr4; > + SVM_TEST_CR_RESERVED_BITS(0, 11, 2, 3, cr3_saved, > + SVM_CR3_LEGACY_RESERVED_MASK); > + > + cr4 |= X86_CR4_PAE; > + vmcb->save.cr4 = cr4; > + efer |= EFER_LMA; > + vmcb->save.efer = efer; > + SVM_TEST_CR_RESERVED_BITS(0, 63, 2, 3, cr3_saved, > + SVM_CR3_LONG_RESERVED_MASK); The test is not covering *non*-reserved bits, so it doesn't catch that in both cases KVM is checking against long-mode bits. Doing this would require setting up the VMCB for immediate VMEXIT (for example, injecting an event while the IDT limit is zero), so it can be done later. Instead, you need to set/clear EFER_LME. Please be more careful to check that the test is covering what you expect. Also, the tests show PASS: Test CR3 2:0: 641001 PASS: Test CR3 2:0: 2 PASS: Test CR3 2:0: 4 PASS: Test CR3 11:0: 1 PASS: Test CR3 11:0: 4 PASS: Test CR3 11:0: 40 PASS: Test CR3 11:0: 100 PASS: Test CR3 11:0: 400 PASS: Test CR3 63:0: 1 PASS: Test CR3 63:0: 4 PASS: Test CR3 63:0: 40 PASS: Test CR3 63:0: 100 PASS: Test CR3 63:0: 400 PASS: Test CR3 63:0: 10000000000000 PASS: Test CR3 63:0: 40000000000000 PASS: Test CR3 63:0: 100000000000000 PASS: Test CR3 63:0: 400000000000000 PASS: Test CR3 63:0: 1000000000000000 PASS: Test CR3 63:0: 4000000000000000 PASS: Test CR4 31:12: 0 PASS: Test CR4 31:12: 0 and then exits. There is an issue with compiler optimization for which I've sent a patch, but even after fixing it the premature exit is a problem: it is caused by a problem in __cr4_reserved_bits and a typo in the tests: diff --git a/x86/svm.h b/x86/svm.h index f6b9a31..58c9069 100644 --- a/x86/svm.h +++ b/x86/svm.h @@ -328,8 +328,8 @@ struct __attribute__ ((__packed__)) vmcb { #define SVM_CR3_LEGACY_RESERVED_MASK 0xfe7U #define SVM_CR3_LEGACY_PAE_RESERVED_MASK 0x7U #define SVM_CR3_LONG_RESERVED_MASK 0xfff0000000000fe7U -#define SVM_CR4_LEGACY_RESERVED_MASK 0xffbaf000U -#define SVM_CR4_RESERVED_MASK 0xffffffffffbaf000U +#define SVM_CR4_LEGACY_RESERVED_MASK 0xffcaf000U +#define SVM_CR4_RESERVED_MASK 0xffffffffffcaf000U #define SVM_DR6_RESERVED_MASK 0xffffffffffff1ff0U #define SVM_DR7_RESERVED_MASK 0xffffffff0000cc00U #define SVM_EFER_RESERVED_MASK 0xffffffffffff0200U (Also, this kind of problem is made harder to notice by only testing even bits, which may make sense for high order bits, but certainly not for low-order ones). All in all, fixing this series has taken me almost 2 hours. Since I have done the work I'm queuing but, but I wonder: the compiler optimization issue could depend on register allocation, but did all of these issues really happen only on my machine? Paolo ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-08 11:07 ` Paolo Bonzini @ 2020-07-09 0:01 ` Krish Sadhukhan 2020-07-11 16:12 ` Nadav Amit 0 siblings, 1 reply; 13+ messages in thread From: Krish Sadhukhan @ 2020-07-09 0:01 UTC (permalink / raw) To: Paolo Bonzini, kvm On 7/8/20 4:07 AM, Paolo Bonzini wrote: > On 08/07/20 02:39, Krish Sadhukhan wrote: >> + SVM_TEST_CR_RESERVED_BITS(0, 2, 1, 3, cr3_saved, >> + SVM_CR3_LEGACY_PAE_RESERVED_MASK); >> + >> + cr4 = cr4_saved & ~X86_CR4_PAE; >> + vmcb->save.cr4 = cr4; >> + SVM_TEST_CR_RESERVED_BITS(0, 11, 2, 3, cr3_saved, >> + SVM_CR3_LEGACY_RESERVED_MASK); >> + >> + cr4 |= X86_CR4_PAE; >> + vmcb->save.cr4 = cr4; >> + efer |= EFER_LMA; >> + vmcb->save.efer = efer; >> + SVM_TEST_CR_RESERVED_BITS(0, 63, 2, 3, cr3_saved, >> + SVM_CR3_LONG_RESERVED_MASK); > The test is not covering *non*-reserved bits, so it doesn't catch that > in both cases KVM is checking against long-mode bits. Doing this would > require setting up the VMCB for immediate VMEXIT (for example, injecting > an event while the IDT limit is zero), so it can be done later. > > Instead, you need to set/clear EFER_LME. Please be more careful to > check that the test is covering what you expect. Sorry, I wasn't aware that setting/unsetting EFER.LMA wouldn't work ! Another test case that I missed here is testing the reserved bits in CR3[11:0] in long-mode based on the setting of CR4.PCIDE. > > Also, the tests show > > PASS: Test CR3 2:0: 641001 > PASS: Test CR3 2:0: 2 > PASS: Test CR3 2:0: 4 > PASS: Test CR3 11:0: 1 > PASS: Test CR3 11:0: 4 > PASS: Test CR3 11:0: 40 > PASS: Test CR3 11:0: 100 > PASS: Test CR3 11:0: 400 > PASS: Test CR3 63:0: 1 > PASS: Test CR3 63:0: 4 > PASS: Test CR3 63:0: 40 > PASS: Test CR3 63:0: 100 > PASS: Test CR3 63:0: 400 > PASS: Test CR3 63:0: 10000000000000 > PASS: Test CR3 63:0: 40000000000000 > PASS: Test CR3 63:0: 100000000000000 > PASS: Test CR3 63:0: 400000000000000 > PASS: Test CR3 63:0: 1000000000000000 > PASS: Test CR3 63:0: 4000000000000000 > PASS: Test CR4 31:12: 0 > PASS: Test CR4 31:12: 0 > > and then exits. There is an issue with compiler optimization for which > I've sent a patch, but even after fixing it the premature exit is a > problem: it is caused by a problem in __cr4_reserved_bits and a typo in > the tests: > > diff --git a/x86/svm.h b/x86/svm.h > index f6b9a31..58c9069 100644 > --- a/x86/svm.h > +++ b/x86/svm.h > @@ -328,8 +328,8 @@ struct __attribute__ ((__packed__)) vmcb { > #define SVM_CR3_LEGACY_RESERVED_MASK 0xfe7U > #define SVM_CR3_LEGACY_PAE_RESERVED_MASK 0x7U > #define SVM_CR3_LONG_RESERVED_MASK 0xfff0000000000fe7U > -#define SVM_CR4_LEGACY_RESERVED_MASK 0xffbaf000U > -#define SVM_CR4_RESERVED_MASK 0xffffffffffbaf000U > +#define SVM_CR4_LEGACY_RESERVED_MASK 0xffcaf000U > +#define SVM_CR4_RESERVED_MASK 0xffffffffffcaf000U > #define SVM_DR6_RESERVED_MASK 0xffffffffffff1ff0U > #define SVM_DR7_RESERVED_MASK 0xffffffff0000cc00U > #define SVM_EFER_RESERVED_MASK 0xffffffffffff0200U > > (Also, this kind of problem is made harder to notice by only testing > even bits, which may make sense for high order bits, but certainly not > for low-order ones). > > All in all, fixing this series has taken me almost 2 hours. Since I have > done the work I'm queuing Sorry to hear it caused so many issues ! Thanks for looking into it ! > but, but I wonder: the compiler optimization > issue could depend on register allocation, but did all of these issues > really happen only on my machine? Just as a reference, I compiled it using gcc 4.8.5 and ran it on an AMD EPYC that was running kernel 5.8.0-rc4+. Surprisingly, all tests passed: PASS: Test CR3 2:0: 641001 PASS: Test CR3 2:0: 641002 PASS: Test CR3 2:0: 641004 PASS: Test CR3 11:0: 641001 PASS: Test CR3 11:0: 641004 PASS: Test CR3 11:0: 641040 PASS: Test CR3 11:0: 641100 PASS: Test CR3 11:0: 641400 PASS: Test CR3 63:0: 641001 PASS: Test CR3 63:0: 641004 PASS: Test CR3 63:0: 641040 PASS: Test CR3 63:0: 641100 PASS: Test CR3 63:0: 641400 PASS: Test CR3 63:0: 10000000641000 PASS: Test CR3 63:0: 40000000641000 PASS: Test CR3 63:0: 100000000641000 PASS: Test CR3 63:0: 400000000641000 PASS: Test CR3 63:0: 1000000000641000 PASS: Test CR3 63:0: 4000000000641000 PASS: Test CR4 31:12: 1000 PASS: Test CR4 31:12: 4000 PASS: Test CR4 31:12: 100000 PASS: Test CR4 31:12: 1000000 PASS: Test CR4 31:12: 4000000 PASS: Test CR4 31:12: 10000000 PASS: Test CR4 31:12: 40000000 PASS: Test CR4 31:12: 1000 PASS: Test CR4 31:12: 4000 PASS: Test CR4 31:12: 100000 PASS: Test CR4 31:12: 1000000 PASS: Test CR4 31:12: 4000000 PASS: Test CR4 31:12: 10000000 PASS: Test CR4 31:12: 40000000 PASS: Test CR4 63:32: 100000000 PASS: Test CR4 63:32: 1000000000 PASS: Test CR4 63:32: 10000000000 PASS: Test CR4 63:32: 100000000000 PASS: Test CR4 63:32: 1000000000000 PASS: Test CR4 63:32: 10000000000000 PASS: Test CR4 63:32: 100000000000000 PASS: Test CR4 63:32: 1000000000000000 > > Paolo > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests 2020-07-09 0:01 ` Krish Sadhukhan @ 2020-07-11 16:12 ` Nadav Amit 0 siblings, 0 replies; 13+ messages in thread From: Nadav Amit @ 2020-07-11 16:12 UTC (permalink / raw) To: Krish Sadhukhan; +Cc: Paolo Bonzini, kvm > On Jul 8, 2020, at 5:01 PM, Krish Sadhukhan <krish.sadhukhan@oracle.com> wrote: > > > On 7/8/20 4:07 AM, Paolo Bonzini wrote: >> On 08/07/20 02:39, Krish Sadhukhan wrote: >>> + SVM_TEST_CR_RESERVED_BITS(0, 2, 1, 3, cr3_saved, >>> + SVM_CR3_LEGACY_PAE_RESERVED_MASK); >>> + >>> + cr4 = cr4_saved & ~X86_CR4_PAE; >>> + vmcb->save.cr4 = cr4; >>> + SVM_TEST_CR_RESERVED_BITS(0, 11, 2, 3, cr3_saved, >>> + SVM_CR3_LEGACY_RESERVED_MASK); >>> + >>> + cr4 |= X86_CR4_PAE; >>> + vmcb->save.cr4 = cr4; >>> + efer |= EFER_LMA; >>> + vmcb->save.efer = efer; >>> + SVM_TEST_CR_RESERVED_BITS(0, 63, 2, 3, cr3_saved, >>> + SVM_CR3_LONG_RESERVED_MASK); >> The test is not covering *non*-reserved bits, so it doesn't catch that >> in both cases KVM is checking against long-mode bits. Doing this would >> require setting up the VMCB for immediate VMEXIT (for example, injecting >> an event while the IDT limit is zero), so it can be done later. >> >> Instead, you need to set/clear EFER_LME. Please be more careful to >> check that the test is covering what you expect. > > > Sorry, I wasn't aware that setting/unsetting EFER.LMA wouldn't work ! > > Another test case that I missed here is testing the reserved bits in CR3[11:0] in long-mode based on the setting of CR4.PCIDE. > >> Also, the tests show >> >> PASS: Test CR3 2:0: 641001 >> PASS: Test CR3 2:0: 2 >> PASS: Test CR3 2:0: 4 >> PASS: Test CR3 11:0: 1 >> PASS: Test CR3 11:0: 4 >> PASS: Test CR3 11:0: 40 >> PASS: Test CR3 11:0: 100 >> PASS: Test CR3 11:0: 400 >> PASS: Test CR3 63:0: 1 >> PASS: Test CR3 63:0: 4 >> PASS: Test CR3 63:0: 40 >> PASS: Test CR3 63:0: 100 >> PASS: Test CR3 63:0: 400 >> PASS: Test CR3 63:0: 10000000000000 >> PASS: Test CR3 63:0: 40000000000000 >> PASS: Test CR3 63:0: 100000000000000 >> PASS: Test CR3 63:0: 400000000000000 >> PASS: Test CR3 63:0: 1000000000000000 >> PASS: Test CR3 63:0: 4000000000000000 >> PASS: Test CR4 31:12: 0 >> PASS: Test CR4 31:12: 0 >> >> and then exits. There is an issue with compiler optimization for which >> I've sent a patch, but even after fixing it the premature exit is a >> problem: it is caused by a problem in __cr4_reserved_bits and a typo in >> the tests: >> >> diff --git a/x86/svm.h b/x86/svm.h >> index f6b9a31..58c9069 100644 >> --- a/x86/svm.h >> +++ b/x86/svm.h >> @@ -328,8 +328,8 @@ struct __attribute__ ((__packed__)) vmcb { >> #define SVM_CR3_LEGACY_RESERVED_MASK 0xfe7U >> #define SVM_CR3_LEGACY_PAE_RESERVED_MASK 0x7U >> #define SVM_CR3_LONG_RESERVED_MASK 0xfff0000000000fe7U >> -#define SVM_CR4_LEGACY_RESERVED_MASK 0xffbaf000U >> -#define SVM_CR4_RESERVED_MASK 0xffffffffffbaf000U >> +#define SVM_CR4_LEGACY_RESERVED_MASK 0xffcaf000U >> +#define SVM_CR4_RESERVED_MASK 0xffffffffffcaf000U >> #define SVM_DR6_RESERVED_MASK 0xffffffffffff1ff0U >> #define SVM_DR7_RESERVED_MASK 0xffffffff0000cc00U >> #define SVM_EFER_RESERVED_MASK 0xffffffffffff0200U >> >> (Also, this kind of problem is made harder to notice by only testing >> even bits, which may make sense for high order bits, but certainly not >> for low-order ones). >> >> All in all, fixing this series has taken me almost 2 hours. Since I have >> done the work I'm queuing > > > Sorry to hear it caused so many issues ! Thanks for looking into it ! > >> but, but I wonder: the compiler optimization >> issue could depend on register allocation, but did all of these issues >> really happen only on my machine? > > > Just as a reference, I compiled it using gcc 4.8.5 and ran it on an AMD EPYC that was running kernel 5.8.0-rc4+. Surprisingly, all tests passed: > > PASS: Test CR3 2:0: 641001 > PASS: Test CR3 2:0: 641002 > PASS: Test CR3 2:0: 641004 > PASS: Test CR3 11:0: 641001 > PASS: Test CR3 11:0: 641004 > PASS: Test CR3 11:0: 641040 > PASS: Test CR3 11:0: 641100 > PASS: Test CR3 11:0: 641400 > PASS: Test CR3 63:0: 641001 > PASS: Test CR3 63:0: 641004 > PASS: Test CR3 63:0: 641040 > PASS: Test CR3 63:0: 641100 > PASS: Test CR3 63:0: 641400 Well, the tests (which I pulled) do not pass on bare-metal, so KVM is even better than bare-metal… Here are the results for long-mode: FAIL: Test CR3 63:0: 641001 FAIL: Test CR3 63:0: 641002 FAIL: Test CR3 63:0: 641004 FAIL: Test CR3 63:0: 641020 FAIL: Test CR3 63:0: 641040 FAIL: Test CR3 63:0: 641080 FAIL: Test CR3 63:0: 641100 FAIL: Test CR3 63:0: 641200 FAIL: Test CR3 63:0: 641400 FAIL: Test CR3 63:0: 641800 PASS: Test CR3 63:0: 10000000641000 PASS: Test CR3 63:0: 20000000641000 PASS: Test CR3 63:0: 40000000641000 PASS: Test CR3 63:0: 80000000641000 PASS: Test CR3 63:0: 100000000641000 PASS: Test CR3 63:0: 200000000641000 PASS: Test CR3 63:0: 400000000641000 PASS: Test CR3 63:0: 800000000641000 PASS: Test CR3 63:0: 1000000000641000 PASS: Test CR3 63:0: 2000000000641000 PASS: Test CR3 63:0: 4000000000641000 PASS: Test CR3 63:0: 8000000000641000 The PAE/legacy tests crashed my machine. Presumably the VM PTEs are completely messed up on PAE/legacy so a failure can cause a crash. It would be better to do something smarter to avoid a crash, perhaps by setting an invalid PDEs or something in the guest. Anyhow, to double-check that the VM-entry was successful, I checked the exit reason on long-mode, and indeed I get a VMMCALL exit, and CR3 of the guest holds the “illegal” value. Checking the APM shows that the low bits of CR3 are marked as reserved but not MBZ. So the condition that the test tries to check "Any MBZ bit of CR3 is set” does not apply to the low-bits. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-07-11 16:12 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-07-08 0:39 [PATCH 0/3 v4] KVM: nSVM: Check MBZ bits in CR3 and CR4 on vmrun of nested guests Krish Sadhukhan 2020-07-08 0:39 ` [PATCH 1/3 v4] KVM: x86: Create mask for guest CR4 reserved bits in kvm_update_cpuid() Krish Sadhukhan 2020-07-08 9:48 ` Paolo Bonzini 2020-07-08 0:39 ` [PATCH 2/3 v4] KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests Krish Sadhukhan 2020-07-08 10:03 ` Paolo Bonzini 2020-07-08 21:36 ` Krish Sadhukhan 2020-07-08 22:51 ` Krish Sadhukhan 2020-07-08 23:07 ` Jim Mattson 2020-07-09 9:36 ` Paolo Bonzini 2020-07-08 0:39 ` [PATCH 3/3 v4] kvm-unit-tests: nSVM: Test " Krish Sadhukhan 2020-07-08 11:07 ` Paolo Bonzini 2020-07-09 0:01 ` Krish Sadhukhan 2020-07-11 16:12 ` Nadav Amit
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.