[PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes
@ 2020-08-20 13:33 Maxim Levitsky
  2020-08-20 13:33 ` [PATCH v2 1/7] KVM: SVM: rename a variable in the svm_create_vcpu Maxim Levitsky
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-20 13:33 UTC (permalink / raw)
  To: kvm
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Jim Mattson, Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini, Maxim Levitsky

Hi!

This patch series does some refactoring and implements on demand nested state area
This way at least guests that don't use nesting won't waste memory
on nested state.

Patches 1,2,3 are refactoring

Patches 4,5 are new from V1 and implement more strict SMM save state area checking
on resume from SMM to avoid guest tampering with this area.

This was done to avoid crashing if the guest enabled 'guest was interrupted'
flag there and we don't have nested state allocated.

Patches 6,7 are for ondemand nested state.

The series was tested with various nested guests, in one case even with
L3 running, but note that due to unrelated issue, migration with nested
guest running didn't work for me with or without this series.
I am investigating this currently.

Best regards,
	Maxim Levitsky

Maxim Levitsky (7):
  KVM: SVM: rename a variable in the svm_create_vcpu
  KVM: nSVM: rename nested 'vmcb' to vmcb12_gpa in few places
  KVM: SVM: refactor msr permission bitmap allocation
  KVM: x86: allow kvm_x86_ops.set_efer to return a value
  KVM: nSVM: more strict smm checks
  KVM: emulator: more strict rsm checks.
  KVM: nSVM: implement ondemand allocation of the nested state

 arch/x86/include/asm/kvm_host.h |   2 +-
 arch/x86/kvm/emulate.c          |  22 ++++--
 arch/x86/kvm/svm/nested.c       |  53 +++++++++++--
 arch/x86/kvm/svm/svm.c          | 130 ++++++++++++++++++--------------
 arch/x86/kvm/svm/svm.h          |  10 ++-
 arch/x86/kvm/vmx/vmx.c          |   5 +-
 arch/x86/kvm/x86.c              |   3 +-
 7 files changed, 151 insertions(+), 74 deletions(-)

-- 
2.26.2

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/7] KVM: SVM: rename a variable in the svm_create_vcpu
  2020-08-20 13:33 [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes Maxim Levitsky
@ 2020-08-20 13:33 ` Maxim Levitsky
  2020-08-20 20:52   ` Jim Mattson
  2020-08-20 13:33 ` [PATCH v2 2/7] KVM: nSVM: rename nested 'vmcb' to vmcb12_gpa in few places Maxim Levitsky
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-20 13:33 UTC (permalink / raw)
  To: kvm
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Jim Mattson, Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini, Maxim Levitsky

The 'page' is to hold the vcpu's vmcb so name it as such to
avoid confusion.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/svm.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 03dd7bac8034..562a79e3e63a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1171,7 +1171,7 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm;
-	struct page *page;
+	struct page *vmcb_page;
 	struct page *msrpm_pages;
 	struct page *hsave_page;
 	struct page *nested_msrpm_pages;
@@ -1181,8 +1181,8 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	svm = to_svm(vcpu);
 
 	err = -ENOMEM;
-	page = alloc_page(GFP_KERNEL_ACCOUNT);
-	if (!page)
+	vmcb_page = alloc_page(GFP_KERNEL_ACCOUNT);
+	if (!vmcb_page)
 		goto out;
 
 	msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
@@ -1216,9 +1216,9 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	svm->nested.msrpm = page_address(nested_msrpm_pages);
 	svm_vcpu_init_msrpm(svm->nested.msrpm);
 
-	svm->vmcb = page_address(page);
+	svm->vmcb = page_address(vmcb_page);
 	clear_page(svm->vmcb);
-	svm->vmcb_pa = __sme_set(page_to_pfn(page) << PAGE_SHIFT);
+	svm->vmcb_pa = __sme_set(page_to_pfn(vmcb_page) << PAGE_SHIFT);
 	svm->asid_generation = 0;
 	init_vmcb(svm);
 
@@ -1234,7 +1234,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 free_page2:
 	__free_pages(msrpm_pages, MSRPM_ALLOC_ORDER);
 free_page1:
-	__free_page(page);
+	__free_page(vmcb_page);
 out:
 	return err;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/7] KVM: nSVM: rename nested 'vmcb' to vmcb12_gpa in few places
  2020-08-20 13:33 [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes Maxim Levitsky
  2020-08-20 13:33 ` [PATCH v2 1/7] KVM: SVM: rename a variable in the svm_create_vcpu Maxim Levitsky
@ 2020-08-20 13:33 ` Maxim Levitsky
  2020-08-20 21:00   ` Jim Mattson
  2020-08-20 13:33 ` [PATCH v2 3/7] KVM: SVM: refactor msr permission bitmap allocation Maxim Levitsky
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-20 13:33 UTC (permalink / raw)
  To: kvm
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Jim Mattson, Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini, Maxim Levitsky

No functional changes.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 10 +++++-----
 arch/x86/kvm/svm/svm.c    | 13 +++++++------
 arch/x86/kvm/svm/svm.h    |  2 +-
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index fb68467e6049..f5b17920a2ca 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -431,7 +431,7 @@ int enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
 {
 	int ret;
 
-	svm->nested.vmcb = vmcb_gpa;
+	svm->nested.vmcb12_gpa = vmcb_gpa;
 	load_nested_vmcb_control(svm, &nested_vmcb->control);
 	nested_prepare_vmcb_save(svm, nested_vmcb);
 	nested_prepare_vmcb_control(svm);
@@ -568,7 +568,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	struct vmcb *vmcb = svm->vmcb;
 	struct kvm_host_map map;
 
-	rc = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(svm->nested.vmcb), &map);
+	rc = kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(svm->nested.vmcb12_gpa), &map);
 	if (rc) {
 		if (rc == -EINVAL)
 			kvm_inject_gp(&svm->vcpu, 0);
@@ -579,7 +579,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 
 	/* Exit Guest-Mode */
 	leave_guest_mode(&svm->vcpu);
-	svm->nested.vmcb = 0;
+	svm->nested.vmcb12_gpa = 0;
 	WARN_ON_ONCE(svm->nested.nested_run_pending);
 
 	/* in case we halted in L2 */
@@ -1018,7 +1018,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
 
 	/* First fill in the header and copy it out.  */
 	if (is_guest_mode(vcpu)) {
-		kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb;
+		kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
 		kvm_state.size += KVM_STATE_NESTED_SVM_VMCB_SIZE;
 		kvm_state.flags |= KVM_STATE_NESTED_GUEST_MODE;
 
@@ -1128,7 +1128,7 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	copy_vmcb_control_area(&hsave->control, &svm->vmcb->control);
 	hsave->save = save;
 
-	svm->nested.vmcb = kvm_state->hdr.svm.vmcb_pa;
+	svm->nested.vmcb12_gpa = kvm_state->hdr.svm.vmcb_pa;
 	load_nested_vmcb_control(svm, &ctl);
 	nested_prepare_vmcb_control(svm);
 
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 562a79e3e63a..d33013b9b4d7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1102,7 +1102,7 @@ static void init_vmcb(struct vcpu_svm *svm)
 	}
 	svm->asid_generation = 0;
 
-	svm->nested.vmcb = 0;
+	svm->nested.vmcb12_gpa = 0;
 	svm->vcpu.arch.hflags = 0;
 
 	if (!kvm_pause_in_guest(svm->vcpu.kvm)) {
@@ -3884,7 +3884,7 @@ static int svm_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 		/* FED8h - SVM Guest */
 		put_smstate(u64, smstate, 0x7ed8, 1);
 		/* FEE0h - SVM Guest VMCB Physical Address */
-		put_smstate(u64, smstate, 0x7ee0, svm->nested.vmcb);
+		put_smstate(u64, smstate, 0x7ee0, svm->nested.vmcb12_gpa);
 
 		svm->vmcb->save.rax = vcpu->arch.regs[VCPU_REGS_RAX];
 		svm->vmcb->save.rsp = vcpu->arch.regs[VCPU_REGS_RSP];
@@ -3903,17 +3903,18 @@ static int svm_pre_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 	struct vmcb *nested_vmcb;
 	struct kvm_host_map map;
 	u64 guest;
-	u64 vmcb;
+	u64 vmcb12_gpa;
 	int ret = 0;
 
 	guest = GET_SMSTATE(u64, smstate, 0x7ed8);
-	vmcb = GET_SMSTATE(u64, smstate, 0x7ee0);
+	vmcb12_gpa = GET_SMSTATE(u64, smstate, 0x7ee0);
 
 	if (guest) {
-		if (kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb), &map) == -EINVAL)
+		if (kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb12_gpa), &map) == -EINVAL)
 			return 1;
+
 		nested_vmcb = map.hva;
-		ret = enter_svm_guest_mode(svm, vmcb, nested_vmcb);
+		ret = enter_svm_guest_mode(svm, vmcb12_gpa, nested_vmcb);
 		kvm_vcpu_unmap(&svm->vcpu, &map, true);
 	}
 
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index a798e1731709..ab913468f9cb 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -85,7 +85,7 @@ struct svm_nested_state {
 	struct vmcb *hsave;
 	u64 hsave_msr;
 	u64 vm_cr_msr;
-	u64 vmcb;
+	u64 vmcb12_gpa;
 	u32 host_intercept_exceptions;
 
 	/* These are the merged vectors */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/7] KVM: SVM: refactor msr permission bitmap allocation
  2020-08-20 13:33 [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes Maxim Levitsky
  2020-08-20 13:33 ` [PATCH v2 1/7] KVM: SVM: rename a variable in the svm_create_vcpu Maxim Levitsky
  2020-08-20 13:33 ` [PATCH v2 2/7] KVM: nSVM: rename nested 'vmcb' to vmcb12_gpa in few places Maxim Levitsky
@ 2020-08-20 13:33 ` Maxim Levitsky
  2020-08-20 21:26   ` Jim Mattson
  2020-08-20 13:33 ` [PATCH v2 4/7] KVM: x86: allow kvm_x86_ops.set_efer to return a value Maxim Levitsky
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-20 13:33 UTC (permalink / raw)
  To: kvm
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Jim Mattson, Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini, Maxim Levitsky

Replace svm_vcpu_init_msrpm with svm_vcpu_alloc_msrpm, that also allocates
the msr bitmap and add svm_vcpu_free_msrpm to free it.

This will be used later to move the nested msr permission bitmap allocation
to nested.c

No functional change intended.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/svm.c | 45 +++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d33013b9b4d7..7bb094bf6494 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -609,18 +609,29 @@ static void set_msr_interception(u32 *msrpm, unsigned msr,
 	msrpm[offset] = tmp;
 }
 
-static void svm_vcpu_init_msrpm(u32 *msrpm)
+static u32 *svm_vcpu_alloc_msrpm(void)
 {
 	int i;
+	u32 *msrpm;
+	struct page *pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
+
+	if (!pages)
+		return NULL;
 
+	msrpm = page_address(pages);
 	memset(msrpm, 0xff, PAGE_SIZE * (1 << MSRPM_ALLOC_ORDER));
 
 	for (i = 0; direct_access_msrs[i].index != MSR_INVALID; i++) {
 		if (!direct_access_msrs[i].always)
 			continue;
-
 		set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1);
 	}
+	return msrpm;
+}
+
+static void svm_vcpu_free_msrpm(u32 *msrpm)
+{
+	__free_pages(virt_to_page(msrpm), MSRPM_ALLOC_ORDER);
 }
 
 static void add_msr_offset(u32 offset)
@@ -1172,9 +1183,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm;
 	struct page *vmcb_page;
-	struct page *msrpm_pages;
 	struct page *hsave_page;
-	struct page *nested_msrpm_pages;
 	int err;
 
 	BUILD_BUG_ON(offsetof(struct vcpu_svm, vcpu) != 0);
@@ -1185,21 +1194,13 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	if (!vmcb_page)
 		goto out;
 
-	msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
-	if (!msrpm_pages)
-		goto free_page1;
-
-	nested_msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
-	if (!nested_msrpm_pages)
-		goto free_page2;
-
 	hsave_page = alloc_page(GFP_KERNEL_ACCOUNT);
 	if (!hsave_page)
-		goto free_page3;
+		goto free_page1;
 
 	err = avic_init_vcpu(svm);
 	if (err)
-		goto free_page4;
+		goto free_page2;
 
 	/* We initialize this flag to true to make sure that the is_running
 	 * bit would be set the first time the vcpu is loaded.
@@ -1210,11 +1211,13 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	svm->nested.hsave = page_address(hsave_page);
 	clear_page(svm->nested.hsave);
 
-	svm->msrpm = page_address(msrpm_pages);
-	svm_vcpu_init_msrpm(svm->msrpm);
+	svm->msrpm = svm_vcpu_alloc_msrpm();
+	if (!svm->msrpm)
+		goto free_page2;
 
-	svm->nested.msrpm = page_address(nested_msrpm_pages);
-	svm_vcpu_init_msrpm(svm->nested.msrpm);
+	svm->nested.msrpm = svm_vcpu_alloc_msrpm();
+	if (!svm->nested.msrpm)
+		goto free_page3;
 
 	svm->vmcb = page_address(vmcb_page);
 	clear_page(svm->vmcb);
@@ -1227,12 +1230,10 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 
 	return 0;
 
-free_page4:
-	__free_page(hsave_page);
 free_page3:
-	__free_pages(nested_msrpm_pages, MSRPM_ALLOC_ORDER);
+	svm_vcpu_free_msrpm(svm->msrpm);
 free_page2:
-	__free_pages(msrpm_pages, MSRPM_ALLOC_ORDER);
+	__free_page(hsave_page);
 free_page1:
 	__free_page(vmcb_page);
 out:
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 4/7] KVM: x86: allow kvm_x86_ops.set_efer to return a value
  2020-08-20 13:33 [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes Maxim Levitsky
                   ` (2 preceding siblings ...)
  2020-08-20 13:33 ` [PATCH v2 3/7] KVM: SVM: refactor msr permission bitmap allocation Maxim Levitsky
@ 2020-08-20 13:33 ` Maxim Levitsky
  2020-08-20 21:43   ` Jim Mattson
  2020-08-20 13:33 ` [PATCH v2 5/7] KVM: nSVM: more strict smm checks Maxim Levitsky
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-20 13:33 UTC (permalink / raw)
  To: kvm
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Jim Mattson, Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini, Maxim Levitsky

This will be used later to return an error when setting this msr fails.

For VMX, it already has an error condition when EFER is
not in the shared MSR list, so return an error in this case.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/svm/svm.c          | 3 ++-
 arch/x86/kvm/svm/svm.h          | 2 +-
 arch/x86/kvm/vmx/vmx.c          | 5 +++--
 arch/x86/kvm/x86.c              | 3 ++-
 5 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5ab3af7275d8..bd0519e26053 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1069,7 +1069,7 @@ struct kvm_x86_ops {
 	void (*get_cs_db_l_bits)(struct kvm_vcpu *vcpu, int *db, int *l);
 	void (*set_cr0)(struct kvm_vcpu *vcpu, unsigned long cr0);
 	int (*set_cr4)(struct kvm_vcpu *vcpu, unsigned long cr4);
-	void (*set_efer)(struct kvm_vcpu *vcpu, u64 efer);
+	int (*set_efer)(struct kvm_vcpu *vcpu, u64 efer);
 	void (*get_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
 	void (*set_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
 	void (*get_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 7bb094bf6494..f4569899361f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -263,7 +263,7 @@ static int get_max_npt_level(void)
 #endif
 }
 
-void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
+int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 	vcpu->arch.efer = efer;
@@ -283,6 +283,7 @@ void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 
 	svm->vmcb->save.efer = efer | EFER_SVME;
 	vmcb_mark_dirty(svm->vmcb, VMCB_CR);
+	return 0;
 }
 
 static int is_external_interrupt(u32 info)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index ab913468f9cb..468c58a91534 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -349,7 +349,7 @@ static inline bool gif_set(struct vcpu_svm *svm)
 #define MSR_INVALID				0xffffffffU
 
 u32 svm_msrpm_offset(u32 msr);
-void svm_set_efer(struct kvm_vcpu *vcpu, u64 efer);
+int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer);
 void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
 int svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
 void svm_flush_tlb(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 46ba2e03a892..e90b9e68c7ea 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2862,13 +2862,13 @@ static void enter_rmode(struct kvm_vcpu *vcpu)
 	kvm_mmu_reset_context(vcpu);
 }
 
-void vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer)
+int vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	struct shared_msr_entry *msr = find_msr_entry(vmx, MSR_EFER);
 
 	if (!msr)
-		return;
+		return 1;
 
 	vcpu->arch.efer = efer;
 	if (efer & EFER_LMA) {
@@ -2880,6 +2880,7 @@ void vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 		msr->data = efer & ~EFER_LME;
 	}
 	setup_msrs(vmx);
+	return 0;
 }
 
 #ifdef CONFIG_X86_64
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2db369a64f29..cad5d9778a21 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1471,7 +1471,8 @@ static int set_efer(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	efer &= ~EFER_LMA;
 	efer |= vcpu->arch.efer & EFER_LMA;
 
-	kvm_x86_ops.set_efer(vcpu, efer);
+	if (kvm_x86_ops.set_efer(vcpu, efer))
+		return 1;
 
 	/* Update reserved bits */
 	if ((efer ^ old_efer) & EFER_NX)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 5/7] KVM: nSVM: more strict smm checks
  2020-08-20 13:33 [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes Maxim Levitsky
                   ` (3 preceding siblings ...)
  2020-08-20 13:33 ` [PATCH v2 4/7] KVM: x86: allow kvm_x86_ops.set_efer to return a value Maxim Levitsky
@ 2020-08-20 13:33 ` Maxim Levitsky
  2020-08-20 13:33 ` [PATCH v2 6/7] KVM: emulator: more strict rsm checks Maxim Levitsky
  2020-08-20 13:33 ` [PATCH v2 7/7] KVM: nSVM: implement ondemand allocation of the nested state Maxim Levitsky
  6 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-20 13:33 UTC (permalink / raw)
  To: kvm
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Jim Mattson, Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini, Maxim Levitsky

* check that guest is 64 bit guest, otherwise the fields in the smm
  state area are not defined

* If the SMM area indicates that SMM interrupted a running guest,
  check that EFER.SVME which is also saved in this area is set, otherwise
  the guest might have tampered with SMM save area, and so indicate
  emulation failure which should triple fault the guest.

* Check that that guest CPUID supports SVM (due to the same issue as above)

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/svm.c | 29 ++++++++++++++++++-----------
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index f4569899361f..2ac13420055d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3902,22 +3902,29 @@ static int svm_pre_enter_smm(struct kvm_vcpu *vcpu, char *smstate)
 static int svm_pre_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
-	struct vmcb *nested_vmcb;
 	struct kvm_host_map map;
-	u64 guest;
-	u64 vmcb12_gpa;
 	int ret = 0;
 
-	guest = GET_SMSTATE(u64, smstate, 0x7ed8);
-	vmcb12_gpa = GET_SMSTATE(u64, smstate, 0x7ee0);
+	if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) {
+		u64 saved_efer = GET_SMSTATE(u64, smstate, 0x7ed0);
+		u64 guest = GET_SMSTATE(u64, smstate, 0x7ed8);
+		u64 vmcb12_gpa = GET_SMSTATE(u64, smstate, 0x7ee0);
 
-	if (guest) {
-		if (kvm_vcpu_map(&svm->vcpu, gpa_to_gfn(vmcb12_gpa), &map) == -EINVAL)
-			return 1;
+		if (guest) {
 
-		nested_vmcb = map.hva;
-		ret = enter_svm_guest_mode(svm, vmcb12_gpa, nested_vmcb);
-		kvm_vcpu_unmap(&svm->vcpu, &map, true);
+			if (!guest_cpuid_has(vcpu, X86_FEATURE_SVM))
+				return 1;
+
+			if (!(saved_efer && EFER_SVME))
+				return 1;
+
+			if (kvm_vcpu_map(&svm->vcpu,
+					 gpa_to_gfn(vmcb12_gpa), &map) == -EINVAL)
+				return 1;
+
+			ret = enter_svm_guest_mode(svm, vmcb12_gpa, map.hva);
+			kvm_vcpu_unmap(&svm->vcpu, &map, true);
+		}
 	}
 
 	return ret;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 6/7] KVM: emulator: more strict rsm checks.
  2020-08-20 13:33 [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes Maxim Levitsky
                   ` (4 preceding siblings ...)
  2020-08-20 13:33 ` [PATCH v2 5/7] KVM: nSVM: more strict smm checks Maxim Levitsky
@ 2020-08-20 13:33 ` Maxim Levitsky
  2020-08-20 13:33 ` [PATCH v2 7/7] KVM: nSVM: implement ondemand allocation of the nested state Maxim Levitsky
  6 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-20 13:33 UTC (permalink / raw)
  To: kvm
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Jim Mattson, Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini, Maxim Levitsky

Don't ignore return values in rsm_load_state_64/32 to avoid
loading invalid state from SMM state area if it was tampered with
by the guest.

This is primarly intended to avoid letting guest bad bits in EFER
(like EFER.SVME when nesting is disabled) by manipulating SMM save area.

The DR checks are probably redundant, since the code sets them to fixed
value, but it won't hurt to have them

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/emulate.c | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index d0e2825ae617..1d450d7710d6 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2505,9 +2505,14 @@ static int rsm_load_state_32(struct x86_emulate_ctxt *ctxt,
 		*reg_write(ctxt, i) = GET_SMSTATE(u32, smstate, 0x7fd0 + i * 4);
 
 	val = GET_SMSTATE(u32, smstate, 0x7fcc);
-	ctxt->ops->set_dr(ctxt, 6, (val & DR6_VOLATILE) | DR6_FIXED_1);
+
+	if (ctxt->ops->set_dr(ctxt, 6, (val & DR6_VOLATILE) | DR6_FIXED_1))
+		return X86EMUL_UNHANDLEABLE;
+
 	val = GET_SMSTATE(u32, smstate, 0x7fc8);
-	ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1);
+
+	if (ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1))
+		return X86EMUL_UNHANDLEABLE;
 
 	selector =                 GET_SMSTATE(u32, smstate, 0x7fc4);
 	set_desc_base(&desc,       GET_SMSTATE(u32, smstate, 0x7f64));
@@ -2560,16 +2565,23 @@ static int rsm_load_state_64(struct x86_emulate_ctxt *ctxt,
 	ctxt->eflags = GET_SMSTATE(u32, smstate, 0x7f70) | X86_EFLAGS_FIXED;
 
 	val = GET_SMSTATE(u32, smstate, 0x7f68);
-	ctxt->ops->set_dr(ctxt, 6, (val & DR6_VOLATILE) | DR6_FIXED_1);
+
+	if (ctxt->ops->set_dr(ctxt, 6, (val & DR6_VOLATILE) | DR6_FIXED_1))
+		return X86EMUL_UNHANDLEABLE;
+
 	val = GET_SMSTATE(u32, smstate, 0x7f60);
-	ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1);
+
+	if (ctxt->ops->set_dr(ctxt, 7, (val & DR7_VOLATILE) | DR7_FIXED_1))
+		return X86EMUL_UNHANDLEABLE;
 
 	cr0 =                       GET_SMSTATE(u64, smstate, 0x7f58);
 	cr3 =                       GET_SMSTATE(u64, smstate, 0x7f50);
 	cr4 =                       GET_SMSTATE(u64, smstate, 0x7f48);
 	ctxt->ops->set_smbase(ctxt, GET_SMSTATE(u32, smstate, 0x7f00));
 	val =                       GET_SMSTATE(u64, smstate, 0x7ed0);
-	ctxt->ops->set_msr(ctxt, MSR_EFER, val & ~EFER_LMA);
+
+	if (ctxt->ops->set_msr(ctxt, MSR_EFER, val & ~EFER_LMA))
+		return X86EMUL_UNHANDLEABLE;
 
 	selector =                  GET_SMSTATE(u32, smstate, 0x7e90);
 	rsm_set_desc_flags(&desc,   GET_SMSTATE(u32, smstate, 0x7e92) << 8);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 7/7] KVM: nSVM: implement ondemand allocation of the nested state
  2020-08-20 13:33 [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes Maxim Levitsky
                   ` (5 preceding siblings ...)
  2020-08-20 13:33 ` [PATCH v2 6/7] KVM: emulator: more strict rsm checks Maxim Levitsky
@ 2020-08-20 13:33 ` Maxim Levitsky
  6 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-20 13:33 UTC (permalink / raw)
  To: kvm
  Cc: maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Jim Mattson, Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini, Maxim Levitsky

This way we don't waste memory on VMs which don't enable
nesting virtualization

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 43 ++++++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c    | 56 +++++++++++++++++++++------------------
 arch/x86/kvm/svm/svm.h    |  6 +++++
 3 files changed, 79 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index f5b17920a2ca..c7c7df58ef38 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -473,6 +473,9 @@ int nested_svm_vmrun(struct vcpu_svm *svm)
 
 	nested_vmcb = map.hva;
 
+	if (WARN_ON(!svm->nested.initialized))
+		return 1;
+
 	if (!nested_vmcb_checks(svm, nested_vmcb)) {
 		nested_vmcb->control.exit_code    = SVM_EXIT_ERR;
 		nested_vmcb->control.exit_code_hi = 0;
@@ -686,6 +689,46 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 	return 0;
 }
 
+int svm_allocate_nested(struct vcpu_svm *svm)
+{
+	struct page *hsave_page;
+
+	if (svm->nested.initialized)
+		return 0;
+
+	hsave_page = alloc_page(GFP_KERNEL_ACCOUNT);
+	if (!hsave_page)
+		goto free_page1;
+
+	svm->nested.hsave = page_address(hsave_page);
+	clear_page(svm->nested.hsave);
+
+	svm->nested.msrpm = svm_vcpu_alloc_msrpm();
+	if (!svm->nested.msrpm)
+		goto free_page2;
+
+	svm->nested.initialized = true;
+	return 0;
+free_page2:
+	__free_page(hsave_page);
+free_page1:
+	return 1;
+}
+
+void svm_free_nested(struct vcpu_svm *svm)
+{
+	if (!svm->nested.initialized)
+		return;
+
+	svm_vcpu_free_msrpm(svm->nested.msrpm);
+	svm->nested.msrpm = NULL;
+
+	__free_page(virt_to_page(svm->nested.hsave));
+	svm->nested.hsave = NULL;
+
+	svm->nested.initialized = false;
+}
+
 /*
  * Forcibly leave nested mode in order to be able to reset the VCPU later on.
  */
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2ac13420055d..fdd41053c42b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -266,6 +266,7 @@ static int get_max_npt_level(void)
 int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
+	u64 old_efer = vcpu->arch.efer;
 	vcpu->arch.efer = efer;
 
 	if (!npt_enabled) {
@@ -276,14 +277,31 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 			efer &= ~EFER_LME;
 	}
 
-	if (!(efer & EFER_SVME)) {
-		svm_leave_nested(svm);
-		svm_set_gif(svm, true);
+	if ((old_efer & EFER_SVME) != (efer & EFER_SVME)) {
+		if (!(efer & EFER_SVME)) {
+			svm_leave_nested(svm);
+			svm_set_gif(svm, true);
+
+			/*
+			 * Free the nested state unless we are in SMM, in which
+			 * case the exit from SVM mode is only for duration of the SMI
+			 * handler
+			 */
+			if (!is_smm(&svm->vcpu))
+				svm_free_nested(svm);
+
+		} else {
+			if (svm_allocate_nested(svm))
+				goto error;
+		}
 	}
 
 	svm->vmcb->save.efer = efer | EFER_SVME;
 	vmcb_mark_dirty(svm->vmcb, VMCB_CR);
 	return 0;
+error:
+	vcpu->arch.efer = old_efer;
+	return 1;
 }
 
 static int is_external_interrupt(u32 info)
@@ -610,7 +628,7 @@ static void set_msr_interception(u32 *msrpm, unsigned msr,
 	msrpm[offset] = tmp;
 }
 
-static u32 *svm_vcpu_alloc_msrpm(void)
+u32 *svm_vcpu_alloc_msrpm(void)
 {
 	int i;
 	u32 *msrpm;
@@ -630,7 +648,7 @@ static u32 *svm_vcpu_alloc_msrpm(void)
 	return msrpm;
 }
 
-static void svm_vcpu_free_msrpm(u32 *msrpm)
+void svm_vcpu_free_msrpm(u32 *msrpm)
 {
 	__free_pages(virt_to_page(msrpm), MSRPM_ALLOC_ORDER);
 }
@@ -1184,7 +1202,6 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm;
 	struct page *vmcb_page;
-	struct page *hsave_page;
 	int err;
 
 	BUILD_BUG_ON(offsetof(struct vcpu_svm, vcpu) != 0);
@@ -1195,13 +1212,9 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	if (!vmcb_page)
 		goto out;
 
-	hsave_page = alloc_page(GFP_KERNEL_ACCOUNT);
-	if (!hsave_page)
-		goto free_page1;
-
 	err = avic_init_vcpu(svm);
 	if (err)
-		goto free_page2;
+		goto out;
 
 	/* We initialize this flag to true to make sure that the is_running
 	 * bit would be set the first time the vcpu is loaded.
@@ -1209,16 +1222,9 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	if (irqchip_in_kernel(vcpu->kvm) && kvm_apicv_activated(vcpu->kvm))
 		svm->avic_is_running = true;
 
-	svm->nested.hsave = page_address(hsave_page);
-	clear_page(svm->nested.hsave);
-
 	svm->msrpm = svm_vcpu_alloc_msrpm();
 	if (!svm->msrpm)
-		goto free_page2;
-
-	svm->nested.msrpm = svm_vcpu_alloc_msrpm();
-	if (!svm->nested.msrpm)
-		goto free_page3;
+		goto free_page;
 
 	svm->vmcb = page_address(vmcb_page);
 	clear_page(svm->vmcb);
@@ -1231,11 +1237,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 
 	return 0;
 
-free_page3:
-	svm_vcpu_free_msrpm(svm->msrpm);
-free_page2:
-	__free_page(hsave_page);
-free_page1:
+free_page:
 	__free_page(vmcb_page);
 out:
 	return err;
@@ -1260,10 +1262,10 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
 	 */
 	svm_clear_current_vmcb(svm->vmcb);
 
+	svm_free_nested(svm);
+
 	__free_page(pfn_to_page(__sme_clr(svm->vmcb_pa) >> PAGE_SHIFT));
 	__free_pages(virt_to_page(svm->msrpm), MSRPM_ALLOC_ORDER);
-	__free_page(virt_to_page(svm->nested.hsave));
-	__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
 }
 
 static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
@@ -3922,6 +3924,8 @@ static int svm_pre_leave_smm(struct kvm_vcpu *vcpu, const char *smstate)
 					 gpa_to_gfn(vmcb12_gpa), &map) == -EINVAL)
 				return 1;
 
+			svm_allocate_nested(svm);
+
 			ret = enter_svm_guest_mode(svm, vmcb12_gpa, map.hva);
 			kvm_vcpu_unmap(&svm->vcpu, &map, true);
 		}
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 468c58a91534..eae7e3a8752f 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -97,6 +97,8 @@ struct svm_nested_state {
 
 	/* cache for control fields of the guest */
 	struct vmcb_control_area ctl;
+
+	bool initialized;
 };
 
 struct vcpu_svm {
@@ -349,6 +351,8 @@ static inline bool gif_set(struct vcpu_svm *svm)
 #define MSR_INVALID				0xffffffffU
 
 u32 svm_msrpm_offset(u32 msr);
+u32 *svm_vcpu_alloc_msrpm(void);
+void svm_vcpu_free_msrpm(u32 *msrpm);
 int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer);
 void svm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0);
 int svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
@@ -390,6 +394,8 @@ static inline bool nested_exit_on_nmi(struct vcpu_svm *svm)
 int enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
 			 struct vmcb *nested_vmcb);
 void svm_leave_nested(struct vcpu_svm *svm);
+void svm_free_nested(struct vcpu_svm *svm);
+int svm_allocate_nested(struct vcpu_svm *svm);
 int nested_svm_vmrun(struct vcpu_svm *svm);
 void nested_svm_vmloadsave(struct vmcb *from_vmcb, struct vmcb *to_vmcb);
 int nested_svm_vmexit(struct vcpu_svm *svm);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/7] KVM: SVM: rename a variable in the svm_create_vcpu
  2020-08-20 13:33 ` [PATCH v2 1/7] KVM: SVM: rename a variable in the svm_create_vcpu Maxim Levitsky
@ 2020-08-20 20:52   ` Jim Mattson
  0 siblings, 0 replies; 16+ messages in thread
From: Jim Mattson @ 2020-08-20 20:52 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm list, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini

On Thu, Aug 20, 2020 at 6:33 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> The 'page' is to hold the vcpu's vmcb so name it as such to
> avoid confusion.
>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/7] KVM: nSVM: rename nested 'vmcb' to vmcb12_gpa in few places
  2020-08-20 13:33 ` [PATCH v2 2/7] KVM: nSVM: rename nested 'vmcb' to vmcb12_gpa in few places Maxim Levitsky
@ 2020-08-20 21:00   ` Jim Mattson
  2020-08-24 11:37     ` Maxim Levitsky
  0 siblings, 1 reply; 16+ messages in thread
From: Jim Mattson @ 2020-08-20 21:00 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm list, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini

On Thu, Aug 20, 2020 at 6:33 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> No functional changes.
>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/svm/nested.c | 10 +++++-----
>  arch/x86/kvm/svm/svm.c    | 13 +++++++------
>  arch/x86/kvm/svm/svm.h    |  2 +-
>  3 files changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index fb68467e6049..f5b17920a2ca 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -431,7 +431,7 @@ int enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
For consistency, should the vmcb_gpa argument be renamed to vmcb12_gpa as well?


> @@ -579,7 +579,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
>
>         /* Exit Guest-Mode */
>         leave_guest_mode(&svm->vcpu);
> -       svm->nested.vmcb = 0;
> +       svm->nested.vmcb12_gpa = 0;
Perhaps in a follow-up change, this could be set to an illegal value
rather than 0?


> @@ -1018,7 +1018,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
>
>         /* First fill in the header and copy it out.  */
>         if (is_guest_mode(vcpu)) {
> -               kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb;
> +               kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
It's unfortunate that we have "_pa" on the LHS on "_gpa" on the RHS. Oh, well.


> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 562a79e3e63a..d33013b9b4d7 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1102,7 +1102,7 @@ static void init_vmcb(struct vcpu_svm *svm)
>         }
>         svm->asid_generation = 0;
>
> -       svm->nested.vmcb = 0;
> +       svm->nested.vmcb12_gpa = 0;
Here, too, perhaps this could be changed from 0 to an illegal value in
a follow-up change.

Reviewed-by: Jim Mattson <jmattson@google.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/7] KVM: SVM: refactor msr permission bitmap allocation
  2020-08-20 13:33 ` [PATCH v2 3/7] KVM: SVM: refactor msr permission bitmap allocation Maxim Levitsky
@ 2020-08-20 21:26   ` Jim Mattson
  2020-08-24 11:43     ` Maxim Levitsky
  0 siblings, 1 reply; 16+ messages in thread
From: Jim Mattson @ 2020-08-20 21:26 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm list, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini

On Thu, Aug 20, 2020 at 6:34 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> Replace svm_vcpu_init_msrpm with svm_vcpu_alloc_msrpm, that also allocates
> the msr bitmap and add svm_vcpu_free_msrpm to free it.
>
> This will be used later to move the nested msr permission bitmap allocation
> to nested.c
>
> No functional change intended.
>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---
>  arch/x86/kvm/svm/svm.c | 45 +++++++++++++++++++++---------------------
>  1 file changed, 23 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index d33013b9b4d7..7bb094bf6494 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -609,18 +609,29 @@ static void set_msr_interception(u32 *msrpm, unsigned msr,
>         msrpm[offset] = tmp;
>  }
>
> -static void svm_vcpu_init_msrpm(u32 *msrpm)
> +static u32 *svm_vcpu_alloc_msrpm(void)

I prefer the original name, since this function does more than allocation.

>  {
>         int i;
> +       u32 *msrpm;
> +       struct page *pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
> +
> +       if (!pages)
> +               return NULL;
>
> +       msrpm = page_address(pages);
>         memset(msrpm, 0xff, PAGE_SIZE * (1 << MSRPM_ALLOC_ORDER));
>
>         for (i = 0; direct_access_msrs[i].index != MSR_INVALID; i++) {
>                 if (!direct_access_msrs[i].always)
>                         continue;
> -
>                 set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1);
>         }
> +       return msrpm;
> +}
> +
> +static void svm_vcpu_free_msrpm(u32 *msrpm)
> +{
> +       __free_pages(virt_to_page(msrpm), MSRPM_ALLOC_ORDER);
>  }
>
>  static void add_msr_offset(u32 offset)
> @@ -1172,9 +1183,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
>  {
>         struct vcpu_svm *svm;
>         struct page *vmcb_page;
> -       struct page *msrpm_pages;
>         struct page *hsave_page;
> -       struct page *nested_msrpm_pages;
>         int err;
>
>         BUILD_BUG_ON(offsetof(struct vcpu_svm, vcpu) != 0);
> @@ -1185,21 +1194,13 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
>         if (!vmcb_page)
>                 goto out;
>
> -       msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
> -       if (!msrpm_pages)
> -               goto free_page1;
> -
> -       nested_msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
> -       if (!nested_msrpm_pages)
> -               goto free_page2;
> -

Reordering the allocations does seem like a functional change to me,
albeit one that should (hopefully) be benign. For example, if the
MSRPM_ALLOC_ORDER allocations fail, in the new version of the code,
the hsave_page will be cleared, but in the old version of the code, no
page would be cleared.

>         hsave_page = alloc_page(GFP_KERNEL_ACCOUNT);

Speaking of clearing pages, why not add __GFP_ZERO to the flags above
and skip the clear_page() call below?

>         if (!hsave_page)
> -               goto free_page3;
> +               goto free_page1;
>
>         err = avic_init_vcpu(svm);
>         if (err)
> -               goto free_page4;
> +               goto free_page2;
>
>         /* We initialize this flag to true to make sure that the is_running
>          * bit would be set the first time the vcpu is loaded.
> @@ -1210,11 +1211,13 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
>         svm->nested.hsave = page_address(hsave_page);
>         clear_page(svm->nested.hsave);
>
> -       svm->msrpm = page_address(msrpm_pages);
> -       svm_vcpu_init_msrpm(svm->msrpm);
> +       svm->msrpm = svm_vcpu_alloc_msrpm();
> +       if (!svm->msrpm)
> +               goto free_page2;
>
> -       svm->nested.msrpm = page_address(nested_msrpm_pages);
> -       svm_vcpu_init_msrpm(svm->nested.msrpm);
> +       svm->nested.msrpm = svm_vcpu_alloc_msrpm();
> +       if (!svm->nested.msrpm)
> +               goto free_page3;
>
>         svm->vmcb = page_address(vmcb_page);
>         clear_page(svm->vmcb);
> @@ -1227,12 +1230,10 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
>
>         return 0;
>
> -free_page4:
> -       __free_page(hsave_page);
>  free_page3:
> -       __free_pages(nested_msrpm_pages, MSRPM_ALLOC_ORDER);
> +       svm_vcpu_free_msrpm(svm->msrpm);
>  free_page2:
> -       __free_pages(msrpm_pages, MSRPM_ALLOC_ORDER);
> +       __free_page(hsave_page);
>  free_page1:
>         __free_page(vmcb_page);
>  out:

While you're here, could you improve these labels? Coding-style.rst says:

Choose label names which say what the goto does or why the goto exists.  An
example of a good name could be ``out_free_buffer:`` if the goto frees
``buffer``.
Avoid using GW-BASIC names like ``err1:`` and ``err2:``, as you would have to
renumber them if you ever add or remove exit paths, and they make correctness
difficult to verify anyway.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/7] KVM: x86: allow kvm_x86_ops.set_efer to return a value
  2020-08-20 13:33 ` [PATCH v2 4/7] KVM: x86: allow kvm_x86_ops.set_efer to return a value Maxim Levitsky
@ 2020-08-20 21:43   ` Jim Mattson
  2020-08-21  0:43     ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Jim Mattson @ 2020-08-20 21:43 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm list, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini

On Thu, Aug 20, 2020 at 6:34 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> This will be used later to return an error when setting this msr fails.
>
> For VMX, it already has an error condition when EFER is
> not in the shared MSR list, so return an error in this case.
>
> Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> ---

> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1471,7 +1471,8 @@ static int set_efer(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>         efer &= ~EFER_LMA;
>         efer |= vcpu->arch.efer & EFER_LMA;
>
> -       kvm_x86_ops.set_efer(vcpu, efer);
> +       if (kvm_x86_ops.set_efer(vcpu, efer))
> +               return 1;

This seems like a userspace ABI change to me. Previously, it looks
like userspace could always use KVM_SET_MSRS to set MSR_EFER to 0 or
EFER_SCE, and it would always succeed. Now, it looks like it will fail
on CPUs that don't support EFER in hardware. (Perhaps it should fail,
but it didn't before, AFAICT.)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/7] KVM: x86: allow kvm_x86_ops.set_efer to return a value
  2020-08-20 21:43   ` Jim Mattson
@ 2020-08-21  0:43     ` Sean Christopherson
  2020-08-27 10:23       ` Maxim Levitsky
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2020-08-21  0:43 UTC (permalink / raw)
  To: Jim Mattson
  Cc: Maxim Levitsky, kvm list,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Joerg Roedel, Wanpeng Li, Borislav Petkov,
	Vitaly Kuznetsov, Paolo Bonzini

On Thu, Aug 20, 2020 at 02:43:56PM -0700, Jim Mattson wrote:
> On Thu, Aug 20, 2020 at 6:34 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> >
> > This will be used later to return an error when setting this msr fails.
> >
> > For VMX, it already has an error condition when EFER is
> > not in the shared MSR list, so return an error in this case.
> >
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> 
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -1471,7 +1471,8 @@ static int set_efer(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> >         efer &= ~EFER_LMA;
> >         efer |= vcpu->arch.efer & EFER_LMA;
> >
> > -       kvm_x86_ops.set_efer(vcpu, efer);
> > +       if (kvm_x86_ops.set_efer(vcpu, efer))
> > +               return 1;
> 
> This seems like a userspace ABI change to me. Previously, it looks
> like userspace could always use KVM_SET_MSRS to set MSR_EFER to 0 or
> EFER_SCE, and it would always succeed. Now, it looks like it will fail
> on CPUs that don't support EFER in hardware. (Perhaps it should fail,
> but it didn't before, AFAICT.)

KVM emulates SYSCALL, presumably that also works when EFER doesn't exist in
hardware.

The above also adds weirdness to nested VMX as vmx_set_efer() simply can't
fail.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/7] KVM: nSVM: rename nested 'vmcb' to vmcb12_gpa in few places
  2020-08-20 21:00   ` Jim Mattson
@ 2020-08-24 11:37     ` Maxim Levitsky
  0 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-24 11:37 UTC (permalink / raw)
  To: Jim Mattson
  Cc: kvm list, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini

On Thu, 2020-08-20 at 14:00 -0700, Jim Mattson wrote:
> On Thu, Aug 20, 2020 at 6:33 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > No functional changes.
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  arch/x86/kvm/svm/nested.c | 10 +++++-----
> >  arch/x86/kvm/svm/svm.c    | 13 +++++++------
> >  arch/x86/kvm/svm/svm.h    |  2 +-
> >  3 files changed, 13 insertions(+), 12 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> > index fb68467e6049..f5b17920a2ca 100644
> > --- a/arch/x86/kvm/svm/nested.c
> > +++ b/arch/x86/kvm/svm/nested.c
> > @@ -431,7 +431,7 @@ int enter_svm_guest_mode(struct vcpu_svm *svm, u64 vmcb_gpa,
> For consistency, should the vmcb_gpa argument be renamed to vmcb12_gpa as well?

I went over all nested.c and renamed all mentions of vmcb which refer to guest's vmcb to vmcb12,
and mentions of nested_vmcb to vmcb12 as well. I hope I didn't made this patch too much larger.
I updated the patch subject too.
> 
> 
> > @@ -579,7 +579,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
> > 
> >         /* Exit Guest-Mode */
> >         leave_guest_mode(&svm->vcpu);
> > -       svm->nested.vmcb = 0;
> > +       svm->nested.vmcb12_gpa = 0;
> Perhaps in a follow-up change, this could be set to an illegal value
> rather than 0?
Or rather not reset this address at all, as I did later in the 
caching pathes which I dropped for now.

> 
> 
> > @@ -1018,7 +1018,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
> > 
> >         /* First fill in the header and copy it out.  */
> >         if (is_guest_mode(vcpu)) {
> > -               kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb;
> > +               kvm_state.hdr.svm.vmcb_pa = svm->nested.vmcb12_gpa;
> It's unfortunate that we have "_pa" on the LHS on "_gpa" on the RHS. Oh, well.
I was afraid to touch this struct since it is user visible. I noticed it.

> 
> 
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 562a79e3e63a..d33013b9b4d7 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -1102,7 +1102,7 @@ static void init_vmcb(struct vcpu_svm *svm)
> >         }
> >         svm->asid_generation = 0;
> > 
> > -       svm->nested.vmcb = 0;
> > +       svm->nested.vmcb12_gpa = 0;
> Here, too, perhaps this could be changed from 0 to an illegal value in
> a follow-up change.
> 
> Reviewed-by: Jim Mattson <jmattson@google.com>
> 

Thanks for the review,
	Best regards,
		Maxim Levitsky


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 3/7] KVM: SVM: refactor msr permission bitmap allocation
  2020-08-20 21:26   ` Jim Mattson
@ 2020-08-24 11:43     ` Maxim Levitsky
  0 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-24 11:43 UTC (permalink / raw)
  To: Jim Mattson
  Cc: kvm list, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Sean Christopherson, Joerg Roedel, Wanpeng Li,
	Borislav Petkov, Vitaly Kuznetsov, Paolo Bonzini

On Thu, 2020-08-20 at 14:26 -0700, Jim Mattson wrote:
> On Thu, Aug 20, 2020 at 6:34 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > Replace svm_vcpu_init_msrpm with svm_vcpu_alloc_msrpm, that also allocates
> > the msr bitmap and add svm_vcpu_free_msrpm to free it.
> > 
> > This will be used later to move the nested msr permission bitmap allocation
> > to nested.c
> > 
> > No functional change intended.
> > 
> > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > ---
> >  arch/x86/kvm/svm/svm.c | 45 +++++++++++++++++++++---------------------
> >  1 file changed, 23 insertions(+), 22 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index d33013b9b4d7..7bb094bf6494 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -609,18 +609,29 @@ static void set_msr_interception(u32 *msrpm, unsigned msr,
> >         msrpm[offset] = tmp;
> >  }
> > 
> > -static void svm_vcpu_init_msrpm(u32 *msrpm)
> > +static u32 *svm_vcpu_alloc_msrpm(void)
> 
> I prefer the original name, since this function does more than allocation.
But it also allocates it. I don't mind using the old name though.
> 
> >  {
> >         int i;
> > +       u32 *msrpm;
> > +       struct page *pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
> > +
> > +       if (!pages)
> > +               return NULL;
> > 
> > +       msrpm = page_address(pages);
> >         memset(msrpm, 0xff, PAGE_SIZE * (1 << MSRPM_ALLOC_ORDER));
> > 
> >         for (i = 0; direct_access_msrs[i].index != MSR_INVALID; i++) {
> >                 if (!direct_access_msrs[i].always)
> >                         continue;
> > -
> >                 set_msr_interception(msrpm, direct_access_msrs[i].index, 1, 1);
> >         }
> > +       return msrpm;
> > +}
> > +
> > +static void svm_vcpu_free_msrpm(u32 *msrpm)
> > +{
> > +       __free_pages(virt_to_page(msrpm), MSRPM_ALLOC_ORDER);
> >  }
> > 
> >  static void add_msr_offset(u32 offset)
> > @@ -1172,9 +1183,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
> >  {
> >         struct vcpu_svm *svm;
> >         struct page *vmcb_page;
> > -       struct page *msrpm_pages;
> >         struct page *hsave_page;
> > -       struct page *nested_msrpm_pages;
> >         int err;
> > 
> >         BUILD_BUG_ON(offsetof(struct vcpu_svm, vcpu) != 0);
> > @@ -1185,21 +1194,13 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
> >         if (!vmcb_page)
> >                 goto out;
> > 
> > -       msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
> > -       if (!msrpm_pages)
> > -               goto free_page1;
> > -
> > -       nested_msrpm_pages = alloc_pages(GFP_KERNEL_ACCOUNT, MSRPM_ALLOC_ORDER);
> > -       if (!nested_msrpm_pages)
> > -               goto free_page2;
> > -
> 
> Reordering the allocations does seem like a functional change to me,
> albeit one that should (hopefully) be benign. For example, if the
> MSRPM_ALLOC_ORDER allocations fail, in the new version of the code,
> the hsave_page will be cleared, but in the old version of the code, no
> page would be cleared.
Noted.
> 
> >         hsave_page = alloc_page(GFP_KERNEL_ACCOUNT);
> 
> Speaking of clearing pages, why not add __GFP_ZERO to the flags above
> and skip the clear_page() call below?
I haven't thought about it, I don't see a reason to not use __GFP_ZERO,
but this is how the old code was.

> 
> >         if (!hsave_page)
> > -               goto free_page3;
> > +               goto free_page1;
> > 
> >         err = avic_init_vcpu(svm);
> >         if (err)
> > -               goto free_page4;
> > +               goto free_page2;
> > 
> >         /* We initialize this flag to true to make sure that the is_running
> >          * bit would be set the first time the vcpu is loaded.
> > @@ -1210,11 +1211,13 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
> >         svm->nested.hsave = page_address(hsave_page);
> >         clear_page(svm->nested.hsave);
> > 
> > -       svm->msrpm = page_address(msrpm_pages);
> > -       svm_vcpu_init_msrpm(svm->msrpm);
> > +       svm->msrpm = svm_vcpu_alloc_msrpm();
> > +       if (!svm->msrpm)
> > +               goto free_page2;
> > 
> > -       svm->nested.msrpm = page_address(nested_msrpm_pages);
> > -       svm_vcpu_init_msrpm(svm->nested.msrpm);
> > +       svm->nested.msrpm = svm_vcpu_alloc_msrpm();
> > +       if (!svm->nested.msrpm)
> > +               goto free_page3;
> > 
> >         svm->vmcb = page_address(vmcb_page);
> >         clear_page(svm->vmcb);
> > @@ -1227,12 +1230,10 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
> > 
> >         return 0;
> > 
> > -free_page4:
> > -       __free_page(hsave_page);
> >  free_page3:
> > -       __free_pages(nested_msrpm_pages, MSRPM_ALLOC_ORDER);
> > +       svm_vcpu_free_msrpm(svm->msrpm);
> >  free_page2:
> > -       __free_pages(msrpm_pages, MSRPM_ALLOC_ORDER);
> > +       __free_page(hsave_page);
> >  free_page1:
> >         __free_page(vmcb_page);
> >  out:
> 
> While you're here, could you improve these labels? Coding-style.rst says:
> 
> Choose label names which say what the goto does or why the goto exists.  An
> example of a good name could be ``out_free_buffer:`` if the goto frees
> ``buffer``.
> Avoid using GW-BASIC names like ``err1:`` and ``err2:``, as you would have to
> renumber them if you ever add or remove exit paths, and they make correctness
> difficult to verify anyway.
I noticed that and I agree. I'll do this in follow up patch.

Thanks for review,
	Best regards,
		Maxim Levitsky


> 



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 4/7] KVM: x86: allow kvm_x86_ops.set_efer to return a value
  2020-08-21  0:43     ` Sean Christopherson
@ 2020-08-27 10:23       ` Maxim Levitsky
  0 siblings, 0 replies; 16+ messages in thread
From: Maxim Levitsky @ 2020-08-27 10:23 UTC (permalink / raw)
  To: Sean Christopherson, Jim Mattson
  Cc: kvm list, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	Ingo Molnar, Thomas Gleixner,
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Joerg Roedel, Wanpeng Li, Borislav Petkov,
	Vitaly Kuznetsov, Paolo Bonzini

On Thu, 2020-08-20 at 17:43 -0700, Sean Christopherson wrote:
> On Thu, Aug 20, 2020 at 02:43:56PM -0700, Jim Mattson wrote:
> > On Thu, Aug 20, 2020 at 6:34 AM Maxim Levitsky <mlevitsk@redhat.com> wrote:
> > > This will be used later to return an error when setting this msr fails.
> > > 
> > > For VMX, it already has an error condition when EFER is
> > > not in the shared MSR list, so return an error in this case.
> > > 
> > > Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
> > > ---
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -1471,7 +1471,8 @@ static int set_efer(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > >         efer &= ~EFER_LMA;
> > >         efer |= vcpu->arch.efer & EFER_LMA;
> > > 
> > > -       kvm_x86_ops.set_efer(vcpu, efer);
> > > +       if (kvm_x86_ops.set_efer(vcpu, efer))
> > > +               return 1;
> > 
> > This seems like a userspace ABI change to me. Previously, it looks
> > like userspace could always use KVM_SET_MSRS to set MSR_EFER to 0 or
> > EFER_SCE, and it would always succeed. Now, it looks like it will fail
> > on CPUs that don't support EFER in hardware. (Perhaps it should fail,
> > but it didn't before, AFAICT.)
> 
> KVM emulates SYSCALL, presumably that also works when EFER doesn't exist in
> hardware.

This is a fair point.
How about checking the return value only when '!msr_info->host_initiated' in set_efer?

This way userspace initiated EFER write will work as it did before,
but guest initiated write will fail 
(and set_efer already checks and fails for many cases)

I also digged a bit around the failure check in VMX, the 'find_msr_entry(vmx, MSR_EFER);'
This one if I am not mistaken will only fail when host doesn't support EFER.
I don't mind ignoring this error as well as it was before.

> 
> The above also adds weirdness to nested VMX as vmx_set_efer() simply can't
> fail.
It will now fail on non 64 bit Intel CPUs that support VMX. I do think that
we had these for a while. As I said I'll return 0 when find_msr_entry fails,
thus return this behavior as it was on Intel.

Best regards,
	Maxim Levitsky



^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-08-27 10:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-20 13:33 [PATCH v2 0/7] KVM: nSVM: ondemand nested state allocation + smm fixes Maxim Levitsky
2020-08-20 13:33 ` [PATCH v2 1/7] KVM: SVM: rename a variable in the svm_create_vcpu Maxim Levitsky
2020-08-20 20:52   ` Jim Mattson
2020-08-20 13:33 ` [PATCH v2 2/7] KVM: nSVM: rename nested 'vmcb' to vmcb12_gpa in few places Maxim Levitsky
2020-08-20 21:00   ` Jim Mattson
2020-08-24 11:37     ` Maxim Levitsky
2020-08-20 13:33 ` [PATCH v2 3/7] KVM: SVM: refactor msr permission bitmap allocation Maxim Levitsky
2020-08-20 21:26   ` Jim Mattson
2020-08-24 11:43     ` Maxim Levitsky
2020-08-20 13:33 ` [PATCH v2 4/7] KVM: x86: allow kvm_x86_ops.set_efer to return a value Maxim Levitsky
2020-08-20 21:43   ` Jim Mattson
2020-08-21  0:43     ` Sean Christopherson
2020-08-27 10:23       ` Maxim Levitsky
2020-08-20 13:33 ` [PATCH v2 5/7] KVM: nSVM: more strict smm checks Maxim Levitsky
2020-08-20 13:33 ` [PATCH v2 6/7] KVM: emulator: more strict rsm checks Maxim Levitsky
2020-08-20 13:33 ` [PATCH v2 7/7] KVM: nSVM: implement ondemand allocation of the nested state Maxim Levitsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.