kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration
@ 2021-04-26 11:13 Maxim Levitsky
  2021-04-26 11:13 ` [PATCH v2 1/6] KVM: nSVM: refactor the CR3 reload on migration Maxim Levitsky
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 11:13 UTC (permalink / raw)
  To: kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel,
	Maxim Levitsky

This patch set aims to fix few flaws that were discovered
in KVM_{GET|SET}_SREGS on x86:

* There is no support for reading/writing PDPTRs although
  these are considered to be part of the guest state.

* There is useless interrupt bitmap which isn't needed

* No support for future extensions (via flags and such)

Also if the user doesn't use the new SREG2 api, the PDPTR
load after migration is now done on KVM_REQ_GET_NESTED_STATE_PAGES
to at least read them correctly in cases when guest memory
map is not up to date when nested state is loaded.

This patch series was tested by doing nested migration test
of 32 bit PAE L1 + 32 bit PAE L2 on AMD and Intel and by
nested migration test of 64 bit L1 + 32 bit PAE L2 on AMD.
The later test currently fails on Intel (regardless of my patches).

Changes from V1:
  - move only PDPTRS load to KVM_REQ_GET_NESTED_STATE_PAGES on VMX
  - rebase on top of kvm/queue
  - improve the KVM_GET_SREGS2 to have flag for PDPTRS
    and remove padding

Patches to qemu will be send soon as well.

Best regards,
        Maxim Levitskky

Maxim Levitsky (6):
  KVM: nSVM: refactor the CR3 reload on migration
  KVM: nVMX: delay loading of PDPTRs to KVM_REQ_GET_NESTED_STATE_PAGES
  KVM: x86: introduce kvm_register_clear_available
  KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2
  KVM: nSVM: avoid loading PDPTRs after migration when possible
  KVM: nVMX: avoid loading PDPTRs after migration when possible

 Documentation/virt/kvm/api.rst  |  48 +++++++++++
 arch/x86/include/asm/kvm_host.h |   7 ++
 arch/x86/include/uapi/asm/kvm.h |  13 +++
 arch/x86/kvm/kvm_cache_regs.h   |  12 +++
 arch/x86/kvm/svm/nested.c       |  33 ++++++--
 arch/x86/kvm/svm/svm.c          |   6 +-
 arch/x86/kvm/vmx/nested.c       |  24 ++++--
 arch/x86/kvm/x86.c              | 139 ++++++++++++++++++++++++++------
 include/uapi/linux/kvm.h        |   4 +
 9 files changed, 246 insertions(+), 40 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/6] KVM: nSVM: refactor the CR3 reload on migration
  2021-04-26 11:13 [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
@ 2021-04-26 11:13 ` Maxim Levitsky
  2021-04-26 11:13 ` [PATCH v2 2/6] KVM: nVMX: delay loading of PDPTRs to KVM_REQ_GET_NESTED_STATE_PAGES Maxim Levitsky
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 11:13 UTC (permalink / raw)
  To: kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel,
	Maxim Levitsky

Document the actual reason why we need to do it
on migration and move the call to svm_set_nested_state
to be closer to VMX code.

To avoid loading the PDPTRs from possibly not up to date memory map,
in nested_svm_load_cr3 after the move, move this code to
.get_nested_state_pages.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 540d43ba2cf4..ceb37ed1dc98 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -385,12 +385,12 @@ static inline bool nested_npt_enabled(struct vcpu_svm *svm)
  * if we are emulating VM-Entry into a guest with NPT enabled.
  */
 static int nested_svm_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
-			       bool nested_npt)
+			       bool nested_npt, bool reload_pdptrs)
 {
 	if (CC(kvm_vcpu_is_illegal_gpa(vcpu, cr3)))
 		return -EINVAL;
 
-	if (!nested_npt && is_pae_paging(vcpu) &&
+	if (reload_pdptrs && !nested_npt && is_pae_paging(vcpu) &&
 	    (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu))) {
 		if (CC(!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3)))
 			return -EINVAL;
@@ -576,7 +576,7 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
 	nested_vmcb02_prepare_save(svm, vmcb12);
 
 	ret = nested_svm_load_cr3(&svm->vcpu, vmcb12->save.cr3,
-				  nested_npt_enabled(svm));
+				  nested_npt_enabled(svm), true);
 	if (ret)
 		return ret;
 
@@ -806,7 +806,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
 
 	nested_svm_uninit_mmu_context(vcpu);
 
-	rc = nested_svm_load_cr3(vcpu, svm->vmcb->save.cr3, false);
+	rc = nested_svm_load_cr3(vcpu, svm->vmcb->save.cr3, false, true);
 	if (rc)
 		return 1;
 
@@ -1291,6 +1291,19 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
 	    !nested_vmcb_valid_sregs(vcpu, save))
 		goto out_free;
 
+	/*
+	 * While the nested guest CR3 is already checked and set by
+	 * KVM_SET_SREGS, it was set when nested state was yet loaded,
+	 * thus MMU might not be initialized correctly.
+	 * Set it again to fix this.
+	 */
+
+	ret = nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3,
+				  nested_npt_enabled(svm), false);
+	if (WARN_ON_ONCE(ret))
+		goto out_free;
+
+
 	/*
 	 * All checks done, we can enter guest mode. Userspace provides
 	 * vmcb12.control, which will be combined with L1 and stored into
@@ -1343,9 +1356,14 @@ static bool svm_get_nested_state_pages(struct kvm_vcpu *vcpu)
 	if (WARN_ON(!is_guest_mode(vcpu)))
 		return true;
 
-	if (nested_svm_load_cr3(&svm->vcpu, vcpu->arch.cr3,
-				nested_npt_enabled(svm)))
-		return false;
+	if (!nested_npt_enabled(svm) && is_pae_paging(vcpu))
+		/*
+		 * Reload the guest's PDPTRs since after a migration
+		 * the guest CR3 might be restored prior to setting the nested
+		 * state which can lead to a load of wrong PDPTRs.
+		 */
+		if (CC(!load_pdptrs(vcpu, vcpu->arch.walk_mmu, vcpu->arch.cr3)))
+			return false;
 
 	if (!nested_svm_vmrun_msrpm(svm)) {
 		vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/6] KVM: nVMX: delay loading of PDPTRs to KVM_REQ_GET_NESTED_STATE_PAGES
  2021-04-26 11:13 [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
  2021-04-26 11:13 ` [PATCH v2 1/6] KVM: nSVM: refactor the CR3 reload on migration Maxim Levitsky
@ 2021-04-26 11:13 ` Maxim Levitsky
  2021-04-26 11:13 ` [PATCH v2 3/6] KVM: x86: introduce kvm_register_clear_available Maxim Levitsky
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 11:13 UTC (permalink / raw)
  To: kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel,
	Maxim Levitsky

Similar to the rest of guest page accesses after a migration,
this access should be delayed to KVM_REQ_GET_NESTED_STATE_PAGES.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 00339d624c92..764781ab805b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1105,7 +1105,8 @@ static bool nested_vmx_transition_mmu_sync(struct kvm_vcpu *vcpu)
  * Exit Qualification (for a VM-Entry consistency check VM-Exit) is assigned to
  * @entry_failure_code.
  */
-static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool nested_ept,
+static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3,
+			       bool nested_ept, bool reload_pdptrs,
 			       enum vm_entry_failure_code *entry_failure_code)
 {
 	if (CC(kvm_vcpu_is_illegal_gpa(vcpu, cr3))) {
@@ -1117,7 +1118,7 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne
 	 * If PAE paging and EPT are both on, CR3 is not used by the CPU and
 	 * must not be dereferenced.
 	 */
-	if (!nested_ept && is_pae_paging(vcpu) &&
+	if (reload_pdptrs && !nested_ept && is_pae_paging(vcpu) &&
 	    (cr3 != kvm_read_cr3(vcpu) || pdptrs_changed(vcpu))) {
 		if (CC(!load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))) {
 			*entry_failure_code = ENTRY_FAIL_PDPTE;
@@ -2488,6 +2489,7 @@ static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
  * is assigned to entry_failure_code on failure.
  */
 static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
+			  bool from_vmentry,
 			  enum vm_entry_failure_code *entry_failure_code)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -2572,7 +2574,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
 
 	/* Shadow page tables on either EPT or shadow page tables. */
 	if (nested_vmx_load_cr3(vcpu, vmcs12->guest_cr3, nested_cpu_has_ept(vmcs12),
-				entry_failure_code))
+			from_vmentry, entry_failure_code))
 		return -EINVAL;
 
 	/*
@@ -3120,6 +3122,17 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
 	struct page *page;
 	u64 hpa;
 
+	if (!nested_cpu_has_ept(vmcs12) && is_pae_paging(vcpu)) {
+		/*
+		 * Reload the guest's PDPTRs since after a migration
+		 * the guest CR3 might be restored prior to setting the nested
+		 * state which can lead to a load of wrong PDPTRs.
+		 */
+		if (CC(!load_pdptrs(vcpu, vcpu->arch.walk_mmu, vcpu->arch.cr3)))
+			return false;
+	}
+
+
 	if (nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) {
 		/*
 		 * Translate L1 physical address to host physical
@@ -3356,7 +3369,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
 	if (vmcs12->cpu_based_vm_exec_control & CPU_BASED_USE_TSC_OFFSETTING)
 		vcpu->arch.tsc_offset += vmcs12->tsc_offset;
 
-	if (prepare_vmcs02(vcpu, vmcs12, &entry_failure_code)) {
+	if (prepare_vmcs02(vcpu, vmcs12, from_vmentry, &entry_failure_code)) {
 		exit_reason.basic = EXIT_REASON_INVALID_STATE;
 		vmcs12->exit_qualification = entry_failure_code;
 		goto vmentry_fail_vmexit_guest_mode;
@@ -4205,7 +4218,7 @@ static void load_vmcs12_host_state(struct kvm_vcpu *vcpu,
 	 * Only PDPTE load can fail as the value of cr3 was checked on entry and
 	 * couldn't have changed.
 	 */
-	if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, &ignored))
+	if (nested_vmx_load_cr3(vcpu, vmcs12->host_cr3, false, true, &ignored))
 		nested_vmx_abort(vcpu, VMX_ABORT_LOAD_HOST_PDPTE_FAIL);
 
 	nested_vmx_transition_tlb_flush(vcpu, vmcs12, false);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/6] KVM: x86: introduce kvm_register_clear_available
  2021-04-26 11:13 [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
  2021-04-26 11:13 ` [PATCH v2 1/6] KVM: nSVM: refactor the CR3 reload on migration Maxim Levitsky
  2021-04-26 11:13 ` [PATCH v2 2/6] KVM: nVMX: delay loading of PDPTRs to KVM_REQ_GET_NESTED_STATE_PAGES Maxim Levitsky
@ 2021-04-26 11:13 ` Maxim Levitsky
  2021-04-26 11:13 ` [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2 Maxim Levitsky
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 11:13 UTC (permalink / raw)
  To: kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel,
	Maxim Levitsky

Small refactoring that will be used in the next patch.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/kvm_cache_regs.h | 7 +++++++
 arch/x86/kvm/svm/svm.c        | 6 ++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index 3db5c42c9ecd..b8559b83c4ca 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -55,6 +55,13 @@ static inline void kvm_register_mark_available(struct kvm_vcpu *vcpu,
 	__set_bit(reg, (unsigned long *)&vcpu->arch.regs_avail);
 }
 
+static inline void kvm_register_clear_available(struct kvm_vcpu *vcpu,
+					       enum kvm_reg reg)
+{
+	__clear_bit(reg, (unsigned long *)&vcpu->arch.regs_avail);
+	__clear_bit(reg, (unsigned long *)&vcpu->arch.regs_dirty);
+}
+
 static inline void kvm_register_mark_dirty(struct kvm_vcpu *vcpu,
 					   enum kvm_reg reg)
 {
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 14ff7f0963e9..c596bf1a4ae8 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3868,10 +3868,8 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
 		vcpu->arch.apf.host_apf_flags =
 			kvm_read_and_reset_apf_flags();
 
-	if (npt_enabled) {
-		vcpu->arch.regs_avail &= ~(1 << VCPU_EXREG_PDPTR);
-		vcpu->arch.regs_dirty &= ~(1 << VCPU_EXREG_PDPTR);
-	}
+	if (npt_enabled)
+		kvm_register_clear_available(vcpu, VCPU_EXREG_PDPTR);
 
 	/*
 	 * We need to handle MC intercepts here before the vcpu has a chance to
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2
  2021-04-26 11:13 [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
                   ` (2 preceding siblings ...)
  2021-04-26 11:13 ` [PATCH v2 3/6] KVM: x86: introduce kvm_register_clear_available Maxim Levitsky
@ 2021-04-26 11:13 ` Maxim Levitsky
  2021-04-26 12:32   ` Paolo Bonzini
  2021-04-26 11:13 ` [PATCH v2 5/6] KVM: nSVM: avoid loading PDPTRs after migration when possible Maxim Levitsky
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 11:13 UTC (permalink / raw)
  To: kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel,
	Maxim Levitsky

This is a new version of KVM_GET_SREGS / KVM_SET_SREGS.

It has the following changes:
   * Has flags for future extensions
   * Has vcpu's PDPTS, which allows to save/restore them on migration.
   * Lacks obsolete interrupt bitmap (done now via KVM_SET_VCPU_EVENTS)

New capability, KVM_CAP_SREGS2 is added to signal
the userspace of this ioctl.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 Documentation/virt/kvm/api.rst  |  48 +++++++++++
 arch/x86/include/asm/kvm_host.h |   7 ++
 arch/x86/include/uapi/asm/kvm.h |  13 +++
 arch/x86/kvm/kvm_cache_regs.h   |   5 ++
 arch/x86/kvm/x86.c              | 139 ++++++++++++++++++++++++++------
 include/uapi/linux/kvm.h        |   4 +
 6 files changed, 192 insertions(+), 24 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 8d614a577e43..6781e0a4bfa5 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5021,6 +5021,54 @@ see KVM_XEN_VCPU_SET_ATTR above.
 The KVM_XEN_VCPU_ATTR_TYPE_RUNSTATE_ADJUST type may not be used
 with the KVM_XEN_VCPU_GET_ATTR ioctl.
 
+
+4.131 KVM_GET_SREGS2
+------------------
+
+:Capability: KVM_CAP_SREGS2
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_sregs2 (out)
+:Returns: 0 on success, -1 on error
+
+Reads special registers from the vcpu.
+This ioctl (when supported) replaces the KVM_GET_SREGS.
+
+::
+
+struct kvm_sregs2 {
+	/* out (KVM_GET_SREGS2) / in (KVM_SET_SREGS2) */
+	struct kvm_segment cs, ds, es, fs, gs, ss;
+	struct kvm_segment tr, ldt;
+	struct kvm_dtable gdt, idt;
+	__u64 cr0, cr2, cr3, cr4, cr8;
+	__u64 efer;
+	__u64 apic_base;
+	__u64 flags;
+	__u64 pdptrs[4];
+};
+
+flags values for ``kvm_sregs2``:
+
+``KVM_SREGS2_FLAGS_PDPTRS_VALID``
+
+  Indicates thats the struct contain valid PDPTR values.
+
+
+4.132 KVM_SET_SREGS2
+------------------
+
+:Capability: KVM_CAP_SREGS2
+:Architectures: x86
+:Type: vcpu ioctl
+:Parameters: struct kvm_sregs2 (in)
+:Returns: 0 on success, -1 on error
+
+Writes special registers into the vcpu.
+See KVM_GET_SREGS2 for the data structures.
+This ioctl (when supported) replaces the KVM_SET_SREGS.
+
+
 5. The kvm_run structure
 ========================
 
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3e5fc80a35c8..943e19831e32 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -849,6 +849,13 @@ struct kvm_vcpu_arch {
 
 	/* Protected Guests */
 	bool guest_state_protected;
+
+	/*
+	 * Do we need to reload the PDPTRs when entering the nested state?
+	 * Set if the userspace used the KVM_SET_SREGS to set CR3
+	 * and thus didn't load the PDPTRs from the migration state.
+	 */
+	bool pdptr_reload_on_nesting_needed;
 };
 
 struct kvm_lpage_info {
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index 5a3022c8af82..46aef200b168 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -159,6 +159,19 @@ struct kvm_sregs {
 	__u64 interrupt_bitmap[(KVM_NR_INTERRUPTS + 63) / 64];
 };
 
+struct kvm_sregs2 {
+	/* out (KVM_GET_SREGS2) / in (KVM_SET_SREGS2) */
+	struct kvm_segment cs, ds, es, fs, gs, ss;
+	struct kvm_segment tr, ldt;
+	struct kvm_dtable gdt, idt;
+	__u64 cr0, cr2, cr3, cr4, cr8;
+	__u64 efer;
+	__u64 apic_base;
+	__u64 flags;
+	__u64 pdptrs[4];
+};
+#define KVM_SREGS2_FLAGS_PDPTRS_VALID 1
+
 /* for KVM_GET_FPU and KVM_SET_FPU */
 struct kvm_fpu {
 	__u8  fpr[8][16];
diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index b8559b83c4ca..842a09711b80 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -125,6 +125,11 @@ static inline u64 kvm_pdptr_read(struct kvm_vcpu *vcpu, int index)
 	return vcpu->arch.walk_mmu->pdptrs[index];
 }
 
+static inline void kvm_pdptr_write(struct kvm_vcpu *vcpu, int index, u64 value)
+{
+	vcpu->arch.walk_mmu->pdptrs[index] = value;
+}
+
 static inline ulong kvm_read_cr0_bits(struct kvm_vcpu *vcpu, ulong mask)
 {
 	ulong tmask = mask & KVM_POSSIBLE_CR0_GUEST_BITS;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 3bf52ba5f2bb..967022b983a8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -113,6 +113,9 @@ static void __kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags);
 static void store_regs(struct kvm_vcpu *vcpu);
 static int sync_regs(struct kvm_vcpu *vcpu);
 
+static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
+static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2);
+
 struct kvm_x86_ops kvm_x86_ops __read_mostly;
 EXPORT_SYMBOL_GPL(kvm_x86_ops);
 
@@ -3841,6 +3844,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 #endif
 	case KVM_CAP_VM_COPY_ENC_CONTEXT_FROM:
 	case KVM_CAP_EXIT_HYPERCALL:
+	case KVM_CAP_SREGS2:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
@@ -4759,6 +4763,7 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	void __user *argp = (void __user *)arg;
 	int r;
 	union {
+		struct kvm_sregs2 *sregs2;
 		struct kvm_lapic_state *lapic;
 		struct kvm_xsave *xsave;
 		struct kvm_xcrs *xcrs;
@@ -5131,6 +5136,28 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		break;
 	}
 #endif
+	case KVM_GET_SREGS2: {
+		u.sregs2 = kzalloc(sizeof(struct kvm_sregs2), GFP_KERNEL);
+		r = -ENOMEM;
+		if (!u.sregs2)
+			goto out;
+		__get_sregs2(vcpu, u.sregs2);
+		r = -EFAULT;
+		if (copy_to_user(argp, u.sregs2, sizeof(struct kvm_sregs2)))
+			goto out;
+		r = 0;
+		break;
+	}
+	case KVM_SET_SREGS2: {
+		u.sregs2 = memdup_user(argp, sizeof(struct kvm_sregs2));
+		if (IS_ERR(u.sregs2)) {
+			r = PTR_ERR(u.sregs2);
+			u.sregs2 = NULL;
+			goto out;
+		}
+		r = __set_sregs2(vcpu, u.sregs2);
+		break;
+	}
 	default:
 		r = -EINVAL;
 	}
@@ -9795,7 +9822,7 @@ void kvm_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l)
 }
 EXPORT_SYMBOL_GPL(kvm_get_cs_db_l_bits);
 
-static void __get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
+static void __get_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 {
 	struct desc_ptr dt;
 
@@ -9828,14 +9855,32 @@ static void __get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 	sregs->cr8 = kvm_get_cr8(vcpu);
 	sregs->efer = vcpu->arch.efer;
 	sregs->apic_base = kvm_get_apic_base(vcpu);
+}
 
-	memset(sregs->interrupt_bitmap, 0, sizeof(sregs->interrupt_bitmap));
+static void __get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
+{
+	__get_sregs_common(vcpu, sregs);
+
+	if (vcpu->arch.guest_state_protected)
+		return;
 
 	if (vcpu->arch.interrupt.injected && !vcpu->arch.interrupt.soft)
 		set_bit(vcpu->arch.interrupt.nr,
 			(unsigned long *)sregs->interrupt_bitmap);
 }
 
+static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2)
+{
+	int i;
+
+	__get_sregs_common(vcpu, (struct kvm_sregs *)sregs2);
+	if (is_pae_paging(vcpu)) {
+		for (i = 0 ; i < 4 ; i++)
+			sregs2->pdptrs[i] = kvm_pdptr_read(vcpu, i);
+		sregs2->flags |= KVM_SREGS2_FLAGS_PDPTRS_VALID;
+	}
+}
+
 int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
 				  struct kvm_sregs *sregs)
 {
@@ -9947,11 +9992,9 @@ static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 	return kvm_is_valid_cr4(vcpu, sregs->cr4);
 }
 
-static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
+static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs, int *mmu_reset_needed)
 {
 	struct msr_data apic_base_msr;
-	int mmu_reset_needed = 0;
-	int pending_vec, max_bits, idx;
 	struct desc_ptr dt;
 	int ret = -EINVAL;
 
@@ -9963,8 +10006,9 @@ static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 	if (kvm_set_apic_base(vcpu, &apic_base_msr))
 		goto out;
 
+	ret = 0;
 	if (vcpu->arch.guest_state_protected)
-		goto skip_protected_regs;
+		goto out;
 
 	dt.size = sregs->idt.limit;
 	dt.address = sregs->idt.base;
@@ -9974,32 +10018,22 @@ static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 	static_call(kvm_x86_set_gdt)(vcpu, &dt);
 
 	vcpu->arch.cr2 = sregs->cr2;
-	mmu_reset_needed |= kvm_read_cr3(vcpu) != sregs->cr3;
+	*mmu_reset_needed |= kvm_read_cr3(vcpu) != sregs->cr3;
 	vcpu->arch.cr3 = sregs->cr3;
 	kvm_register_mark_available(vcpu, VCPU_EXREG_CR3);
 
 	kvm_set_cr8(vcpu, sregs->cr8);
 
-	mmu_reset_needed |= vcpu->arch.efer != sregs->efer;
+	*mmu_reset_needed |= vcpu->arch.efer != sregs->efer;
 	static_call(kvm_x86_set_efer)(vcpu, sregs->efer);
 
-	mmu_reset_needed |= kvm_read_cr0(vcpu) != sregs->cr0;
+	*mmu_reset_needed |= kvm_read_cr0(vcpu) != sregs->cr0;
 	static_call(kvm_x86_set_cr0)(vcpu, sregs->cr0);
 	vcpu->arch.cr0 = sregs->cr0;
 
-	mmu_reset_needed |= kvm_read_cr4(vcpu) != sregs->cr4;
+	*mmu_reset_needed |= kvm_read_cr4(vcpu) != sregs->cr4;
 	static_call(kvm_x86_set_cr4)(vcpu, sregs->cr4);
 
-	idx = srcu_read_lock(&vcpu->kvm->srcu);
-	if (is_pae_paging(vcpu)) {
-		load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu));
-		mmu_reset_needed = 1;
-	}
-	srcu_read_unlock(&vcpu->kvm->srcu, idx);
-
-	if (mmu_reset_needed)
-		kvm_mmu_reset_context(vcpu);
-
 	kvm_set_segment(vcpu, &sregs->cs, VCPU_SREG_CS);
 	kvm_set_segment(vcpu, &sregs->ds, VCPU_SREG_DS);
 	kvm_set_segment(vcpu, &sregs->es, VCPU_SREG_ES);
@@ -10017,8 +10051,39 @@ static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 	    sregs->cs.selector == 0xf000 && sregs->cs.base == 0xffff0000 &&
 	    !is_protmode(vcpu))
 		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
+out:
+	return ret;
+}
+
+static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
+{
+	int idx, pending_vec, max_bits;
+	int mmu_reset_needed = 0;
+	int ret = __set_sregs_common(vcpu, sregs, &mmu_reset_needed);
+
+	if (ret)
+		return ret;
+
+	if (vcpu->arch.guest_state_protected)
+		return 0;
+
+	idx = srcu_read_lock(&vcpu->kvm->srcu);
+	if (is_pae_paging(vcpu)) {
+		load_pdptrs(vcpu, vcpu->arch.walk_mmu, kvm_read_cr3(vcpu));
+		mmu_reset_needed = 1;
+
+		/* If we are going to enter a nested guest, we had just
+		 * loaded wrong PDPTRs, thus we need to reload them
+		 * on guest mode entry
+		 */
+
+		vcpu->arch.pdptr_reload_on_nesting_needed = true;
+	}
+	srcu_read_unlock(&vcpu->kvm->srcu, idx);
+
+	if (mmu_reset_needed)
+		kvm_mmu_reset_context(vcpu);
 
-skip_protected_regs:
 	max_bits = KVM_NR_INTERRUPTS;
 	pending_vec = find_first_bit(
 		(const unsigned long *)sregs->interrupt_bitmap, max_bits);
@@ -10028,10 +10093,36 @@ static int __set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 	}
 
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
+	return 0;
+}
 
-	ret = 0;
-out:
-	return ret;
+static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2)
+{
+	int mmu_reset_needed = 0;
+	int i, ret;
+
+	if (sregs2->flags & ~KVM_SREGS2_FLAGS_PDPTRS_VALID)
+		return -EINVAL;
+
+	ret = __set_sregs_common(vcpu, (struct kvm_sregs *)sregs2, &mmu_reset_needed);
+	if (ret || vcpu->arch.guest_state_protected)
+		return ret;
+
+	if (sregs2->flags & KVM_SREGS2_FLAGS_PDPTRS_VALID) {
+
+		if (!is_pae_paging(vcpu))
+			return -EINVAL;
+
+		for (i = 0 ; i < 4 ; i++)
+			kvm_pdptr_write(vcpu, i, sregs2->pdptrs[i]);
+
+		kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
+		mmu_reset_needed = 1;
+	}
+
+	if (mmu_reset_needed)
+		kvm_mmu_reset_context(vcpu);
+	return 0;
 }
 
 int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 7dc1c217704f..156f2f751893 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1082,6 +1082,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_SGX_ATTRIBUTE 196
 #define KVM_CAP_VM_COPY_ENC_CONTEXT_FROM 197
 #define KVM_CAP_EXIT_HYPERCALL 198
+#define KVM_CAP_SREGS2 199
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1620,6 +1621,9 @@ struct kvm_xen_hvm_attr {
 #define KVM_XEN_VCPU_GET_ATTR	_IOWR(KVMIO, 0xca, struct kvm_xen_vcpu_attr)
 #define KVM_XEN_VCPU_SET_ATTR	_IOW(KVMIO,  0xcb, struct kvm_xen_vcpu_attr)
 
+#define KVM_GET_SREGS2             _IOR(KVMIO,  0xcc, struct kvm_sregs2)
+#define KVM_SET_SREGS2             _IOW(KVMIO,  0xcd, struct kvm_sregs2)
+
 struct kvm_xen_vcpu_attr {
 	__u16 type;
 	__u16 pad[3];
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/6] KVM: nSVM: avoid loading PDPTRs after migration when possible
  2021-04-26 11:13 [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
                   ` (3 preceding siblings ...)
  2021-04-26 11:13 ` [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2 Maxim Levitsky
@ 2021-04-26 11:13 ` Maxim Levitsky
  2021-04-26 11:13 ` [PATCH v2 6/6] KVM: nVMX: " Maxim Levitsky
  2021-05-26 18:01 ` [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Sean Christopherson
  6 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 11:13 UTC (permalink / raw)
  To: kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel,
	Maxim Levitsky

if new KVM_*_SREGS2 ioctls are used, the PDPTRs are
a part of the migration state and are correctly
restored by those ioctls.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/svm/nested.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index ceb37ed1dc98..9624fa81e830 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1356,7 +1356,8 @@ static bool svm_get_nested_state_pages(struct kvm_vcpu *vcpu)
 	if (WARN_ON(!is_guest_mode(vcpu)))
 		return true;
 
-	if (!nested_npt_enabled(svm) && is_pae_paging(vcpu))
+	if (vcpu->arch.pdptr_reload_on_nesting_needed &&
+	    !nested_npt_enabled(svm) && is_pae_paging(vcpu))
 		/*
 		 * Reload the guest's PDPTRs since after a migration
 		 * the guest CR3 might be restored prior to setting the nested
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/6] KVM: nVMX: avoid loading PDPTRs after migration when possible
  2021-04-26 11:13 [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
                   ` (4 preceding siblings ...)
  2021-04-26 11:13 ` [PATCH v2 5/6] KVM: nSVM: avoid loading PDPTRs after migration when possible Maxim Levitsky
@ 2021-04-26 11:13 ` Maxim Levitsky
  2021-05-26 18:01 ` [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Sean Christopherson
  6 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 11:13 UTC (permalink / raw)
  To: kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel,
	Maxim Levitsky

if new KVM_*_SREGS2 ioctls are used, the PDPTRs are
a part of the migration state and are correctly
restored by those ioctls.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
---
 arch/x86/kvm/vmx/nested.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 764781ab805b..743a832d6caf 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3122,7 +3122,8 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
 	struct page *page;
 	u64 hpa;
 
-	if (!nested_cpu_has_ept(vmcs12) && is_pae_paging(vcpu)) {
+	if (vcpu->arch.pdptr_reload_on_nesting_needed &&
+	    !nested_cpu_has_ept(vmcs12) && is_pae_paging(vcpu)) {
 		/*
 		 * Reload the guest's PDPTRs since after a migration
 		 * the guest CR3 might be restored prior to setting the nested
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2
  2021-04-26 11:13 ` [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2 Maxim Levitsky
@ 2021-04-26 12:32   ` Paolo Bonzini
  2021-04-26 12:56     ` Maxim Levitsky
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2021-04-26 12:32 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel

On 26/04/21 13:13, Maxim Levitsky wrote:
> +	if (sregs2->flags & KVM_SREGS2_FLAGS_PDPTRS_VALID) {
> +
> +		if (!is_pae_paging(vcpu))
> +			return -EINVAL;
> +
> +		for (i = 0 ; i < 4 ; i++)
> +			kvm_pdptr_write(vcpu, i, sregs2->pdptrs[i]);
> +
> +		kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
> +		mmu_reset_needed = 1;
> +	}

I think this should also have

	else {
		if (is_pae_paging(vcpu))
			return -EINVAL;
	}

but perhaps even better, check it at the beginning:

	if ((sregs->cr4 & X86_CR4_PAE) &&
             !!(sregs->efer & EFER_LMA) == !!(sregs2->flags & KVM_SREGS2_FLAGS_PDPTRS_VALID))
		return -EINVAL;

which technically means the flag is redundant, but there is some value in
having the flag and not allowing the user to shoot itself in the foot.

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2
  2021-04-26 12:32   ` Paolo Bonzini
@ 2021-04-26 12:56     ` Maxim Levitsky
  2021-04-26 13:28       ` Paolo Bonzini
  0 siblings, 1 reply; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 12:56 UTC (permalink / raw)
  To: Paolo Bonzini, kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel

On Mon, 2021-04-26 at 14:32 +0200, Paolo Bonzini wrote:
> On 26/04/21 13:13, Maxim Levitsky wrote:
> > +	if (sregs2->flags & KVM_SREGS2_FLAGS_PDPTRS_VALID) {
> > +
> > +		if (!is_pae_paging(vcpu))
> > +			return -EINVAL;
> > +
> > +		for (i = 0 ; i < 4 ; i++)
> > +			kvm_pdptr_write(vcpu, i, sregs2->pdptrs[i]);
> > +
> > +		kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
> > +		mmu_reset_needed = 1;
> > +	}
> 
> I think this should also have
> 
> 	else {
> 		if (is_pae_paging(vcpu))
> 			return -EINVAL;
> 	}


What about the case when we migrate from qemu that doesn't use
this ioctl to qemu that does? 

In this case assuming that the new qemu does use SREGS2 ioctls,
the PDPTR data will not be present
in the migration stream and thus qemu will call this ioctl without this flag
set.

I think I should in this case load the pdptrs from memory,
Or I should make qemu not use this ioctl in this.
What do you prefer?

Thanks for pointing this bug out though!
I haven't thought about this case well enough.

Best regards,
	Maxim Levitsky

> 
> but perhaps even better, check it at the beginning:
> 
> 	if ((sregs->cr4 & X86_CR4_PAE) &&
>              !!(sregs->efer & EFER_LMA) == !!(sregs2->flags & KVM_SREGS2_FLAGS_PDPTRS_VALID))
> 		return -EINVAL;
> 
> which technically means the flag is redundant, but there is some value in
> having the flag and not allowing the user to shoot itself in the foot.
> 
> Paolo
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2
  2021-04-26 12:56     ` Maxim Levitsky
@ 2021-04-26 13:28       ` Paolo Bonzini
  2021-04-26 13:31         ` Maxim Levitsky
  0 siblings, 1 reply; 13+ messages in thread
From: Paolo Bonzini @ 2021-04-26 13:28 UTC (permalink / raw)
  To: Maxim Levitsky, kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel

On 26/04/21 14:56, Maxim Levitsky wrote:
> On Mon, 2021-04-26 at 14:32 +0200, Paolo Bonzini wrote:
>> On 26/04/21 13:13, Maxim Levitsky wrote:
>>> +	if (sregs2->flags & KVM_SREGS2_FLAGS_PDPTRS_VALID) {
>>> +
>>> +		if (!is_pae_paging(vcpu))
>>> +			return -EINVAL;
>>> +
>>> +		for (i = 0 ; i < 4 ; i++)
>>> +			kvm_pdptr_write(vcpu, i, sregs2->pdptrs[i]);
>>> +
>>> +		kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
>>> +		mmu_reset_needed = 1;
>>> +	}
>>
>> I think this should also have
>>
>> 	else {
>> 		if (is_pae_paging(vcpu))
>> 			return -EINVAL;
>> 	}
> 
> 
> What about the case when we migrate from qemu that doesn't use
> this ioctl to qemu that does?

Right, that makes sense but then the "else" branch should do the same as 
KVM_SET_SREGS.  Maybe add a "load_pdptrs" bool to __set_sregs_common?

Paolo


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2
  2021-04-26 13:28       ` Paolo Bonzini
@ 2021-04-26 13:31         ` Maxim Levitsky
  0 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-04-26 13:31 UTC (permalink / raw)
  To: Paolo Bonzini, kvm
  Cc: open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Sean Christopherson, Joerg Roedel

On Mon, 2021-04-26 at 15:28 +0200, Paolo Bonzini wrote:
> On 26/04/21 14:56, Maxim Levitsky wrote:
> > On Mon, 2021-04-26 at 14:32 +0200, Paolo Bonzini wrote:
> > > On 26/04/21 13:13, Maxim Levitsky wrote:
> > > > +	if (sregs2->flags & KVM_SREGS2_FLAGS_PDPTRS_VALID) {
> > > > +
> > > > +		if (!is_pae_paging(vcpu))
> > > > +			return -EINVAL;
> > > > +
> > > > +		for (i = 0 ; i < 4 ; i++)
> > > > +			kvm_pdptr_write(vcpu, i, sregs2->pdptrs[i]);
> > > > +
> > > > +		kvm_register_mark_dirty(vcpu, VCPU_EXREG_PDPTR);
> > > > +		mmu_reset_needed = 1;
> > > > +	}
> > > 
> > > I think this should also have
> > > 
> > > 	else {
> > > 		if (is_pae_paging(vcpu))
> > > 			return -EINVAL;
> > > 	}
> > 
> > What about the case when we migrate from qemu that doesn't use
> > this ioctl to qemu that does?
> 
> Right, that makes sense but then the "else" branch should do the same as 
> KVM_SET_SREGS.  Maybe add a "load_pdptrs" bool to __set_sregs_common?

Yes, I'll do something like that.
Thanks,
	Best regards,
		Maxim Levitsky

> 
> Paolo
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration
  2021-04-26 11:13 [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
                   ` (5 preceding siblings ...)
  2021-04-26 11:13 ` [PATCH v2 6/6] KVM: nVMX: " Maxim Levitsky
@ 2021-05-26 18:01 ` Sean Christopherson
  2021-05-29 17:49   ` Maxim Levitsky
  6 siblings, 1 reply; 13+ messages in thread
From: Sean Christopherson @ 2021-05-26 18:01 UTC (permalink / raw)
  To: Maxim Levitsky
  Cc: kvm, open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Joerg Roedel

On Mon, Apr 26, 2021, Maxim Levitsky wrote:
> This patch set aims to fix few flaws that were discovered
> in KVM_{GET|SET}_SREGS on x86:
> 
> * There is no support for reading/writing PDPTRs although
>   these are considered to be part of the guest state.
> 
> * There is useless interrupt bitmap which isn't needed
> 
> * No support for future extensions (via flags and such)
> 
> Also if the user doesn't use the new SREG2 api, the PDPTR
> load after migration is now done on KVM_REQ_GET_NESTED_STATE_PAGES
> to at least read them correctly in cases when guest memory
> map is not up to date when nested state is loaded.
> 
> This patch series was tested by doing nested migration test
> of 32 bit PAE L1 + 32 bit PAE L2 on AMD and Intel and by
> nested migration test of 64 bit L1 + 32 bit PAE L2 on AMD.
> The later test currently fails on Intel (regardless of my patches).
> 
> Changes from V1:
>   - move only PDPTRS load to KVM_REQ_GET_NESTED_STATE_PAGES on VMX
>   - rebase on top of kvm/queue
>   - improve the KVM_GET_SREGS2 to have flag for PDPTRS
>     and remove padding
> 
> Patches to qemu will be send soon as well.

How did you want to handle integration with the removal of pdptrs_changed()?

https://lkml.kernel.org/r/68ff1249-2902-43d5-3dfd-35b1f14c4f90@redhat.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration
  2021-05-26 18:01 ` [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Sean Christopherson
@ 2021-05-29 17:49   ` Maxim Levitsky
  0 siblings, 0 replies; 13+ messages in thread
From: Maxim Levitsky @ 2021-05-29 17:49 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, open list:DOCUMENTATION, Thomas Gleixner, Jonathan Corbet,
	Paolo Bonzini, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	open list:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Ingo Molnar, Borislav Petkov, Wanpeng Li,
	Vitaly Kuznetsov, Jim Mattson, Joerg Roedel

On Wed, 2021-05-26 at 18:01 +0000, Sean Christopherson wrote:
> On Mon, Apr 26, 2021, Maxim Levitsky wrote:
> > This patch set aims to fix few flaws that were discovered
> > in KVM_{GET|SET}_SREGS on x86:
> > 
> > * There is no support for reading/writing PDPTRs although
> >   these are considered to be part of the guest state.
> > 
> > * There is useless interrupt bitmap which isn't needed
> > 
> > * No support for future extensions (via flags and such)
> > 
> > Also if the user doesn't use the new SREG2 api, the PDPTR
> > load after migration is now done on KVM_REQ_GET_NESTED_STATE_PAGES
> > to at least read them correctly in cases when guest memory
> > map is not up to date when nested state is loaded.
> > 
> > This patch series was tested by doing nested migration test
> > of 32 bit PAE L1 + 32 bit PAE L2 on AMD and Intel and by
> > nested migration test of 64 bit L1 + 32 bit PAE L2 on AMD.
> > The later test currently fails on Intel (regardless of my patches).
> > 
> > Changes from V1:
> >   - move only PDPTRS load to KVM_REQ_GET_NESTED_STATE_PAGES on VMX
> >   - rebase on top of kvm/queue
> >   - improve the KVM_GET_SREGS2 to have flag for PDPTRS
> >     and remove padding
> > 
> > Patches to qemu will be send soon as well.
> 
> How did you want to handle integration with the removal of
> pdptrs_changed()?
> 
> https://lkml.kernel.org/r/68ff1249-2902-43d5-3dfd-35b1f14c4f90@redhat.com
> 

Hi!
Sorry that I missed your mail. I will take a look in a day or so at
this, and I don't envision any significant trouble with removal of
pdptrs_changed, since it is only an optimization anyway.

Thanks,
	Best regards,
		Maxim Levitsky


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-05-29 17:50 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 11:13 [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Maxim Levitsky
2021-04-26 11:13 ` [PATCH v2 1/6] KVM: nSVM: refactor the CR3 reload on migration Maxim Levitsky
2021-04-26 11:13 ` [PATCH v2 2/6] KVM: nVMX: delay loading of PDPTRs to KVM_REQ_GET_NESTED_STATE_PAGES Maxim Levitsky
2021-04-26 11:13 ` [PATCH v2 3/6] KVM: x86: introduce kvm_register_clear_available Maxim Levitsky
2021-04-26 11:13 ` [PATCH v2 4/6] KVM: x86: Introduce KVM_GET_SREGS2 / KVM_SET_SREGS2 Maxim Levitsky
2021-04-26 12:32   ` Paolo Bonzini
2021-04-26 12:56     ` Maxim Levitsky
2021-04-26 13:28       ` Paolo Bonzini
2021-04-26 13:31         ` Maxim Levitsky
2021-04-26 11:13 ` [PATCH v2 5/6] KVM: nSVM: avoid loading PDPTRs after migration when possible Maxim Levitsky
2021-04-26 11:13 ` [PATCH v2 6/6] KVM: nVMX: " Maxim Levitsky
2021-05-26 18:01 ` [PATCH v2 0/6] Introduce KVM_{GET|SET}_SREGS2 and fix PDPTR migration Sean Christopherson
2021-05-29 17:49   ` Maxim Levitsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).