All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation"
@ 2021-09-21  0:02 Sean Christopherson
  2021-09-21  0:02 ` [PATCH v2 01/10] KVM: x86: Mark all registers as avail/dirty at vCPU creation Sean Christopherson
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Add dedicated helpers to emulate RESET (or at least, the parts of RESET
that KVM actually emulates) instead of having the relevant code scattered
throughout vcpu_create() and vcpu_reset().

Vitaly's innocuous suggestion to WARN on non-zero CR0 led to a wonderful
bit of git archaeology after, to my complete surprise, it fired on RESET
due to fx_init() setting CR0.ET (and XCR0.FP).

v2:
 - Add patches to drop defunct fx_init() code.
 - Add patch to zero CR3 at RESET/INIT. 
 - Add patch to mark all regs avail/dirty at creation, not at RESET/INIT.
 - Add patch to WARN if CRs are non-zero at RESET. [Vitaly]

v1: https://lkml.kernel.org/r/20210914230840.3030620-1-seanjc@google.com

Sean Christopherson (10):
  KVM: x86: Mark all registers as avail/dirty at vCPU creation
  KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT
  KVM: x86: Do not mark all registers as avail/dirty during RESET/INIT
  KVM: x86: Remove defunct setting of CR0.ET for guests during vCPU
    create
  KVM: x86: Remove defunct setting of XCR0 for guest during vCPU create
  KVM: x86: Fold fx_init() into kvm_arch_vcpu_create()
  KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation
  KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
  KVM: SVM: Move RESET emulation to svm_vcpu_reset()
  KVM: x86: WARN on non-zero CRs at RESET to detect improper
    initalization

 arch/x86/kvm/svm/sev.c |  6 ++--
 arch/x86/kvm/svm/svm.c | 29 ++++++++++--------
 arch/x86/kvm/svm/svm.h |  2 +-
 arch/x86/kvm/vmx/vmx.c | 68 ++++++++++++++++++++----------------------
 arch/x86/kvm/x86.c     | 44 +++++++++++++--------------
 5 files changed, 76 insertions(+), 73 deletions(-)

-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2 01/10] KVM: x86: Mark all registers as avail/dirty at vCPU creation
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
@ 2021-09-21  0:02 ` Sean Christopherson
  2021-09-21 13:40   ` Vitaly Kuznetsov
  2021-09-21  0:02 ` [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT Sean Christopherson
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Mark all registers as available and dirty at vCPU creation, as the vCPU has
obviously not been loaded into hardware, let alone been given the chance to
be modified in hardware.  On SVM, reading from "uninitialized" hardware is
a non-issue as VMCBs are zero allocated (thus not truly uninitialized) and
hardware does not allow for arbitrary field encoding schemes.

On VMX, backing memory for VMCSes is also zero allocated, but true
initialization of the VMCS _technically_ requires VMWRITEs, as the VMX
architectural specification technically allows CPU implementations to
encode fields with arbitrary schemes.  E.g. a CPU could theoretically store
the inverted value of every field, which would result in VMREAD to a
zero-allocated field returns all ones.

In practice, only the AR_BYTES fields are known to be manipulated by
hardware during VMREAD/VMREAD; no known hardware or VMM (for nested VMX)
does fancy encoding of cacheable field values (CR0, CR3, CR4, etc...).  In
other words, this is technically a bug fix, but practically speakings it's
a glorified nop.

Failure to mark registers as available has been a lurking bug for quite
some time.  The original register caching supported only GPRs (+RIP, which
is kinda sorta a GPR), with the masks initialized at ->vcpu_reset().  That
worked because the two cacheable registers, RIP and RSP, are generally
speaking not read as side effects in other flows.

Arguably, commit aff48baa34c0 ("KVM: Fetch guest cr3 from hardware on
demand") was the first instance of failure to mark regs available.  While
_just_ marking CR3 available during vCPU creation wouldn't have fixed the
VMREAD from an uninitialized VMCS bug because ept_update_paging_mode_cr0()
unconditionally read vmcs.GUEST_CR3, marking CR3 _and_ intentionally not
reading GUEST_CR3 when it's available would have avoided VMREAD to a
technically-uninitialized VMCS.

Fixes: aff48baa34c0 ("KVM: Fetch guest cr3 from hardware on demand")
Fixes: 6de4f3ada40b ("KVM: Cache pdptrs")
Fixes: 6de12732c42c ("KVM: VMX: Optimize vmx_get_rflags()")
Fixes: 2fb92db1ec08 ("KVM: VMX: Cache vmcs segment fields")
Fixes: bd31fe495d0d ("KVM: VMX: Add proper cache tracking for CR0")
Fixes: f98c1e77127d ("KVM: VMX: Add proper cache tracking for CR4")
Fixes: 5addc235199f ("KVM: VMX: Cache vmcs.EXIT_QUALIFICATION using arch avail_reg flags")
Fixes: 8791585837f6 ("KVM: VMX: Cache vmcs.EXIT_INTR_INFO using arch avail_reg flags")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 86539c1686fa..e77a5bf2d940 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10656,6 +10656,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 	int r;
 
 	vcpu->arch.last_vmentry_cpu = -1;
+	vcpu->arch.regs_avail = ~0;
+	vcpu->arch.regs_dirty = ~0;
 
 	if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
 		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
  2021-09-21  0:02 ` [PATCH v2 01/10] KVM: x86: Mark all registers as avail/dirty at vCPU creation Sean Christopherson
@ 2021-09-21  0:02 ` Sean Christopherson
  2021-09-21 13:52   ` Vitaly Kuznetsov
  2021-09-21 17:35   ` Paolo Bonzini
  2021-09-21  0:02 ` [PATCH v2 03/10] KVM: x86: Do not mark all registers as avail/dirty during RESET/INIT Sean Christopherson
                   ` (8 subsequent siblings)
  10 siblings, 2 replies; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Explicitly zero the guest's CR3 and mark it available+dirty at RESET/INIT.
Per Intel's SDM and AMD's APM, CR3 is zeroed at both RESET and INIT.  For
RESET, this is a nop as vcpu is zero-allocated.  For INIT, the bug has
likely escaped notice because no firmware/kernel puts its page tables root
at PA=0, let alone relies on INIT to get the desired CR3 for such page
tables.

Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e77a5bf2d940..2cb38c67ed43 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10899,6 +10899,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
 	kvm_rip_write(vcpu, 0xfff0);
 
+	vcpu->arch.cr3 = 0;
+	kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
+
 	/*
 	 * CR0.CD/NW are set on RESET, preserved on INIT.  Note, some versions
 	 * of Intel's SDM list CD/NW as being set on INIT, but they contradict
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 03/10] KVM: x86: Do not mark all registers as avail/dirty during RESET/INIT
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
  2021-09-21  0:02 ` [PATCH v2 01/10] KVM: x86: Mark all registers as avail/dirty at vCPU creation Sean Christopherson
  2021-09-21  0:02 ` [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT Sean Christopherson
@ 2021-09-21  0:02 ` Sean Christopherson
  2021-09-21  0:02 ` [PATCH v2 04/10] KVM: x86: Remove defunct setting of CR0.ET for guests during vCPU create Sean Christopherson
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Do not blindly mark all registers as available+dirty at RESET/INIT, and
instead rely on writes to registers to go through the proper mutators or
to explicitly mark registers as dirty.  INIT in particular does not blindly
overwrite all registers, e.g. select bits in CR0 are preserved across INIT,
thus marking registers available+dirty without first reading the register
from hardware is incorrect.

In practice this is a benign bug as KVM doesn't let the guest control CR0
bits that are preserved across INIT, and all other true registers are
explicitly written during the RESET/INIT flows.  The PDPTRs and EX_INFO
"registers" are not explicitly written, but accessing those values during
RESET/INIT is nonsensical and would be a KVM bug regardless of register
caching.

Fixes: 66f7b72e1171 ("KVM: x86: Make register state after reset conform to specification")
[sean: !!! NOT FOR STABLE !!!]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 1 +
 arch/x86/kvm/x86.c     | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index fada1055f325..d44d07d5a02f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4448,6 +4448,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	kvm_set_cr8(vcpu, 0);
 
 	vmx_segment_cache_clear(vmx);
+	kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS);
 
 	seg_setup(VCPU_SREG_CS);
 	vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2cb38c67ed43..ab907a0b9eeb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10876,9 +10876,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 		vcpu->arch.xcr0 = XFEATURE_MASK_FP;
 	}
 
+	/* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
 	memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
-	vcpu->arch.regs_avail = ~0;
-	vcpu->arch.regs_dirty = ~0;
+	kvm_register_mark_dirty(vcpu, VCPU_REGS_RSP);
 
 	/*
 	 * Fall back to KVM's default Family/Model/Stepping of 0x600 (P6/Athlon)
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 04/10] KVM: x86: Remove defunct setting of CR0.ET for guests during vCPU create
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (2 preceding siblings ...)
  2021-09-21  0:02 ` [PATCH v2 03/10] KVM: x86: Do not mark all registers as avail/dirty during RESET/INIT Sean Christopherson
@ 2021-09-21  0:02 ` Sean Christopherson
  2021-09-21 14:23   ` Vitaly Kuznetsov
  2021-09-21  0:02 ` [PATCH v2 05/10] KVM: x86: Remove defunct setting of XCR0 for guest " Sean Christopherson
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Drop code to set CR0.ET for the guest during initialization of the guest
FPU.  The code was added as a misguided bug fix by commit 380102c8e431
("KVM Set the ET flag in CR0 after initializing FX") to resolve an issue
where vcpu->cr0 (now vcpu->arch.cr0) was not correctly initialized on SVM
systems.  While init_vmcb() did set CR0.ET, it only did so in the VMCB,
and subtly did not update vcpu->cr0.  Stuffing CR0.ET worked around the
immediate problem, but did not fix the real bug of vcpu->cr0 and the VMCB
being out of sync.  That underlying bug was eventually remedied by commit
18fa000ae453 ("KVM: SVM: Reset cr0 properly on vcpu reset").

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ab907a0b9eeb..e0bff5473813 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10628,8 +10628,6 @@ static void fx_init(struct kvm_vcpu *vcpu)
 	 * Ensure guest xcr0 is valid for loading
 	 */
 	vcpu->arch.xcr0 = XFEATURE_MASK_FP;
-
-	vcpu->arch.cr0 |= X86_CR0_ET;
 }
 
 void kvm_free_guest_fpu(struct kvm_vcpu *vcpu)
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 05/10] KVM: x86: Remove defunct setting of XCR0 for guest during vCPU create
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (3 preceding siblings ...)
  2021-09-21  0:02 ` [PATCH v2 04/10] KVM: x86: Remove defunct setting of CR0.ET for guests during vCPU create Sean Christopherson
@ 2021-09-21  0:02 ` Sean Christopherson
  2021-09-21 14:37   ` Vitaly Kuznetsov
  2021-09-21  0:02 ` [PATCH v2 06/10] KVM: x86: Fold fx_init() into kvm_arch_vcpu_create() Sean Christopherson
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Drop code to initialize XCR0 during fx_init(), a.k.a. vCPU creation, as
XCR0 has been initialized during kvm_vcpu_reset() (for RESET) since
commit a554d207dc46 ("KVM: X86: Processor States following Reset or INIT").

Back when XCR0 support was added by commit 2acf923e38fb ("KVM: VMX:
Enable XSAVE/XRSTOR for guest"), KVM didn't differentiate between RESET
and INIT.  Ignoring the fact that calling fx_init() for INIT is obviously
wrong, e.g. FPU state after INIT is not the same as after RESET, setting
XCR0 in fx_init() was correct.

Eventually fx_init() got moved to kvm_arch_vcpu_init(), a.k.a. vCPU
creation (ignore the terrible name) by commit 0ee6a5172573 ("x86/fpu,
kvm: Simplify fx_init()").  Finally, commit 95a0d01eef7a ("KVM: x86: Move
all vcpu init code into kvm_arch_vcpu_create()") killed off
kvm_arch_vcpu_init(), leaving behind the oddity of redundant setting of
guest state during vCPU creation.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e0bff5473813..6fd3fe21863e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -993,7 +993,7 @@ static int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
 	/*
 	 * Do not allow the guest to set bits that we do not support
 	 * saving.  However, xcr0 bit 0 is always set, even if the
-	 * emulated CPU does not support XSAVE (see fx_init).
+	 * emulated CPU does not support XSAVE (see kvm_vcpu_reset()).
 	 */
 	valid_bits = vcpu->arch.guest_supported_xcr0 | XFEATURE_MASK_FP;
 	if (xcr0 & ~valid_bits)
@@ -10623,11 +10623,6 @@ static void fx_init(struct kvm_vcpu *vcpu)
 	if (boot_cpu_has(X86_FEATURE_XSAVES))
 		vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv =
 			host_xcr0 | XSTATE_COMPACTION_ENABLED;
-
-	/*
-	 * Ensure guest xcr0 is valid for loading
-	 */
-	vcpu->arch.xcr0 = XFEATURE_MASK_FP;
 }
 
 void kvm_free_guest_fpu(struct kvm_vcpu *vcpu)
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 06/10] KVM: x86: Fold fx_init() into kvm_arch_vcpu_create()
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (4 preceding siblings ...)
  2021-09-21  0:02 ` [PATCH v2 05/10] KVM: x86: Remove defunct setting of XCR0 for guest " Sean Christopherson
@ 2021-09-21  0:02 ` Sean Christopherson
  2021-09-21 14:52   ` Vitaly Kuznetsov
  2021-09-21  0:03 ` [PATCH v2 07/10] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation Sean Christopherson
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Move the few bits of relevant fx_init() code into kvm_arch_vcpu_create(),
dropping the superfluous check on vcpu->arch.guest_fpu that was blindly
and wrongly added by commit ed02b213098a ("KVM: SVM: Guest FPU state
save/restore not needed for SEV-ES guest").

Note, KVM currently allocates and then frees FPU state for SEV-ES guests,
rather than avoid the allocation in the first place.  While that approach
is inarguably inefficient and unnecessary, it's a cleanup for the future.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6fd3fe21863e..ec61b90d9b73 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10614,17 +10614,6 @@ static int sync_regs(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-static void fx_init(struct kvm_vcpu *vcpu)
-{
-	if (!vcpu->arch.guest_fpu)
-		return;
-
-	fpstate_init(&vcpu->arch.guest_fpu->state);
-	if (boot_cpu_has(X86_FEATURE_XSAVES))
-		vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv =
-			host_xcr0 | XSTATE_COMPACTION_ENABLED;
-}
-
 void kvm_free_guest_fpu(struct kvm_vcpu *vcpu)
 {
 	if (vcpu->arch.guest_fpu) {
@@ -10703,7 +10692,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
 		pr_err("kvm: failed to allocate vcpu's fpu\n");
 		goto free_user_fpu;
 	}
-	fx_init(vcpu);
+	fpstate_init(&vcpu->arch.guest_fpu->state);
+	if (boot_cpu_has(X86_FEATURE_XSAVES))
+		vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv =
+			host_xcr0 | XSTATE_COMPACTION_ENABLED;
 
 	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
 	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 07/10] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (5 preceding siblings ...)
  2021-09-21  0:02 ` [PATCH v2 06/10] KVM: x86: Fold fx_init() into kvm_arch_vcpu_create() Sean Christopherson
@ 2021-09-21  0:03 ` Sean Christopherson
  2021-09-21 15:02   ` Vitaly Kuznetsov
  2021-09-21  0:03 ` [PATCH v2 08/10] KVM: VMX: Move RESET emulation to vmx_vcpu_reset() Sean Christopherson
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Don't zero out user return and nested MSRs during vCPU creation, and
instead rely on vcpu_vmx being zero-allocated.  Explicitly zeroing MSRs
is not wrong, and is in fact necessary if KVM ever emulates vCPU RESET
outside of vCPU creation, but zeroing only a subset of MSRs is confusing.

Poking directly into KVM's backing is also undesirable in that it doesn't
scale and is error prone.  Ideally KVM would have a common RESET path for
all MSRs, e.g. by expanding kvm_set_msr(), which would obviate the need
for this out-of-bad code (to support standalone RESET).

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d44d07d5a02f..8d14066db3ea 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6819,10 +6819,8 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 			goto free_vpid;
 	}
 
-	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
-		vmx->guest_uret_msrs[i].data = 0;
+	for (i = 0; i < kvm_nr_uret_msrs; ++i)
 		vmx->guest_uret_msrs[i].mask = -1ull;
-	}
 	if (boot_cpu_has(X86_FEATURE_RTM)) {
 		/*
 		 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID interception.
@@ -6879,8 +6877,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 
 	if (nested)
 		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
-	else
-		memset(&vmx->nested.msrs, 0, sizeof(vmx->nested.msrs));
 
 	vcpu_setup_sgx_lepubkeyhash(vcpu);
 
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 08/10] KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (6 preceding siblings ...)
  2021-09-21  0:03 ` [PATCH v2 07/10] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation Sean Christopherson
@ 2021-09-21  0:03 ` Sean Christopherson
  2021-09-21  0:03 ` [PATCH v2 09/10] KVM: SVM: Move RESET emulation to svm_vcpu_reset() Sean Christopherson
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Move vCPU RESET emulation, including initializating of select VMCS state,
to vmx_vcpu_reset().  Drop the open coded "vCPU load" sequence, as
->vcpu_reset() is invoked while the vCPU is properly loaded (which is
kind of the point of ->vcpu_reset()...).  Hopefully KVM will someday
expose a dedicated RESET ioctl(), and in the meantime separating "create"
from "RESET" is a nice cleanup.

Deferring VMCS initialization is effectively a nop as it's impossible to
safely access the VMCS between the current call site and its new home, as
both the vCPU and the pCPU are put immediately after init_vmcs(), i.e.
the VMCS isn't guaranteed to be loaded.

Note, task preemption is not a problem as vmx_sched_in() _can't_ touch
the VMCS as ->sched_in() is invoked before the vCPU, and thus VMCS, is
reloaded.  I.e. the preemption path also can't consume VMCS state.

Cc: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 61 +++++++++++++++++++++---------------------
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 8d14066db3ea..dc0831054319 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4327,10 +4327,6 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
 
 #define VMX_XSS_EXIT_BITMAP 0
 
-/*
- * Noting that the initialization of Guest-state Area of VMCS is in
- * vmx_vcpu_reset().
- */
 static void init_vmcs(struct vcpu_vmx *vmx)
 {
 	if (nested)
@@ -4435,10 +4431,39 @@ static void init_vmcs(struct vcpu_vmx *vmx)
 	vmx_setup_uret_msrs(vmx);
 }
 
+static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	init_vmcs(vmx);
+
+	if (nested)
+		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
+
+	vcpu_setup_sgx_lepubkeyhash(vcpu);
+
+	vmx->nested.posted_intr_nv = -1;
+	vmx->nested.current_vmptr = -1ull;
+	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
+
+	vcpu->arch.microcode_version = 0x100000000ULL;
+	vmx->msr_ia32_feature_control_valid_bits = FEAT_CTL_LOCKED;
+
+	/*
+	 * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR
+	 * or POSTED_INTR_WAKEUP_VECTOR.
+	 */
+	vmx->pi_desc.nv = POSTED_INTR_VECTOR;
+	vmx->pi_desc.sn = 1;
+}
+
 static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+	if (!init_event)
+		__vmx_vcpu_reset(vcpu);
+
 	vmx->rmode.vm86_active = 0;
 	vmx->spec_ctrl = 0;
 
@@ -6798,7 +6823,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct vmx_uret_msr *tsx_ctrl;
 	struct vcpu_vmx *vmx;
-	int i, cpu, err;
+	int i, err;
 
 	BUILD_BUG_ON(offsetof(struct vcpu_vmx, vcpu) != 0);
 	vmx = to_vmx(vcpu);
@@ -6857,12 +6882,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 	}
 
 	vmx->loaded_vmcs = &vmx->vmcs01;
-	cpu = get_cpu();
-	vmx_vcpu_load(vcpu, cpu);
-	vcpu->cpu = cpu;
-	init_vmcs(vmx);
-	vmx_vcpu_put(vcpu);
-	put_cpu();
+
 	if (cpu_need_virtualize_apic_accesses(vcpu)) {
 		err = alloc_apic_access_page(vcpu->kvm);
 		if (err)
@@ -6875,25 +6895,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 			goto free_vmcs;
 	}
 
-	if (nested)
-		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
-
-	vcpu_setup_sgx_lepubkeyhash(vcpu);
-
-	vmx->nested.posted_intr_nv = -1;
-	vmx->nested.current_vmptr = -1ull;
-	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
-
-	vcpu->arch.microcode_version = 0x100000000ULL;
-	vmx->msr_ia32_feature_control_valid_bits = FEAT_CTL_LOCKED;
-
-	/*
-	 * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR
-	 * or POSTED_INTR_WAKEUP_VECTOR.
-	 */
-	vmx->pi_desc.nv = POSTED_INTR_VECTOR;
-	vmx->pi_desc.sn = 1;
-
 	return 0;
 
 free_vmcs:
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 09/10] KVM: SVM: Move RESET emulation to svm_vcpu_reset()
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (7 preceding siblings ...)
  2021-09-21  0:03 ` [PATCH v2 08/10] KVM: VMX: Move RESET emulation to vmx_vcpu_reset() Sean Christopherson
@ 2021-09-21  0:03 ` Sean Christopherson
  2021-09-21  0:03 ` [PATCH v2 10/10] KVM: x86: WARN on non-zero CRs at RESET to detect improper initalization Sean Christopherson
  2021-09-23 16:23 ` [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Paolo Bonzini
  10 siblings, 0 replies; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Move RESET emulation for SVM vCPUs to svm_vcpu_reset(), and drop an extra
init_vmcb() from svm_create_vcpu() in the process.  Hopefully KVM will
someday expose a dedicated RESET ioctl(), and in the meantime separating
"create" from "RESET" is a nice cleanup.

Keep the call to svm_switch_vmcb() so that misuse of svm->vmcb at worst
breaks the guest, e.g. premature accesses doesn't cause a NULL pointer
dereference.

Cc: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/sev.c |  6 +++---
 arch/x86/kvm/svm/svm.c | 29 +++++++++++++++++------------
 arch/x86/kvm/svm/svm.h |  2 +-
 3 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 75e0b21ad07c..4cf40021dde4 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2599,11 +2599,11 @@ void sev_es_init_vmcb(struct vcpu_svm *svm)
 	set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1);
 }
 
-void sev_es_create_vcpu(struct vcpu_svm *svm)
+void sev_es_vcpu_reset(struct vcpu_svm *svm)
 {
 	/*
-	 * Set the GHCB MSR value as per the GHCB specification when creating
-	 * a vCPU for an SEV-ES guest.
+	 * Set the GHCB MSR value as per the GHCB specification when emulating
+	 * vCPU RESET for an SEV-ES guest.
 	 */
 	set_ghcb_msr(svm, GHCB_MSR_SEV_INFO(GHCB_VERSION_MAX,
 					    GHCB_VERSION_MIN,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1a70e11f0487..bb79ae8e8bcd 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1303,6 +1303,19 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 
 }
 
+static void __svm_vcpu_reset(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	svm_vcpu_init_msrpm(vcpu, svm->msrpm);
+
+	svm_init_osvw(vcpu);
+	vcpu->arch.microcode_version = 0x01000065;
+
+	if (sev_es_guest(vcpu->kvm))
+		sev_es_vcpu_reset(svm);
+}
+
 static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -1311,6 +1324,9 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	svm->virt_spec_ctrl = 0;
 
 	init_vmcb(vcpu);
+
+	if (!init_event)
+		__svm_vcpu_reset(vcpu);
 }
 
 void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb)
@@ -1370,24 +1386,13 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 
 	svm->vmcb01.ptr = page_address(vmcb01_page);
 	svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT);
+	svm_switch_vmcb(svm, &svm->vmcb01);
 
 	if (vmsa_page)
 		svm->vmsa = page_address(vmsa_page);
 
 	svm->guest_state_loaded = false;
 
-	svm_switch_vmcb(svm, &svm->vmcb01);
-	init_vmcb(vcpu);
-
-	svm_vcpu_init_msrpm(vcpu, svm->msrpm);
-
-	svm_init_osvw(vcpu);
-	vcpu->arch.microcode_version = 0x01000065;
-
-	if (sev_es_guest(vcpu->kvm))
-		/* Perform SEV-ES specific VMCB creation updates */
-		sev_es_create_vcpu(svm);
-
 	return 0;
 
 error_free_vmsa_page:
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 524d943f3efc..001698919148 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -561,7 +561,7 @@ void sev_free_vcpu(struct kvm_vcpu *vcpu);
 int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
 int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
 void sev_es_init_vmcb(struct vcpu_svm *svm);
-void sev_es_create_vcpu(struct vcpu_svm *svm);
+void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
 void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v2 10/10] KVM: x86: WARN on non-zero CRs at RESET to detect improper initalization
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (8 preceding siblings ...)
  2021-09-21  0:03 ` [PATCH v2 09/10] KVM: SVM: Move RESET emulation to svm_vcpu_reset() Sean Christopherson
@ 2021-09-21  0:03 ` Sean Christopherson
  2021-09-21 13:59   ` Vitaly Kuznetsov
  2021-09-23 16:23 ` [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Paolo Bonzini
  10 siblings, 1 reply; 22+ messages in thread
From: Sean Christopherson @ 2021-09-21  0:03 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

WARN if CR0, CR3, or CR4 are non-zero at RESET, which given the current
KVM implementation, really means WARN if they're not zeroed at vCPU
creation.  VMX in particular has several ->set_*() flows that read other
registers to handle side effects, and because those flows are common to
RESET and INIT, KVM subtly relies on emulated/virtualized registers to be
zeroed at vCPU creation in order to do the right thing at RESET.

Use CRs as a sentinel because they are most likely to be written as side
effects, and because KVM specifically needs CR0.PG and CR0.PE to be '0'
to correctly reflect the state of the vCPU's MMU.  CRs are also loaded
and stored from/to the VMCS, and so adds some level of coverage to verify
that KVM doesn't conflate zero-allocating the VMCS with properly
initializing the VMCS with VMWRITEs.

Note, '0' is somewhat arbitrary, vCPU creation can technically stuff any
value for a register so long as it's coherent with respect to the current
vCPU state.  In practice, '0' works for all registers and is convenient.

Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ec61b90d9b73..4e25baac3977 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10800,6 +10800,16 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	unsigned long new_cr0;
 	u32 eax, dummy;
 
+	/*
+	 * Several of the "set" flows, e.g. ->set_cr0(), read other registers
+	 * to handle side effects.  RESET emulation hits those flows and relies
+	 * on emulated/virtualized registers, including those that are loaded
+	 * into hardware, to be zeroed at vCPU creation.  Use CRs as a sentinel
+	 * to detect improper or missing initialization.
+	 */
+	WARN_ON_ONCE(!init_event &&
+		     (old_cr0 || kvm_read_cr3(vcpu) || kvm_read_cr4(vcpu)));
+
 	kvm_lapic_reset(vcpu, init_event);
 
 	vcpu->arch.hflags = 0;
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 01/10] KVM: x86: Mark all registers as avail/dirty at vCPU creation
  2021-09-21  0:02 ` [PATCH v2 01/10] KVM: x86: Mark all registers as avail/dirty at vCPU creation Sean Christopherson
@ 2021-09-21 13:40   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 22+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-21 13:40 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> Mark all registers as available and dirty at vCPU creation, as the vCPU has
> obviously not been loaded into hardware, let alone been given the chance to
> be modified in hardware.  On SVM, reading from "uninitialized" hardware is
> a non-issue as VMCBs are zero allocated (thus not truly uninitialized) and
> hardware does not allow for arbitrary field encoding schemes.
>
> On VMX, backing memory for VMCSes is also zero allocated, but true
> initialization of the VMCS _technically_ requires VMWRITEs, as the VMX
> architectural specification technically allows CPU implementations to
> encode fields with arbitrary schemes.  E.g. a CPU could theoretically store
> the inverted value of every field, which would result in VMREAD to a
> zero-allocated field returns all ones.
>
> In practice, only the AR_BYTES fields are known to be manipulated by
> hardware during VMREAD/VMREAD; no known hardware or VMM (for nested VMX)
> does fancy encoding of cacheable field values (CR0, CR3, CR4, etc...).  In
> other words, this is technically a bug fix, but practically speakings it's
> a glorified nop.

Just to make the picture complete, according to TLFS, Enlightened VMCS
must also be zero allocated and the encoding is known. Still a nop ;-)

>
> Failure to mark registers as available has been a lurking bug for quite
> some time.  The original register caching supported only GPRs (+RIP, which
> is kinda sorta a GPR), with the masks initialized at ->vcpu_reset().  That
> worked because the two cacheable registers, RIP and RSP, are generally
> speaking not read as side effects in other flows.
>
> Arguably, commit aff48baa34c0 ("KVM: Fetch guest cr3 from hardware on
> demand") was the first instance of failure to mark regs available.  While
> _just_ marking CR3 available during vCPU creation wouldn't have fixed the
> VMREAD from an uninitialized VMCS bug because ept_update_paging_mode_cr0()
> unconditionally read vmcs.GUEST_CR3, marking CR3 _and_ intentionally not
> reading GUEST_CR3 when it's available would have avoided VMREAD to a
> technically-uninitialized VMCS.
>
> Fixes: aff48baa34c0 ("KVM: Fetch guest cr3 from hardware on demand")
> Fixes: 6de4f3ada40b ("KVM: Cache pdptrs")
> Fixes: 6de12732c42c ("KVM: VMX: Optimize vmx_get_rflags()")
> Fixes: 2fb92db1ec08 ("KVM: VMX: Cache vmcs segment fields")
> Fixes: bd31fe495d0d ("KVM: VMX: Add proper cache tracking for CR0")
> Fixes: f98c1e77127d ("KVM: VMX: Add proper cache tracking for CR4")
> Fixes: 5addc235199f ("KVM: VMX: Cache vmcs.EXIT_QUALIFICATION using arch avail_reg flags")
> Fixes: 8791585837f6 ("KVM: VMX: Cache vmcs.EXIT_INTR_INFO using arch avail_reg flags")
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 86539c1686fa..e77a5bf2d940 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10656,6 +10656,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  	int r;
>  
>  	vcpu->arch.last_vmentry_cpu = -1;
> +	vcpu->arch.regs_avail = ~0;
> +	vcpu->arch.regs_dirty = ~0;
>  
>  	if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
>  		vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT
  2021-09-21  0:02 ` [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT Sean Christopherson
@ 2021-09-21 13:52   ` Vitaly Kuznetsov
  2021-09-21 13:55     ` Vitaly Kuznetsov
  2021-09-21 17:35   ` Paolo Bonzini
  1 sibling, 1 reply; 22+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-21 13:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> Explicitly zero the guest's CR3 and mark it available+dirty at RESET/INIT.
> Per Intel's SDM and AMD's APM, CR3 is zeroed at both RESET and INIT.  For
> RESET, this is a nop as vcpu is zero-allocated.  For INIT, the bug has
> likely escaped notice because no firmware/kernel puts its page tables root
> at PA=0, let alone relies on INIT to get the desired CR3 for such page
> tables.
>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e77a5bf2d940..2cb38c67ed43 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10899,6 +10899,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>  	kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
>  	kvm_rip_write(vcpu, 0xfff0);
>  
> +	vcpu->arch.cr3 = 0;
> +	kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);

kvm_register_mark_dirty() is redundant here as PATCH1 does

      vcpu->arch.regs_avail = ~0;
      vcpu->arch.regs_dirty = ~0;

just a few lines above. The dependency is, however, implicit and this
patch is marked for stable@ (well, PATCH1 has 8 Fixes: tags so I'd
expect it to get picked by everyone too, especially by robots) and
flipping two bits is cheap.

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

> +
>  	/*
>  	 * CR0.CD/NW are set on RESET, preserved on INIT.  Note, some versions
>  	 * of Intel's SDM list CD/NW as being set on INIT, but they contradict

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT
  2021-09-21 13:52   ` Vitaly Kuznetsov
@ 2021-09-21 13:55     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 22+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-21 13:55 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Vitaly Kuznetsov <vkuznets@redhat.com> writes:

> Sean Christopherson <seanjc@google.com> writes:
>
>> Explicitly zero the guest's CR3 and mark it available+dirty at RESET/INIT.
>> Per Intel's SDM and AMD's APM, CR3 is zeroed at both RESET and INIT.  For
>> RESET, this is a nop as vcpu is zero-allocated.  For INIT, the bug has
>> likely escaped notice because no firmware/kernel puts its page tables root
>> at PA=0, let alone relies on INIT to get the desired CR3 for such page
>> tables.
>>
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Sean Christopherson <seanjc@google.com>
>> ---
>>  arch/x86/kvm/x86.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index e77a5bf2d940..2cb38c67ed43 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -10899,6 +10899,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>>  	kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
>>  	kvm_rip_write(vcpu, 0xfff0);
>>  
>> +	vcpu->arch.cr3 = 0;
>> +	kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
>
> kvm_register_mark_dirty() is redundant here as PATCH1 does
>
>       vcpu->arch.regs_avail = ~0;
>       vcpu->arch.regs_dirty = ~0;
>
> just a few lines above. The dependency is, however, implicit and this
> patch is marked for stable@ (well, PATCH1 has 8 Fixes: tags so I'd
> expect it to get picked by everyone too, especially by robots) and
> flipping two bits is cheap.

Scratch that, kvm_vcpu_reset() and kvm_arch_vcpu_create() got mixed up
in my head :-(

>
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>
>> +
>>  	/*
>>  	 * CR0.CD/NW are set on RESET, preserved on INIT.  Note, some versions
>>  	 * of Intel's SDM list CD/NW as being set on INIT, but they contradict

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 10/10] KVM: x86: WARN on non-zero CRs at RESET to detect improper initalization
  2021-09-21  0:03 ` [PATCH v2 10/10] KVM: x86: WARN on non-zero CRs at RESET to detect improper initalization Sean Christopherson
@ 2021-09-21 13:59   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 22+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-21 13:59 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> WARN if CR0, CR3, or CR4 are non-zero at RESET, which given the current
> KVM implementation, really means WARN if they're not zeroed at vCPU
> creation.  VMX in particular has several ->set_*() flows that read other
> registers to handle side effects, and because those flows are common to
> RESET and INIT, KVM subtly relies on emulated/virtualized registers to be
> zeroed at vCPU creation in order to do the right thing at RESET.
>
> Use CRs as a sentinel because they are most likely to be written as side
> effects, and because KVM specifically needs CR0.PG and CR0.PE to be '0'
> to correctly reflect the state of the vCPU's MMU.  CRs are also loaded
> and stored from/to the VMCS, and so adds some level of coverage to verify
> that KVM doesn't conflate zero-allocating the VMCS with properly
> initializing the VMCS with VMWRITEs.
>
> Note, '0' is somewhat arbitrary, vCPU creation can technically stuff any
> value for a register so long as it's coherent with respect to the current
> vCPU state.  In practice, '0' works for all registers and is convenient.
>
> Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ec61b90d9b73..4e25baac3977 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10800,6 +10800,16 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>  	unsigned long new_cr0;
>  	u32 eax, dummy;
>  
> +	/*
> +	 * Several of the "set" flows, e.g. ->set_cr0(), read other registers
> +	 * to handle side effects.  RESET emulation hits those flows and relies
> +	 * on emulated/virtualized registers, including those that are loaded
> +	 * into hardware, to be zeroed at vCPU creation.  Use CRs as a sentinel
> +	 * to detect improper or missing initialization.
> +	 */
> +	WARN_ON_ONCE(!init_event &&
> +		     (old_cr0 || kvm_read_cr3(vcpu) || kvm_read_cr4(vcpu)));
> +
>  	kvm_lapic_reset(vcpu, init_event);
>  
>  	vcpu->arch.hflags = 0;

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 04/10] KVM: x86: Remove defunct setting of CR0.ET for guests during vCPU create
  2021-09-21  0:02 ` [PATCH v2 04/10] KVM: x86: Remove defunct setting of CR0.ET for guests during vCPU create Sean Christopherson
@ 2021-09-21 14:23   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 22+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-21 14:23 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> Drop code to set CR0.ET for the guest during initialization of the guest
> FPU.  The code was added as a misguided bug fix by commit 380102c8e431
> ("KVM Set the ET flag in CR0 after initializing FX") to resolve an issue
> where vcpu->cr0 (now vcpu->arch.cr0) was not correctly initialized on SVM
> systems.  While init_vmcb() did set CR0.ET, it only did so in the VMCB,
> and subtly did not update vcpu->cr0.  Stuffing CR0.ET worked around the
> immediate problem, but did not fix the real bug of vcpu->cr0 and the VMCB
> being out of sync.  That underlying bug was eventually remedied by commit
> 18fa000ae453 ("KVM: SVM: Reset cr0 properly on vcpu reset").
>
> No functional change intended.

fx_init() is only called from kvm_arch_vcpu_create() (and inlined later
in the series) a few lines before kvm_vcpu_reset() which stuffs CR0 with 
X86_CR0_ET too and it doesn't seem that arch.cr0 value is important in
between.

>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index ab907a0b9eeb..e0bff5473813 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10628,8 +10628,6 @@ static void fx_init(struct kvm_vcpu *vcpu)
>  	 * Ensure guest xcr0 is valid for loading
>  	 */
>  	vcpu->arch.xcr0 = XFEATURE_MASK_FP;
> -
> -	vcpu->arch.cr0 |= X86_CR0_ET;
>  }
>  
>  void kvm_free_guest_fpu(struct kvm_vcpu *vcpu)

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 05/10] KVM: x86: Remove defunct setting of XCR0 for guest during vCPU create
  2021-09-21  0:02 ` [PATCH v2 05/10] KVM: x86: Remove defunct setting of XCR0 for guest " Sean Christopherson
@ 2021-09-21 14:37   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 22+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-21 14:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> Drop code to initialize XCR0 during fx_init(), a.k.a. vCPU creation, as
> XCR0 has been initialized during kvm_vcpu_reset() (for RESET) since
> commit a554d207dc46 ("KVM: X86: Processor States following Reset or INIT").
>
> Back when XCR0 support was added by commit 2acf923e38fb ("KVM: VMX:
> Enable XSAVE/XRSTOR for guest"), KVM didn't differentiate between RESET
> and INIT.  Ignoring the fact that calling fx_init() for INIT is obviously
> wrong, e.g. FPU state after INIT is not the same as after RESET, setting
> XCR0 in fx_init() was correct.
>
> Eventually fx_init() got moved to kvm_arch_vcpu_init(), a.k.a. vCPU
> creation (ignore the terrible name) by commit 0ee6a5172573 ("x86/fpu,
> kvm: Simplify fx_init()").  Finally, commit 95a0d01eef7a ("KVM: x86: Move
> all vcpu init code into kvm_arch_vcpu_create()") killed off
> kvm_arch_vcpu_init(),

Technically, empty kvm_arch_vcpu_init() was still alive for a few more
commits and only ddd259c9aaba ("KVM: Drop kvm_arch_vcpu_init() and
kvm_arch_vcpu_uninit()") killed it for real but a curious reader can
find all these gory details himself)

> leaving behind the oddity of redundant setting of
> guest state during vCPU creation.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e0bff5473813..6fd3fe21863e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -993,7 +993,7 @@ static int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
>  	/*
>  	 * Do not allow the guest to set bits that we do not support
>  	 * saving.  However, xcr0 bit 0 is always set, even if the
> -	 * emulated CPU does not support XSAVE (see fx_init).
> +	 * emulated CPU does not support XSAVE (see kvm_vcpu_reset()).
>  	 */
>  	valid_bits = vcpu->arch.guest_supported_xcr0 | XFEATURE_MASK_FP;
>  	if (xcr0 & ~valid_bits)
> @@ -10623,11 +10623,6 @@ static void fx_init(struct kvm_vcpu *vcpu)
>  	if (boot_cpu_has(X86_FEATURE_XSAVES))
>  		vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv =
>  			host_xcr0 | XSTATE_COMPACTION_ENABLED;
> -
> -	/*
> -	 * Ensure guest xcr0 is valid for loading
> -	 */
> -	vcpu->arch.xcr0 = XFEATURE_MASK_FP;
>  }
>  
>  void kvm_free_guest_fpu(struct kvm_vcpu *vcpu)

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 06/10] KVM: x86: Fold fx_init() into kvm_arch_vcpu_create()
  2021-09-21  0:02 ` [PATCH v2 06/10] KVM: x86: Fold fx_init() into kvm_arch_vcpu_create() Sean Christopherson
@ 2021-09-21 14:52   ` Vitaly Kuznetsov
  2021-10-06 23:04     ` Sean Christopherson
  0 siblings, 1 reply; 22+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-21 14:52 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> Move the few bits of relevant fx_init() code into kvm_arch_vcpu_create(),
> dropping the superfluous check on vcpu->arch.guest_fpu that was blindly
> and wrongly added by commit ed02b213098a ("KVM: SVM: Guest FPU state
> save/restore not needed for SEV-ES guest").

I have more questions to the above mentioned commit: why is it OK to
'return 0' from kvm_vcpu_ioctl_x86_set_xsave() without writing anything
to 'guest_xsave'? Same goes to kvm_arch_vcpu_ioctl_get_fpu(). Whould't
it be better to throw an error as we can't actually get this information
for encrypted guests? It's probably too late to change this now I
suppose ...

>
> Note, KVM currently allocates and then frees FPU state for SEV-ES guests,
> rather than avoid the allocation in the first place.  While that approach
> is inarguably inefficient and unnecessary, it's a cleanup for the future.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c | 16 ++++------------
>  1 file changed, 4 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 6fd3fe21863e..ec61b90d9b73 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10614,17 +10614,6 @@ static int sync_regs(struct kvm_vcpu *vcpu)
>  	return 0;
>  }
>  
> -static void fx_init(struct kvm_vcpu *vcpu)
> -{
> -	if (!vcpu->arch.guest_fpu)
> -		return;
> -
> -	fpstate_init(&vcpu->arch.guest_fpu->state);
> -	if (boot_cpu_has(X86_FEATURE_XSAVES))
> -		vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv =
> -			host_xcr0 | XSTATE_COMPACTION_ENABLED;
> -}
> -
>  void kvm_free_guest_fpu(struct kvm_vcpu *vcpu)
>  {
>  	if (vcpu->arch.guest_fpu) {
> @@ -10703,7 +10692,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>  		pr_err("kvm: failed to allocate vcpu's fpu\n");
>  		goto free_user_fpu;
>  	}
> -	fx_init(vcpu);
> +	fpstate_init(&vcpu->arch.guest_fpu->state);
> +	if (boot_cpu_has(X86_FEATURE_XSAVES))
> +		vcpu->arch.guest_fpu->state.xsave.header.xcomp_bv =
> +			host_xcr0 | XSTATE_COMPACTION_ENABLED;
>  
>  	vcpu->arch.maxphyaddr = cpuid_query_maxphyaddr(vcpu);
>  	vcpu->arch.reserved_gpa_bits = kvm_vcpu_reserved_gpa_bits_raw(vcpu);

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 07/10] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation
  2021-09-21  0:03 ` [PATCH v2 07/10] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation Sean Christopherson
@ 2021-09-21 15:02   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 22+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-21 15:02 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> Don't zero out user return and nested MSRs during vCPU creation, and
> instead rely on vcpu_vmx being zero-allocated.  Explicitly zeroing MSRs
> is not wrong, and is in fact necessary if KVM ever emulates vCPU RESET
> outside of vCPU creation, but zeroing only a subset of MSRs is confusing.
>
> Poking directly into KVM's backing is also undesirable in that it doesn't
> scale and is error prone.  Ideally KVM would have a common RESET path for
> all MSRs, e.g. by expanding kvm_set_msr(), which would obviate the need
> for this out-of-bad code (to support standalone RESET).

Just thinking out loud: we can probably enhance KVM_SET_MSRS making it
possible for userspace VMM to set the default (as not every setting need
to survive RESET). Conveniently enough, 'struct kvm_msr_entry' has 32
bits reserved and we can bite off one for 'default' flag.

>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d44d07d5a02f..8d14066db3ea 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6819,10 +6819,8 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  			goto free_vpid;
>  	}
>  
> -	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
> -		vmx->guest_uret_msrs[i].data = 0;
> +	for (i = 0; i < kvm_nr_uret_msrs; ++i)
>  		vmx->guest_uret_msrs[i].mask = -1ull;
> -	}
>  	if (boot_cpu_has(X86_FEATURE_RTM)) {
>  		/*
>  		 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID interception.
> @@ -6879,8 +6877,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  
>  	if (nested)
>  		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
> -	else
> -		memset(&vmx->nested.msrs, 0, sizeof(vmx->nested.msrs));
>  
>  	vcpu_setup_sgx_lepubkeyhash(vcpu);

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT
  2021-09-21  0:02 ` [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT Sean Christopherson
  2021-09-21 13:52   ` Vitaly Kuznetsov
@ 2021-09-21 17:35   ` Paolo Bonzini
  1 sibling, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2021-09-21 17:35 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

On 21/09/21 02:02, Sean Christopherson wrote:
> Explicitly zero the guest's CR3 and mark it available+dirty at RESET/INIT.
> Per Intel's SDM and AMD's APM, CR3 is zeroed at both RESET and INIT.  For
> RESET, this is a nop as vcpu is zero-allocated.  For INIT, the bug has
> likely escaped notice because no firmware/kernel puts its page tables root
> at PA=0, let alone relies on INIT to get the desired CR3 for such page
> tables.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/x86.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e77a5bf2d940..2cb38c67ed43 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10899,6 +10899,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>   	kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
>   	kvm_rip_write(vcpu, 0xfff0);
>   
> +	vcpu->arch.cr3 = 0;
> +	kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
> +
>   	/*
>   	 * CR0.CD/NW are set on RESET, preserved on INIT.  Note, some versions
>   	 * of Intel's SDM list CD/NW as being set on INIT, but they contradict
> 

Queued patches 1-2 for 5.15-rc3, thanks.

Paolo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation"
  2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (9 preceding siblings ...)
  2021-09-21  0:03 ` [PATCH v2 10/10] KVM: x86: WARN on non-zero CRs at RESET to detect improper initalization Sean Christopherson
@ 2021-09-23 16:23 ` Paolo Bonzini
  10 siblings, 0 replies; 22+ messages in thread
From: Paolo Bonzini @ 2021-09-23 16:23 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

On 21/09/21 02:02, Sean Christopherson wrote:
> Add dedicated helpers to emulate RESET (or at least, the parts of RESET
> that KVM actually emulates) instead of having the relevant code scattered
> throughout vcpu_create() and vcpu_reset().
> 
> Vitaly's innocuous suggestion to WARN on non-zero CR0 led to a wonderful
> bit of git archaeology after, to my complete surprise, it fired on RESET
> due to fx_init() setting CR0.ET (and XCR0.FP).
> 
> v2:
>   - Add patches to drop defunct fx_init() code.
>   - Add patch to zero CR3 at RESET/INIT.
>   - Add patch to mark all regs avail/dirty at creation, not at RESET/INIT.
>   - Add patch to WARN if CRs are non-zero at RESET. [Vitaly]
> 
> v1: https://lkml.kernel.org/r/20210914230840.3030620-1-seanjc@google.com
> 
> Sean Christopherson (10):
>    KVM: x86: Mark all registers as avail/dirty at vCPU creation
>    KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT
>    KVM: x86: Do not mark all registers as avail/dirty during RESET/INIT
>    KVM: x86: Remove defunct setting of CR0.ET for guests during vCPU
>      create
>    KVM: x86: Remove defunct setting of XCR0 for guest during vCPU create
>    KVM: x86: Fold fx_init() into kvm_arch_vcpu_create()
>    KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation
>    KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
>    KVM: SVM: Move RESET emulation to svm_vcpu_reset()
>    KVM: x86: WARN on non-zero CRs at RESET to detect improper
>      initalization
> 
>   arch/x86/kvm/svm/sev.c |  6 ++--
>   arch/x86/kvm/svm/svm.c | 29 ++++++++++--------
>   arch/x86/kvm/svm/svm.h |  2 +-
>   arch/x86/kvm/vmx/vmx.c | 68 ++++++++++++++++++++----------------------
>   arch/x86/kvm/x86.c     | 44 +++++++++++++--------------
>   5 files changed, 76 insertions(+), 73 deletions(-)
> 

Queued 3-10 too now, thanks.

Paolo


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2 06/10] KVM: x86: Fold fx_init() into kvm_arch_vcpu_create()
  2021-09-21 14:52   ` Vitaly Kuznetsov
@ 2021-10-06 23:04     ` Sean Christopherson
  0 siblings, 0 replies; 22+ messages in thread
From: Sean Christopherson @ 2021-10-06 23:04 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

On Tue, Sep 21, 2021, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> 
> > Move the few bits of relevant fx_init() code into kvm_arch_vcpu_create(),
> > dropping the superfluous check on vcpu->arch.guest_fpu that was blindly
> > and wrongly added by commit ed02b213098a ("KVM: SVM: Guest FPU state
> > save/restore not needed for SEV-ES guest").
> 
> I have more questions to the above mentioned commit: why is it OK to
> 'return 0' from kvm_vcpu_ioctl_x86_set_xsave() without writing anything
> to 'guest_xsave'? Same goes to kvm_arch_vcpu_ioctl_get_fpu(). Whould't
> it be better to throw an error as we can't actually get this information
> for encrypted guests? It's probably too late to change this now I
> suppose ...

Yep, I would also have preferred that KVM force userspace to avoid ioctls() that
cannot work, but that ship has sailed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-10-06 23:04 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-21  0:02 [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Sean Christopherson
2021-09-21  0:02 ` [PATCH v2 01/10] KVM: x86: Mark all registers as avail/dirty at vCPU creation Sean Christopherson
2021-09-21 13:40   ` Vitaly Kuznetsov
2021-09-21  0:02 ` [PATCH v2 02/10] KVM: x86: Clear KVM's cached guest CR3 at RESET/INIT Sean Christopherson
2021-09-21 13:52   ` Vitaly Kuznetsov
2021-09-21 13:55     ` Vitaly Kuznetsov
2021-09-21 17:35   ` Paolo Bonzini
2021-09-21  0:02 ` [PATCH v2 03/10] KVM: x86: Do not mark all registers as avail/dirty during RESET/INIT Sean Christopherson
2021-09-21  0:02 ` [PATCH v2 04/10] KVM: x86: Remove defunct setting of CR0.ET for guests during vCPU create Sean Christopherson
2021-09-21 14:23   ` Vitaly Kuznetsov
2021-09-21  0:02 ` [PATCH v2 05/10] KVM: x86: Remove defunct setting of XCR0 for guest " Sean Christopherson
2021-09-21 14:37   ` Vitaly Kuznetsov
2021-09-21  0:02 ` [PATCH v2 06/10] KVM: x86: Fold fx_init() into kvm_arch_vcpu_create() Sean Christopherson
2021-09-21 14:52   ` Vitaly Kuznetsov
2021-10-06 23:04     ` Sean Christopherson
2021-09-21  0:03 ` [PATCH v2 07/10] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation Sean Christopherson
2021-09-21 15:02   ` Vitaly Kuznetsov
2021-09-21  0:03 ` [PATCH v2 08/10] KVM: VMX: Move RESET emulation to vmx_vcpu_reset() Sean Christopherson
2021-09-21  0:03 ` [PATCH v2 09/10] KVM: SVM: Move RESET emulation to svm_vcpu_reset() Sean Christopherson
2021-09-21  0:03 ` [PATCH v2 10/10] KVM: x86: WARN on non-zero CRs at RESET to detect improper initalization Sean Christopherson
2021-09-21 13:59   ` Vitaly Kuznetsov
2021-09-23 16:23 ` [PATCH v2 00/10] KVM: x86: Clean up RESET "emulation" Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.