All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] KVM: x86: Clean up RESET "emulation"
@ 2021-09-14 23:08 Sean Christopherson
  2021-09-14 23:08 ` [PATCH 1/3] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation Sean Christopherson
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Sean Christopherson @ 2021-09-14 23:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Add dedicated helpers to emulate RESET instead of having the relevant code
scattered through vcpu_create() and vcpu_reset().  Paolo, I think this is
what you meant by "have init_vmcb/svm_vcpu_reset look more like the VMX
code"[*].

Patch 01 is a bit odd; it's essentially an explicit acknowledgement that
KVM's emulation is far from complete.  It caught my eye when auditing the
"create" flows to ensure they didn't touch guest state, which should be
handled by "reset".  I waffled between deleting it outright and moving it
to the new __vmx_vcpu_reset(), and opted to delete outright to discourage
ad hoc clearing of MSRs during RESET, which isn't a maintainable approach.

[*] https://lore.kernel.org/all/c3563870-62c3-897d-3148-e48bb755310c@redhat.com/

Sean Christopherson (3):
  KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation
  KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
  KVM: SVM: Move RESET emulation to svm_vcpu_reset()

 arch/x86/kvm/svm/sev.c |  6 ++--
 arch/x86/kvm/svm/svm.c | 29 ++++++++++--------
 arch/x86/kvm/svm/svm.h |  2 +-
 arch/x86/kvm/vmx/vmx.c | 67 ++++++++++++++++++++----------------------
 4 files changed, 53 insertions(+), 51 deletions(-)

-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation
  2021-09-14 23:08 [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Sean Christopherson
@ 2021-09-14 23:08 ` Sean Christopherson
  2021-09-14 23:08 ` [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset() Sean Christopherson
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Sean Christopherson @ 2021-09-14 23:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Don't zero out user return and nested MSRs during vCPU creation, and
instead rely on vcpu_vmx being zero-allocated.  Explicitly zeroing MSRs
is not wrong, and is in fact necessary if KVM ever emulates vCPU RESET
outside of vCPU creation, but zeroing only a subset of MSRs is confusing.

Poking directly into KVM's backing is also undesirable in that it doesn't
scale and is error prone.  Ideally KVM would have a common RESET path for
all MSRs, e.g. by expanding kvm_set_msr(), which would obviate the need
for this out-of-bad code (to support standalone RESET).

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index fada1055f325..dc274b4c9912 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6818,10 +6818,8 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 			goto free_vpid;
 	}
 
-	for (i = 0; i < kvm_nr_uret_msrs; ++i) {
-		vmx->guest_uret_msrs[i].data = 0;
+	for (i = 0; i < kvm_nr_uret_msrs; ++i)
 		vmx->guest_uret_msrs[i].mask = -1ull;
-	}
 	if (boot_cpu_has(X86_FEATURE_RTM)) {
 		/*
 		 * TSX_CTRL_CPUID_CLEAR is handled in the CPUID interception.
@@ -6878,8 +6876,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 
 	if (nested)
 		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
-	else
-		memset(&vmx->nested.msrs, 0, sizeof(vmx->nested.msrs));
 
 	vcpu_setup_sgx_lepubkeyhash(vcpu);
 
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
  2021-09-14 23:08 [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Sean Christopherson
  2021-09-14 23:08 ` [PATCH 1/3] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation Sean Christopherson
@ 2021-09-14 23:08 ` Sean Christopherson
  2021-09-15 10:30   ` Vitaly Kuznetsov
  2021-09-14 23:08 ` [PATCH 3/3] KVM: SVM: Move RESET emulation to svm_vcpu_reset() Sean Christopherson
  2021-09-17 16:15 ` [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Paolo Bonzini
  3 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2021-09-14 23:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Move vCPU RESET emulation, including initializating of select VMCS state,
to vmx_vcpu_reset().  Drop the open coded "vCPU load" sequence, as
->vcpu_reset() is invoked while the vCPU is properly loaded (which is
kind of the point of ->vcpu_reset()...).  Hopefully KVM will someday
expose a dedicated RESET ioctl(), and in the meantime separating "create"
from "RESET" is a nice cleanup.

Deferring VMCS initialization is effectively a nop as it's impossible to
safely access the VMCS between the current call site and its new home, as
both the vCPU and the pCPU are put immediately after init_vmcs(), i.e.
the VMCS isn't guaranteed to be loaded.

Note, task preemption is not a problem as vmx_sched_in() _can't_ touch
the VMCS as ->sched_in() is invoked before the vCPU, and thus VMCS, is
reloaded.  I.e. the preemption path also can't consume VMCS state.

Cc: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 61 +++++++++++++++++++++---------------------
 1 file changed, 31 insertions(+), 30 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dc274b4c9912..629427cf8f4e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4327,10 +4327,6 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
 
 #define VMX_XSS_EXIT_BITMAP 0
 
-/*
- * Noting that the initialization of Guest-state Area of VMCS is in
- * vmx_vcpu_reset().
- */
 static void init_vmcs(struct vcpu_vmx *vmx)
 {
 	if (nested)
@@ -4435,10 +4431,39 @@ static void init_vmcs(struct vcpu_vmx *vmx)
 	vmx_setup_uret_msrs(vmx);
 }
 
+static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	init_vmcs(vmx);
+
+	if (nested)
+		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
+
+	vcpu_setup_sgx_lepubkeyhash(vcpu);
+
+	vmx->nested.posted_intr_nv = -1;
+	vmx->nested.current_vmptr = -1ull;
+	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
+
+	vcpu->arch.microcode_version = 0x100000000ULL;
+	vmx->msr_ia32_feature_control_valid_bits = FEAT_CTL_LOCKED;
+
+	/*
+	 * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR
+	 * or POSTED_INTR_WAKEUP_VECTOR.
+	 */
+	vmx->pi_desc.nv = POSTED_INTR_VECTOR;
+	vmx->pi_desc.sn = 1;
+}
+
 static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 
+	if (!init_event)
+		__vmx_vcpu_reset(vcpu);
+
 	vmx->rmode.vm86_active = 0;
 	vmx->spec_ctrl = 0;
 
@@ -6797,7 +6822,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 {
 	struct vmx_uret_msr *tsx_ctrl;
 	struct vcpu_vmx *vmx;
-	int i, cpu, err;
+	int i, err;
 
 	BUILD_BUG_ON(offsetof(struct vcpu_vmx, vcpu) != 0);
 	vmx = to_vmx(vcpu);
@@ -6856,12 +6881,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 	}
 
 	vmx->loaded_vmcs = &vmx->vmcs01;
-	cpu = get_cpu();
-	vmx_vcpu_load(vcpu, cpu);
-	vcpu->cpu = cpu;
-	init_vmcs(vmx);
-	vmx_vcpu_put(vcpu);
-	put_cpu();
+
 	if (cpu_need_virtualize_apic_accesses(vcpu)) {
 		err = alloc_apic_access_page(vcpu->kvm);
 		if (err)
@@ -6874,25 +6894,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
 			goto free_vmcs;
 	}
 
-	if (nested)
-		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
-
-	vcpu_setup_sgx_lepubkeyhash(vcpu);
-
-	vmx->nested.posted_intr_nv = -1;
-	vmx->nested.current_vmptr = -1ull;
-	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
-
-	vcpu->arch.microcode_version = 0x100000000ULL;
-	vmx->msr_ia32_feature_control_valid_bits = FEAT_CTL_LOCKED;
-
-	/*
-	 * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR
-	 * or POSTED_INTR_WAKEUP_VECTOR.
-	 */
-	vmx->pi_desc.nv = POSTED_INTR_VECTOR;
-	vmx->pi_desc.sn = 1;
-
 	return 0;
 
 free_vmcs:
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] KVM: SVM: Move RESET emulation to svm_vcpu_reset()
  2021-09-14 23:08 [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Sean Christopherson
  2021-09-14 23:08 ` [PATCH 1/3] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation Sean Christopherson
  2021-09-14 23:08 ` [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset() Sean Christopherson
@ 2021-09-14 23:08 ` Sean Christopherson
  2021-09-17 16:15 ` [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Paolo Bonzini
  3 siblings, 0 replies; 11+ messages in thread
From: Sean Christopherson @ 2021-09-14 23:08 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Reiji Watanabe

Move RESET emulation for SVM vCPUs to svm_vcpu_reset(), and drop an extra
init_vmcb() from svm_create_vcpu() in the process.  Hopefully KVM will
someday expose a dedicated RESET ioctl(), and in the meantime separating
"create" from "RESET" is a nice cleanup.

Keep the call to svm_switch_vmcb() so that misuse of svm->vmcb at worst
breaks the guest, e.g. premature accesses doesn't cause a NULL pointer
dereference.

Cc: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/sev.c |  6 +++---
 arch/x86/kvm/svm/svm.c | 29 +++++++++++++++++------------
 arch/x86/kvm/svm/svm.h |  2 +-
 3 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 75e0b21ad07c..4cf40021dde4 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2599,11 +2599,11 @@ void sev_es_init_vmcb(struct vcpu_svm *svm)
 	set_msr_interception(vcpu, svm->msrpm, MSR_IA32_LASTINTTOIP, 1, 1);
 }
 
-void sev_es_create_vcpu(struct vcpu_svm *svm)
+void sev_es_vcpu_reset(struct vcpu_svm *svm)
 {
 	/*
-	 * Set the GHCB MSR value as per the GHCB specification when creating
-	 * a vCPU for an SEV-ES guest.
+	 * Set the GHCB MSR value as per the GHCB specification when emulating
+	 * vCPU RESET for an SEV-ES guest.
 	 */
 	set_ghcb_msr(svm, GHCB_MSR_SEV_INFO(GHCB_VERSION_MAX,
 					    GHCB_VERSION_MIN,
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 1a70e11f0487..bb79ae8e8bcd 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1303,6 +1303,19 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 
 }
 
+static void __svm_vcpu_reset(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	svm_vcpu_init_msrpm(vcpu, svm->msrpm);
+
+	svm_init_osvw(vcpu);
+	vcpu->arch.microcode_version = 0x01000065;
+
+	if (sev_es_guest(vcpu->kvm))
+		sev_es_vcpu_reset(svm);
+}
+
 static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -1311,6 +1324,9 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 	svm->virt_spec_ctrl = 0;
 
 	init_vmcb(vcpu);
+
+	if (!init_event)
+		__svm_vcpu_reset(vcpu);
 }
 
 void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb)
@@ -1370,24 +1386,13 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 
 	svm->vmcb01.ptr = page_address(vmcb01_page);
 	svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT);
+	svm_switch_vmcb(svm, &svm->vmcb01);
 
 	if (vmsa_page)
 		svm->vmsa = page_address(vmsa_page);
 
 	svm->guest_state_loaded = false;
 
-	svm_switch_vmcb(svm, &svm->vmcb01);
-	init_vmcb(vcpu);
-
-	svm_vcpu_init_msrpm(vcpu, svm->msrpm);
-
-	svm_init_osvw(vcpu);
-	vcpu->arch.microcode_version = 0x01000065;
-
-	if (sev_es_guest(vcpu->kvm))
-		/* Perform SEV-ES specific VMCB creation updates */
-		sev_es_create_vcpu(svm);
-
 	return 0;
 
 error_free_vmsa_page:
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 524d943f3efc..001698919148 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -561,7 +561,7 @@ void sev_free_vcpu(struct kvm_vcpu *vcpu);
 int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
 int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
 void sev_es_init_vmcb(struct vcpu_svm *svm);
-void sev_es_create_vcpu(struct vcpu_svm *svm);
+void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
 void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
-- 
2.33.0.309.g3052b89438-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
  2021-09-14 23:08 ` [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset() Sean Christopherson
@ 2021-09-15 10:30   ` Vitaly Kuznetsov
  2021-09-15 17:34     ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-15 10:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> Move vCPU RESET emulation, including initializating of select VMCS state,
> to vmx_vcpu_reset().  Drop the open coded "vCPU load" sequence, as
> ->vcpu_reset() is invoked while the vCPU is properly loaded (which is
> kind of the point of ->vcpu_reset()...).  Hopefully KVM will someday
> expose a dedicated RESET ioctl(), and in the meantime separating "create"
> from "RESET" is a nice cleanup.
>
> Deferring VMCS initialization is effectively a nop as it's impossible to
> safely access the VMCS between the current call site and its new home, as
> both the vCPU and the pCPU are put immediately after init_vmcs(), i.e.
> the VMCS isn't guaranteed to be loaded.
>
> Note, task preemption is not a problem as vmx_sched_in() _can't_ touch
> the VMCS as ->sched_in() is invoked before the vCPU, and thus VMCS, is
> reloaded.  I.e. the preemption path also can't consume VMCS state.
>
> Cc: Reiji Watanabe <reijiw@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/vmx/vmx.c | 61 +++++++++++++++++++++---------------------
>  1 file changed, 31 insertions(+), 30 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index dc274b4c9912..629427cf8f4e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4327,10 +4327,6 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
>  
>  #define VMX_XSS_EXIT_BITMAP 0
>  
> -/*
> - * Noting that the initialization of Guest-state Area of VMCS is in
> - * vmx_vcpu_reset().
> - */
>  static void init_vmcs(struct vcpu_vmx *vmx)
>  {
>  	if (nested)
> @@ -4435,10 +4431,39 @@ static void init_vmcs(struct vcpu_vmx *vmx)
>  	vmx_setup_uret_msrs(vmx);
>  }
>  
> +static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu)
> +{
> +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +
> +	init_vmcs(vmx);
> +
> +	if (nested)
> +		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
> +
> +	vcpu_setup_sgx_lepubkeyhash(vcpu);
> +
> +	vmx->nested.posted_intr_nv = -1;
> +	vmx->nested.current_vmptr = -1ull;
> +	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;

What would happen in (hypothetical) case when enlightened VMCS is
currently in use? If we zap 'hv_evmcs_vmptr' here, the consequent
nested_release_evmcs() (called from
nested_vmx_handle_enlightened_vmptrld(), for example) will not do 
kvm_vcpu_unmap() while it should.

This, however, got me thinking: should we free all-things-nested with
free_nested()/nested_vmx_free_vcpu() upon vcpu reset? I can't seem to
find us doing that... (I do remember that INIT is blocked in VMX-root
mode and nobody else besides kvm_arch_vcpu_create()/
kvm_apic_accept_events() seems to call kvm_vcpu_reset()) but maybe we
should at least add a WARN_ON() guardian here?

Side topic: we don't seem to reset Hyper-V specific MSRs either,
probably relying on userspace VMM doing the right thing but this is also
not ideal I believe.

> +
> +	vcpu->arch.microcode_version = 0x100000000ULL;
> +	vmx->msr_ia32_feature_control_valid_bits = FEAT_CTL_LOCKED;
> +
> +	/*
> +	 * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR
> +	 * or POSTED_INTR_WAKEUP_VECTOR.
> +	 */
> +	vmx->pi_desc.nv = POSTED_INTR_VECTOR;
> +	vmx->pi_desc.sn = 1;
> +}
> +
>  static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>  
> +	if (!init_event)
> +		__vmx_vcpu_reset(vcpu);
> +
>  	vmx->rmode.vm86_active = 0;
>  	vmx->spec_ctrl = 0;
>  
> @@ -6797,7 +6822,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  {
>  	struct vmx_uret_msr *tsx_ctrl;
>  	struct vcpu_vmx *vmx;
> -	int i, cpu, err;
> +	int i, err;
>  
>  	BUILD_BUG_ON(offsetof(struct vcpu_vmx, vcpu) != 0);
>  	vmx = to_vmx(vcpu);
> @@ -6856,12 +6881,7 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  	}
>  
>  	vmx->loaded_vmcs = &vmx->vmcs01;
> -	cpu = get_cpu();
> -	vmx_vcpu_load(vcpu, cpu);
> -	vcpu->cpu = cpu;
> -	init_vmcs(vmx);
> -	vmx_vcpu_put(vcpu);
> -	put_cpu();
> +
>  	if (cpu_need_virtualize_apic_accesses(vcpu)) {
>  		err = alloc_apic_access_page(vcpu->kvm);
>  		if (err)
> @@ -6874,25 +6894,6 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
>  			goto free_vmcs;
>  	}
>  
> -	if (nested)
> -		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
> -
> -	vcpu_setup_sgx_lepubkeyhash(vcpu);
> -
> -	vmx->nested.posted_intr_nv = -1;
> -	vmx->nested.current_vmptr = -1ull;
> -	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
> -
> -	vcpu->arch.microcode_version = 0x100000000ULL;
> -	vmx->msr_ia32_feature_control_valid_bits = FEAT_CTL_LOCKED;
> -
> -	/*
> -	 * Enforce invariant: pi_desc.nv is always either POSTED_INTR_VECTOR
> -	 * or POSTED_INTR_WAKEUP_VECTOR.
> -	 */
> -	vmx->pi_desc.nv = POSTED_INTR_VECTOR;
> -	vmx->pi_desc.sn = 1;
> -
>  	return 0;
>  
>  free_vmcs:

-- 
Vitaly


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
  2021-09-15 10:30   ` Vitaly Kuznetsov
@ 2021-09-15 17:34     ` Sean Christopherson
  2021-09-16  7:19       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2021-09-15 17:34 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

On Wed, Sep 15, 2021, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> > +static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu)
> > +{
> > +	struct vcpu_vmx *vmx = to_vmx(vcpu);
> > +
> > +	init_vmcs(vmx);
> > +
> > +	if (nested)
> > +		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
> > +
> > +	vcpu_setup_sgx_lepubkeyhash(vcpu);
> > +
> > +	vmx->nested.posted_intr_nv = -1;
> > +	vmx->nested.current_vmptr = -1ull;
> > +	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
> 
> What would happen in (hypothetical) case when enlightened VMCS is
> currently in use? If we zap 'hv_evmcs_vmptr' here, the consequent
> nested_release_evmcs() (called from
> nested_vmx_handle_enlightened_vmptrld(), for example) will not do 
> kvm_vcpu_unmap() while it should.

The short answer is that there's a lot of stuff that needs to be addressed before
KVM can expose a RESET ioctl().  My goal with these patches is to carve out the
stubs and move the few bits of RESET emulation into the "stubs".  This is the same
answer for the MSR question/comment at the end.

> This, however, got me thinking: should we free all-things-nested with
> free_nested()/nested_vmx_free_vcpu() upon vcpu reset? I can't seem to
> find us doing that... (I do remember that INIT is blocked in VMX-root
> mode and nobody else besides kvm_arch_vcpu_create()/
> kvm_apic_accept_events() seems to call kvm_vcpu_reset()) but maybe we
> should at least add a WARN_ON() guardian here?

I think that makes sense.  Maybe use CR0 as a sentinel since it has a non-zero
RESET value?  E.g. WARN if CR0 is non-zero at RESET.

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 86539c1686fa..3ac074376821 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10813,6 +10813,11 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
        unsigned long new_cr0;
        u32 eax, dummy;

+       /*
+        * <comment about KVM not supporting arbitrary RESET>
+        */
+       WARN_ON_ONCE(!init_event && old_cr0);
+
        kvm_lapic_reset(vcpu, init_event);

        vcpu->arch.hflags = 0;

Huh, typing that out made me realize commit 0aa1837533e5 ("KVM: x86: Properly
reset MMU context at vCPU RESET/INIT") technically introduced a bug.  kvm_vcpu_reset()
does kvm_read_cr0() and thus reads vmcs.GUEST_CR0 because vcpu->arch.regs_avail is
(correctly) not stuffed to ALL_ONES until later in kvm_vcpu_reset().  init_vmcs()
doesn't explicitly zero vmcs.GUEST_CR0 (along with many other guest fields), and
so VMREAD(GUEST_CR0) is technically consuming garbage.  In practice, it's consuming
'0' because no known CPU or VMM inverts values in the VMCS, i.e. zero allocating
the VMCS is functionally equivalent to writing '0' to all fields via VMWRITE.

And staring more at kvm_vcpu_reset(), this code is terrifying for INIT

	memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
	vcpu->arch.regs_avail = ~0;
	vcpu->arch.regs_dirty = ~0;

because it means cr0 and cr4 are marked available+dirty without immediately writing
vcpu->arch.cr0/cr4.  And VMX subtly relies on that, as vmx_set_cr0() grabs CR0.PG
via kvm_read_cr0_bits(), i.e. zeroing vcpu->arch.cr0 would "break" the INIT flow.
Ignoring for the moment that CR0.PG is never guest-owned and thus never stale in
vcpu->arch.cr0, KVM is also technically relying on the earlier kvm_read_cr0() in
kvm_vcpu_reset() to ensure vcpu->arch.cr0 is fresh.

Stuffing regs_avail technically means vmx_set_rflags() -> vmx_get_rflags() is
consuming stale data.  It doesn't matter in practice because the old value is
only used to re-evaluate vmx->emulation_required, which is guaranteed to be up
refreshed by vmx_set_cr0() and friends.  PDPTRs and EXIT_INFO are in a similar
boat; KVM shouldn't be reading those fields (CR0.PG=0, not an exit path), but
marking them dirty without actually updating the cached values is wrong.

There's also one concrete bug: KVM doesn't set vcpu->arch.cr3=0 on RESET/INIT.
That bug has gone unnoticed because no real word BIOS/kernel is going to rely on
INIT to set CR3=0.  

I'm strongly leaning towards stuffing regs_avail/dirty in kvm_arch_vcpu_create(),
and relying on explicit kvm_register_mark_dirty() calls for the 3 or so cases where
x86 code writes a vcpu->arch register directly.  That would fix the CR0 read bug
and also prevent subtle bugs from sneaking in.  Adding new EXREGS would be slightly
more costly, but IMO that's a good thing as would force us to actually think about
how to handle each register.

E.g. implement this over 2-3 patches:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 114847253e0a..743146ac8307 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4385,6 +4385,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
        kvm_set_cr8(vcpu, 0);

        vmx_segment_cache_clear(vmx);
+       kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS);

        seg_setup(VCPU_SREG_CS);
        vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 86539c1686fa..ab907a0b9eeb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10656,6 +10656,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
        int r;

        vcpu->arch.last_vmentry_cpu = -1;
+       vcpu->arch.regs_avail = ~0;
+       vcpu->arch.regs_dirty = ~0;

        if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
                vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
@@ -10874,9 +10876,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
                vcpu->arch.xcr0 = XFEATURE_MASK_FP;
        }

+       /* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
        memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
-       vcpu->arch.regs_avail = ~0;
-       vcpu->arch.regs_dirty = ~0;
+       kvm_register_mark_dirty(vcpu, VCPU_REGS_RSP);

        /*
         * Fall back to KVM's default Family/Model/Stepping of 0x600 (P6/Athlon)
@@ -10897,6 +10899,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
        kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
        kvm_rip_write(vcpu, 0xfff0);

+       vcpu->arch.cr3 = 0;
+       kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
+
        /*
         * CR0.CD/NW are set on RESET, preserved on INIT.  Note, some versions
         * of Intel's SDM list CD/NW as being set on INIT, but they contradict

> Side topic: we don't seem to reset Hyper-V specific MSRs either,
> probably relying on userspace VMM doing the right thing but this is also
> not ideal I believe.


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
  2021-09-15 17:34     ` Sean Christopherson
@ 2021-09-16  7:19       ` Vitaly Kuznetsov
  2021-09-16 19:01         ` Sean Christopherson
  0 siblings, 1 reply; 11+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-16  7:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

Sean Christopherson <seanjc@google.com> writes:

> On Wed, Sep 15, 2021, Vitaly Kuznetsov wrote:
>> Sean Christopherson <seanjc@google.com> writes:
>> > +static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu)
>> > +{
>> > +	struct vcpu_vmx *vmx = to_vmx(vcpu);
>> > +
>> > +	init_vmcs(vmx);
>> > +
>> > +	if (nested)
>> > +		memcpy(&vmx->nested.msrs, &vmcs_config.nested, sizeof(vmx->nested.msrs));
>> > +
>> > +	vcpu_setup_sgx_lepubkeyhash(vcpu);
>> > +
>> > +	vmx->nested.posted_intr_nv = -1;
>> > +	vmx->nested.current_vmptr = -1ull;
>> > +	vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
>> 
>> What would happen in (hypothetical) case when enlightened VMCS is
>> currently in use? If we zap 'hv_evmcs_vmptr' here, the consequent
>> nested_release_evmcs() (called from
>> nested_vmx_handle_enlightened_vmptrld(), for example) will not do 
>> kvm_vcpu_unmap() while it should.
>
> The short answer is that there's a lot of stuff that needs to be addressed before
> KVM can expose a RESET ioctl().  My goal with these patches is to carve out the
> stubs and move the few bits of RESET emulation into the "stubs".  This is the same
> answer for the MSR question/comment at the end.
>
>> This, however, got me thinking: should we free all-things-nested with
>> free_nested()/nested_vmx_free_vcpu() upon vcpu reset? I can't seem to
>> find us doing that... (I do remember that INIT is blocked in VMX-root
>> mode and nobody else besides kvm_arch_vcpu_create()/
>> kvm_apic_accept_events() seems to call kvm_vcpu_reset()) but maybe we
>> should at least add a WARN_ON() guardian here?
>
> I think that makes sense.  Maybe use CR0 as a sentinel since it has a non-zero
> RESET value?  E.g. WARN if CR0 is non-zero at RESET.
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 86539c1686fa..3ac074376821 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10813,6 +10813,11 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>         unsigned long new_cr0;
>         u32 eax, dummy;
>
> +       /*
> +        * <comment about KVM not supporting arbitrary RESET>
> +        */
> +       WARN_ON_ONCE(!init_event && old_cr0);
> +
>         kvm_lapic_reset(vcpu, init_event);
>
>         vcpu->arch.hflags = 0;
>

Should work (assuming nothing in VMX would want to call vmx_vcpu_reset()/
__vmx_vcpu_reset() directly).

> Huh, typing that out made me realize commit 0aa1837533e5 ("KVM: x86: Properly
> reset MMU context at vCPU RESET/INIT") technically introduced a bug.  kvm_vcpu_reset()
> does kvm_read_cr0() and thus reads vmcs.GUEST_CR0 because vcpu->arch.regs_avail is
> (correctly) not stuffed to ALL_ONES until later in kvm_vcpu_reset().  init_vmcs()
> doesn't explicitly zero vmcs.GUEST_CR0 (along with many other guest fields), and
> so VMREAD(GUEST_CR0) is technically consuming garbage.  In practice, it's consuming
> '0' because no known CPU or VMM inverts values in the VMCS, i.e. zero allocating
> the VMCS is functionally equivalent to writing '0' to all fields via VMWRITE.
>
> And staring more at kvm_vcpu_reset(), this code is terrifying for INIT
>
> 	memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
> 	vcpu->arch.regs_avail = ~0;
> 	vcpu->arch.regs_dirty = ~0;
>
> because it means cr0 and cr4 are marked available+dirty without immediately writing
> vcpu->arch.cr0/cr4.  And VMX subtly relies on that, as vmx_set_cr0() grabs CR0.PG
> via kvm_read_cr0_bits(), i.e. zeroing vcpu->arch.cr0 would "break" the INIT flow.
> Ignoring for the moment that CR0.PG is never guest-owned and thus never stale in
> vcpu->arch.cr0, KVM is also technically relying on the earlier kvm_read_cr0() in
> kvm_vcpu_reset() to ensure vcpu->arch.cr0 is fresh.
>
> Stuffing regs_avail technically means vmx_set_rflags() -> vmx_get_rflags() is
> consuming stale data.  It doesn't matter in practice because the old value is
> only used to re-evaluate vmx->emulation_required, which is guaranteed to be up
> refreshed by vmx_set_cr0() and friends.  PDPTRs and EXIT_INFO are in a similar
> boat; KVM shouldn't be reading those fields (CR0.PG=0, not an exit path), but
> marking them dirty without actually updating the cached values is wrong.
>
> There's also one concrete bug: KVM doesn't set vcpu->arch.cr3=0 on RESET/INIT.
> That bug has gone unnoticed because no real word BIOS/kernel is going to rely on
> INIT to set CR3=0.  
>
> I'm strongly leaning towards stuffing regs_avail/dirty in kvm_arch_vcpu_create(),
> and relying on explicit kvm_register_mark_dirty() calls for the 3 or so cases where
> x86 code writes a vcpu->arch register directly.  That would fix the CR0 read bug
> and also prevent subtle bugs from sneaking in.  Adding new EXREGS would be slightly
> more costly, but IMO that's a good thing as would force us to actually think about
> how to handle each register.
>
> E.g. implement this over 2-3 patches:
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 114847253e0a..743146ac8307 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4385,6 +4385,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>         kvm_set_cr8(vcpu, 0);
>
>         vmx_segment_cache_clear(vmx);
> +       kvm_register_mark_available(vcpu, VCPU_EXREG_SEGMENTS);
>
>         seg_setup(VCPU_SREG_CS);
>         vmcs_write16(GUEST_CS_SELECTOR, 0xf000);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 86539c1686fa..ab907a0b9eeb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10656,6 +10656,8 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
>         int r;
>
>         vcpu->arch.last_vmentry_cpu = -1;
> +       vcpu->arch.regs_avail = ~0;
> +       vcpu->arch.regs_dirty = ~0;
>
>         if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
>                 vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
> @@ -10874,9 +10876,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>                 vcpu->arch.xcr0 = XFEATURE_MASK_FP;
>         }
>
> +       /* All GPRs except RDX (handled below) are zeroed on RESET/INIT. */
>         memset(vcpu->arch.regs, 0, sizeof(vcpu->arch.regs));
> -       vcpu->arch.regs_avail = ~0;
> -       vcpu->arch.regs_dirty = ~0;
> +       kvm_register_mark_dirty(vcpu, VCPU_REGS_RSP);
>
>         /*
>          * Fall back to KVM's default Family/Model/Stepping of 0x600 (P6/Athlon)
> @@ -10897,6 +10899,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
>         kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
>         kvm_rip_write(vcpu, 0xfff0);
>
> +       vcpu->arch.cr3 = 0;
> +       kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
> +
>         /*
>          * CR0.CD/NW are set on RESET, preserved on INIT.  Note, some versions
>          * of Intel's SDM list CD/NW as being set on INIT, but they contradict
>

A selftest for vCPU create/reset would be really helpful. I can even
volunteer to [eventually] write one :-)

-- 
Vitaly


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset()
  2021-09-16  7:19       ` Vitaly Kuznetsov
@ 2021-09-16 19:01         ` Sean Christopherson
  0 siblings, 0 replies; 11+ messages in thread
From: Sean Christopherson @ 2021-09-16 19:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Paolo Bonzini, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	linux-kernel, Reiji Watanabe

On Thu, Sep 16, 2021, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> > @@ -10897,6 +10899,9 @@ void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
> >         kvm_set_rflags(vcpu, X86_EFLAGS_FIXED);
> >         kvm_rip_write(vcpu, 0xfff0);
> >
> > +       vcpu->arch.cr3 = 0;
> > +       kvm_register_mark_dirty(vcpu, VCPU_EXREG_CR3);
> > +
> >         /*
> >          * CR0.CD/NW are set on RESET, preserved on INIT.  Note, some versions
> >          * of Intel's SDM list CD/NW as being set on INIT, but they contradict
> >
> 
> A selftest for vCPU create/reset would be really helpful. I can even
> volunteer to [eventually] write one :-)

Hmm, I wonder if it would be possible to share code/infrastructure with Erdem's
in-progress TDX selftest framework[*].  TDX forces vCPUs to start at the legacy
reset vector with paging disabled, so it needs a lot of the same glue code as a
from-RESET test would need.  TDX forces 32-bit PM instead of RM, but it should
be easy enough to allow an optional opening sequence to get into 32-bit PM.

We could also test INIT without much trouble since INIT to the BSP will send it
back to the reset vector, e.g. set a flag somewhere to avoid an infinite loop and
INIT self.

Let me work with Erdem to see if we can concoct something that will work for
both TDX and tests that want to take control at RESET.

[*] https://lkml.kernel.org/r/20210726183816.1343022-3-erdemaktas@google.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] KVM: x86: Clean up RESET "emulation"
  2021-09-14 23:08 [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Sean Christopherson
                   ` (2 preceding siblings ...)
  2021-09-14 23:08 ` [PATCH 3/3] KVM: SVM: Move RESET emulation to svm_vcpu_reset() Sean Christopherson
@ 2021-09-17 16:15 ` Paolo Bonzini
  2021-09-17 17:34   ` Sean Christopherson
  3 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2021-09-17 16:15 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	Kernel Mailing List, Linux, Reiji Watanabe

On Wed, Sep 15, 2021 at 1:08 AM Sean Christopherson <seanjc@google.com> wrote:
> Add dedicated helpers to emulate RESET instead of having the relevant code
> scattered through vcpu_create() and vcpu_reset().  Paolo, I think this is
> what you meant by "have init_vmcb/svm_vcpu_reset look more like the VMX
> code"[*].
>
> [*] https://lore.kernel.org/all/c3563870-62c3-897d-3148-e48bb755310c@redhat.com/

That assumes that I remember what I meant :) but I do like it so yes,
that was it. Especially the fact that init_vmcb now has a single
caller. I would further consider moving save area initialization to
*_vcpu_reset, and keeping the control fields in init_vmcb/vmcs. That
would make it easier to relate the two functions to separate parts of
the manuals.

I should go back to KVM next week. Context switching with KVM Forum
and Kangrejos this week made everything so much slower than I'd liked.

Paolo


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] KVM: x86: Clean up RESET "emulation"
  2021-09-17 16:15 ` [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Paolo Bonzini
@ 2021-09-17 17:34   ` Sean Christopherson
  2021-09-17 17:37     ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Sean Christopherson @ 2021-09-17 17:34 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	Kernel Mailing List, Linux, Reiji Watanabe

On Fri, Sep 17, 2021, Paolo Bonzini wrote:
> On Wed, Sep 15, 2021 at 1:08 AM Sean Christopherson <seanjc@google.com> wrote:
> > Add dedicated helpers to emulate RESET instead of having the relevant code
> > scattered through vcpu_create() and vcpu_reset().  Paolo, I think this is
> > what you meant by "have init_vmcb/svm_vcpu_reset look more like the VMX
> > code"[*].
> >
> > [*] https://lore.kernel.org/all/c3563870-62c3-897d-3148-e48bb755310c@redhat.com/
> 
> That assumes that I remember what I meant :)

Ha!  That's why I write changelogs with --verbose :-)

> but I do like it so yes, that was it. Especially the fact that init_vmcb now
> has a single caller. I would further consider moving save area initialization
> to *_vcpu_reset, and keeping the control fields in init_vmcb/vmcs. That would
> make it easier to relate the two functions to separate parts of the manuals.

I like the idea, but I think I'd prefer to tackle that at the same time as generic
support for handling MSRs at RESET/INIT.  E.g. instead of manually writing
vmcs.GUEST_SYSENTER_* at RESET, provide infrastruture to automagically run through
all emulated/virtualized at RESET and/or INIT as appropriate to initialize the
guest value.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/3] KVM: x86: Clean up RESET "emulation"
  2021-09-17 17:34   ` Sean Christopherson
@ 2021-09-17 17:37     ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2021-09-17 17:37 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel, kvm,
	Kernel Mailing List, Linux, Reiji Watanabe

On 17/09/21 19:34, Sean Christopherson wrote:
>> but I do like it so yes, that was it. Especially the fact that init_vmcb now
>> has a single caller. I would further consider moving save area initialization
>> to *_vcpu_reset, and keeping the control fields in init_vmcb/vmcs. That would
>> make it easier to relate the two functions to separate parts of the manuals.
>
> I like the idea, but I think I'd prefer to tackle that at the same time as generic
> support for handling MSRs at RESET/INIT.

No problem, just roughly sketching some ideas for the future.  But 
you're absolutely right that some MSRs have effects on the control areas 
rather than the save area (and some have effects on neither).

Thanks,

Paolo

> E.g. instead of manually writing
> vmcs.GUEST_SYSENTER_* at RESET, provide infrastruture to automagically run through
> all emulated/virtualized at RESET and/or INIT as appropriate to initialize the
> guest value.
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-17 17:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-14 23:08 [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Sean Christopherson
2021-09-14 23:08 ` [PATCH 1/3] KVM: VMX: Drop explicit zeroing of MSR guest values at vCPU creation Sean Christopherson
2021-09-14 23:08 ` [PATCH 2/3] KVM: VMX: Move RESET emulation to vmx_vcpu_reset() Sean Christopherson
2021-09-15 10:30   ` Vitaly Kuznetsov
2021-09-15 17:34     ` Sean Christopherson
2021-09-16  7:19       ` Vitaly Kuznetsov
2021-09-16 19:01         ` Sean Christopherson
2021-09-14 23:08 ` [PATCH 3/3] KVM: SVM: Move RESET emulation to svm_vcpu_reset() Sean Christopherson
2021-09-17 16:15 ` [PATCH 0/3] KVM: x86: Clean up RESET "emulation" Paolo Bonzini
2021-09-17 17:34   ` Sean Christopherson
2021-09-17 17:37     ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.