All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paolo Bonzini <pbonzini@redhat.com>
To: Lai Jiangshan <laijs@linux.alibaba.com>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH 13/15] KVM: SVM: Add and use svm_register_cache_reset()
Date: Thu, 18 Nov 2021 18:54:56 +0100	[thread overview]
Message-ID: <f2a99afc-6ce6-459d-05d5-a2e396af96d4@redhat.com> (raw)
In-Reply-To: <55654594-9967-37d2-335b-5035f99212fe@linux.alibaba.com>

On 11/18/21 17:28, Lai Jiangshan wrote:
> Using VMX_REGS_DIRTY_SET and SVM_REGS_DIRTY_SET and making the code
> similar is my intent for patch12,13.  If it causes confusing, I would
> like to make a second thought.  SVM_REGS_DIRTY_SET does be special
> in svm where VCPU_EXREG_CR3 is in it by definition, but it is not
> added into SVM_REGS_DIRTY_SET in the patch just for optimization to allow
> the compiler optimizes the line of code out.

I think this is where we disagree.  In my opinion it is enough to
document that CR3 _can_ be out of date, but it doesn't have to be marked
dirty because its dirty bit is effectively KVM_REQ_LOAD_MMU_PGD.

For VMX, it is important to clear VCPU_EXREG_CR3 because the combination
"avail=0, dirty=1" is nonsensical:

	av d
	0  0    in VMCS
	0  1    *INVALID*
	1  0	in vcpu->arch
	1  1	in vcpu->arch, needs store

But on SVM, VCPU_EXREG_CR3 is always available.

Thinking more about it, it makes more sense for VMX to reset _all_
bits of dirty to 0, just like it was before your change, but doing
so even earlier in vmx_vcpu_run.

I appreciate that VMX_REGS_LAZY_UPDATE_SET is useful for documentation,
but it's also important that the values in avail/dirty make sense as
a pair.

So here is what I would do:

diff --git a/arch/x86/kvm/kvm_cache_regs.h b/arch/x86/kvm/kvm_cache_regs.h
index 6e6d0d01f18d..ac3d3bd662f4 100644
--- a/arch/x86/kvm/kvm_cache_regs.h
+++ b/arch/x86/kvm/kvm_cache_regs.h
@@ -43,6 +43,13 @@ BUILD_KVM_GPR_ACCESSORS(r14, R14)
  BUILD_KVM_GPR_ACCESSORS(r15, R15)
  #endif
  
+/*
+ * avail  dirty
+ * 0	  0	  register in VMCS/VMCB
+ * 0	  1	  *INVALID*
+ * 1	  0	  register in vcpu->arch
+ * 1	  1	  register in vcpu->arch, needs to be stored back
+ */
  static inline bool kvm_register_is_available(struct kvm_vcpu *vcpu,
  					     enum kvm_reg reg)
  {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6fce61fc98e3..72ae67e214b5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6635,6 +6635,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
  		vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]);
  	if (kvm_register_is_dirty(vcpu, VCPU_REGS_RIP))
  		vmcs_writel(GUEST_RIP, vcpu->arch.regs[VCPU_REGS_RIP]);
+	vcpu->arch.regs_dirty = 0;
  
  	cr3 = __get_current_cr3_fast();
  	if (unlikely(cr3 != vmx->loaded_vmcs->host_state.cr3)) {
@@ -6729,7 +6730,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
  	loadsegment(es, __USER_DS);
  #endif
  
-	vmx_register_cache_reset(vcpu);
+	vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET;
  
  	pt_guest_exit(vmx);
  
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 4df2ac24ffc1..f978699480e3 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -473,19 +473,21 @@ BUILD_CONTROLS_SHADOW(pin, PIN_BASED_VM_EXEC_CONTROL)
  BUILD_CONTROLS_SHADOW(exec, CPU_BASED_VM_EXEC_CONTROL)
  BUILD_CONTROLS_SHADOW(secondary_exec, SECONDARY_VM_EXEC_CONTROL)
  
-static inline void vmx_register_cache_reset(struct kvm_vcpu *vcpu)
-{
-	vcpu->arch.regs_avail = ~((1 << VCPU_REGS_RIP) | (1 << VCPU_REGS_RSP)
-				  | (1 << VCPU_EXREG_RFLAGS)
-				  | (1 << VCPU_EXREG_PDPTR)
-				  | (1 << VCPU_EXREG_SEGMENTS)
-				  | (1 << VCPU_EXREG_CR0)
-				  | (1 << VCPU_EXREG_CR3)
-				  | (1 << VCPU_EXREG_CR4)
-				  | (1 << VCPU_EXREG_EXIT_INFO_1)
-				  | (1 << VCPU_EXREG_EXIT_INFO_2));
-	vcpu->arch.regs_dirty = 0;
-}
+/*
+ * VMX_REGS_LAZY_LOAD_SET - The set of registers that will be updated in the
+ * cache on demand.  Other registers not listed here are synced to
+ * the cache immediately after VM-Exit.
+ */
+#define VMX_REGS_LAZY_LOAD_SET	((1 << VCPU_REGS_RIP) |         \
+				(1 << VCPU_REGS_RSP) |          \
+				(1 << VCPU_EXREG_RFLAGS) |      \
+				(1 << VCPU_EXREG_PDPTR) |       \
+				(1 << VCPU_EXREG_SEGMENTS) |    \
+				(1 << VCPU_EXREG_CR0) |         \
+				(1 << VCPU_EXREG_CR3) |         \
+				(1 << VCPU_EXREG_CR4) |         \
+				(1 << VCPU_EXREG_EXIT_INFO_1) | \
+				(1 << VCPU_EXREG_EXIT_INFO_2))
  
  static inline struct kvm_vmx *to_kvm_vmx(struct kvm *kvm)
  {

and likewise for SVM:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index eb2a2609cae8..4b22aa7d55d0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3944,6 +3944,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
  		vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp;
  		vcpu->arch.regs[VCPU_REGS_RIP] = svm->vmcb->save.rip;
  	}
+	vcpu->arch.regs_dirty = 0;
  
  	if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI))
  		kvm_before_interrupt(vcpu);
@@ -3978,7 +3978,7 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
  		vcpu->arch.apf.host_apf_flags =
  			kvm_read_and_reset_apf_flags();
  
-	kvm_register_clear_available(vcpu, VCPU_EXREG_PDPTR);
+	vcpu->arch.regs_avail &= ~SVM_REGS_LAZY_LOAD_SET;
  
  	/*
  	 * We need to handle MC intercepts here before the vcpu has a chance to
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 32769d227860..b3c3c3098216 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -321,6 +321,16 @@ static inline bool vmcb_is_dirty(struct vmcb *vmcb, int bit)
          return !test_bit(bit, (unsigned long *)&vmcb->control.clean);
  }
  
+/*
+ * Only the PDPTRs are loaded on demand into the shadow MMU.  All other
+ * fields are synchronized in handle_exit, because accessing the VMCB is cheap.
+ *
+ * CR3 might be out of date in the VMCB but it is not marked dirty; instead,
+ * KVM_REQ_LOAD_MMU_PGD is always requested when the cached vcpu->arch.cr3
+ * is changed.  svm_load_mmu_pgd() then syncs the new CR3 value into the VMCB.
+ */
+#define SVM_REGS_LAZY_LOAD_SET	(1 << VCPU_EXREG_PDPTR)
+
  static inline struct vcpu_svm *to_svm(struct kvm_vcpu *vcpu)
  {
  	return container_of(vcpu, struct vcpu_svm, vcpu);



  reply	other threads:[~2021-11-18 17:56 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-08 12:43 [PATCH 00/15] KVM: X86: Fix and clean up for register caches Lai Jiangshan
2021-11-08 12:43 ` [PATCH 01/15] KVM: X86: Ensure the dirty PDPTEs to be loaded Lai Jiangshan
2021-11-08 12:43 ` [PATCH 02/15] KVM: VMX: Mark VCPU_EXREG_PDPTR available in ept_save_pdptrs() Lai Jiangshan
2021-11-08 12:43 ` [PATCH 03/15] KVM: SVM: Always clear available of VCPU_EXREG_PDPTR in svm_vcpu_run() Lai Jiangshan
2021-11-08 12:43 ` [PATCH 04/15] KVM: VMX: Add and use X86_CR4_TLB_BITS when !enable_ept Lai Jiangshan
2021-11-18 15:18   ` Paolo Bonzini
2021-11-08 12:43 ` [PATCH 05/15] KVM: VMX: Add and use X86_CR4_PDPTR_BITS " Lai Jiangshan
2021-11-08 12:43 ` [PATCH 06/15] KVM: X86: Move CR0 pdptr_bits into header file as X86_CR0_PDPTR_BITS Lai Jiangshan
2021-11-08 12:43 ` [PATCH 07/15] KVM: SVM: Remove outdated comment in svm_load_mmu_pgd() Lai Jiangshan
2021-11-08 12:44 ` [PATCH 08/15] KVM: SVM: Remove useless check " Lai Jiangshan
2021-11-08 12:44 ` [PATCH 09/15] KVM: SVM: Remove the unneeded code to mark available for CR3 Lai Jiangshan
2021-11-18 15:17   ` Paolo Bonzini
2021-11-08 12:44 ` [PATCH 10/15] KVM: X86: Mark CR3 dirty when vcpu->arch.cr3 is changed Lai Jiangshan
2021-11-08 12:44 ` [PATCH 11/15] KVM: VMX: Update vmcs.GUEST_CR3 only when the guest CR3 is dirty Lai Jiangshan
2021-12-15 15:47   ` Maxim Levitsky
2021-12-15 16:31     ` Lai Jiangshan
2021-12-15 16:43       ` Lai Jiangshan
2021-12-15 16:45       ` Sean Christopherson
2021-12-15 17:10         ` Paolo Bonzini
2021-12-15 20:21         ` Maxim Levitsky
2021-12-15 20:20       ` Maxim Levitsky
2021-11-08 12:44 ` [PATCH 12/15] KVM: VMX: Reset the bits that are meaningful to be reset in vmx_register_cache_reset() Lai Jiangshan
2021-11-18 15:25   ` Paolo Bonzini
2021-11-08 12:44 ` [PATCH 13/15] KVM: SVM: Add and use svm_register_cache_reset() Lai Jiangshan
2021-11-18 15:37   ` Paolo Bonzini
2021-11-18 16:28     ` Lai Jiangshan
2021-11-18 17:54       ` Paolo Bonzini [this message]
2021-11-19  0:49         ` Lai Jiangshan
2021-11-08 12:44 ` [PATCH 14/15] KVM: X86: Remove kvm_register_clear_available() Lai Jiangshan
2021-11-08 12:44 ` [PATCH 15/15] KVM: nVMX: Always write vmcs.GUEST_CR3 during nested VM-Exit Lai Jiangshan
2021-11-18 15:52   ` Paolo Bonzini
2021-11-11 14:45 ` [PATCH 16/15] KVM: X86: Update mmu->pdptrs only when it is changed Lai Jiangshan
2021-12-07 23:43   ` Sean Christopherson
2021-12-08  3:29     ` Lai Jiangshan
2021-12-08  9:09     ` Paolo Bonzini
2021-12-08  9:34       ` Lai Jiangshan
2021-11-11 14:46 ` [PATCH 17/15] KVM: X86: Ensure pae_root to be reconstructed for shadow paging if the guest PDPTEs " Lai Jiangshan
2021-11-23  9:34   ` Lai Jiangshan
2021-12-08  0:15   ` Sean Christopherson
2021-12-08  4:00     ` Lai Jiangshan
2021-12-08 15:29       ` Sean Christopherson
2021-12-09 22:46     ` Paolo Bonzini
2021-12-10 21:07       ` Sean Christopherson
2021-12-10 21:08         ` Sean Christopherson
2021-12-11  6:56         ` Maxim Levitsky
2021-12-11  8:22           ` Paolo Bonzini
2021-12-13 16:54             ` Sean Christopherson
2021-11-18  8:53 ` [PATCH 00/15] KVM: X86: Fix and clean up for register caches Lai Jiangshan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f2a99afc-6ce6-459d-05d5-a2e396af96d4@redhat.com \
    --to=pbonzini@redhat.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jiangshanlai@gmail.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.