All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes
@ 2022-07-28 22:17 Sean Christopherson
  2022-07-28 22:17 ` [PATCH 1/4] KVM: x86: Tag kvm_mmu_x86_module_init() with __init Sean Christopherson
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Sean Christopherson @ 2022-07-28 22:17 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

Fix two bugs I introduced when adding the enable_mmio_caching module param.

Bug #1 is that KVM unintentionally makes disabling caching due to a config
incompatibility "sticky", e.g. disabling caching because there are no
reserved PA bits prevents KVM from enabling when "switching" to an EPT
config (doesn't rely on PA bits) or when SVM adjusts the MMIO masks to
account for C-bit shenanigans (even if MAXPHYADDR=52 and C-bit=51, there
can be reserved PA bits due to the "real" MAXPHYADDR being reduced).

Bug #2 is that KVM doesn't explicitly check that MMIO caching is enabled
when doing SEV-ES setup.  Prior to the module param, MMIO caching was
guaranteed when SEV-ES could be enabled as SEV-ES-capable CPUs effectively
guarantee there will be at least one reserved PA bit (see above).  With
the module param, userspace can explicitly disable MMIO caching, thus
silently breaking SEV-ES.

I believe I tested all combinations of things getting disabled and enabled
by hacking kvm_mmu_reset_all_pte_masks() to disable MMIO caching on a much
lower MAXPHYADDR, e.g. 43 instead of 52.  That said, definitely wait for a
thumbs up from the AMD folks before queueing.

Sean Christopherson (4):
  KVM: x86: Tag kvm_mmu_x86_module_init() with __init
  KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup
  KVM: SVM: Disable SEV-ES support if MMIO caching is disable

 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu.h              |  2 ++
 arch/x86/kvm/mmu/mmu.c          |  6 +++++-
 arch/x86/kvm/mmu/spte.c         | 20 ++++++++++++++++++++
 arch/x86/kvm/mmu/spte.h         |  3 +--
 arch/x86/kvm/svm/sev.c          | 10 ++++++++++
 arch/x86/kvm/svm/svm.c          |  9 ++++++---
 7 files changed, 45 insertions(+), 7 deletions(-)


base-commit: 1a4d88a361af4f2e91861d632c6a1fe87a9665c2
-- 
2.37.1.455.g008518b4e5-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/4] KVM: x86: Tag kvm_mmu_x86_module_init() with __init
  2022-07-28 22:17 [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Sean Christopherson
@ 2022-07-28 22:17 ` Sean Christopherson
  2022-07-29  2:14   ` Kai Huang
  2022-07-28 22:17 ` [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change Sean Christopherson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-07-28 22:17 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

Mark kvm_mmu_x86_module_init() with __init, the entire reason it exists
is to initialize variables when kvm.ko is loaded, i.e. it must never be
called after module initialization.

Fixes: 1d0e84806047 ("KVM: x86/mmu: Resolve nx_huge_pages when kvm.ko is loaded")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/mmu/mmu.c          | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index e8281d64a431..5ffa578cafe1 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1704,7 +1704,7 @@ static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
 #define kvm_arch_pmi_in_guest(vcpu) \
 	((vcpu) && (vcpu)->arch.handling_intr_from_guest)
 
-void kvm_mmu_x86_module_init(void);
+void __init kvm_mmu_x86_module_init(void);
 int kvm_mmu_vendor_module_init(void);
 void kvm_mmu_vendor_module_exit(void);
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 8e477333a263..2975fcb14c86 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6700,7 +6700,7 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
  * nx_huge_pages needs to be resolved to true/false when kvm.ko is loaded, as
  * its default value of -1 is technically undefined behavior for a boolean.
  */
-void kvm_mmu_x86_module_init(void)
+void __init kvm_mmu_x86_module_init(void)
 {
 	if (nx_huge_pages == -1)
 		__set_nx_huge_pages(get_nx_auto_mode());
-- 
2.37.1.455.g008518b4e5-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-07-28 22:17 [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Sean Christopherson
  2022-07-28 22:17 ` [PATCH 1/4] KVM: x86: Tag kvm_mmu_x86_module_init() with __init Sean Christopherson
@ 2022-07-28 22:17 ` Sean Christopherson
  2022-07-29  2:39   ` Kai Huang
  2022-07-28 22:17 ` [PATCH 3/4] KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup Sean Christopherson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-07-28 22:17 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

Fully re-evaluate whether or not MMIO caching can be enabled when SPTE
masks change; simply clearing enable_mmio_caching when a configuration
isn't compatible with caching fails to handle the scenario where the
masks are updated, e.g. by VMX for EPT or by SVM to account for the C-bit
location, and toggle compatibility from false=>true.

Snapshot the original module param so that re-evaluating MMIO caching
preserves userspace's desire to allow caching.  Use a snapshot approach
so that enable_mmio_caching still reflects KVM's actual behavior.

Fixes: 8b9e74bfbf8c ("KVM: x86/mmu: Use enable_mmio_caching to track if MMIO caching is enabled")
Reported-by: Michael Roth <michael.roth@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c  |  4 ++++
 arch/x86/kvm/mmu/spte.c | 19 +++++++++++++++++++
 arch/x86/kvm/mmu/spte.h |  1 +
 3 files changed, 24 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 2975fcb14c86..660f58928252 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6699,11 +6699,15 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
 /*
  * nx_huge_pages needs to be resolved to true/false when kvm.ko is loaded, as
  * its default value of -1 is technically undefined behavior for a boolean.
+ * Forward the module init call to SPTE code so that it too can handle module
+ * params that need to be resolved/snapshot.
  */
 void __init kvm_mmu_x86_module_init(void)
 {
 	if (nx_huge_pages == -1)
 		__set_nx_huge_pages(get_nx_auto_mode());
+
+	kvm_mmu_spte_module_init();
 }
 
 /*
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 7314d27d57a4..66f76f5a15bd 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -20,6 +20,7 @@
 #include <asm/vmx.h>
 
 bool __read_mostly enable_mmio_caching = true;
+static bool __ro_after_init allow_mmio_caching;
 module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
 
 u64 __read_mostly shadow_host_writable_mask;
@@ -43,6 +44,18 @@ u64 __read_mostly shadow_nonpresent_or_rsvd_lower_gfn_mask;
 
 u8 __read_mostly shadow_phys_bits;
 
+void __init kvm_mmu_spte_module_init(void)
+{
+	/*
+	 * Snapshot userspace's desire to allow MMIO caching.  Whether or not
+	 * KVM can actually enable MMIO caching depends on vendor-specific
+	 * hardware capabilities and other module params that can't be resolved
+	 * until the vendor module is loaded, i.e. enable_mmio_caching can and
+	 * will change when the vendor module is (re)loaded.
+	 */
+	allow_mmio_caching = enable_mmio_caching;
+}
+
 static u64 generation_mmio_spte_mask(u64 gen)
 {
 	u64 mask;
@@ -340,6 +353,12 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
 	BUG_ON((u64)(unsigned)access_mask != access_mask);
 	WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);
 
+	/*
+	 * Reset to the original module param value to honor userspace's desire
+	 * to (dis)allow MMIO caching.  Update the param itself so that
+	 * userspace can see whether or not KVM is actually using MMIO caching.
+	 */
+	enable_mmio_caching = allow_mmio_caching;
 	if (!enable_mmio_caching)
 		mmio_value = 0;
 
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index cabe3fbb4f39..26b144ffd146 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -450,6 +450,7 @@ static inline u64 restore_acc_track_spte(u64 spte)
 
 u64 kvm_mmu_changed_pte_notifier_make_spte(u64 old_spte, kvm_pfn_t new_pfn);
 
+void __init kvm_mmu_spte_module_init(void);
 void kvm_mmu_reset_all_pte_masks(void);
 
 #endif
-- 
2.37.1.455.g008518b4e5-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/4] KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup
  2022-07-28 22:17 [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Sean Christopherson
  2022-07-28 22:17 ` [PATCH 1/4] KVM: x86: Tag kvm_mmu_x86_module_init() with __init Sean Christopherson
  2022-07-28 22:17 ` [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change Sean Christopherson
@ 2022-07-28 22:17 ` Sean Christopherson
  2022-07-29  2:06   ` Kai Huang
  2022-07-28 22:17 ` [PATCH 4/4] KVM: SVM: Disable SEV-ES support if MMIO caching is disable Sean Christopherson
  2022-07-29  1:09 ` [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Michael Roth
  4 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-07-28 22:17 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

Adjust KVM's MMIO masks to account for the C-bit location prior to doing
SEV(-ES) setup.  A future patch will consume enable_mmio caching during
SEV setup as SEV-ES _requires_ MMIO caching, i.e. KVM needs to disallow
SEV-ES if MMIO caching is disabled.

Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index aef63aae922d..62e89db83bc1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -5034,13 +5034,16 @@ static __init int svm_hardware_setup(void)
 	/* Setup shadow_me_value and shadow_me_mask */
 	kvm_mmu_set_me_spte_mask(sme_me_mask, sme_me_mask);
 
-	/* Note, SEV setup consumes npt_enabled. */
+	svm_adjust_mmio_mask();
+
+	/*
+	 * Note, SEV setup consumes npt_enabled and enable_mmio_caching (which
+	 * may be modified by svm_adjust_mmio_mask()).
+	 */
 	sev_hardware_setup();
 
 	svm_hv_hardware_setup();
 
-	svm_adjust_mmio_mask();
-
 	for_each_possible_cpu(cpu) {
 		r = svm_cpu_init(cpu);
 		if (r)
-- 
2.37.1.455.g008518b4e5-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/4] KVM: SVM: Disable SEV-ES support if MMIO caching is disable
  2022-07-28 22:17 [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Sean Christopherson
                   ` (2 preceding siblings ...)
  2022-07-28 22:17 ` [PATCH 3/4] KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup Sean Christopherson
@ 2022-07-28 22:17 ` Sean Christopherson
  2022-07-29  2:12   ` Kai Huang
  2022-07-29  1:09 ` [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Michael Roth
  4 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-07-28 22:17 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

Disable SEV-ES if MMIO caching is disabled as SEV-ES relies on MMIO SPTEs
generating #NPF(RSVD), which are reflected by the CPU into the guest as
a #VC.  With SEV-ES, the untrusted host, a.k.a. KVM, doesn't have access
to the guest instruction stream or register state and so can't directly
emulate in response to a #NPF on an emulated MMIO GPA.  Disabling MMIO
caching means guest accesses to emulated MMIO ranges cause #NPF(!PRESENT),
and those flavors of #NPF cause automatic VM-Exits, not #VC.

Fixes: b09763da4dd8 ("KVM: x86/mmu: Add module param to disable MMIO caching (for testing)")
Reported-by: Michael Roth <michael.roth@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu.h      |  2 ++
 arch/x86/kvm/mmu/spte.c |  1 +
 arch/x86/kvm/mmu/spte.h |  2 --
 arch/x86/kvm/svm/sev.c  | 10 ++++++++++
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index a99acec925eb..6bdaacb6faa0 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -6,6 +6,8 @@
 #include "kvm_cache_regs.h"
 #include "cpuid.h"
 
+extern bool __read_mostly enable_mmio_caching;
+
 #define PT_WRITABLE_SHIFT 1
 #define PT_USER_SHIFT 2
 
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 66f76f5a15bd..03ca740bf721 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -22,6 +22,7 @@
 bool __read_mostly enable_mmio_caching = true;
 static bool __ro_after_init allow_mmio_caching;
 module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
+EXPORT_SYMBOL_GPL(enable_mmio_caching);
 
 u64 __read_mostly shadow_host_writable_mask;
 u64 __read_mostly shadow_mmu_writable_mask;
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 26b144ffd146..9a9414b8d1d6 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -5,8 +5,6 @@
 
 #include "mmu_internal.h"
 
-extern bool __read_mostly enable_mmio_caching;
-
 /*
  * A MMU present SPTE is backed by actual memory and may or may not be present
  * in hardware.  E.g. MMIO SPTEs are not considered present.  Use bit 11, as it
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 309bcdb2f929..05bf6301acac 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -22,6 +22,7 @@
 #include <asm/trapnr.h>
 #include <asm/fpu/xcr.h>
 
+#include "mmu.h"
 #include "x86.h"
 #include "svm.h"
 #include "svm_ops.h"
@@ -2205,6 +2206,15 @@ void __init sev_hardware_setup(void)
 	if (!sev_es_enabled)
 		goto out;
 
+	/*
+	 * SEV-ES requires MMIO caching as KVM doesn't have access to the guest
+	 * instruction stream, i.e. can't emulate in response to a #NPF and
+	 * instead relies on #NPF(RSVD) being reflected into the guest as #VC
+	 * (the guest can then do a #VMGEXIT to request MMIO emulation).
+	 */
+	if (!enable_mmio_caching)
+		goto out;
+
 	/* Does the CPU support SEV-ES? */
 	if (!boot_cpu_has(X86_FEATURE_SEV_ES))
 		goto out;
-- 
2.37.1.455.g008518b4e5-goog


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes
  2022-07-28 22:17 [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Sean Christopherson
                   ` (3 preceding siblings ...)
  2022-07-28 22:17 ` [PATCH 4/4] KVM: SVM: Disable SEV-ES support if MMIO caching is disable Sean Christopherson
@ 2022-07-29  1:09 ` Michael Roth
  4 siblings, 0 replies; 23+ messages in thread
From: Michael Roth @ 2022-07-29  1:09 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Paolo Bonzini, kvm, linux-kernel, Tom Lendacky

On Thu, Jul 28, 2022 at 10:17:55PM +0000, Sean Christopherson wrote:
> Fix two bugs I introduced when adding the enable_mmio_caching module param.
> 
> Bug #1 is that KVM unintentionally makes disabling caching due to a config
> incompatibility "sticky", e.g. disabling caching because there are no
> reserved PA bits prevents KVM from enabling when "switching" to an EPT
> config (doesn't rely on PA bits) or when SVM adjusts the MMIO masks to
> account for C-bit shenanigans (even if MAXPHYADDR=52 and C-bit=51, there
> can be reserved PA bits due to the "real" MAXPHYADDR being reduced).
> 
> Bug #2 is that KVM doesn't explicitly check that MMIO caching is enabled
> when doing SEV-ES setup.  Prior to the module param, MMIO caching was
> guaranteed when SEV-ES could be enabled as SEV-ES-capable CPUs effectively
> guarantee there will be at least one reserved PA bit (see above).  With
> the module param, userspace can explicitly disable MMIO caching, thus
> silently breaking SEV-ES.
> 
> I believe I tested all combinations of things getting disabled and enabled
> by hacking kvm_mmu_reset_all_pte_masks() to disable MMIO caching on a much
> lower MAXPHYADDR, e.g. 43 instead of 52.  That said, definitely wait for a
> thumbs up from the AMD folks before queueing.

I tested the below systems/configurations and everything looks good
to me.  Thanks for the quick fix!

  AMD Milan, MAXPHYADDR = 48 bits, kvm.mmio_caching=on (on by default)
  normal: pass
  SEV:    pass
  SEV-ES: pass
  
  AMD Milan, MAXPHYADDR = 48 bits, kvm.mmio_caching=off
  normal: pass
  SEV:    pass
  SEV-ES: fail (as expected, since kvm_amd.sev_es gets forced to off)
  
  AMD unreleased, MAXPHYADDR = 52 bits, kvm.mmio_caching=on (on by default)
  normal: pass
  SEV:    pass
  SEV-ES: pass
  
  AMD unreleased, MAXPHYADDR = 52 bits, kvm.mmio_caching=off
  normal: pass
  SEV:    pass
  SEV-ES: fail (as expected, since kvm_amd.sev_es gets forced to off)

> 
> Sean Christopherson (4):
>   KVM: x86: Tag kvm_mmu_x86_module_init() with __init
>   KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
>   KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup
>   KVM: SVM: Disable SEV-ES support if MMIO caching is disable

Series:

Tested-by: Michael Roth <michael.roth@amd.com>

-Mike

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup
  2022-07-28 22:17 ` [PATCH 3/4] KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup Sean Christopherson
@ 2022-07-29  2:06   ` Kai Huang
  2022-07-29 18:15     ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Kai Huang @ 2022-07-29  2:06 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> Adjust KVM's MMIO masks to account for the C-bit location prior to doing
> SEV(-ES) setup.  A future patch will consume enable_mmio caching during
> SEV setup as SEV-ES _requires_ MMIO caching, i.e. KVM needs to disallow
> SEV-ES if MMIO caching is disabled.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/svm/svm.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index aef63aae922d..62e89db83bc1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -5034,13 +5034,16 @@ static __init int svm_hardware_setup(void)
>  	/* Setup shadow_me_value and shadow_me_mask */
>  	kvm_mmu_set_me_spte_mask(sme_me_mask, sme_me_mask);
>  
> -	/* Note, SEV setup consumes npt_enabled. */
> +	svm_adjust_mmio_mask();
> +
> +	/*
> +	 * Note, SEV setup consumes npt_enabled and enable_mmio_caching (which
> +	 * may be modified by svm_adjust_mmio_mask()).
> +	 */
>  	sev_hardware_setup();

If I am not seeing mistakenly, the code in latest queue branch doesn't consume
enable_mmio_caching.  It is only added in your later patch.

So perhaps adjust the comment or merge patches together?

>  
>  	svm_hv_hardware_setup();
>  
> -	svm_adjust_mmio_mask();
> -
>  	for_each_possible_cpu(cpu) {
>  		r = svm_cpu_init(cpu);
>  		if (r)


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] KVM: SVM: Disable SEV-ES support if MMIO caching is disable
  2022-07-28 22:17 ` [PATCH 4/4] KVM: SVM: Disable SEV-ES support if MMIO caching is disable Sean Christopherson
@ 2022-07-29  2:12   ` Kai Huang
  2022-07-29 15:21     ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Kai Huang @ 2022-07-29  2:12 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> Disable SEV-ES if MMIO caching is disabled as SEV-ES relies on MMIO SPTEs
> generating #NPF(RSVD), which are reflected by the CPU into the guest as
> a #VC.  With SEV-ES, the untrusted host, a.k.a. KVM, doesn't have access
> to the guest instruction stream or register state and so can't directly
> emulate in response to a #NPF on an emulated MMIO GPA.  Disabling MMIO
> caching means guest accesses to emulated MMIO ranges cause #NPF(!PRESENT),
> and those flavors of #NPF cause automatic VM-Exits, not #VC.
> 
> Fixes: b09763da4dd8 ("KVM: x86/mmu: Add module param to disable MMIO caching (for testing)")
> Reported-by: Michael Roth <michael.roth@amd.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu.h      |  2 ++
>  arch/x86/kvm/mmu/spte.c |  1 +
>  arch/x86/kvm/mmu/spte.h |  2 --
>  arch/x86/kvm/svm/sev.c  | 10 ++++++++++
>  4 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index a99acec925eb..6bdaacb6faa0 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -6,6 +6,8 @@
>  #include "kvm_cache_regs.h"
>  #include "cpuid.h"
>  
> +extern bool __read_mostly enable_mmio_caching;
> +
>  #define PT_WRITABLE_SHIFT 1
>  #define PT_USER_SHIFT 2
>  
> diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
> index 66f76f5a15bd..03ca740bf721 100644
> --- a/arch/x86/kvm/mmu/spte.c
> +++ b/arch/x86/kvm/mmu/spte.c
> @@ -22,6 +22,7 @@
>  bool __read_mostly enable_mmio_caching = true;
>  static bool __ro_after_init allow_mmio_caching;
>  module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
> +EXPORT_SYMBOL_GPL(enable_mmio_caching);
>  
>  u64 __read_mostly shadow_host_writable_mask;
>  u64 __read_mostly shadow_mmu_writable_mask;
> diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
> index 26b144ffd146..9a9414b8d1d6 100644
> --- a/arch/x86/kvm/mmu/spte.h
> +++ b/arch/x86/kvm/mmu/spte.h
> @@ -5,8 +5,6 @@
>  
>  #include "mmu_internal.h"
>  
> -extern bool __read_mostly enable_mmio_caching;
> -
>  /*
>   * A MMU present SPTE is backed by actual memory and may or may not be present
>   * in hardware.  E.g. MMIO SPTEs are not considered present.  Use bit 11, as it
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 309bcdb2f929..05bf6301acac 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -22,6 +22,7 @@
>  #include <asm/trapnr.h>
>  #include <asm/fpu/xcr.h>
>  
> +#include "mmu.h"
>  #include "x86.h"
>  #include "svm.h"
>  #include "svm_ops.h"
> @@ -2205,6 +2206,15 @@ void __init sev_hardware_setup(void)
>  	if (!sev_es_enabled)
>  		goto out;
>  
> +	/*
> +	 * SEV-ES requires MMIO caching as KVM doesn't have access to the guest
> +	 * instruction stream, i.e. can't emulate in response to a #NPF and
> +	 * instead relies on #NPF(RSVD) being reflected into the guest as #VC
> +	 * (the guest can then do a #VMGEXIT to request MMIO emulation).
> +	 */
> +	if (!enable_mmio_caching)
> +		goto out;
> +
> 

I am not familiar with SEV, but looks it is similar to TDX -- they both causes
#VE to guest instead of faulting into KVM.  And they both require explicit call
from guest to do MMIO.

In this case, does existing MMIO caching logic still apply to them?  Should we
still treat SEV and TDX's MMIO handling as MMIO caching being enabled?  Or
perhaps another variable?

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/4] KVM: x86: Tag kvm_mmu_x86_module_init() with __init
  2022-07-28 22:17 ` [PATCH 1/4] KVM: x86: Tag kvm_mmu_x86_module_init() with __init Sean Christopherson
@ 2022-07-29  2:14   ` Kai Huang
  0 siblings, 0 replies; 23+ messages in thread
From: Kai Huang @ 2022-07-29  2:14 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> Mark kvm_mmu_x86_module_init() with __init, the entire reason it exists
> is to initialize variables when kvm.ko is loaded, i.e. it must never be
> called after module initialization.
> 
> Fixes: 1d0e84806047 ("KVM: x86/mmu: Resolve nx_huge_pages when kvm.ko is loaded")
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 2 +-
>  arch/x86/kvm/mmu/mmu.c          | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index e8281d64a431..5ffa578cafe1 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1704,7 +1704,7 @@ static inline int kvm_arch_flush_remote_tlb(struct kvm *kvm)
>  #define kvm_arch_pmi_in_guest(vcpu) \
>  	((vcpu) && (vcpu)->arch.handling_intr_from_guest)
>  
> -void kvm_mmu_x86_module_init(void);
> +void __init kvm_mmu_x86_module_init(void);
>  int kvm_mmu_vendor_module_init(void);
>  void kvm_mmu_vendor_module_exit(void);
>  
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 8e477333a263..2975fcb14c86 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6700,7 +6700,7 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
>   * nx_huge_pages needs to be resolved to true/false when kvm.ko is loaded, as
>   * its default value of -1 is technically undefined behavior for a boolean.
>   */
> -void kvm_mmu_x86_module_init(void)
> +void __init kvm_mmu_x86_module_init(void)
>  {
>  	if (nx_huge_pages == -1)
>  		__set_nx_huge_pages(get_nx_auto_mode());

Reviewed-by: Kai Huang <kai.huang@intel.com>

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-07-28 22:17 ` [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change Sean Christopherson
@ 2022-07-29  2:39   ` Kai Huang
  2022-07-29 15:07     ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Kai Huang @ 2022-07-29  2:39 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: kvm, linux-kernel, Michael Roth, Tom Lendacky

On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> Fully re-evaluate whether or not MMIO caching can be enabled when SPTE
> masks change; simply clearing enable_mmio_caching when a configuration
> isn't compatible with caching fails to handle the scenario where the
> masks are updated, e.g. by VMX for EPT or by SVM to account for the C-bit
> location, and toggle compatibility from false=>true.
> 
> Snapshot the original module param so that re-evaluating MMIO caching
> preserves userspace's desire to allow caching.  Use a snapshot approach
> so that enable_mmio_caching still reflects KVM's actual behavior.
> 
> Fixes: 8b9e74bfbf8c ("KVM: x86/mmu: Use enable_mmio_caching to track if MMIO caching is enabled")
> Reported-by: Michael Roth <michael.roth@amd.com>
> Cc: Tom Lendacky <thomas.lendacky@amd.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c  |  4 ++++
>  arch/x86/kvm/mmu/spte.c | 19 +++++++++++++++++++
>  arch/x86/kvm/mmu/spte.h |  1 +
>  3 files changed, 24 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 2975fcb14c86..660f58928252 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6699,11 +6699,15 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
>  /*
>   * nx_huge_pages needs to be resolved to true/false when kvm.ko is loaded, as
>   * its default value of -1 is technically undefined behavior for a boolean.
> + * Forward the module init call to SPTE code so that it too can handle module
> + * params that need to be resolved/snapshot.
>   */
>  void __init kvm_mmu_x86_module_init(void)
>  {
>  	if (nx_huge_pages == -1)
>  		__set_nx_huge_pages(get_nx_auto_mode());
> +
> +	kvm_mmu_spte_module_init();
>  }
>  
>  /*
> diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
> index 7314d27d57a4..66f76f5a15bd 100644
> --- a/arch/x86/kvm/mmu/spte.c
> +++ b/arch/x86/kvm/mmu/spte.c
> @@ -20,6 +20,7 @@
>  #include <asm/vmx.h>
>  
>  bool __read_mostly enable_mmio_caching = true;
> +static bool __ro_after_init allow_mmio_caching;
>  module_param_named(mmio_caching, enable_mmio_caching, bool, 0444);
>  
>  u64 __read_mostly shadow_host_writable_mask;
> @@ -43,6 +44,18 @@ u64 __read_mostly shadow_nonpresent_or_rsvd_lower_gfn_mask;
>  
>  u8 __read_mostly shadow_phys_bits;
>  
> +void __init kvm_mmu_spte_module_init(void)
> +{
> +	/*
> +	 * Snapshot userspace's desire to allow MMIO caching.  Whether or not
> +	 * KVM can actually enable MMIO caching depends on vendor-specific
> +	 * hardware capabilities and other module params that can't be resolved
> +	 * until the vendor module is loaded, i.e. enable_mmio_caching can and
> +	 * will change when the vendor module is (re)loaded.
> +	 */
> +	allow_mmio_caching = enable_mmio_caching;
> +}
> +
>  static u64 generation_mmio_spte_mask(u64 gen)
>  {
>  	u64 mask;
> @@ -340,6 +353,12 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
>  	BUG_ON((u64)(unsigned)access_mask != access_mask);
>  	WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);
>  
> +	/*
> +	 * Reset to the original module param value to honor userspace's desire
> +	 * to (dis)allow MMIO caching.  Update the param itself so that
> +	 * userspace can see whether or not KVM is actually using MMIO caching.
> +	 */
> +	enable_mmio_caching = allow_mmio_caching;

I think the problem comes from MMIO caching mask/value are firstly set in
kvm_mmu_reset_all_pte_masks() (which calls kvm_mmu_set_mmio_spte_mask() and may
change enable_mmio_caching), and later vendor specific code _may_ or _may_not_
call kvm_mmu_set_mmio_spte_mask() again to adjust the mask/value.  And when it
does, the second call from vendor specific code shouldn't depend on the
'enable_mmio_caching' value calculated in the first call in 
kvm_mmu_reset_all_pte_masks().

Instead of using 'allow_mmio_caching', should we just remove
kvm_mmu_set_mmio_spte_mask() in kvm_mmu_reset_all_pte_masks() and enforce vendor
specific code to always call kvm_mmu_set_mmio_spte_mask() depending on whatever
hardware feature the vendor uses?

I am suggesting this way because in Isaku's TDX patch

[PATCH v7 037/102] KVM: x86/mmu: Track shadow MMIO value/mask on a per-VM basis

we will enable per-VM MMIO mask/value, which will remove global
shadow_mmio_mask/shadow_mmio_value, and I am already suggesting something
similar:

https://lore.kernel.org/all/20220719084737.GU1379820@ls.amr.corp.intel.com/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-07-29  2:39   ` Kai Huang
@ 2022-07-29 15:07     ` Sean Christopherson
  2022-08-01  9:24       ` Kai Huang
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-07-29 15:07 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Fri, Jul 29, 2022, Kai Huang wrote:
> On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> > Fully re-evaluate whether or not MMIO caching can be enabled when SPTE
> > masks change; simply clearing enable_mmio_caching when a configuration
> > isn't compatible with caching fails to handle the scenario where the
> > masks are updated, e.g. by VMX for EPT or by SVM to account for the C-bit
> > location, and toggle compatibility from false=>true.
> > 
> > Snapshot the original module param so that re-evaluating MMIO caching
> > preserves userspace's desire to allow caching.  Use a snapshot approach
> > so that enable_mmio_caching still reflects KVM's actual behavior.
> > 

..

> > @@ -340,6 +353,12 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
> >  	BUG_ON((u64)(unsigned)access_mask != access_mask);
> >  	WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);
> >  
> > +	/*
> > +	 * Reset to the original module param value to honor userspace's desire
> > +	 * to (dis)allow MMIO caching.  Update the param itself so that
> > +	 * userspace can see whether or not KVM is actually using MMIO caching.
> > +	 */
> > +	enable_mmio_caching = allow_mmio_caching;
> 
> I think the problem comes from MMIO caching mask/value are firstly set in
> kvm_mmu_reset_all_pte_masks() (which calls kvm_mmu_set_mmio_spte_mask() and may
> change enable_mmio_caching), and later vendor specific code _may_ or _may_not_
> call kvm_mmu_set_mmio_spte_mask() again to adjust the mask/value.  And when it
> does, the second call from vendor specific code shouldn't depend on the
> 'enable_mmio_caching' value calculated in the first call in 
> kvm_mmu_reset_all_pte_masks().

Correct.

> Instead of using 'allow_mmio_caching', should we just remove
> kvm_mmu_set_mmio_spte_mask() in kvm_mmu_reset_all_pte_masks() and enforce vendor
> specific code to always call kvm_mmu_set_mmio_spte_mask() depending on whatever
> hardware feature the vendor uses?

Hmm, I'd rather not force vendor code to duplicate the "basic" functionality.
It's somewhat silly to preserve the common path since both SVM and VMX need to
override it, but on the other hand those overrides are conditional.

Case in point, if I'm reading the below patch correctly, svm_shadow_mmio_mask will
be left '0' if the platform doesn't support memory encryption (svm_adjust_mmio_mask()
will bail early).  That's a solvable problem, but then I think KVM just ends up
punting this issue to SVM to some extent.

Another flaw in the below patch is that enable_mmio_caching doesn't need to be
tracked on a per-VM basis.  VMX with EPT can have different masks, but barring a
massive change in KVM or hardware, there will never be a scenario where caching is
enabled for one VM but not another.

And isn't the below patch also broken for TDX?  For TDX, unless things have changed,
the mask+value is supposed to be SUPPRES_VE==0 && RWX==0.  So either KVM is generating
the wrong mask (MAXPHYADDR < 51), or KVM is incorrectly marking MMIO caching as disabled
in the TDX case.

Lastly, in prepration for TDX, enable_mmio_caching should be changed to key off
of the _mask_, not the value.  E.g. for TDX, the value will be '0', but the mask
should be SUPPRESS_VE | RWX.

> I am suggesting this way because in Isaku's TDX patch
> 
> [PATCH v7 037/102] KVM: x86/mmu: Track shadow MMIO value/mask on a per-VM basis
> 
> we will enable per-VM MMIO mask/value, which will remove global
> shadow_mmio_mask/shadow_mmio_value, and I am already suggesting something
> similar:
> 
> https://lore.kernel.org/all/20220719084737.GU1379820@ls.amr.corp.intel.com/
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] KVM: SVM: Disable SEV-ES support if MMIO caching is disable
  2022-07-29  2:12   ` Kai Huang
@ 2022-07-29 15:21     ` Sean Christopherson
  2022-08-01  9:30       ` Kai Huang
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-07-29 15:21 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Fri, Jul 29, 2022, Kai Huang wrote:
> On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> > Disable SEV-ES if MMIO caching is disabled as SEV-ES relies on MMIO SPTEs
> > generating #NPF(RSVD), which are reflected by the CPU into the guest as
> > a #VC.  With SEV-ES, the untrusted host, a.k.a. KVM, doesn't have access
> > to the guest instruction stream or register state and so can't directly
> > emulate in response to a #NPF on an emulated MMIO GPA.  Disabling MMIO
> > caching means guest accesses to emulated MMIO ranges cause #NPF(!PRESENT),
> > and those flavors of #NPF cause automatic VM-Exits, not #VC.
> > 
> > Fixes: b09763da4dd8 ("KVM: x86/mmu: Add module param to disable MMIO caching (for testing)")
> > Reported-by: Michael Roth <michael.roth@amd.com>
> > Cc: Tom Lendacky <thomas.lendacky@amd.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---

...

> > +	/*
> > +	 * SEV-ES requires MMIO caching as KVM doesn't have access to the guest
> > +	 * instruction stream, i.e. can't emulate in response to a #NPF and
> > +	 * instead relies on #NPF(RSVD) being reflected into the guest as #VC
> > +	 * (the guest can then do a #VMGEXIT to request MMIO emulation).
> > +	 */
> > +	if (!enable_mmio_caching)
> > +		goto out;
> > +
> > 
> 
> I am not familiar with SEV, but looks it is similar to TDX -- they both causes
> #VE to guest instead of faulting into KVM.  And they both require explicit call
> from guest to do MMIO.
> 
> In this case, does existing MMIO caching logic still apply to them?

Yes, because TDX/SEV-ES+ need to generate #VE/#VC on emulated MMIO so that legacy
(or intentionally unenlightened) software in the guest doesn't simply hang/die on
memory accesses to emulated MMIO (as opposed to direct TDVMCALL/#VMGEXIT).

> Should we still treat SEV and TDX's MMIO handling as MMIO caching being
> enabled?  Or perhaps another variable?

I don't think a separate variable is necesary.  At its core, KVM is still caching
MMIO GPAs via magic SPTE values.  The fact that it's required for functionality
doesn't make the name wrong.

SEV-ES+ in particular doesn't have a strong guarantee that inducing #VC via #NPF(RSVD)
is always possible.  Theoretically, an SEV-ES+ capable CPU could ship with an effective
MAXPHYADDR=51 (after reducing the raw MAXPHYADDR) and C-bit=51, in which case there are
no resered PA bits and thus no reserved PTE bits at all.  That's obviously unlikely to
happen, but if it does come up, then disabling SEV-ES+ due to MMIO caching not being
possible is the desired behavior, e.g. either the CPU configuration is bad or KVM is
lacking support for a newfangled way to support emulated MMIO (in a future theoretical
SEV-* product).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/4] KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup
  2022-07-29  2:06   ` Kai Huang
@ 2022-07-29 18:15     ` Sean Christopherson
  0 siblings, 0 replies; 23+ messages in thread
From: Sean Christopherson @ 2022-07-29 18:15 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Fri, Jul 29, 2022, Kai Huang wrote:
> On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> > Adjust KVM's MMIO masks to account for the C-bit location prior to doing
> > SEV(-ES) setup.  A future patch will consume enable_mmio caching during
> > SEV setup as SEV-ES _requires_ MMIO caching, i.e. KVM needs to disallow
> > SEV-ES if MMIO caching is disabled.
> > 
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >  arch/x86/kvm/svm/svm.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index aef63aae922d..62e89db83bc1 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -5034,13 +5034,16 @@ static __init int svm_hardware_setup(void)
> >  	/* Setup shadow_me_value and shadow_me_mask */
> >  	kvm_mmu_set_me_spte_mask(sme_me_mask, sme_me_mask);
> >  
> > -	/* Note, SEV setup consumes npt_enabled. */
> > +	svm_adjust_mmio_mask();
> > +
> > +	/*
> > +	 * Note, SEV setup consumes npt_enabled and enable_mmio_caching (which
> > +	 * may be modified by svm_adjust_mmio_mask()).
> > +	 */
> >  	sev_hardware_setup();
> 
> If I am not seeing mistakenly, the code in latest queue branch doesn't consume
> enable_mmio_caching.  It is only added in your later patch.
> 
> So perhaps adjust the comment or merge patches together?

Oooh, I see what you're saying.  I split the patches so that if this movement turns
out to break something then bisection will point directly here, but that's a pretty
weak argument since both patches are tiny.  And taking patch 4 without patch 3,
e.g. in the unlikely event this movement needs to be reverted, is probably worse
than not having patch 4 at all, i.e. having somewhat obvious breakage is better.

So yeah, I'll squash this with patch 4.

Thanks!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-07-29 15:07     ` Sean Christopherson
@ 2022-08-01  9:24       ` Kai Huang
  2022-08-01 14:15         ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Kai Huang @ 2022-08-01  9:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Fri, 2022-07-29 at 15:07 +0000, Sean Christopherson wrote:
> On Fri, Jul 29, 2022, Kai Huang wrote:
> > On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> > > Fully re-evaluate whether or not MMIO caching can be enabled when SPTE
> > > masks change; simply clearing enable_mmio_caching when a configuration
> > > isn't compatible with caching fails to handle the scenario where the
> > > masks are updated, e.g. by VMX for EPT or by SVM to account for the C-bit
> > > location, and toggle compatibility from false=>true.
> > > 
> > > Snapshot the original module param so that re-evaluating MMIO caching
> > > preserves userspace's desire to allow caching.  Use a snapshot approach
> > > so that enable_mmio_caching still reflects KVM's actual behavior.
> > > 
> 
> ..
> 
> > > @@ -340,6 +353,12 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask)
> > >  	BUG_ON((u64)(unsigned)access_mask != access_mask);
> > >  	WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask);
> > >  
> > > +	/*
> > > +	 * Reset to the original module param value to honor userspace's desire
> > > +	 * to (dis)allow MMIO caching.  Update the param itself so that
> > > +	 * userspace can see whether or not KVM is actually using MMIO caching.
> > > +	 */
> > > +	enable_mmio_caching = allow_mmio_caching;
> > 
> > I think the problem comes from MMIO caching mask/value are firstly set in
> > kvm_mmu_reset_all_pte_masks() (which calls kvm_mmu_set_mmio_spte_mask() and may
> > change enable_mmio_caching), and later vendor specific code _may_ or _may_not_
> > call kvm_mmu_set_mmio_spte_mask() again to adjust the mask/value.  And when it
> > does, the second call from vendor specific code shouldn't depend on the
> > 'enable_mmio_caching' value calculated in the first call in 
> > kvm_mmu_reset_all_pte_masks().
> 
> Correct.
> 
> > Instead of using 'allow_mmio_caching', should we just remove
> > kvm_mmu_set_mmio_spte_mask() in kvm_mmu_reset_all_pte_masks() and enforce vendor
> > specific code to always call kvm_mmu_set_mmio_spte_mask() depending on whatever
> > hardware feature the vendor uses?
> 
> Hmm, I'd rather not force vendor code to duplicate the "basic" functionality.
> It's somewhat silly to preserve the common path since both SVM and VMX need to
> override it, but on the other hand those overrides are conditional.

OK. 

> 
> Case in point, if I'm reading the below patch correctly, svm_shadow_mmio_mask will
> be left '0' if the platform doesn't support memory encryption (svm_adjust_mmio_mask()
> will bail early).  That's a solvable problem, but then I think KVM just ends up
> punting this issue to SVM to some extent.

That patch is not supposed to have any functional change to AMD. Yes this needs
to be fixed if we go with solution.

> 
> Another flaw in the below patch is that enable_mmio_caching doesn't need to be
> tracked on a per-VM basis.  VMX with EPT can have different masks, but barring a
> massive change in KVM or hardware, there will never be a scenario where caching is
> enabled for one VM but not another.

Yeah it looks so, if we always treat TDX guest must have enable_mmio_caching
enabled (reading your reply to patch 4).

> 
> And isn't the below patch also broken for TDX?  For TDX, unless things have changed,
> the mask+value is supposed to be SUPPRES_VE==0 && RWX==0.  So either KVM is generating
> the wrong mask (MAXPHYADDR < 51), or KVM is incorrectly marking MMIO caching as disabled
> in the TDX case.

The MMIO mask/value set for TDX will come in another patch.  My suggestion is
this patch is some kinda infrastructural patch which doesn't bring any
functional change.
 
> 
> Lastly, in prepration for TDX, enable_mmio_caching should be changed to key off
> of the _mask_, not the value.  E.g. for TDX, the value will be '0', but the mask
> should be SUPPRESS_VE | RWX.

Agreed.  But perhaps in another patch.  We need to re-define what does
mask/value mean to enable_mmio_caching.

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 4/4] KVM: SVM: Disable SEV-ES support if MMIO caching is disable
  2022-07-29 15:21     ` Sean Christopherson
@ 2022-08-01  9:30       ` Kai Huang
  0 siblings, 0 replies; 23+ messages in thread
From: Kai Huang @ 2022-08-01  9:30 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Fri, 2022-07-29 at 15:21 +0000, Sean Christopherson wrote:
> On Fri, Jul 29, 2022, Kai Huang wrote:
> > On Thu, 2022-07-28 at 22:17 +0000, Sean Christopherson wrote:
> > > Disable SEV-ES if MMIO caching is disabled as SEV-ES relies on MMIO SPTEs
> > > generating #NPF(RSVD), which are reflected by the CPU into the guest as
> > > a #VC.  With SEV-ES, the untrusted host, a.k.a. KVM, doesn't have access
> > > to the guest instruction stream or register state and so can't directly
> > > emulate in response to a #NPF on an emulated MMIO GPA.  Disabling MMIO
> > > caching means guest accesses to emulated MMIO ranges cause #NPF(!PRESENT),
> > > and those flavors of #NPF cause automatic VM-Exits, not #VC.
> > > 
> > > Fixes: b09763da4dd8 ("KVM: x86/mmu: Add module param to disable MMIO caching (for testing)")
> > > Reported-by: Michael Roth <michael.roth@amd.com>
> > > Cc: Tom Lendacky <thomas.lendacky@amd.com>
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > > ---
> 
> ...
> 
> > > +	/*
> > > +	 * SEV-ES requires MMIO caching as KVM doesn't have access to the guest
> > > +	 * instruction stream, i.e. can't emulate in response to a #NPF and
> > > +	 * instead relies on #NPF(RSVD) being reflected into the guest as #VC
> > > +	 * (the guest can then do a #VMGEXIT to request MMIO emulation).
> > > +	 */
> > > +	if (!enable_mmio_caching)
> > > +		goto out;
> > > +
> > > 
> > 
> > I am not familiar with SEV, but looks it is similar to TDX -- they both causes
> > #VE to guest instead of faulting into KVM.  And they both require explicit call
> > from guest to do MMIO.
> > 
> > In this case, does existing MMIO caching logic still apply to them?
> 
> Yes, because TDX/SEV-ES+ need to generate #VE/#VC on emulated MMIO so that legacy
> (or intentionally unenlightened) software in the guest doesn't simply hang/die on
> memory accesses to emulated MMIO (as opposed to direct TDVMCALL/#VMGEXIT).
> 
> > Should we still treat SEV and TDX's MMIO handling as MMIO caching being
> > enabled?  Or perhaps another variable?
> 
> I don't think a separate variable is necesary.  At its core, KVM is still caching
> MMIO GPAs via magic SPTE values.  The fact that it's required for functionality
> doesn't make the name wrong.

OK.

> 
> SEV-ES+ in particular doesn't have a strong guarantee that inducing #VC via #NPF(RSVD)
> is always possible.  Theoretically, an SEV-ES+ capable CPU could ship with an effective
> MAXPHYADDR=51 (after reducing the raw MAXPHYADDR) and C-bit=51, in which case there are
> no resered PA bits and thus no reserved PTE bits at all.  That's obviously unlikely to
> happen, but if it does come up, then disabling SEV-ES+ due to MMIO caching not being
> possible is the desired behavior, e.g. either the CPU configuration is bad or KVM is
> lacking support for a newfangled way to support emulated MMIO (in a future theoretical
> SEV-* product).

I bet AMD will see this (your) response and never ship such chips :)

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-08-01  9:24       ` Kai Huang
@ 2022-08-01 14:15         ` Sean Christopherson
  2022-08-01 20:46           ` Kai Huang
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-08-01 14:15 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Mon, Aug 01, 2022, Kai Huang wrote:
> On Fri, 2022-07-29 at 15:07 +0000, Sean Christopherson wrote:
> > Lastly, in prepration for TDX, enable_mmio_caching should be changed to key off
> > of the _mask_, not the value.  E.g. for TDX, the value will be '0', but the mask
> > should be SUPPRESS_VE | RWX.
> 
> Agreed.  But perhaps in another patch.  We need to re-define what does
> mask/value mean to enable_mmio_caching.

There's no need to redefine what they mean, the only change that needs to be made
is handle the scenario where desire value is '0'.  Maybe that's all you mean by
"redefine"?

Another thing to note is that only the value needs to be per-VM, the mask can be
KVM-wide, i.e. "mask = SUPPRESS_VE | RWX" will work for TDX and non-TDX VMs when
EPT is enabled.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-08-01 14:15         ` Sean Christopherson
@ 2022-08-01 20:46           ` Kai Huang
  2022-08-01 23:20             ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Kai Huang @ 2022-08-01 20:46 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Mon, 2022-08-01 at 14:15 +0000, Sean Christopherson wrote:
> On Mon, Aug 01, 2022, Kai Huang wrote:
> > On Fri, 2022-07-29 at 15:07 +0000, Sean Christopherson wrote:
> > > Lastly, in prepration for TDX, enable_mmio_caching should be changed to key off
> > > of the _mask_, not the value.  E.g. for TDX, the value will be '0', but the mask
> > > should be SUPPRESS_VE | RWX.
> > 
> > Agreed.  But perhaps in another patch.  We need to re-define what does
> > mask/value mean to enable_mmio_caching.
> 
> There's no need to redefine what they mean, the only change that needs to be made
> is handle the scenario where desire value is '0'.  Maybe that's all you mean by
> "redefine"?

My thinking is only when mask and value both are 0, enable_mmio_caching is
considered disabled.  vlaue=0 is valid when enable_mmio_caching is true as you
said.

> 
> Another thing to note is that only the value needs to be per-VM, the mask can be
> KVM-wide, i.e. "mask = SUPPRESS_VE | RWX" will work for TDX and non-TDX VMs when
> EPT is enabled.

Yeah, but is more like VMX and TDX both *happen* to have the same mask? 
Theoretically,  VMX only need RWX to trigger EPT misconfiguration but doesn't
need SUPPRESS_VE.

I don't see making mask/value both per-vm is a big issue?

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-08-01 20:46           ` Kai Huang
@ 2022-08-01 23:20             ` Sean Christopherson
  2022-08-02  0:05               ` Kai Huang
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-08-01 23:20 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Tue, Aug 02, 2022, Kai Huang wrote:
> On Mon, 2022-08-01 at 14:15 +0000, Sean Christopherson wrote:
> > Another thing to note is that only the value needs to be per-VM, the mask can be
> > KVM-wide, i.e. "mask = SUPPRESS_VE | RWX" will work for TDX and non-TDX VMs when
> > EPT is enabled.
> 
> Yeah, but is more like VMX and TDX both *happen* to have the same mask? 
> Theoretically,  VMX only need RWX to trigger EPT misconfiguration but doesn't
> need SUPPRESS_VE.

Right, SUPPRESS_VE isn't strictly necessary, but KVM already deliberately avoids
bit 63 because it has meaning, e.g. SUPPRESS_VE for EPT and NX for PAE and 64-bit
paging.  

> I don't see making mask/value both per-vm is a big issue?

Yes and no.

No, in the sense that it's not a big issue in terms of code.  

Yes, because of the connotations of having a per-VM mask.  While having SUPPRESS_VE
in the mask for non-TDX EPT isn't strictly necessary, it's also not strictly necessary
to _not_ have it in the mask.  In other words, having a per-VM mask incorrectly
implies that TDX _must_ have a different mask.

It's also one more piece of information that developers have to track down and
account for, i.e. one more thing we can screw up.

The other aspect of MMIO SPTEs are that the mask bits must not overlap the generation
bits or shadow-present bit, and changing any of those bits requires careful
consideration, i.e. defining the set of _allowed_ mask bits on a per-VM basis would
incur significant complexity without providing meaningful benefit.  As a result,
it's highly unlikely that we'll ever want to opportunsitically "reclaim" bit 63
for MMIO SPTEs, so there's practically zero cost if it's included in the mask for
non-TDX EPT.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-08-01 23:20             ` Sean Christopherson
@ 2022-08-02  0:05               ` Kai Huang
  2022-08-02 21:15                 ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Kai Huang @ 2022-08-02  0:05 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Mon, 2022-08-01 at 23:20 +0000, Sean Christopherson wrote:
> On Tue, Aug 02, 2022, Kai Huang wrote:
> > On Mon, 2022-08-01 at 14:15 +0000, Sean Christopherson wrote:
> > > Another thing to note is that only the value needs to be per-VM, the mask can be
> > > KVM-wide, i.e. "mask = SUPPRESS_VE | RWX" will work for TDX and non-TDX VMs when
> > > EPT is enabled.
> > 
> > Yeah, but is more like VMX and TDX both *happen* to have the same mask? 
> > Theoretically,  VMX only need RWX to trigger EPT misconfiguration but doesn't
> > need SUPPRESS_VE.
> 
> Right, SUPPRESS_VE isn't strictly necessary, but KVM already deliberately avoids
> bit 63 because it has meaning, e.g. SUPPRESS_VE for EPT and NX for PAE and 64-bit
> paging.  
> 
> > I don't see making mask/value both per-vm is a big issue?
> 
> Yes and no.
> 
> No, in the sense that it's not a big issue in terms of code.  
> 
> Yes, because of the connotations of having a per-VM mask.  While having SUPPRESS_VE
> in the mask for non-TDX EPT isn't strictly necessary, it's also not strictly necessary
> to _not_ have it in the mask.  
> 

I think the 'mask' itself is ambiguous, i.e. it doesn't say in what circumstance
we should include one bit to the mask.  My understanding is any bit in the
'mask' should at least be related to the 'value' that can enable MMIO caching.

So if SUPPRESS_VE bit is not related to non-TDX EPT (as we want EPT
misconfiguration, but not EPT violation), I don't see why we need to include it
to the  'mask'.

> In other words, having a per-VM mask incorrectly
> implies that TDX _must_ have a different mask.

I interpret as TDX _can_, but not _must_. 

> 
> It's also one more piece of information that developers have to track down and
> account for, i.e. one more thing we can screw up.
> 
> The other aspect of MMIO SPTEs are that the mask bits must not overlap the generation
> bits or shadow-present bit, and changing any of those bits requires careful
> consideration, i.e. defining the set of _allowed_ mask bits on a per-VM basis would
> incur significant complexity without providing meaningful benefit.  
> 

Agreed on this.

But we are not checking any of those in kvm_mmu_set_mmio_spte_mask(), right? :)

Also Isaku's patch extends kvm_mmu_set_mmio_spte_mask() to take 'kvm' or 'vcpu'
as parameter so it's easy to check there -- not 100% sure about other places,
though.

> As a result,
> it's highly unlikely that we'll ever want to opportunsitically "reclaim" bit 63
> for MMIO SPTEs, so there's practically zero cost if it's included in the mask for
> non-TDX EPT.

Sorry I don't understand this.  If we will never "reclaim" bit 63 for MMIO SPTEs
(for non-TDX EPT), then why bother including it to the mask?

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-08-02  0:05               ` Kai Huang
@ 2022-08-02 21:15                 ` Sean Christopherson
  2022-08-02 22:19                   ` Kai Huang
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-08-02 21:15 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Tue, Aug 02, 2022, Kai Huang wrote:
> On Mon, 2022-08-01 at 23:20 +0000, Sean Christopherson wrote:
> > On Tue, Aug 02, 2022, Kai Huang wrote:
> > > On Mon, 2022-08-01 at 14:15 +0000, Sean Christopherson wrote:
> > > > Another thing to note is that only the value needs to be per-VM, the mask can be
> > > > KVM-wide, i.e. "mask = SUPPRESS_VE | RWX" will work for TDX and non-TDX VMs when
> > > > EPT is enabled.
> > > 
> > > Yeah, but is more like VMX and TDX both *happen* to have the same mask? 
> > > Theoretically,  VMX only need RWX to trigger EPT misconfiguration but doesn't
> > > need SUPPRESS_VE.
> > 
> > Right, SUPPRESS_VE isn't strictly necessary, but KVM already deliberately avoids
> > bit 63 because it has meaning, e.g. SUPPRESS_VE for EPT and NX for PAE and 64-bit
> > paging.  
> > 
> > > I don't see making mask/value both per-vm is a big issue?
> > 
> > Yes and no.
> > 
> > No, in the sense that it's not a big issue in terms of code.  
> > 
> > Yes, because of the connotations of having a per-VM mask.  While having SUPPRESS_VE
> > in the mask for non-TDX EPT isn't strictly necessary, it's also not strictly necessary
> > to _not_ have it in the mask.  
> > 
> 
> I think the 'mask' itself is ambiguous, i.e. it doesn't say in what circumstance
> we should include one bit to the mask.  My understanding is any bit in the
> 'mask' should at least be related to the 'value' that can enable MMIO caching.

The purpose of the mask isn't ambiguous, though it's definitely not well documented.
The mask defines what bits should be included in the check to identify an MMIO SPTE.
 
> So if SUPPRESS_VE bit is not related to non-TDX EPT (as we want EPT
> misconfiguration, but not EPT violation), I don't see why we need to include it
> to the  'mask'.

Again, it's not strictly necessary, but by doing so we don't need a per-VM mask.
And KVM should also never set SUPPRESS_VE for MMIO SPTEs, i.e. checking that bit
by including it in the mask adds some sanitcy check (albeit a miniscule amount).

> > In other words, having a per-VM mask incorrectly implies that TDX _must_
> > have a different mask.
> 
> I interpret as TDX _can_, but not _must_. 

Right, but if we write the KVM code such that it doesn't have a different mask,
then even that "can" is wrong/misleading.

> > It's also one more piece of information that developers have to track down and
> > account for, i.e. one more thing we can screw up.
> > 
> > The other aspect of MMIO SPTEs are that the mask bits must not overlap the generation
> > bits or shadow-present bit, and changing any of those bits requires careful
> > consideration, i.e. defining the set of _allowed_ mask bits on a per-VM basis would
> > incur significant complexity without providing meaningful benefit.  
> > 
> 
> Agreed on this.
> 
> But we are not checking any of those in kvm_mmu_set_mmio_spte_mask(), right? :)

No, but we really should.

> Also Isaku's patch extends kvm_mmu_set_mmio_spte_mask() to take 'kvm' or 'vcpu'
> as parameter so it's easy to check there -- not 100% sure about other places,
> though.
> 
> > As a result,
> > it's highly unlikely that we'll ever want to opportunsitically "reclaim" bit 63
> > for MMIO SPTEs, so there's practically zero cost if it's included in the mask for
> > non-TDX EPT.
> 
> Sorry I don't understand this.  If we will never "reclaim" bit 63 for MMIO SPTEs
> (for non-TDX EPT), then why bother including it to the mask?

Because then we don't need to track a per-VM mask.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-08-02 21:15                 ` Sean Christopherson
@ 2022-08-02 22:19                   ` Kai Huang
  2022-08-02 23:05                     ` Sean Christopherson
  0 siblings, 1 reply; 23+ messages in thread
From: Kai Huang @ 2022-08-02 22:19 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Tue, 2022-08-02 at 21:15 +0000, Sean Christopherson wrote:
> On Tue, Aug 02, 2022, Kai Huang wrote:
> > On Mon, 2022-08-01 at 23:20 +0000, Sean Christopherson wrote:
> > > On Tue, Aug 02, 2022, Kai Huang wrote:
> > > > On Mon, 2022-08-01 at 14:15 +0000, Sean Christopherson wrote:
> > > > > Another thing to note is that only the value needs to be per-VM, the mask can be
> > > > > KVM-wide, i.e. "mask = SUPPRESS_VE | RWX" will work for TDX and non-TDX VMs when
> > > > > EPT is enabled.
> > > > 
> > > > Yeah, but is more like VMX and TDX both *happen* to have the same mask? 
> > > > Theoretically,  VMX only need RWX to trigger EPT misconfiguration but doesn't
> > > > need SUPPRESS_VE.
> > > 
> > > Right, SUPPRESS_VE isn't strictly necessary, but KVM already deliberately avoids
> > > bit 63 because it has meaning, e.g. SUPPRESS_VE for EPT and NX for PAE and 64-bit
> > > paging.  
> > > 
> > > > I don't see making mask/value both per-vm is a big issue?
> > > 
> > > Yes and no.
> > > 
> > > No, in the sense that it's not a big issue in terms of code.  
> > > 
> > > Yes, because of the connotations of having a per-VM mask.  While having SUPPRESS_VE
> > > in the mask for non-TDX EPT isn't strictly necessary, it's also not strictly necessary
> > > to _not_ have it in the mask.  
> > > 
> > 
> > I think the 'mask' itself is ambiguous, i.e. it doesn't say in what circumstance
> > we should include one bit to the mask.  My understanding is any bit in the
> > 'mask' should at least be related to the 'value' that can enable MMIO caching.
> 
> The purpose of the mask isn't ambiguous, though it's definitely not well documented.
> The mask defines what bits should be included in the check to identify an MMIO SPTE.

Yes this is true.

>  
> > So if SUPPRESS_VE bit is not related to non-TDX EPT (as we want EPT
> > misconfiguration, but not EPT violation), I don't see why we need to include it
> > to the  'mask'.
> 
> Again, it's not strictly necessary, but by doing so we don't need a per-VM mask.
> And KVM should also never set SUPPRESS_VE for MMIO SPTEs, i.e. checking that bit
> by including it in the mask adds some sanitcy check (albeit a miniscule amount).

OK.

> 
> > > In other words, having a per-VM mask incorrectly implies that TDX _must_
> > > have a different mask.
> > 
> > I interpret as TDX _can_, but not _must_. 
> 
> Right, but if we write the KVM code such that it doesn't have a different mask,
> then even that "can" is wrong/misleading.
> 
> > > It's also one more piece of information that developers have to track down and
> > > account for, i.e. one more thing we can screw up.
> > > 
> > > The other aspect of MMIO SPTEs are that the mask bits must not overlap the generation
> > > bits or shadow-present bit, and changing any of those bits requires careful
> > > consideration, i.e. defining the set of _allowed_ mask bits on a per-VM basis would
> > > incur significant complexity without providing meaningful benefit.  
> > > 
> > 
> > Agreed on this.
> > 
> > But we are not checking any of those in kvm_mmu_set_mmio_spte_mask(), right? :)
> 
> No, but we really should.

I can come up a patch if you are not planning to do so?

> 
> > Also Isaku's patch extends kvm_mmu_set_mmio_spte_mask() to take 'kvm' or 'vcpu'
> > as parameter so it's easy to check there -- not 100% sure about other places,
> > though.
> > 
> > > As a result,
> > > it's highly unlikely that we'll ever want to opportunsitically "reclaim" bit 63
> > > for MMIO SPTEs, so there's practically zero cost if it's included in the mask for
> > > non-TDX EPT.
> > 
> > Sorry I don't understand this.  If we will never "reclaim" bit 63 for MMIO SPTEs
> > (for non-TDX EPT), then why bother including it to the mask?
> 
> Because then we don't need to track a per-VM mask.

OK.

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-08-02 22:19                   ` Kai Huang
@ 2022-08-02 23:05                     ` Sean Christopherson
  2022-08-02 23:42                       ` Kai Huang
  0 siblings, 1 reply; 23+ messages in thread
From: Sean Christopherson @ 2022-08-02 23:05 UTC (permalink / raw)
  To: Kai Huang; +Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Wed, Aug 03, 2022, Kai Huang wrote:
> On Tue, 2022-08-02 at 21:15 +0000, Sean Christopherson wrote:
> > On Tue, Aug 02, 2022, Kai Huang wrote:
> > > But we are not checking any of those in kvm_mmu_set_mmio_spte_mask(), right? :)
> > 
> > No, but we really should.
> 
> I can come up a patch if you are not planning to do so?

Hmm, I'll throw one together, it's simple enough (famous last words) and it'll
need to be tested on AMD as well.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change
  2022-08-02 23:05                     ` Sean Christopherson
@ 2022-08-02 23:42                       ` Kai Huang
  0 siblings, 0 replies; 23+ messages in thread
From: Kai Huang @ 2022-08-02 23:42 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, kvm, linux-kernel, Michael Roth, Tom Lendacky

On Tue, 2022-08-02 at 23:05 +0000, Sean Christopherson wrote:
> On Wed, Aug 03, 2022, Kai Huang wrote:
> > On Tue, 2022-08-02 at 21:15 +0000, Sean Christopherson wrote:
> > > On Tue, Aug 02, 2022, Kai Huang wrote:
> > > > But we are not checking any of those in kvm_mmu_set_mmio_spte_mask(), right? :)
> > > 
> > > No, but we really should.
> > 
> > I can come up a patch if you are not planning to do so?
> 
> Hmm, I'll throw one together, it's simple enough (famous last words) and it'll
> need to be tested on AMD as well.

Sure.

-- 
Thanks,
-Kai



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-08-02 23:43 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-28 22:17 [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Sean Christopherson
2022-07-28 22:17 ` [PATCH 1/4] KVM: x86: Tag kvm_mmu_x86_module_init() with __init Sean Christopherson
2022-07-29  2:14   ` Kai Huang
2022-07-28 22:17 ` [PATCH 2/4] KVM: x86/mmu: Fully re-evaluate MMIO caching when SPTE masks change Sean Christopherson
2022-07-29  2:39   ` Kai Huang
2022-07-29 15:07     ` Sean Christopherson
2022-08-01  9:24       ` Kai Huang
2022-08-01 14:15         ` Sean Christopherson
2022-08-01 20:46           ` Kai Huang
2022-08-01 23:20             ` Sean Christopherson
2022-08-02  0:05               ` Kai Huang
2022-08-02 21:15                 ` Sean Christopherson
2022-08-02 22:19                   ` Kai Huang
2022-08-02 23:05                     ` Sean Christopherson
2022-08-02 23:42                       ` Kai Huang
2022-07-28 22:17 ` [PATCH 3/4] KVM: SVM: Adjust MMIO masks (for caching) before doing SEV(-ES) setup Sean Christopherson
2022-07-29  2:06   ` Kai Huang
2022-07-29 18:15     ` Sean Christopherson
2022-07-28 22:17 ` [PATCH 4/4] KVM: SVM: Disable SEV-ES support if MMIO caching is disable Sean Christopherson
2022-07-29  2:12   ` Kai Huang
2022-07-29 15:21     ` Sean Christopherson
2022-08-01  9:30       ` Kai Huang
2022-07-29  1:09 ` [PATCH 0/4] KVM: x86/mmu: MMIO caching bug fixes Michael Roth

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.