All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap
@ 2022-12-09  4:38 Alexey Kardashevskiy
  2022-12-09  4:38 ` [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables Alexey Kardashevskiy
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-09  4:38 UTC (permalink / raw)
  To: kvm
  Cc: x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Borislav Petkov,
	Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin,
	Adrian Hunter, Jason A. Donenfeld, H. Peter Anvin,
	Alexey Kardashevskiy

This is to use another AMD SEV-ES hardware assisted register swap,
more detail in 2/3. The patches are fairly independend but required in
this order.

This is based on sha1 0d1409e4ff08 and requires something like this
for X86_FEATURE_NO_NESTED_DATA_BP:
https://lkml.org/lkml/2022/11/29/1229

This patchset is pushed out at
https://github.com/aik/linux/commits/debugswap


Please comment. Thanks.

btw the enormous cc: list came from get_maintainer.pl, keep it like
this or tell the script something to reduce the list?


Alexey Kardashevskiy (3):
  x86/amd: Cache values in percpu variables
  KVM: SEV: Enable data breakpoints in SEV-ES
  x86/sev: Do not handle #VC for DR7 read/write

 arch/x86/include/asm/debugreg.h        |  9 +++-
 arch/x86/include/asm/msr-index.h       |  1 +
 arch/x86/include/asm/svm.h             |  1 +
 arch/x86/kvm/svm/svm.h                 | 16 +++++--
 tools/arch/x86/include/asm/msr-index.h |  1 +
 arch/x86/kernel/cpu/amd.c              | 45 ++++++++++++++------
 arch/x86/kernel/sev.c                  |  6 +++
 arch/x86/kvm/svm/sev.c                 | 29 +++++++++++++
 arch/x86/kvm/svm/svm.c                 |  3 +-
 9 files changed, 90 insertions(+), 21 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables
  2022-12-09  4:38 [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
@ 2022-12-09  4:38 ` Alexey Kardashevskiy
  2023-01-10 16:05   ` Borislav Petkov
  2022-12-09  4:38 ` [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-09  4:38 UTC (permalink / raw)
  To: kvm
  Cc: x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Borislav Petkov,
	Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin,
	Adrian Hunter, Jason A. Donenfeld, H. Peter Anvin,
	Alexey Kardashevskiy

Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to
be noticeable with the AMD KVM SEV-ES DebugSwap feature enabled.
KVM is going to store host's DR[0-3] and DR[0-3]_ADDR_MASK before
switching to a guest; the hardware is going to swap these on VMRUN
and VMEXIT.

Store MSR values passsed to set_dr_addr_mask() in percpu values
(when changed) and return them via new amd_get_dr_addr_mask().
The gain here is about 10x.

As amd_set_dr_addr_mask() uses the array too, change the @dr type to
unsigned to avoid checking for <0.

While at it, replace deprecated boot_cpu_has() with cpu_feature_enabled()
in set_dr_addr_mask().

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
---
Changes:
v2:
* reworked to use arrays
* set() skips wrmsr() when the mask is not changed
* added stub for get_dr_addr_mask()
* changed @dr type to unsigned
* s/boot_cpu_has/cpu_feature_enabled/
* added amd_ prefix
---
 arch/x86/include/asm/debugreg.h |  9 +++-
 arch/x86/kernel/cpu/amd.c       | 45 ++++++++++++++------
 2 files changed, 38 insertions(+), 16 deletions(-)

diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
index cfdf307ddc01..59f97ba25d5f 100644
--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -126,9 +126,14 @@ static __always_inline void local_db_restore(unsigned long dr7)
 }
 
 #ifdef CONFIG_CPU_SUP_AMD
-extern void set_dr_addr_mask(unsigned long mask, int dr);
+extern void set_dr_addr_mask(unsigned long mask, unsigned int dr);
+extern unsigned long amd_get_dr_addr_mask(unsigned int dr);
 #else
-static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
+static inline void set_dr_addr_mask(unsigned long mask, unsigned int dr) { }
+static inline unsigned long amd_get_dr_addr_mask(unsigned int dr)
+{
+	return 0;
+}
 #endif
 
 #endif /* _ASM_X86_DEBUGREG_H */
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index c75d75b9f11a..9ac5a19f89b9 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -1158,24 +1158,41 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
 	return false;
 }
 
-void set_dr_addr_mask(unsigned long mask, int dr)
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long[4], amd_dr_addr_mask);
+
+static unsigned int amd_msr_dr_addr_masks[] = {
+	MSR_F16H_DR0_ADDR_MASK,
+	MSR_F16H_DR1_ADDR_MASK - 1 + 1,
+	MSR_F16H_DR1_ADDR_MASK - 1 + 2,
+	MSR_F16H_DR1_ADDR_MASK - 1 + 3
+};
+
+void set_dr_addr_mask(unsigned long mask, unsigned int dr)
 {
-	if (!boot_cpu_has(X86_FEATURE_BPEXT))
+	if (!cpu_feature_enabled(X86_FEATURE_BPEXT))
 		return;
 
-	switch (dr) {
-	case 0:
-		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
-		break;
-	case 1:
-	case 2:
-	case 3:
-		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
-		break;
-	default:
-		break;
-	}
+	if (WARN_ON_ONCE(dr >= ARRAY_SIZE(amd_msr_dr_addr_masks)))
+		return;
+
+	if (per_cpu(amd_dr_addr_mask, smp_processor_id())[dr] == mask)
+		return;
+
+	wrmsr(amd_msr_dr_addr_masks[dr], mask, 0);
+	per_cpu(amd_dr_addr_mask, smp_processor_id())[dr] = mask;
+}
+
+unsigned long amd_get_dr_addr_mask(unsigned int dr)
+{
+	if (!cpu_feature_enabled(X86_FEATURE_BPEXT))
+		return 0;
+
+	if (WARN_ON_ONCE(dr >= ARRAY_SIZE(amd_msr_dr_addr_masks)))
+		return 0;
+
+	return per_cpu(amd_dr_addr_mask[dr], smp_processor_id());
 }
+EXPORT_SYMBOL_GPL(amd_get_dr_addr_mask);
 
 u32 amd_get_highest_perf(void)
 {
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES
  2022-12-09  4:38 [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
  2022-12-09  4:38 ` [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables Alexey Kardashevskiy
@ 2022-12-09  4:38 ` Alexey Kardashevskiy
  2023-01-10 18:00   ` Borislav Petkov
  2022-12-09  4:38 ` [PATCH kernel v2 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-09  4:38 UTC (permalink / raw)
  To: kvm
  Cc: x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Borislav Petkov,
	Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin,
	Adrian Hunter, Jason A. Donenfeld, H. Peter Anvin,
	Alexey Kardashevskiy

AMD Milan (Fam 19h) introduces support for the swapping, as type 'B',
of DR[0-3] and DR[0-3]_ADDR_MASK registers. Software enables this by
setting SEV_FEATURES[5] (called "DebugSwap") in the VMSA which makes
data breakpoints work in SEV-ES VMs.
For type 'B' swaps the hardware saves/restores the VM state on
VMEXIT/VMRUN in VMSA, and restores the host state on VMEXIT.

Enable DebugSwap in VMSA but only if CPUID Fn80000021_EAX[0]
("NoNestedDataBp", "Processor ignores nested data breakpoints") is
supported by the SOC as otherwise a malicious guest can cause
the infinite #DB loop DoS.

Save DR[0-3] / DR[0-3]_ADDR_MASK in the host save area before VMRUN
as type 'B' swap does not do this part.

Eliminate DR7 and #DB intercepts as:
- they are not needed when DebugSwap is supported;
- #VC for these intercepts is most likely not supported anyway and
kills the VM.
Keep DR7 intercepted unless DebugSwap enabled to prevent
the infinite #DB loop DoS.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
---
Changes:
v2:
* debug_swap moved from vcpu to module_param
* rewrote commit log

---

"DR7 access must remain intercepted for an SEV-ES guest" - I could not
figure out the exact reasoning why it is there in the first place,
IIUC this is to prevent loop of #DBs in the VM.

---
Tested with:
===
int x;
int main(int argc, char *argv[])
{
        x = 1;
        return 0;
}
===
gcc -g a.c
rsync a.out ruby-954vm:~/
ssh -t ruby-954vm 'gdb -ex "file a.out" -ex "watch x" -ex r'

where ruby-954vm is a VM.

With "/sys/module/kvm_amd/parameters/debug_swap = 0", gdb does not stop
on the watchpoint, with "= 1" - gdb does.
---
 arch/x86/include/asm/svm.h |  1 +
 arch/x86/kvm/svm/svm.h     | 16 ++++++++---
 arch/x86/kvm/svm/sev.c     | 29 ++++++++++++++++++++
 arch/x86/kvm/svm/svm.c     |  3 +-
 4 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 0361626841bc..373a0edda588 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -273,6 +273,7 @@ enum avic_ipi_failure_cause {
 #define AVIC_HPA_MASK	~((0xFFFULL << 52) | 0xFFF)
 #define VMCB_AVIC_APIC_BAR_MASK		0xFFFFFFFFFF000ULL
 
+#define SVM_SEV_FEAT_DEBUG_SWAP                        BIT(5)
 
 struct vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 199a2ecef1ce..0fae611abe4a 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -386,6 +386,8 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3
 	return test_bit(bit, (unsigned long *)&control->intercepts);
 }
 
+extern bool sev_es_is_debug_swap_enabled(void);
+
 static inline void set_dr_intercepts(struct vcpu_svm *svm)
 {
 	struct vmcb *vmcb = svm->vmcb01.ptr;
@@ -407,8 +409,10 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm)
 		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
 	}
 
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+	if (!sev_es_guest(svm->vcpu.kvm) || !sev_es_is_debug_swap_enabled()) {
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+	}
 
 	recalc_intercepts(svm);
 }
@@ -419,8 +423,12 @@ static inline void clr_dr_intercepts(struct vcpu_svm *svm)
 
 	vmcb->control.intercepts[INTERCEPT_DR] = 0;
 
-	/* DR7 access must remain intercepted for an SEV-ES guest */
-	if (sev_es_guest(svm->vcpu.kvm)) {
+	/*
+	 * DR7 access must remain intercepted for an SEV-ES guest unless DebugSwap
+	 * (depends on NO_NESTED_DATA_BP) is enabled as otherwise a VM writing to DR7
+	 * from the #DB handler may trigger infinite loop of #DB's.
+	 */
+	if (sev_es_guest(svm->vcpu.kvm) && !sev_es_is_debug_swap_enabled()) {
 		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
 		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
 	}
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index efaaef2b7ae1..800ea2a778cc 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -21,6 +21,7 @@
 #include <asm/pkru.h>
 #include <asm/trapnr.h>
 #include <asm/fpu/xcr.h>
+#include <asm/debugreg.h>
 
 #include "mmu.h"
 #include "x86.h"
@@ -52,11 +53,21 @@ module_param_named(sev, sev_enabled, bool, 0444);
 /* enable/disable SEV-ES support */
 static bool sev_es_enabled = true;
 module_param_named(sev_es, sev_es_enabled, bool, 0444);
+
+/* enable/disable SEV-ES DebugSwap support */
+static bool sev_es_debug_swap_enabled = true;
+module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644);
 #else
 #define sev_enabled false
 #define sev_es_enabled false
+#define sev_es_debug_swap false
 #endif /* CONFIG_KVM_AMD_SEV */
 
+bool sev_es_is_debug_swap_enabled(void)
+{
+	return sev_es_debug_swap_enabled;
+}
+
 static u8 sev_enc_bit;
 static DECLARE_RWSEM(sev_deactivate_lock);
 static DEFINE_MUTEX(sev_bitmap_lock);
@@ -604,6 +615,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 	save->xss  = svm->vcpu.arch.ia32_xss;
 	save->dr6  = svm->vcpu.arch.dr6;
 
+	if (sev_es_is_debug_swap_enabled())
+		save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP;
+
 	pr_debug("Virtual Machine Save Area (VMSA):\n");
 	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
 
@@ -2249,6 +2263,9 @@ void __init sev_hardware_setup(void)
 out:
 	sev_enabled = sev_supported;
 	sev_es_enabled = sev_es_supported;
+	if (sev_es_debug_swap_enabled)
+		sev_es_debug_swap_enabled = sev_es_enabled &&
+			boot_cpu_has(X86_FEATURE_NO_NESTED_DATA_BP);
 #endif
 }
 
@@ -3027,6 +3044,18 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa)
 
 	/* MSR_IA32_XSS is restored on VMEXIT, save the currnet host value */
 	hostsa->xss = host_xss;
+
+	/* The DebugSwap SEV feature does Type B swaps of DR[0-3] */
+	if (sev_es_is_debug_swap_enabled()) {
+		hostsa->dr0 = native_get_debugreg(0);
+		hostsa->dr1 = native_get_debugreg(1);
+		hostsa->dr2 = native_get_debugreg(2);
+		hostsa->dr3 = native_get_debugreg(3);
+		hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0);
+		hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1);
+		hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2);
+		hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3);
+	}
 }
 
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ce362e88a567..697804d46545 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1189,7 +1189,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	set_exception_intercept(svm, UD_VECTOR);
 	set_exception_intercept(svm, MC_VECTOR);
 	set_exception_intercept(svm, AC_VECTOR);
-	set_exception_intercept(svm, DB_VECTOR);
+	if (!sev_es_is_debug_swap_enabled())
+		set_exception_intercept(svm, DB_VECTOR);
 	/*
 	 * Guest access to VMware backdoor ports could legitimately
 	 * trigger #GP because of TSS I/O permission bitmap.
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH kernel v2 3/3] x86/sev: Do not handle #VC for DR7 read/write
  2022-12-09  4:38 [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
  2022-12-09  4:38 ` [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables Alexey Kardashevskiy
  2022-12-09  4:38 ` [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy
@ 2022-12-09  4:38 ` Alexey Kardashevskiy
  2023-01-10 20:04   ` Borislav Petkov
  2023-01-09  5:20 ` [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
  2023-01-10 17:41 ` Borislav Petkov
  4 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-09  4:38 UTC (permalink / raw)
  To: kvm
  Cc: x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Borislav Petkov,
	Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin,
	Adrian Hunter, Jason A. Donenfeld, H. Peter Anvin,
	Alexey Kardashevskiy

With MSR_AMD64_SEV_DEBUG_SWAP enabled, the VM should not get #VC
events for DR7 read/write which it rather avoided.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
---
Changes:
v2:
* use new bit definition
---
 arch/x86/include/asm/msr-index.h       | 1 +
 tools/arch/x86/include/asm/msr-index.h | 1 +
 arch/x86/kernel/sev.c                  | 6 ++++++
 3 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 4a2af82553e4..979ea2dd3845 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -570,6 +570,7 @@
 #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
 #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
 #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
+#define MSR_AMD64_SEV_DEBUG_SWAP	BIT_ULL(7)
 
 #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
 
diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h
index f17ade084720..2264ada2e26a 100644
--- a/tools/arch/x86/include/asm/msr-index.h
+++ b/tools/arch/x86/include/asm/msr-index.h
@@ -570,6 +570,7 @@
 #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
 #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
 #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
+#define MSR_AMD64_SEV_DEBUG_SWAP	BIT_ULL(7)
 
 #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
 
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a428c62330d3..6141c789e3d5 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1618,6 +1618,9 @@ static enum es_result vc_handle_dr7_write(struct ghcb *ghcb,
 	long val, *reg = vc_insn_get_rm(ctxt);
 	enum es_result ret;
 
+	if (sev_status & MSR_AMD64_SEV_DEBUG_SWAP)
+		return ES_VMM_ERROR;
+
 	if (!reg)
 		return ES_DECODE_FAILED;
 
@@ -1655,6 +1658,9 @@ static enum es_result vc_handle_dr7_read(struct ghcb *ghcb,
 	struct sev_es_runtime_data *data = this_cpu_read(runtime_data);
 	long *reg = vc_insn_get_rm(ctxt);
 
+	if (sev_status & MSR_AMD64_SEV_DEBUG_SWAP)
+		return ES_VMM_ERROR;
+
 	if (!reg)
 		return ES_DECODE_FAILED;
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap
  2022-12-09  4:38 [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
                   ` (2 preceding siblings ...)
  2022-12-09  4:38 ` [PATCH kernel v2 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy
@ 2023-01-09  5:20 ` Alexey Kardashevskiy
  2023-01-10 17:41 ` Borislav Petkov
  4 siblings, 0 replies; 15+ messages in thread
From: Alexey Kardashevskiy @ 2023-01-09  5:20 UTC (permalink / raw)
  To: kvm
  Cc: x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Borislav Petkov,
	Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin,
	Adrian Hunter, Jason A. Donenfeld, H. Peter Anvin



On 9/12/22 15:38, Alexey Kardashevskiy wrote:
> This is to use another AMD SEV-ES hardware assisted register swap,
> more detail in 2/3. The patches are fairly independend but required in
> this order.
> 
> This is based on sha1 0d1409e4ff08 and requires something like this
> for X86_FEATURE_NO_NESTED_DATA_BP:
> https://lkml.org/lkml/2022/11/29/1229
> 
> This patchset is pushed out at
> https://github.com/aik/linux/commits/debugswap
> 
> 
> Please comment. Thanks.

Ping? Thanks,


> 
> btw the enormous cc: list came from get_maintainer.pl, keep it like
> this or tell the script something to reduce the list?
> 
> 
> Alexey Kardashevskiy (3):
>    x86/amd: Cache values in percpu variables
>    KVM: SEV: Enable data breakpoints in SEV-ES
>    x86/sev: Do not handle #VC for DR7 read/write
> 
>   arch/x86/include/asm/debugreg.h        |  9 +++-
>   arch/x86/include/asm/msr-index.h       |  1 +
>   arch/x86/include/asm/svm.h             |  1 +
>   arch/x86/kvm/svm/svm.h                 | 16 +++++--
>   tools/arch/x86/include/asm/msr-index.h |  1 +
>   arch/x86/kernel/cpu/amd.c              | 45 ++++++++++++++------
>   arch/x86/kernel/sev.c                  |  6 +++
>   arch/x86/kvm/svm/sev.c                 | 29 +++++++++++++
>   arch/x86/kvm/svm/svm.c                 |  3 +-
>   9 files changed, 90 insertions(+), 21 deletions(-)
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables
  2022-12-09  4:38 ` [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables Alexey Kardashevskiy
@ 2023-01-10 16:05   ` Borislav Petkov
  2023-01-12  5:21     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2023-01-10 16:05 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin

On Fri, Dec 09, 2022 at 03:38:02PM +1100, Alexey Kardashevskiy wrote:

Make that Subject:

"x86/amd: Cache debug register values in percpu variables"

to make it less generic and more specific as to what you're doing.

> Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to
> be noticeable with the AMD KVM SEV-ES DebugSwap feature enabled.
> KVM is going to store host's DR[0-3] and DR[0-3]_ADDR_MASK before
> switching to a guest; the hardware is going to swap these on VMRUN
> and VMEXIT.
> 
> Store MSR values passsed to set_dr_addr_mask() in percpu values

s/values/variables/

Unknown word [passsed] in commit message.

Use a spellchecker pls.

> (when changed) and return them via new amd_get_dr_addr_mask().
> The gain here is about 10x.

10x when reading percpu vars vs MSR reads?

Oh well.

> As amd_set_dr_addr_mask() uses the array too, change the @dr type to
> unsigned to avoid checking for <0.

I feel ya but that function will warn once, return 0 when the @dr number is
outta bounds and that 0 will still get used as an address mask.

I think you really wanna return negative on error and the caller should not
continue in that case.

> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index c75d75b9f11a..9ac5a19f89b9 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -1158,24 +1158,41 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
>  	return false;
>  }
>  
> -void set_dr_addr_mask(unsigned long mask, int dr)
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long[4], amd_dr_addr_mask);

static

> +
> +static unsigned int amd_msr_dr_addr_masks[] = {
> +	MSR_F16H_DR0_ADDR_MASK,
> +	MSR_F16H_DR1_ADDR_MASK - 1 + 1,

- 1 + 1 ?

Why?

Because of the DR0 and then DR1 being in a different MSR range?

Who cares, make it simple:

	MSR_F16H_DR0_ADDR_MASK,
	MSR_F16H_DR1_ADDR_MASK,
	MSR_F16H_DR1_ADDR_MASK + 1,
	MSR_F16H_DR1_ADDR_MASK + 2

and add a comment if you want to denote the non-contiguous range but meh.

> +	MSR_F16H_DR1_ADDR_MASK - 1 + 2,
> +	MSR_F16H_DR1_ADDR_MASK - 1 + 3
> +};
> +
> +void set_dr_addr_mask(unsigned long mask, unsigned int dr)
>  {
> -	if (!boot_cpu_has(X86_FEATURE_BPEXT))
> +	if (!cpu_feature_enabled(X86_FEATURE_BPEXT))
>  		return;
>  
> -	switch (dr) {
> -	case 0:
> -		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
> -		break;
> -	case 1:
> -	case 2:
> -	case 3:
> -		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
> -		break;
> -	default:
> -		break;
> -	}
> +	if (WARN_ON_ONCE(dr >= ARRAY_SIZE(amd_msr_dr_addr_masks)))
> +		return;
> +
> +	if (per_cpu(amd_dr_addr_mask, smp_processor_id())[dr] == mask)

Do that at function entry:

	int cpu = smp_processor_id();

and use cpu here.

> +		return;
> +
> +	wrmsr(amd_msr_dr_addr_masks[dr], mask, 0);
> +	per_cpu(amd_dr_addr_mask, smp_processor_id())[dr] = mask;
> +}

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap
  2022-12-09  4:38 [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
                   ` (3 preceding siblings ...)
  2023-01-09  5:20 ` [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
@ 2023-01-10 17:41 ` Borislav Petkov
  4 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2023-01-10 17:41 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin

On Fri, Dec 09, 2022 at 03:38:01PM +1100, Alexey Kardashevskiy wrote:
> This is to use another AMD SEV-ES hardware assisted register swap,
> more detail in 2/3. The patches are fairly independend but required in
> this order.
> 
> This is based on sha1 0d1409e4ff08 and requires something like this
> for X86_FEATURE_NO_NESTED_DATA_BP:
> https://lkml.org/lkml/2022/11/29/1229

Please use

lore.kernel.org/r/<Message-ID>

when you want to refer to a mail on a public ML.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES
  2022-12-09  4:38 ` [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy
@ 2023-01-10 18:00   ` Borislav Petkov
  2023-01-10 19:06     ` Tom Lendacky
  2023-01-12  5:45     ` Alexey Kardashevskiy
  0 siblings, 2 replies; 15+ messages in thread
From: Borislav Petkov @ 2023-01-10 18:00 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Tom Lendacky
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin

On Fri, Dec 09, 2022 at 03:38:03PM +1100, Alexey Kardashevskiy wrote:
> AMD Milan (Fam 19h) introduces support for the swapping, as type 'B',

"type B" means nothing to people who don't have an intimate APM knowledge.

Let's try again, this time with a more accessible formulation:

"The debug registers are handled a bit differently when doing a world switch of a
SEV-ES guest: the guest debug registers values are saved and restored as usual
and as one would expect.

The *host* debug registers are not saved to the host save area so if the
host is doing any debug activity, that host should take care to stash its debug
registers values into the host save area before running guests.

See Table B-3. Swap Types and the AMD APM volume 2."

And now you can go into detail explaining which regs exactly and so on.

> of DR[0-3] and DR[0-3]_ADDR_MASK registers. Software enables this by
> setting SEV_FEATURES[5] (called "DebugSwap") in the VMSA which makes
> data breakpoints work in SEV-ES VMs.
>
> For type 'B' swaps the hardware saves/restores the VM state on
> VMEXIT/VMRUN in VMSA, and restores the host state on VMEXIT.

Yeah, close but I'd prefer a more detailed explanation and a reference to the
APM so that people can follow and read more info if needed.
> 
> Enable DebugSwap in VMSA but only if CPUID Fn80000021_EAX[0]
> ("NoNestedDataBp", "Processor ignores nested data breakpoints") is
> supported by the SOC as otherwise a malicious guest can cause
> the infinite #DB loop DoS.
> 
> Save DR[0-3] / DR[0-3]_ADDR_MASK in the host save area before VMRUN
> as type 'B' swap does not do this part.
> 
> Eliminate DR7 and #DB intercepts as:
> - they are not needed when DebugSwap is supported;
> - #VC for these intercepts is most likely not supported anyway and
> kills the VM.
> Keep DR7 intercepted unless DebugSwap enabled to prevent
> the infinite #DB loop DoS.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> ---
> Changes:
> v2:
> * debug_swap moved from vcpu to module_param
> * rewrote commit log
> 
> ---
> 
> "DR7 access must remain intercepted for an SEV-ES guest" - I could not
> figure out the exact reasoning why it is there in the first place,
> IIUC this is to prevent loop of #DBs in the VM.

Let's ask Mr. Lendacky:

8d4846b9b150 ("KVM: SVM: Prevent debugging under SEV-ES")

> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index efaaef2b7ae1..800ea2a778cc 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -21,6 +21,7 @@
>  #include <asm/pkru.h>
>  #include <asm/trapnr.h>
>  #include <asm/fpu/xcr.h>
> +#include <asm/debugreg.h>
>  
>  #include "mmu.h"
>  #include "x86.h"
> @@ -52,11 +53,21 @@ module_param_named(sev, sev_enabled, bool, 0444);
>  /* enable/disable SEV-ES support */
>  static bool sev_es_enabled = true;
>  module_param_named(sev_es, sev_es_enabled, bool, 0444);
> +
> +/* enable/disable SEV-ES DebugSwap support */
> +static bool sev_es_debug_swap_enabled = true;
> +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644);
>  #else
>  #define sev_enabled false
>  #define sev_es_enabled false
> +#define sev_es_debug_swap false
>  #endif /* CONFIG_KVM_AMD_SEV */
>  
> +bool sev_es_is_debug_swap_enabled(void)
> +{
> +	return sev_es_debug_swap_enabled;
> +}
> +
>  static u8 sev_enc_bit;
>  static DECLARE_RWSEM(sev_deactivate_lock);
>  static DEFINE_MUTEX(sev_bitmap_lock);
> @@ -604,6 +615,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>  	save->xss  = svm->vcpu.arch.ia32_xss;
>  	save->dr6  = svm->vcpu.arch.dr6;
>  
> +	if (sev_es_is_debug_swap_enabled())
> +		save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP;
> +
>  	pr_debug("Virtual Machine Save Area (VMSA):\n");
>  	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
>  
> @@ -2249,6 +2263,9 @@ void __init sev_hardware_setup(void)
>  out:
>  	sev_enabled = sev_supported;
>  	sev_es_enabled = sev_es_supported;
> +	if (sev_es_debug_swap_enabled)
> +		sev_es_debug_swap_enabled = sev_es_enabled &&
> +			boot_cpu_has(X86_FEATURE_NO_NESTED_DATA_BP);

check_for_deprecated_apis: WARNING: arch/x86/kvm/svm/sev.c:2268: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES
  2023-01-10 18:00   ` Borislav Petkov
@ 2023-01-10 19:06     ` Tom Lendacky
  2023-01-12  5:45     ` Alexey Kardashevskiy
  1 sibling, 0 replies; 15+ messages in thread
From: Tom Lendacky @ 2023-01-10 19:06 UTC (permalink / raw)
  To: Borislav Petkov, Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin

On 1/10/23 12:00, Borislav Petkov wrote:
> On Fri, Dec 09, 2022 at 03:38:03PM +1100, Alexey Kardashevskiy wrote:
>>

>> "DR7 access must remain intercepted for an SEV-ES guest" - I could not
>> figure out the exact reasoning why it is there in the first place,
>> IIUC this is to prevent loop of #DBs in the VM.
> 
> Let's ask Mr. Lendacky:
> 
> 8d4846b9b150 ("KVM: SVM: Prevent debugging under SEV-ES")

The DR7 requirements were to prevent a malicious SEV-ES guest from setting 
up data breakpoints on the #VC IDT entry/stack and causing an infinite loop.

Thanks,
Tom

> 
>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
>> index efaaef2b7ae1..800ea2a778cc 100644
>> --- a/arch/x86/kvm/svm/sev.c
>> +++ b/arch/x86/kvm/svm/sev.c
>> @@ -21,6 +21,7 @@
>>   #include <asm/pkru.h>
>>   #include <asm/trapnr.h>
>>   #include <asm/fpu/xcr.h>
>> +#include <asm/debugreg.h>
>>   
>>   #include "mmu.h"
>>   #include "x86.h"
>> @@ -52,11 +53,21 @@ module_param_named(sev, sev_enabled, bool, 0444);
>>   /* enable/disable SEV-ES support */
>>   static bool sev_es_enabled = true;
>>   module_param_named(sev_es, sev_es_enabled, bool, 0444);
>> +
>> +/* enable/disable SEV-ES DebugSwap support */
>> +static bool sev_es_debug_swap_enabled = true;
>> +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644);
>>   #else
>>   #define sev_enabled false
>>   #define sev_es_enabled false
>> +#define sev_es_debug_swap false
>>   #endif /* CONFIG_KVM_AMD_SEV */
>>   
>> +bool sev_es_is_debug_swap_enabled(void)
>> +{
>> +	return sev_es_debug_swap_enabled;
>> +}
>> +
>>   static u8 sev_enc_bit;
>>   static DECLARE_RWSEM(sev_deactivate_lock);
>>   static DEFINE_MUTEX(sev_bitmap_lock);
>> @@ -604,6 +615,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>>   	save->xss  = svm->vcpu.arch.ia32_xss;
>>   	save->dr6  = svm->vcpu.arch.dr6;
>>   
>> +	if (sev_es_is_debug_swap_enabled())
>> +		save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP;
>> +
>>   	pr_debug("Virtual Machine Save Area (VMSA):\n");
>>   	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
>>   
>> @@ -2249,6 +2263,9 @@ void __init sev_hardware_setup(void)
>>   out:
>>   	sev_enabled = sev_supported;
>>   	sev_es_enabled = sev_es_supported;
>> +	if (sev_es_debug_swap_enabled)
>> +		sev_es_debug_swap_enabled = sev_es_enabled &&
>> +			boot_cpu_has(X86_FEATURE_NO_NESTED_DATA_BP);
> 
> check_for_deprecated_apis: WARNING: arch/x86/kvm/svm/sev.c:2268: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 3/3] x86/sev: Do not handle #VC for DR7 read/write
  2022-12-09  4:38 ` [PATCH kernel v2 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy
@ 2023-01-10 20:04   ` Borislav Petkov
  0 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2023-01-10 20:04 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin

On Fri, Dec 09, 2022 at 03:38:04PM +1100, Alexey Kardashevskiy wrote:
> With MSR_AMD64_SEV_DEBUG_SWAP enabled, the VM should not get #VC
> events for DR7 read/write which it rather avoided.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> ---
> Changes:
> v2:
> * use new bit definition
> ---
>  arch/x86/include/asm/msr-index.h       | 1 +
>  tools/arch/x86/include/asm/msr-index.h | 1 +
>  arch/x86/kernel/sev.c                  | 6 ++++++
>  3 files changed, 8 insertions(+)

Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables
  2023-01-10 16:05   ` Borislav Petkov
@ 2023-01-12  5:21     ` Alexey Kardashevskiy
  2023-01-12 11:12       ` Borislav Petkov
  0 siblings, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2023-01-12  5:21 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin



On 11/1/23 03:05, Borislav Petkov wrote:
> On Fri, Dec 09, 2022 at 03:38:02PM +1100, Alexey Kardashevskiy wrote:
> 
> Make that Subject:
> 
> "x86/amd: Cache debug register values in percpu variables"
> 
> to make it less generic and more specific as to what you're doing.
> 
>> Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to
>> be noticeable with the AMD KVM SEV-ES DebugSwap feature enabled.
>> KVM is going to store host's DR[0-3] and DR[0-3]_ADDR_MASK before
>> switching to a guest; the hardware is going to swap these on VMRUN
>> and VMEXIT.
>>
>> Store MSR values passsed to set_dr_addr_mask() in percpu values
> 
> s/values/variables/
> 
> Unknown word [passsed] in commit message.
> 
> Use a spellchecker pls.
> 
>> (when changed) and return them via new amd_get_dr_addr_mask().
>> The gain here is about 10x.
> 
> 10x when reading percpu vars vs MSR reads?
> 
> Oh well.
 >
>> As amd_set_dr_addr_mask() uses the array too, change the @dr type to
>> unsigned to avoid checking for <0.
> 
> I feel ya but that function will warn once, return 0 when the @dr number is
> outta bounds and that 0 will still get used as an address mask.

"that function" is set_dr_addr_mask() (btw should I rename it to start 
with amd_? the commit log uses the wrong&longer name) which does not 
return a mask, amd_get_dr_addr_mask() does.

> I think you really wanna return negative on error and the caller should not
> continue in that case.

If it is out of bounds, it won't neither set or get. And these 2 helpers 
are used only by the kernel and the callers pass 0..3 and nothing else. 
BUG_ON() would do too, for example.


>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index c75d75b9f11a..9ac5a19f89b9 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -1158,24 +1158,41 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
>>   	return false;
>>   }
>>   
>> -void set_dr_addr_mask(unsigned long mask, int dr)
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long[4], amd_dr_addr_mask);
> 
> static
> 
>> +
>> +static unsigned int amd_msr_dr_addr_masks[] = {
>> +	MSR_F16H_DR0_ADDR_MASK,
>> +	MSR_F16H_DR1_ADDR_MASK - 1 + 1,
> 
> - 1 + 1 ?
> 
> Why?
> 
> Because of the DR0 and then DR1 being in a different MSR range?

Yup.

> 
> Who cares, make it simple:
> 
> 	MSR_F16H_DR0_ADDR_MASK,
> 	MSR_F16H_DR1_ADDR_MASK,
> 	MSR_F16H_DR1_ADDR_MASK + 1,
> 	MSR_F16H_DR1_ADDR_MASK + 2
> 
> and add a comment if you want to denote the non-contiguous range but meh.

imho having 1,2,3 in the code eliminates the need in a comment and 
produces the exact same end result. But since nobody cares, I'll do it 
the shorter way with just +1 and +2.


> >> +	MSR_F16H_DR1_ADDR_MASK - 1 + 2,
>> +	MSR_F16H_DR1_ADDR_MASK - 1 + 3
>> +};
>> +
>> +void set_dr_addr_mask(unsigned long mask, unsigned int dr)
>>   {
>> -	if (!boot_cpu_has(X86_FEATURE_BPEXT))
>> +	if (!cpu_feature_enabled(X86_FEATURE_BPEXT))
>>   		return;
>>   
>> -	switch (dr) {
>> -	case 0:
>> -		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
>> -		break;
>> -	case 1:
>> -	case 2:
>> -	case 3:
>> -		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
>> -		break;
>> -	default:
>> -		break;
>> -	}
>> +	if (WARN_ON_ONCE(dr >= ARRAY_SIZE(amd_msr_dr_addr_masks)))
>> +		return;
>> +
>> +	if (per_cpu(amd_dr_addr_mask, smp_processor_id())[dr] == mask)
> 
> Do that at function entry:
> 
> 	int cpu = smp_processor_id();
> 
> and use cpu here.

Sure. Out of curiosity - why?^w^w^w^w^  it reduced the vmlinux size by 
48 bytes, nice.

Thanks for the review!


> 
>> +		return;
>> +
>> +	wrmsr(amd_msr_dr_addr_masks[dr], mask, 0);
>> +	per_cpu(amd_dr_addr_mask, smp_processor_id())[dr] = mask;
>> +}
> 
> Thx.
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES
  2023-01-10 18:00   ` Borislav Petkov
  2023-01-10 19:06     ` Tom Lendacky
@ 2023-01-12  5:45     ` Alexey Kardashevskiy
  2023-01-12 11:28       ` Borislav Petkov
  1 sibling, 1 reply; 15+ messages in thread
From: Alexey Kardashevskiy @ 2023-01-12  5:45 UTC (permalink / raw)
  To: Borislav Petkov, Tom Lendacky
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin



On 11/1/23 05:00, Borislav Petkov wrote:
> On Fri, Dec 09, 2022 at 03:38:03PM +1100, Alexey Kardashevskiy wrote:
>> AMD Milan (Fam 19h) introduces support for the swapping, as type 'B',
> 
> "type B" means nothing to people who don't have an intimate APM knowledge.
> 
> Let's try again, this time with a more accessible formulation:
> 
> "The debug registers are handled a bit differently when doing a world switch of a
> SEV-ES guest: the guest debug registers values are saved and restored as usual
> and as one would expect.

Well, SEV-ES KVM (ES == Encrypted State) does not save/restore them for 
the guest (well, as I would expect) as the guest registers are not 
visible to host to save, they are intercepted and the VM does this GHCB 
dance with VMGEXIT(SVM_EXIT_WRITE_DR7).


> The *host* debug registers are not saved to the host save area so if the
> host is doing any debug activity, that host should take care to stash its debug
> registers values into the host save area before running guests.
> 
> See Table B-3. Swap Types and the AMD APM volume 2."
> 
> And now you can go into detail explaining which regs exactly and so on.
> 
>> of DR[0-3] and DR[0-3]_ADDR_MASK registers. Software enables this by
>> setting SEV_FEATURES[5] (called "DebugSwap") in the VMSA which makes
>> data breakpoints work in SEV-ES VMs.
>>
>> For type 'B' swaps the hardware saves/restores the VM state on
>> VMEXIT/VMRUN in VMSA, and restores the host state on VMEXIT.
> 
> Yeah, close but I'd prefer a more detailed explanation and a reference to the
> APM so that people can follow and read more info if needed.


Well, the only place in APM is that "Table B-3. Swap Types and the AMD 
APM volume 2", and it is pretty brief, do I miss something? Thanks,

> >> Enable DebugSwap in VMSA but only if CPUID Fn80000021_EAX[0]
>> ("NoNestedDataBp", "Processor ignores nested data breakpoints") is
>> supported by the SOC as otherwise a malicious guest can cause
>> the infinite #DB loop DoS.
>>
>> Save DR[0-3] / DR[0-3]_ADDR_MASK in the host save area before VMRUN
>> as type 'B' swap does not do this part.
>>
>> Eliminate DR7 and #DB intercepts as:
>> - they are not needed when DebugSwap is supported;
>> - #VC for these intercepts is most likely not supported anyway and
>> kills the VM.
>> Keep DR7 intercepted unless DebugSwap enabled to prevent
>> the infinite #DB loop DoS.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
>> ---
>> Changes:
>> v2:
>> * debug_swap moved from vcpu to module_param
>> * rewrote commit log
>>
>> ---
>>
>> "DR7 access must remain intercepted for an SEV-ES guest" - I could not
>> figure out the exact reasoning why it is there in the first place,
>> IIUC this is to prevent loop of #DBs in the VM.
> 
> Let's ask Mr. Lendacky:
> 
> 8d4846b9b150 ("KVM: SVM: Prevent debugging under SEV-ES")
> 
>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
>> index efaaef2b7ae1..800ea2a778cc 100644
>> --- a/arch/x86/kvm/svm/sev.c
>> +++ b/arch/x86/kvm/svm/sev.c
>> @@ -21,6 +21,7 @@
>>   #include <asm/pkru.h>
>>   #include <asm/trapnr.h>
>>   #include <asm/fpu/xcr.h>
>> +#include <asm/debugreg.h>
>>   
>>   #include "mmu.h"
>>   #include "x86.h"
>> @@ -52,11 +53,21 @@ module_param_named(sev, sev_enabled, bool, 0444);
>>   /* enable/disable SEV-ES support */
>>   static bool sev_es_enabled = true;
>>   module_param_named(sev_es, sev_es_enabled, bool, 0444);
>> +
>> +/* enable/disable SEV-ES DebugSwap support */
>> +static bool sev_es_debug_swap_enabled = true;
>> +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644);
>>   #else
>>   #define sev_enabled false
>>   #define sev_es_enabled false
>> +#define sev_es_debug_swap false
>>   #endif /* CONFIG_KVM_AMD_SEV */
>>   
>> +bool sev_es_is_debug_swap_enabled(void)
>> +{
>> +	return sev_es_debug_swap_enabled;
>> +}
>> +
>>   static u8 sev_enc_bit;
>>   static DECLARE_RWSEM(sev_deactivate_lock);
>>   static DEFINE_MUTEX(sev_bitmap_lock);
>> @@ -604,6 +615,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>>   	save->xss  = svm->vcpu.arch.ia32_xss;
>>   	save->dr6  = svm->vcpu.arch.dr6;
>>   
>> +	if (sev_es_is_debug_swap_enabled())
>> +		save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP;
>> +
>>   	pr_debug("Virtual Machine Save Area (VMSA):\n");
>>   	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
>>   
>> @@ -2249,6 +2263,9 @@ void __init sev_hardware_setup(void)
>>   out:
>>   	sev_enabled = sev_supported;
>>   	sev_es_enabled = sev_es_supported;
>> +	if (sev_es_debug_swap_enabled)
>> +		sev_es_debug_swap_enabled = sev_es_enabled &&
>> +			boot_cpu_has(X86_FEATURE_NO_NESTED_DATA_BP);
> 
> check_for_deprecated_apis: WARNING: arch/x86/kvm/svm/sev.c:2268: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables
  2023-01-12  5:21     ` Alexey Kardashevskiy
@ 2023-01-12 11:12       ` Borislav Petkov
  0 siblings, 0 replies; 15+ messages in thread
From: Borislav Petkov @ 2023-01-12 11:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin

On Thu, Jan 12, 2023 at 04:21:28PM +1100, Alexey Kardashevskiy wrote:
> "that function" is set_dr_addr_mask() (btw should I rename it to start with
> amd_?

If you really wanna... I mean the code is already doing AMD-specific handling
but sure, it'll be more obvious then that arch_install_hw_breakpoint() does only
AMD-specific masking there under the info->mask test.

> If it is out of bounds, it won't neither set or get. And these 2 helpers are
> used only by the kernel and the callers pass 0..3 and nothing else. BUG_ON()
> would do too, for example.

Yeah, we don't do BUG_ON - look for Linus' colorful explanations why. :)

In any case, I think we should always aim for proper recovery from errors but
this case is not that important so let's leave it at the WARN_ON_ONCE.

> imho having 1,2,3 in the code eliminates the need in a comment and produces
> the exact same end result. But since nobody cares, I'll do it the shorter
> way with just +1 and +2.

Yeah, the more important goal is simplicity. And that pays off when you have to
revisit that code and figure out what it does, later.

> Sure. Out of curiosity - why?^w^w^w^w^  it reduced the vmlinux size by 48
> bytes, nice.

The same answer - simplicity and speed when reading it.

That

	if (per_cpu(amd_dr_addr_mask, smp_processor_id())[dr] == mask)

is a bit harder to parse than

	if (per_cpu(amd_dr_addr_mask, cpu)[dr] == mask)

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES
  2023-01-12  5:45     ` Alexey Kardashevskiy
@ 2023-01-12 11:28       ` Borislav Petkov
  2023-01-12 14:32         ` Tom Lendacky
  0 siblings, 1 reply; 15+ messages in thread
From: Borislav Petkov @ 2023-01-12 11:28 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Tom Lendacky, kvm, x86, linux-kernel, Venu Busireddy, Tony Luck,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin

On Thu, Jan 12, 2023 at 04:45:34PM +1100, Alexey Kardashevskiy wrote:
> Well, SEV-ES KVM (ES == Encrypted State) does not save/restore them for the
> guest (well, as I would expect) as the guest registers are not visible to
> host to save, they are intercepted and the VM does this GHCB dance with
> VMGEXIT(SVM_EXIT_WRITE_DR7).

But they're saved in the VMSA, as Table B-3 says.

> Well, the only place in APM is that "Table B-3. Swap Types and the AMD APM
> volume 2", and it is pretty brief, do I miss something?

I don't understand that question - please elaborate.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES
  2023-01-12 11:28       ` Borislav Petkov
@ 2023-01-12 14:32         ` Tom Lendacky
  0 siblings, 0 replies; 15+ messages in thread
From: Tom Lendacky @ 2023-01-12 14:32 UTC (permalink / raw)
  To: Borislav Petkov, Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck,
	Thomas Gleixner, Sean Christopherson, Sandipan Das,
	Peter Zijlstra, Pawan Gupta, Paolo Bonzini, Michael Roth,
	Mario Limonciello, Jan Kara, Ingo Molnar, Huang Rui, Dave Hansen,
	Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo,
	Andrew Cooper, Alexander Shishkin, Adrian Hunter,
	Jason A. Donenfeld, H. Peter Anvin

On 1/12/23 05:28, Borislav Petkov wrote:
> On Thu, Jan 12, 2023 at 04:45:34PM +1100, Alexey Kardashevskiy wrote:
>> Well, SEV-ES KVM (ES == Encrypted State) does not save/restore them for the
>> guest (well, as I would expect) as the guest registers are not visible to
>> host to save, they are intercepted and the VM does this GHCB dance with
>> VMGEXIT(SVM_EXIT_WRITE_DR7).
> 
> But they're saved in the VMSA, as Table B-3 says.

Correct, when this feature is enabled, the VMRUN execution will restore 
the guest debug registers on guest entry and save them on guest exit.

Thanks,
Tom

> 
>> Well, the only place in APM is that "Table B-3. Swap Types and the AMD APM
>> volume 2", and it is pretty brief, do I miss something?
> 
> I don't understand that question - please elaborate.
> 
> Thx.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-01-12 14:45 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-09  4:38 [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
2022-12-09  4:38 ` [PATCH kernel v2 1/3] x86/amd: Cache values in percpu variables Alexey Kardashevskiy
2023-01-10 16:05   ` Borislav Petkov
2023-01-12  5:21     ` Alexey Kardashevskiy
2023-01-12 11:12       ` Borislav Petkov
2022-12-09  4:38 ` [PATCH kernel v2 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy
2023-01-10 18:00   ` Borislav Petkov
2023-01-10 19:06     ` Tom Lendacky
2023-01-12  5:45     ` Alexey Kardashevskiy
2023-01-12 11:28       ` Borislav Petkov
2023-01-12 14:32         ` Tom Lendacky
2022-12-09  4:38 ` [PATCH kernel v2 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy
2023-01-10 20:04   ` Borislav Petkov
2023-01-09  5:20 ` [PATCH kernel v2 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
2023-01-10 17:41 ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.