All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH kernel 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap
@ 2022-12-01  2:19 Alexey Kardashevskiy
  2022-12-01  2:19 ` [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables Alexey Kardashevskiy
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-01  2:19 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Borislav Petkov, Andrew Cooper, Jason A. Donenfeld,
	H. Peter Anvin

This is to use another AMD SEV-ES hardware assisted register swap,
more detail in 2/3. The patches are fairly independend but required in
this order.

This is based on sha1 ef4d3ea40565 and requires something like this:
https://lkml.org/lkml/2022/11/29/1229

This patchset is pushed out at
https://github.com/aik/linux/commits/debugswap

Please comment. Thanks.



Alexey Kardashevskiy (3):
  x86/amd/dr_addr_mask: Cache values in percpu variables
  KVM: SEV: Enable DebugSwap
  x86/sev: Do not handle #VC for DR7 read/write

 arch/x86/include/asm/debugreg.h |  1 +
 arch/x86/include/asm/svm.h      |  1 +
 arch/x86/kvm/svm/svm.h          | 18 ++++++++---
 arch/x86/kernel/cpu/amd.c       | 32 ++++++++++++++++++++
 arch/x86/kernel/sev.c           |  6 ++++
 arch/x86/kvm/svm/sev.c          | 22 +++++++++++++-
 arch/x86/kvm/svm/svm.c          |  6 ++--
 7 files changed, 78 insertions(+), 8 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-01  2:19 [PATCH kernel 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
@ 2022-12-01  2:19 ` Alexey Kardashevskiy
  2022-12-01 16:58   ` Sean Christopherson
  2022-12-07 18:55   ` Borislav Petkov
  2022-12-01  2:19 ` [PATCH kernel 2/3] KVM: SEV: Enable DebugSwap Alexey Kardashevskiy
  2022-12-01  2:19 ` [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy
  2 siblings, 2 replies; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-01  2:19 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Borislav Petkov, Andrew Cooper, Jason A. Donenfeld,
	H. Peter Anvin

Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to
be noticeable when the AMD KVM SEV-ES's DebugSwap feature is enabled and
KVM needs to store these before switching to a guest; the DebugSwitch
hardware support restores them as type B swap.

This stores MSR values from set_dr_addr_mask() in percpu values and
returns them via new get_dr_addr_mask(). The gain here is about 10x.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
---
 arch/x86/include/asm/debugreg.h |  1 +
 arch/x86/kernel/cpu/amd.c       | 32 ++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
index cfdf307ddc01..c4324d0205b5 100644
--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -127,6 +127,7 @@ static __always_inline void local_db_restore(unsigned long dr7)
 
 #ifdef CONFIG_CPU_SUP_AMD
 extern void set_dr_addr_mask(unsigned long mask, int dr);
+extern unsigned long get_dr_addr_mask(int dr);
 #else
 static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
 #endif
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index c75d75b9f11a..ec7efcef4e14 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -1158,6 +1158,11 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
 	return false;
 }
 
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr0_addr_mask);
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr1_addr_mask);
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr2_addr_mask);
+DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr3_addr_mask);
+
 void set_dr_addr_mask(unsigned long mask, int dr)
 {
 	if (!boot_cpu_has(X86_FEATURE_BPEXT))
@@ -1166,17 +1171,44 @@ void set_dr_addr_mask(unsigned long mask, int dr)
 	switch (dr) {
 	case 0:
 		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
+		per_cpu(dr0_addr_mask, smp_processor_id()) = mask;
 		break;
 	case 1:
+		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
+		per_cpu(dr1_addr_mask, smp_processor_id()) = mask;
+		break;
 	case 2:
+		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
+		per_cpu(dr2_addr_mask, smp_processor_id()) = mask;
+		break;
 	case 3:
 		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
+		per_cpu(dr3_addr_mask, smp_processor_id()) = mask;
 		break;
 	default:
 		break;
 	}
 }
 
+unsigned long get_dr_addr_mask(int dr)
+{
+	if (!boot_cpu_has(X86_FEATURE_BPEXT))
+		return 0;
+
+	switch (dr) {
+	case 0:
+		return per_cpu(dr0_addr_mask, smp_processor_id());
+	case 1:
+		return per_cpu(dr1_addr_mask, smp_processor_id());
+	case 2:
+		return per_cpu(dr2_addr_mask, smp_processor_id());
+	case 3:
+		return per_cpu(dr3_addr_mask, smp_processor_id());
+	}
+	return 0;
+}
+EXPORT_SYMBOL_GPL(get_dr_addr_mask);
+
 u32 amd_get_highest_perf(void)
 {
 	struct cpuinfo_x86 *c = &boot_cpu_data;
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH kernel 2/3] KVM: SEV: Enable DebugSwap
  2022-12-01  2:19 [PATCH kernel 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
  2022-12-01  2:19 ` [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables Alexey Kardashevskiy
@ 2022-12-01  2:19 ` Alexey Kardashevskiy
  2022-12-01 17:37   ` Sean Christopherson
  2022-12-01  2:19 ` [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy
  2 siblings, 1 reply; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-01  2:19 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Borislav Petkov, Andrew Cooper, Jason A. Donenfeld,
	H. Peter Anvin

AMD Milan introduces support for the swapping, as type 'B',
of DR[0-3] and DR[0-3]_ADDR_MASK registers. It requires that
SEV_FEATURES[5] be set in the VMSA.

This requires the KVM to eliminate the intercept of #DB. However,
because of the infinite #DB loop DoS that a malicious guest can do,
it can only be eliminated based if CPUID Fn80000021_EAX[0]
(NoNestedDataBp) is set in the host/HV.

This eliminates #DB intercept, DR7 intercept for SEV-ES/SEV-SNP guest.
This saves DR[0-3] / DR[0-3]_ADDR_MASK in the host save area before VMRUN.
This sets SEV_FEATURES[5] in VMSA.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
---
 arch/x86/include/asm/svm.h |  1 +
 arch/x86/kvm/svm/svm.h     | 18 +++++++++++-----
 arch/x86/kvm/svm/sev.c     | 22 +++++++++++++++++++-
 arch/x86/kvm/svm/svm.c     |  6 ++++--
 4 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 0361626841bc..373a0edda588 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -273,6 +273,7 @@ enum avic_ipi_failure_cause {
 #define AVIC_HPA_MASK	~((0xFFFULL << 52) | 0xFFF)
 #define VMCB_AVIC_APIC_BAR_MASK		0xFFFFFFFFFF000ULL
 
+#define SVM_SEV_FEAT_DEBUG_SWAP                        BIT(5)
 
 struct vmcb_seg {
 	u16 selector;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 199a2ecef1ce..4d75b14bffab 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -83,6 +83,7 @@ enum {
 struct kvm_sev_info {
 	bool active;		/* SEV enabled guest */
 	bool es_active;		/* SEV-ES enabled guest */
+	bool debug_swap;        /* SEV-ES Debug swap enabled */
 	unsigned int asid;	/* ASID used for this guest */
 	unsigned int handle;	/* SEV firmware handle */
 	int fd;			/* SEV device fd */
@@ -388,6 +389,7 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3
 
 static inline void set_dr_intercepts(struct vcpu_svm *svm)
 {
+	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
 	struct vmcb *vmcb = svm->vmcb01.ptr;
 
 	if (!sev_es_guest(svm->vcpu.kvm)) {
@@ -407,20 +409,26 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm)
 		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
 	}
 
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
-	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+	if (!sev->debug_swap) {
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
+		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
+	}
 
 	recalc_intercepts(svm);
 }
 
 static inline void clr_dr_intercepts(struct vcpu_svm *svm)
 {
+	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
 	struct vmcb *vmcb = svm->vmcb01.ptr;
 
 	vmcb->control.intercepts[INTERCEPT_DR] = 0;
 
-	/* DR7 access must remain intercepted for an SEV-ES guest */
-	if (sev_es_guest(svm->vcpu.kvm)) {
+	/*
+	 * DR7 access must remain intercepted for an SEV-ES guest unless
+	 * the DebugSwap feature is set
+	 */
+	if (sev_es_guest(svm->vcpu.kvm) && !sev->debug_swap) {
 		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
 		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
 	}
@@ -677,7 +685,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
 int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
 void sev_es_vcpu_reset(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
-void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
+void sev_es_prepare_switch_to_guest(struct kvm_vcpu *vcpu, struct sev_es_save_area *hostsa);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
 
 /* vmenter.S */
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index efaaef2b7ae1..fac8b48e3162 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -21,6 +21,7 @@
 #include <asm/pkru.h>
 #include <asm/trapnr.h>
 #include <asm/fpu/xcr.h>
+#include <asm/debugreg.h>
 
 #include "mmu.h"
 #include "x86.h"
@@ -253,6 +254,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
 	if (asid < 0)
 		goto e_no_asid;
 	sev->asid = asid;
+	sev->debug_swap = sev->es_active && kvm_cpu_cap_get(KVM_X86_FEATURE_NO_NESTED_DATA_BP);
 
 	ret = sev_platform_init(&argp->error);
 	if (ret)
@@ -564,6 +566,7 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
 static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 {
 	struct sev_es_save_area *save = svm->sev_es.vmsa;
+	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
 
 	/* Check some debug related fields before encrypting the VMSA */
 	if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1))
@@ -604,6 +607,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
 	save->xss  = svm->vcpu.arch.ia32_xss;
 	save->dr6  = svm->vcpu.arch.dr6;
 
+	if (sev->debug_swap)
+		save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP;
+
 	pr_debug("Virtual Machine Save Area (VMSA):\n");
 	print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
 
@@ -3010,8 +3016,10 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm)
 					    sev_enc_bit));
 }
 
-void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa)
+void sev_es_prepare_switch_to_guest(struct kvm_vcpu *vcpu, struct sev_es_save_area *hostsa)
 {
+	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
+
 	/*
 	 * As an SEV-ES guest, hardware will restore the host state on VMEXIT,
 	 * of which one step is to perform a VMLOAD.  KVM performs the
@@ -3027,6 +3035,18 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa)
 
 	/* MSR_IA32_XSS is restored on VMEXIT, save the currnet host value */
 	hostsa->xss = host_xss;
+
+	/* The DebugSwap SEV feature does Type B swaps of DR[0-3] */
+	if (sev->debug_swap) {
+		hostsa->dr0 = native_get_debugreg(0);
+		hostsa->dr1 = native_get_debugreg(1);
+		hostsa->dr2 = native_get_debugreg(2);
+		hostsa->dr3 = native_get_debugreg(3);
+		hostsa->dr0_addr_mask = get_dr_addr_mask(0);
+		hostsa->dr1_addr_mask = get_dr_addr_mask(1);
+		hostsa->dr2_addr_mask = get_dr_addr_mask(2);
+		hostsa->dr3_addr_mask = get_dr_addr_mask(3);
+	}
 }
 
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ce362e88a567..ee0e56521d26 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1173,6 +1173,7 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	struct vmcb *vmcb = svm->vmcb01.ptr;
 	struct vmcb_control_area *control = &vmcb->control;
 	struct vmcb_save_area *save = &vmcb->save;
+	struct kvm_sev_info *sev = &to_kvm_svm(vcpu->kvm)->sev_info;
 
 	svm_set_intercept(svm, INTERCEPT_CR0_READ);
 	svm_set_intercept(svm, INTERCEPT_CR3_READ);
@@ -1189,7 +1190,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
 	set_exception_intercept(svm, UD_VECTOR);
 	set_exception_intercept(svm, MC_VECTOR);
 	set_exception_intercept(svm, AC_VECTOR);
-	set_exception_intercept(svm, DB_VECTOR);
+	if (!sev->debug_swap)
+		set_exception_intercept(svm, DB_VECTOR);
 	/*
 	 * Guest access to VMware backdoor ports could legitimately
 	 * trigger #GP because of TSS I/O permission bitmap.
@@ -1461,7 +1463,7 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
 		struct sev_es_save_area *hostsa;
 		hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400);
 
-		sev_es_prepare_switch_to_guest(hostsa);
+		sev_es_prepare_switch_to_guest(vcpu, hostsa);
 	}
 
 	if (tsc_scaling)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write
  2022-12-01  2:19 [PATCH kernel 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
  2022-12-01  2:19 ` [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables Alexey Kardashevskiy
  2022-12-01  2:19 ` [PATCH kernel 2/3] KVM: SEV: Enable DebugSwap Alexey Kardashevskiy
@ 2022-12-01  2:19 ` Alexey Kardashevskiy
  2022-12-01 17:38   ` Sean Christopherson
  2 siblings, 1 reply; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-01  2:19 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Borislav Petkov, Andrew Cooper, Jason A. Donenfeld,
	H. Peter Anvin

With SVM_SEV_FEAT_DEBUG_SWAP enabled, the VM should not get #VC events
for DR7 read/write which it rather avoided.

Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
---
 arch/x86/kernel/sev.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a428c62330d3..4e91b9f8742c 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1618,6 +1618,9 @@ static enum es_result vc_handle_dr7_write(struct ghcb *ghcb,
 	long val, *reg = vc_insn_get_rm(ctxt);
 	enum es_result ret;
 
+	if ((sev_status >> 2) & SVM_SEV_FEAT_DEBUG_SWAP)
+		return ES_VMM_ERROR;
+
 	if (!reg)
 		return ES_DECODE_FAILED;
 
@@ -1655,6 +1658,9 @@ static enum es_result vc_handle_dr7_read(struct ghcb *ghcb,
 	struct sev_es_runtime_data *data = this_cpu_read(runtime_data);
 	long *reg = vc_insn_get_rm(ctxt);
 
+	if ((sev_status >> 2) & SVM_SEV_FEAT_DEBUG_SWAP)
+		return ES_VMM_ERROR;
+
 	if (!reg)
 		return ES_DECODE_FAILED;
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-01  2:19 ` [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables Alexey Kardashevskiy
@ 2022-12-01 16:58   ` Sean Christopherson
  2022-12-01 19:45     ` Andrew Cooper
  2022-12-06  7:14     ` Alexey Kardashevskiy
  2022-12-07 18:55   ` Borislav Petkov
  1 sibling, 2 replies; 19+ messages in thread
From: Sean Christopherson @ 2022-12-01 16:58 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Borislav Petkov, Andrew Cooper,
	Jason A. Donenfeld, H. Peter Anvin

On Thu, Dec 01, 2022, Alexey Kardashevskiy wrote:
> Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to
> be noticeable when the AMD KVM SEV-ES's DebugSwap feature is enabled and
> KVM needs to store these before switching to a guest; the DebugSwitch
> hardware support restores them as type B swap.
>
> This stores MSR values from set_dr_addr_mask() in percpu values and
> returns them via new get_dr_addr_mask(). The gain here is about 10x.
>
> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> ---
>  arch/x86/include/asm/debugreg.h |  1 +
>  arch/x86/kernel/cpu/amd.c       | 32 ++++++++++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
> index cfdf307ddc01..c4324d0205b5 100644
> --- a/arch/x86/include/asm/debugreg.h
> +++ b/arch/x86/include/asm/debugreg.h
> @@ -127,6 +127,7 @@ static __always_inline void local_db_restore(unsigned long dr7)
>  
>  #ifdef CONFIG_CPU_SUP_AMD
>  extern void set_dr_addr_mask(unsigned long mask, int dr);
> +extern unsigned long get_dr_addr_mask(int dr);
>  #else
>  static inline void set_dr_addr_mask(unsigned long mask, int dr) { }

KVM_AMD doesn't depend on CPU_SUP_AMD, i.e. this needs a stub.  Or we need to add
a dependency.

> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index c75d75b9f11a..ec7efcef4e14 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -1158,6 +1158,11 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
>  	return false;
>  }
>  
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr0_addr_mask);
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr1_addr_mask);
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr2_addr_mask);
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr3_addr_mask);
> +
>  void set_dr_addr_mask(unsigned long mask, int dr)
>  {
>  	if (!boot_cpu_has(X86_FEATURE_BPEXT))
> @@ -1166,17 +1171,44 @@ void set_dr_addr_mask(unsigned long mask, int dr)
>  	switch (dr) {
>  	case 0:
>  		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);

LOL, I'd love to hear how MSR_F16H_DR0_ADDR_MASK ended up with a completely
different MSR index.

> +		per_cpu(dr0_addr_mask, smp_processor_id()) = mask;

Use an array to avoid the copy+paste?  And if you're going to add a cache, might
as well use it to avoid unnecessary writes.

>  		break;
>  	case 1:

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 2/3] KVM: SEV: Enable DebugSwap
  2022-12-01  2:19 ` [PATCH kernel 2/3] KVM: SEV: Enable DebugSwap Alexey Kardashevskiy
@ 2022-12-01 17:37   ` Sean Christopherson
  2022-12-09  2:28     ` Alexey Kardashevskiy
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2022-12-01 17:37 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Borislav Petkov, Andrew Cooper,
	Jason A. Donenfeld, H. Peter Anvin

On Thu, Dec 01, 2022, Alexey Kardashevskiy wrote:
> AMD Milan introduces support for the swapping, as type 'B',

Please make the changelog standalone, i.e. don't rely on the shortlog to provide
context.  "the swapping" is inscrutable without the shortlog.

> of DR[0-3] and DR[0-3]_ADDR_MASK registers. It requires that
> SEV_FEATURES[5] be set in the VMSA.

Avoid pronouns in shortlogs, changelogs, and comments, as pronouns tend to be
ambiguous.  "Software can enable DebugSwap by setting SEV_FEATURE[5] in the VMSA."
isn't much more effort to type.

> 
> This requires the KVM to eliminate the intercept of #DB. However,

Same here, e.g. does "this" mean that the architecture requires DB interception
to be disabled to enable DebugSwap?

> because of the infinite #DB loop DoS that a malicious guest can do,
> it can only be eliminated based if CPUID Fn80000021_EAX[0]

And "it" here.

> (NoNestedDataBp) is set in the host/HV.
> 
> This eliminates #DB intercept, DR7 intercept for SEV-ES/SEV-SNP guest.
> This saves DR[0-3] / DR[0-3]_ADDR_MASK in the host save area before VMRUN.
> This sets SEV_FEATURES[5] in VMSA.

And all of these "this".  Assuming "this" means "this patch", rewrite these
sentences to be commands that state what changes are being done.

> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> ---
>  arch/x86/include/asm/svm.h |  1 +
>  arch/x86/kvm/svm/svm.h     | 18 +++++++++++-----
>  arch/x86/kvm/svm/sev.c     | 22 +++++++++++++++++++-
>  arch/x86/kvm/svm/svm.c     |  6 ++++--
>  4 files changed, 39 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 0361626841bc..373a0edda588 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -273,6 +273,7 @@ enum avic_ipi_failure_cause {
>  #define AVIC_HPA_MASK	~((0xFFFULL << 52) | 0xFFF)
>  #define VMCB_AVIC_APIC_BAR_MASK		0xFFFFFFFFFF000ULL
>  
> +#define SVM_SEV_FEAT_DEBUG_SWAP                        BIT(5)
>  
>  struct vmcb_seg {
>  	u16 selector;
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 199a2ecef1ce..4d75b14bffab 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -83,6 +83,7 @@ enum {
>  struct kvm_sev_info {
>  	bool active;		/* SEV enabled guest */
>  	bool es_active;		/* SEV-ES enabled guest */
> +	bool debug_swap;        /* SEV-ES Debug swap enabled */
>  	unsigned int asid;	/* ASID used for this guest */
>  	unsigned int handle;	/* SEV firmware handle */
>  	int fd;			/* SEV device fd */
> @@ -388,6 +389,7 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3
>  
>  static inline void set_dr_intercepts(struct vcpu_svm *svm)
>  {
> +	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
>  	struct vmcb *vmcb = svm->vmcb01.ptr;
>  
>  	if (!sev_es_guest(svm->vcpu.kvm)) {
> @@ -407,20 +409,26 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm)
>  		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
>  	}
>  
> -	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
> -	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
> +	if (!sev->debug_swap) {
> +		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
> +		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
> +	}
>  
>  	recalc_intercepts(svm);
>  }
>  
>  static inline void clr_dr_intercepts(struct vcpu_svm *svm)
>  {
> +	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
>  	struct vmcb *vmcb = svm->vmcb01.ptr;
>  
>  	vmcb->control.intercepts[INTERCEPT_DR] = 0;
>  
> -	/* DR7 access must remain intercepted for an SEV-ES guest */
> -	if (sev_es_guest(svm->vcpu.kvm)) {
> +	/*
> +	 * DR7 access must remain intercepted for an SEV-ES guest unless
> +	 * the DebugSwap feature is set

Please explain _why_.  

> +	 */
> +	if (sev_es_guest(svm->vcpu.kvm) && !sev->debug_swap) {
>  		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
>  		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
>  	}
> @@ -677,7 +685,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
>  int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
>  void sev_es_vcpu_reset(struct vcpu_svm *svm);
>  void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
> -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
> +void sev_es_prepare_switch_to_guest(struct kvm_vcpu *vcpu, struct sev_es_save_area *hostsa);
>  void sev_es_unmap_ghcb(struct vcpu_svm *svm);
>  
>  /* vmenter.S */
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index efaaef2b7ae1..fac8b48e3162 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -21,6 +21,7 @@
>  #include <asm/pkru.h>
>  #include <asm/trapnr.h>
>  #include <asm/fpu/xcr.h>
> +#include <asm/debugreg.h>
>  
>  #include "mmu.h"
>  #include "x86.h"
> @@ -253,6 +254,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  	if (asid < 0)
>  		goto e_no_asid;
>  	sev->asid = asid;
> +	sev->debug_swap = sev->es_active && 

Enabling DebugSwap should be guarded with a module param so that the admin can
disable the feature if necessary.  And then the per-vCPU variable goes away.

> kvm_cpu_cap_get(KVM_X86_FEATURE_NO_NESTED_DATA_BP);

kvm_cpu_cap_has().

And use X86_FEATURE_* directly, which is the whole point of the __feature_translate()
shenanigans.

>  
>  	ret = sev_platform_init(&argp->error);
>  	if (ret)
> @@ -564,6 +566,7 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
>  static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>  {
>  	struct sev_es_save_area *save = svm->sev_es.vmsa;
> +	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
>  
>  	/* Check some debug related fields before encrypting the VMSA */
>  	if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1))
> @@ -604,6 +607,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>  	save->xss  = svm->vcpu.arch.ia32_xss;
>  	save->dr6  = svm->vcpu.arch.dr6;
>  
> +	if (sev->debug_swap)
> +		save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP;

Resurrecting my objection to "AP Creation NAE event"[*], what happens if a hypervisor
supports GHCB_HV_FT_SNP_AP_CREATION but not DebugSwap?  IIUC, a guest can corrupt
host DRs by enabling DebugSwap in the VMSA of an AP vCPU, e.g. the CPU will load
zeros on VM-Exit if the host hasn't stuffed the host save area fields.

KVM can obviously just make sure to save its DRs if hardware supports DebugSwap,
but what if DebugSwap is buggy and needs to be disabled?  And what about the next
feature that can apparently be enabled by the guest?

[*] https://lore.kernel.org/all/YWnbfCet84Vup6q9@google.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write
  2022-12-01  2:19 ` [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy
@ 2022-12-01 17:38   ` Sean Christopherson
  2022-12-07 19:01     ` Borislav Petkov
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2022-12-01 17:38 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Borislav Petkov, Andrew Cooper,
	Jason A. Donenfeld, H. Peter Anvin

On Thu, Dec 01, 2022, Alexey Kardashevskiy wrote:
> With SVM_SEV_FEAT_DEBUG_SWAP enabled, the VM should not get #VC events
> for DR7 read/write which it rather avoided.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> ---
>  arch/x86/kernel/sev.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index a428c62330d3..4e91b9f8742c 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -1618,6 +1618,9 @@ static enum es_result vc_handle_dr7_write(struct ghcb *ghcb,
>  	long val, *reg = vc_insn_get_rm(ctxt);
>  	enum es_result ret;
>  
> +	if ((sev_status >> 2) & SVM_SEV_FEAT_DEBUG_SWAP)

Probably high time to add a helper/macro to convert the SEV_STATUS to the SEV_FEATURES
field.

> +		return ES_VMM_ERROR;
> +
>  	if (!reg)
>  		return ES_DECODE_FAILED;
>  
> @@ -1655,6 +1658,9 @@ static enum es_result vc_handle_dr7_read(struct ghcb *ghcb,
>  	struct sev_es_runtime_data *data = this_cpu_read(runtime_data);
>  	long *reg = vc_insn_get_rm(ctxt);
>  
> +	if ((sev_status >> 2) & SVM_SEV_FEAT_DEBUG_SWAP)
> +		return ES_VMM_ERROR;
> +
>  	if (!reg)
>  		return ES_DECODE_FAILED;
>  
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-01 16:58   ` Sean Christopherson
@ 2022-12-01 19:45     ` Andrew Cooper
  2022-12-06  7:14     ` Alexey Kardashevskiy
  1 sibling, 0 replies; 19+ messages in thread
From: Andrew Cooper @ 2022-12-01 19:45 UTC (permalink / raw)
  To: Sean Christopherson, Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Borislav Petkov,
	Jason A. Donenfeld, H. Peter Anvin, Andrew Cooper

On 01/12/2022 16:58, Sean Christopherson wrote:
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index c75d75b9f11a..ec7efcef4e14 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -1158,6 +1158,11 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
>>  	return false;
>>  }
>>  
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr0_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr1_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr2_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr3_addr_mask);
>> +
>>  void set_dr_addr_mask(unsigned long mask, int dr)
>>  {
>>  	if (!boot_cpu_has(X86_FEATURE_BPEXT))
>> @@ -1166,17 +1171,44 @@ void set_dr_addr_mask(unsigned long mask, int dr)
>>  	switch (dr) {
>>  	case 0:
>>  		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
> LOL, I'd love to hear how MSR_F16H_DR0_ADDR_MASK ended up with a completely
> different MSR index.

(Very) back in the day, this is was a special for %dr0 only.

When the feature was made architectural (CPUID.80000001.ecx[26] "data
breakpoint extensions"), 3 more registered needed to be allocated. 

There's also CPUID.80000001.ecx[30] "Address Mask Extensions" which mean
"all 32 bits work", where previously only bits above 12 took effect. 
I.e. you can now match within the same page.

And somewhere I'm pretty sure there's another bit (New in Zen2/Rome ?)
saying that all 64 bits work, but I can't actually locate the CPUID bit.

~Andrew

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-01 16:58   ` Sean Christopherson
  2022-12-01 19:45     ` Andrew Cooper
@ 2022-12-06  7:14     ` Alexey Kardashevskiy
  2022-12-06 17:07       ` Sean Christopherson
  1 sibling, 1 reply; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-06  7:14 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Borislav Petkov, Andrew Cooper,
	Jason A. Donenfeld, H. Peter Anvin



On 2/12/22 03:58, Sean Christopherson wrote:
> On Thu, Dec 01, 2022, Alexey Kardashevskiy wrote:
>> Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to
>> be noticeable when the AMD KVM SEV-ES's DebugSwap feature is enabled and
>> KVM needs to store these before switching to a guest; the DebugSwitch
>> hardware support restores them as type B swap.
>>
>> This stores MSR values from set_dr_addr_mask() in percpu values and
>> returns them via new get_dr_addr_mask(). The gain here is about 10x.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
>> ---
>>   arch/x86/include/asm/debugreg.h |  1 +
>>   arch/x86/kernel/cpu/amd.c       | 32 ++++++++++++++++++++
>>   2 files changed, 33 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
>> index cfdf307ddc01..c4324d0205b5 100644
>> --- a/arch/x86/include/asm/debugreg.h
>> +++ b/arch/x86/include/asm/debugreg.h
>> @@ -127,6 +127,7 @@ static __always_inline void local_db_restore(unsigned long dr7)
>>   
>>   #ifdef CONFIG_CPU_SUP_AMD
>>   extern void set_dr_addr_mask(unsigned long mask, int dr);
>> +extern unsigned long get_dr_addr_mask(int dr);
>>   #else
>>   static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
> 
> KVM_AMD doesn't depend on CPU_SUP_AMD, i.e. this needs a stub.  Or we need to add
> a dependency.

The new symbol is under CONFIG_CPU_SUP_AMD and so is its only user 
arch/x86/kernel/cpu/amd.c:

arch/x86/kernel/cpu/Makefile:
obj-$(CONFIG_CPU_SUP_AMD)               += amd.o


Is this enough dependency or we need something else? (if enough, I'll 
put it into the next commit log).


>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index c75d75b9f11a..ec7efcef4e14 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -1158,6 +1158,11 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
>>   	return false;
>>   }
>>   
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr0_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr1_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr2_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr3_addr_mask);
>> +
>>   void set_dr_addr_mask(unsigned long mask, int dr)
>>   {
>>   	if (!boot_cpu_has(X86_FEATURE_BPEXT))
>> @@ -1166,17 +1171,44 @@ void set_dr_addr_mask(unsigned long mask, int dr)
>>   	switch (dr) {
>>   	case 0:
>>   		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
> 
> LOL, I'd love to hear how MSR_F16H_DR0_ADDR_MASK ended up with a completely
> different MSR index.
>> +		per_cpu(dr0_addr_mask, smp_processor_id()) = mask;
> 
> Use an array to avoid the copy+paste?  And if you're going to add a cache, might
> as well use it to avoid unnecessary writes.

I'll do this, I did not realize DEFINE_PER_CPU_READ_MOSTLY can do 
arrays. Thanks,

> 
>>   		break;
>>   	case 1:

-- 
Alexey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-06  7:14     ` Alexey Kardashevskiy
@ 2022-12-06 17:07       ` Sean Christopherson
  2022-12-07  0:50         ` Alexey Kardashevskiy
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2022-12-06 17:07 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Borislav Petkov, Andrew Cooper,
	Jason A. Donenfeld, H. Peter Anvin

On Tue, Dec 06, 2022, Alexey Kardashevskiy wrote:
> 
> On 2/12/22 03:58, Sean Christopherson wrote:
> > On Thu, Dec 01, 2022, Alexey Kardashevskiy wrote:
> > > diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
> > > index cfdf307ddc01..c4324d0205b5 100644
> > > --- a/arch/x86/include/asm/debugreg.h
> > > +++ b/arch/x86/include/asm/debugreg.h
> > > @@ -127,6 +127,7 @@ static __always_inline void local_db_restore(unsigned long dr7)
> > >   #ifdef CONFIG_CPU_SUP_AMD
> > >   extern void set_dr_addr_mask(unsigned long mask, int dr);
> > > +extern unsigned long get_dr_addr_mask(int dr);
> > >   #else
> > >   static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
> > 
> > KVM_AMD doesn't depend on CPU_SUP_AMD, i.e. this needs a stub.  Or we need to add
> > a dependency.
> 
> The new symbol is under CONFIG_CPU_SUP_AMD and so is its only user
> arch/x86/kernel/cpu/amd.c:
> 
> arch/x86/kernel/cpu/Makefile:
> obj-$(CONFIG_CPU_SUP_AMD)               += amd.o
> 
> 
> Is this enough dependency or we need something else? (if enough, I'll put it
> into the next commit log).

No, it's actually the opposite, the issue is precisely that the symbol is buried
under CPU_SUP_AMD.  KVM_AMD doesn't currently depend on CPU_SUP_AMD, and so the
usage in sev_es_prepare_switch_to_guest() fails to compile if CPU_SUP_AMD=n and
KVM_AMD={Y,M}.

  config KVM_AMD
	tristate "KVM for AMD processors support"
	depends on KVM

I actually just submitted a patch[*] to "fix" that since KVM requires the CPU vendor
to be AMD or Hygon at runtime.  Although that patch is buried in the middle of a
large series, it doesn't have any dependencies.  So, if this series is routed through
the KVM tree, it should be straightforward to just pull that patch into this series,
and whichever series lands first gets the honor of applying that patch.

If this series is routed through the tip tree, the best option may be to just add
a stub to avoid potential conflicts, and then we can rip the stub out later.

[*] https://lore.kernel.org/all/20221201232655.290720-12-seanjc@google.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-06 17:07       ` Sean Christopherson
@ 2022-12-07  0:50         ` Alexey Kardashevskiy
  0 siblings, 0 replies; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-07  0:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Borislav Petkov, Andrew Cooper,
	Jason A. Donenfeld, H. Peter Anvin



On 7/12/22 04:07, Sean Christopherson wrote:
> On Tue, Dec 06, 2022, Alexey Kardashevskiy wrote:
>>
>> On 2/12/22 03:58, Sean Christopherson wrote:
>>> On Thu, Dec 01, 2022, Alexey Kardashevskiy wrote:
>>>> diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
>>>> index cfdf307ddc01..c4324d0205b5 100644
>>>> --- a/arch/x86/include/asm/debugreg.h
>>>> +++ b/arch/x86/include/asm/debugreg.h
>>>> @@ -127,6 +127,7 @@ static __always_inline void local_db_restore(unsigned long dr7)
>>>>    #ifdef CONFIG_CPU_SUP_AMD
>>>>    extern void set_dr_addr_mask(unsigned long mask, int dr);
>>>> +extern unsigned long get_dr_addr_mask(int dr);
>>>>    #else
>>>>    static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
>>>
>>> KVM_AMD doesn't depend on CPU_SUP_AMD, i.e. this needs a stub.  Or we need to add
>>> a dependency.
>>
>> The new symbol is under CONFIG_CPU_SUP_AMD and so is its only user
>> arch/x86/kernel/cpu/amd.c:
>>
>> arch/x86/kernel/cpu/Makefile:
>> obj-$(CONFIG_CPU_SUP_AMD)               += amd.o
>>
>>
>> Is this enough dependency or we need something else? (if enough, I'll put it
>> into the next commit log).
> 
> No, it's actually the opposite, the issue is precisely that the symbol is buried
> under CPU_SUP_AMD.  KVM_AMD doesn't currently depend on CPU_SUP_AMD, and so the
> usage in sev_es_prepare_switch_to_guest() fails to compile if CPU_SUP_AMD=n and
> KVM_AMD={Y,M}.

Ouch, you are one step ahead, 2/3 fails, not this one. My bad. I'll add 
a stub anyway.

btw I want to do "s/int dr/unsigned int dr/" since I am using an array 
now. Does it have to be patch#1 "fix set_dr_addr_mask" and then patch#2 
"add get_dr_addr_mask" or one patch will do? Thanks,


> 
>    config KVM_AMD
> 	tristate "KVM for AMD processors support"
> 	depends on KVM
> 
> I actually just submitted a patch[*] to "fix" that since KVM requires the CPU vendor
> to be AMD or Hygon at runtime.

>  Although that patch is buried in the middle of a
> large series, it doesn't have any dependencies.  So, if this series is routed through
> the KVM tree, it should be straightforward to just pull that patch into this series,
> and whichever series lands first gets the honor of applying that patch.
> 
> If this series is routed through the tip tree, the best option may be to just add
> a stub to avoid potential conflicts, and then we can rip the stub out later.
> 
> [*] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F20221201232655.290720-12-seanjc%40google.com&amp;data=05%7C01%7CAlexey.Kardashevskiy%40amd.com%7C6ee92cb78f8f446b887908dad7ac6fca%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638059432896741545%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=25f18IS6z9HEf0Qtt%2BFDxZdtbTVPg%2FVulsFGhWLH0Rg%3D&amp;reserved=0

-- 
Alexey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-01  2:19 ` [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables Alexey Kardashevskiy
  2022-12-01 16:58   ` Sean Christopherson
@ 2022-12-07 18:55   ` Borislav Petkov
  2022-12-08  6:11     ` Alexey Kardashevskiy
  1 sibling, 1 reply; 19+ messages in thread
From: Borislav Petkov @ 2022-12-07 18:55 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Andrew Cooper, Jason A. Donenfeld, H. Peter Anvin

On Thu, Dec 01, 2022 at 01:19:46PM +1100, Alexey Kardashevskiy wrote:
> Subject: Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables

"x86/amd: " is perfectly fine as a prefix.

> Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to
> be noticeable when the AMD KVM SEV-ES's DebugSwap feature is enabled and

which does what? I.e., a sort of lazy DR regs swapping...

> KVM needs to store these before switching to a guest; the DebugSwitch
> hardware support restores them as type B swap.

I know this is all clear to you but you should explain what type B
register swap is.

> This stores MSR values from set_dr_addr_mask() in percpu values and

s/This stores/Store/

From Documentation/process/submitting-patches.rst:

 "Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
  instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
  to do frotz", as if you are giving orders to the codebase to change
  its behaviour."

Also, do not talk about what your patch does - that should hopefully be
visible in the diff itself. Rather, talk about *why* you're doing what
you're doing.

> returns them via new get_dr_addr_mask(). The gain here is about 10x.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
> ---
>  arch/x86/include/asm/debugreg.h |  1 +
>  arch/x86/kernel/cpu/amd.c       | 32 ++++++++++++++++++++
>  2 files changed, 33 insertions(+)
> 
> diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
> index cfdf307ddc01..c4324d0205b5 100644
> --- a/arch/x86/include/asm/debugreg.h
> +++ b/arch/x86/include/asm/debugreg.h
> @@ -127,6 +127,7 @@ static __always_inline void local_db_restore(unsigned long dr7)
>  
>  #ifdef CONFIG_CPU_SUP_AMD
>  extern void set_dr_addr_mask(unsigned long mask, int dr);
> +extern unsigned long get_dr_addr_mask(int dr);
>  #else
>  static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
>  #endif
> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
> index c75d75b9f11a..ec7efcef4e14 100644
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -1158,6 +1158,11 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
>  	return false;
>  }
>  
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr0_addr_mask);
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr1_addr_mask);
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr2_addr_mask);
> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr3_addr_mask);

This BPEXT thing is AMD-only, right?

I guess those should be called amd_drX_addr_mask where X in [0-3].

Yeah yeah, they are used in AMD-only code - svm* - but still.

>  void set_dr_addr_mask(unsigned long mask, int dr)
>  {
>  	if (!boot_cpu_has(X86_FEATURE_BPEXT))
> @@ -1166,17 +1171,44 @@ void set_dr_addr_mask(unsigned long mask, int dr)
>  	switch (dr) {
>  	case 0:
>  		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
> +		per_cpu(dr0_addr_mask, smp_processor_id()) = mask;
>  		break;
>  	case 1:
> +		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
> +		per_cpu(dr1_addr_mask, smp_processor_id()) = mask;
> +		break;
>  	case 2:
> +		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
> +		per_cpu(dr2_addr_mask, smp_processor_id()) = mask;
> +		break;
>  	case 3:
>  		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
> +		per_cpu(dr3_addr_mask, smp_processor_id()) = mask;
>  		break;
>  	default:
>  		break;
>  	}
>  }
>  
> +unsigned long get_dr_addr_mask(int dr)

This function name is too generic for an exported function.

amd_get_dr_addr_mask() I'd say.

> +	if (!boot_cpu_has(X86_FEATURE_BPEXT))

check_for_deprecated_apis: WARNING: arch/x86/kernel/cpu/amd.c:1195: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.

You could fix the above one too, while at it.

> +		return 0;
> +
> +	switch (dr) {
> +	case 0:
> +		return per_cpu(dr0_addr_mask, smp_processor_id());
> +	case 1:
> +		return per_cpu(dr1_addr_mask, smp_processor_id());
> +	case 2:
> +		return per_cpu(dr2_addr_mask, smp_processor_id());
> +	case 3:
> +		return per_cpu(dr3_addr_mask, smp_processor_id());

	default:	
		WARN_ON_ONCE(1);
		break;

Just in case.

And as a matter of fact, make that short and succinct:

        switch (dr) {
        case 0: return per_cpu(dr0_addr_mask, smp_processor_id());
        case 1: return per_cpu(dr1_addr_mask, smp_processor_id());
        case 2: return per_cpu(dr2_addr_mask, smp_processor_id());
        case 3: return per_cpu(dr3_addr_mask, smp_processor_id());
        default: WARN_ON_ONCE(1); break;
        }

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write
  2022-12-01 17:38   ` Sean Christopherson
@ 2022-12-07 19:01     ` Borislav Petkov
  2022-12-07 19:07       ` Sean Christopherson
  0 siblings, 1 reply; 19+ messages in thread
From: Borislav Petkov @ 2022-12-07 19:01 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Alexey Kardashevskiy, kvm, x86, linux-kernel, Venu Busireddy,
	Tony Luck, Tom Lendacky, Thomas Gleixner, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Andrew Cooper, Jason A. Donenfeld, H. Peter Anvin

On Thu, Dec 01, 2022 at 05:38:33PM +0000, Sean Christopherson wrote:
> Probably high time to add a helper/macro to convert the SEV_STATUS to
> the SEV_FEATURES field.

Nah, there's a couple of

MSR_AMD64_SEV*

defines in arch/x86/include/asm/msr-index.h.

Bit 5 should simply be added there.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write
  2022-12-07 19:01     ` Borislav Petkov
@ 2022-12-07 19:07       ` Sean Christopherson
  2022-12-08  7:14         ` Alexey Kardashevskiy
  0 siblings, 1 reply; 19+ messages in thread
From: Sean Christopherson @ 2022-12-07 19:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Alexey Kardashevskiy, kvm, x86, linux-kernel, Venu Busireddy,
	Tony Luck, Tom Lendacky, Thomas Gleixner, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Andrew Cooper, Jason A. Donenfeld, H. Peter Anvin

On Wed, Dec 07, 2022, Borislav Petkov wrote:
> On Thu, Dec 01, 2022 at 05:38:33PM +0000, Sean Christopherson wrote:
> > Probably high time to add a helper/macro to convert the SEV_STATUS to
> > the SEV_FEATURES field.
> 
> Nah, there's a couple of
> 
> MSR_AMD64_SEV*
> 
> defines in arch/x86/include/asm/msr-index.h.
> 
> Bit 5 should simply be added there.

Ah, yeah, that's much better.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-07 18:55   ` Borislav Petkov
@ 2022-12-08  6:11     ` Alexey Kardashevskiy
  2022-12-08 10:35       ` Borislav Petkov
  0 siblings, 1 reply; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-08  6:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Andrew Cooper, Jason A. Donenfeld, H. Peter Anvin



On 8/12/22 05:55, Borislav Petkov wrote:
> On Thu, Dec 01, 2022 at 01:19:46PM +1100, Alexey Kardashevskiy wrote:
>> Subject: Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
> 
> "x86/amd: " is perfectly fine as a prefix.
> 
>> Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to
>> be noticeable when the AMD KVM SEV-ES's DebugSwap feature is enabled and
> 
> which does what? I.e., a sort of lazy DR regs swapping...
> 
>> KVM needs to store these before switching to a guest; the DebugSwitch
>> hardware support restores them as type B swap.
> 
> I know this is all clear to you but you should explain what type B
> register swap is.
> 
>> This stores MSR values from set_dr_addr_mask() in percpu values and
> 
> s/This stores/Store/
> 
>  From Documentation/process/submitting-patches.rst:
> 
>   "Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
>    instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
>    to do frotz", as if you are giving orders to the codebase to change
>    its behaviour."
> 
> Also, do not talk about what your patch does - that should hopefully be
> visible in the diff itself. Rather, talk about *why* you're doing what
> you're doing.
> 
>> returns them via new get_dr_addr_mask(). The gain here is about 10x.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
>> ---
>>   arch/x86/include/asm/debugreg.h |  1 +
>>   arch/x86/kernel/cpu/amd.c       | 32 ++++++++++++++++++++
>>   2 files changed, 33 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
>> index cfdf307ddc01..c4324d0205b5 100644
>> --- a/arch/x86/include/asm/debugreg.h
>> +++ b/arch/x86/include/asm/debugreg.h
>> @@ -127,6 +127,7 @@ static __always_inline void local_db_restore(unsigned long dr7)
>>   
>>   #ifdef CONFIG_CPU_SUP_AMD
>>   extern void set_dr_addr_mask(unsigned long mask, int dr);
>> +extern unsigned long get_dr_addr_mask(int dr);
>>   #else
>>   static inline void set_dr_addr_mask(unsigned long mask, int dr) { }
>>   #endif
>> diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
>> index c75d75b9f11a..ec7efcef4e14 100644
>> --- a/arch/x86/kernel/cpu/amd.c
>> +++ b/arch/x86/kernel/cpu/amd.c
>> @@ -1158,6 +1158,11 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
>>   	return false;
>>   }
>>   
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr0_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr1_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr2_addr_mask);
>> +DEFINE_PER_CPU_READ_MOSTLY(unsigned long, dr3_addr_mask);
> 
> This BPEXT thing is AMD-only, right?
> 
> I guess those should be called amd_drX_addr_mask where X in [0-3].
> 
> Yeah yeah, they are used in AMD-only code - svm* - but still.
> 
>>   void set_dr_addr_mask(unsigned long mask, int dr)
>>   {
>>   	if (!boot_cpu_has(X86_FEATURE_BPEXT))
>> @@ -1166,17 +1171,44 @@ void set_dr_addr_mask(unsigned long mask, int dr)
>>   	switch (dr) {
>>   	case 0:
>>   		wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0);
>> +		per_cpu(dr0_addr_mask, smp_processor_id()) = mask;
>>   		break;
>>   	case 1:
>> +		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
>> +		per_cpu(dr1_addr_mask, smp_processor_id()) = mask;
>> +		break;
>>   	case 2:
>> +		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
>> +		per_cpu(dr2_addr_mask, smp_processor_id()) = mask;
>> +		break;
>>   	case 3:
>>   		wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0);
>> +		per_cpu(dr3_addr_mask, smp_processor_id()) = mask;
>>   		break;
>>   	default:
>>   		break;
>>   	}
>>   }
>>   
>> +unsigned long get_dr_addr_mask(int dr)
> 
> This function name is too generic for an exported function.
> 
> amd_get_dr_addr_mask() I'd say.
> 
>> +	if (!boot_cpu_has(X86_FEATURE_BPEXT))
> 
> check_for_deprecated_apis: WARNING: arch/x86/kernel/cpu/amd.c:1195: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
> 
> You could fix the above one too, while at it.
> 
>> +		return 0;
>> +
>> +	switch (dr) {
>> +	case 0:
>> +		return per_cpu(dr0_addr_mask, smp_processor_id());
>> +	case 1:
>> +		return per_cpu(dr1_addr_mask, smp_processor_id());
>> +	case 2:
>> +		return per_cpu(dr2_addr_mask, smp_processor_id());
>> +	case 3:
>> +		return per_cpu(dr3_addr_mask, smp_processor_id());
> 
> 	default:	
> 		WARN_ON_ONCE(1);
> 		break;
> 
> Just in case.
> 
> And as a matter of fact, make that short and succinct:
> 
>          switch (dr) {
>          case 0: return per_cpu(dr0_addr_mask, smp_processor_id());
>          case 1: return per_cpu(dr1_addr_mask, smp_processor_id());
>          case 2: return per_cpu(dr2_addr_mask, smp_processor_id());
>          case 3: return per_cpu(dr3_addr_mask, smp_processor_id());
>          default: WARN_ON_ONCE(1); break;
>          }


Not an array, as Sean suggested? Uff...


> 
> Thx.
> 

-- 
Alexey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write
  2022-12-07 19:07       ` Sean Christopherson
@ 2022-12-08  7:14         ` Alexey Kardashevskiy
  2022-12-08 11:01           ` Borislav Petkov
  0 siblings, 1 reply; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-08  7:14 UTC (permalink / raw)
  To: Sean Christopherson, Borislav Petkov
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Andrew Cooper,
	Jason A. Donenfeld, H. Peter Anvin



On 8/12/22 06:07, Sean Christopherson wrote:
> On Wed, Dec 07, 2022, Borislav Petkov wrote:
>> On Thu, Dec 01, 2022 at 05:38:33PM +0000, Sean Christopherson wrote:
>>> Probably high time to add a helper/macro to convert the SEV_STATUS to
>>> the SEV_FEATURES field.
>>
>> Nah, there's a couple of
>>
>> MSR_AMD64_SEV*
>>
>> defines in arch/x86/include/asm/msr-index.h.
>>
>> Bit 5 should simply be added there.
> 
> Ah, yeah, that's much better.

Sorry, I am not following. How is moving the bit makes 
SEV_STATUS_TO_FEATURES() not needed?

When I am setting it in VMSA SEV_FEATURES - it is a bit 5.

Inside a SEV VM, it is SEV_STATUS MSR and there it is bit 7. Mentioned 
MSR_AMD64_SEV* are SEV_STATUS MSR bits.

Since the current patch is bad, I'd rather define the bit twice then:

arch/x86/include/asm/msr-index.h:
#define MSR_AMD64_SEV_FEAT_DEBUG_SWAP    BIT_ULL(7)

arch/x86/include/asm/svm.h
#define SVM_SEV_FEAT_DEBUG_SWAP         BIT(5)

as nothing really says that SEV_FEATURES is always going to be 
SEV_STATUS>>2, even though it is now.

Soooo what is acceptable solution here? Thanks,


-- 
Alexey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables
  2022-12-08  6:11     ` Alexey Kardashevskiy
@ 2022-12-08 10:35       ` Borislav Petkov
  0 siblings, 0 replies; 19+ messages in thread
From: Borislav Petkov @ 2022-12-08 10:35 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Sean Christopherson, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Andrew Cooper, Jason A. Donenfeld, H. Peter Anvin

On Thu, Dec 08, 2022 at 05:11:26PM +1100, Alexey Kardashevskiy wrote:
> Not an array, as Sean suggested? Uff...

Sure an array if it makes the code even more readable and clean. With an
array it should become even more compact, I'd say.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write
  2022-12-08  7:14         ` Alexey Kardashevskiy
@ 2022-12-08 11:01           ` Borislav Petkov
  0 siblings, 0 replies; 19+ messages in thread
From: Borislav Petkov @ 2022-12-08 11:01 UTC (permalink / raw)
  To: Alexey Kardashevskiy
  Cc: Sean Christopherson, kvm, x86, linux-kernel, Venu Busireddy,
	Tony Luck, Tom Lendacky, Thomas Gleixner, Peter Zijlstra,
	Paolo Bonzini, Michael Sterritt, Michael Roth, Mario Limonciello,
	Ingo Molnar, Heiko Carstens, Greg Kroah-Hartman, Dave Hansen,
	Andrew Cooper, Jason A. Donenfeld, H. Peter Anvin

On Thu, Dec 08, 2022 at 06:14:01PM +1100, Alexey Kardashevskiy wrote:
> Sorry, I am not following. How is moving the bit makes
> SEV_STATUS_TO_FEATURES() not needed?
> 
> When I am setting it in VMSA SEV_FEATURES - it is a bit 5.
> 
> Inside a SEV VM, it is SEV_STATUS MSR and there it is bit 7. Mentioned
> MSR_AMD64_SEV* are SEV_STATUS MSR bits.
> 
> Since the current patch is bad, I'd rather define the bit twice then:

Yes.

> arch/x86/include/asm/msr-index.h:
> #define MSR_AMD64_SEV_FEAT_DEBUG_SWAP    BIT_ULL(7)
> 
> arch/x86/include/asm/svm.h
> #define SVM_SEV_FEAT_DEBUG_SWAP         BIT(5)
> 
> as nothing really says that SEV_FEATURES is always going to be
> SEV_STATUS>>2, even though it is now.
> 
> Soooo what is acceptable solution here? Thanks,

Right, so since you're testing against sev_status which is a copy of
MSR_AMD64_SEV, then you use bit definitions which are for that MSR as
documented in the respective PPR section for "MSRC001_0131 [SEV Status]
(Core::X86::Msr::SEV_Status)"

When you're setting the VMSA's SEV_FEATURES field, then you need a
different define, ofc.

This also automatically takes care of SEV_FEATURES not being tied to
SEV_STATUS >> 2 forever, as you say.

So yes, do the twice thing.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH kernel 2/3] KVM: SEV: Enable DebugSwap
  2022-12-01 17:37   ` Sean Christopherson
@ 2022-12-09  2:28     ` Alexey Kardashevskiy
  0 siblings, 0 replies; 19+ messages in thread
From: Alexey Kardashevskiy @ 2022-12-09  2:28 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, x86, linux-kernel, Venu Busireddy, Tony Luck, Tom Lendacky,
	Thomas Gleixner, Peter Zijlstra, Paolo Bonzini, Michael Sterritt,
	Michael Roth, Mario Limonciello, Ingo Molnar, Heiko Carstens,
	Greg Kroah-Hartman, Dave Hansen, Borislav Petkov, Andrew Cooper,
	Jason A. Donenfeld, H. Peter Anvin



On 2/12/22 04:37, Sean Christopherson wrote:
> On Thu, Dec 01, 2022, Alexey Kardashevskiy wrote:
>> AMD Milan introduces support for the swapping, as type 'B',
> 
> Please make the changelog standalone, i.e. don't rely on the shortlog to provide
> context.  "the swapping" is inscrutable without the shortlog.
> 
>> of DR[0-3] and DR[0-3]_ADDR_MASK registers. It requires that
>> SEV_FEATURES[5] be set in the VMSA.
> 
> Avoid pronouns in shortlogs, changelogs, and comments, as pronouns tend to be
> ambiguous.  "Software can enable DebugSwap by setting SEV_FEATURE[5] in the VMSA."
> isn't much more effort to type.
> 
>>
>> This requires the KVM to eliminate the intercept of #DB. However,
> 
> Same here, e.g. does "this" mean that the architecture requires DB interception
> to be disabled to enable DebugSwap?
> 
>> because of the infinite #DB loop DoS that a malicious guest can do,
>> it can only be eliminated based if CPUID Fn80000021_EAX[0]
> 
> And "it" here.
> 
>> (NoNestedDataBp) is set in the host/HV.
>>
>> This eliminates #DB intercept, DR7 intercept for SEV-ES/SEV-SNP guest.
>> This saves DR[0-3] / DR[0-3]_ADDR_MASK in the host save area before VMRUN.
>> This sets SEV_FEATURES[5] in VMSA.
> 
> And all of these "this".  Assuming "this" means "this patch", rewrite these
> sentences to be commands that state what changes are being done.
> 
>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
>> ---
>>   arch/x86/include/asm/svm.h |  1 +
>>   arch/x86/kvm/svm/svm.h     | 18 +++++++++++-----
>>   arch/x86/kvm/svm/sev.c     | 22 +++++++++++++++++++-
>>   arch/x86/kvm/svm/svm.c     |  6 ++++--
>>   4 files changed, 39 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
>> index 0361626841bc..373a0edda588 100644
>> --- a/arch/x86/include/asm/svm.h
>> +++ b/arch/x86/include/asm/svm.h
>> @@ -273,6 +273,7 @@ enum avic_ipi_failure_cause {
>>   #define AVIC_HPA_MASK	~((0xFFFULL << 52) | 0xFFF)
>>   #define VMCB_AVIC_APIC_BAR_MASK		0xFFFFFFFFFF000ULL
>>   
>> +#define SVM_SEV_FEAT_DEBUG_SWAP                        BIT(5)
>>   
>>   struct vmcb_seg {
>>   	u16 selector;
>> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
>> index 199a2ecef1ce..4d75b14bffab 100644
>> --- a/arch/x86/kvm/svm/svm.h
>> +++ b/arch/x86/kvm/svm/svm.h
>> @@ -83,6 +83,7 @@ enum {
>>   struct kvm_sev_info {
>>   	bool active;		/* SEV enabled guest */
>>   	bool es_active;		/* SEV-ES enabled guest */
>> +	bool debug_swap;        /* SEV-ES Debug swap enabled */
>>   	unsigned int asid;	/* ASID used for this guest */
>>   	unsigned int handle;	/* SEV firmware handle */
>>   	int fd;			/* SEV device fd */
>> @@ -388,6 +389,7 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3
>>   
>>   static inline void set_dr_intercepts(struct vcpu_svm *svm)
>>   {
>> +	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
>>   	struct vmcb *vmcb = svm->vmcb01.ptr;
>>   
>>   	if (!sev_es_guest(svm->vcpu.kvm)) {
>> @@ -407,20 +409,26 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm)
>>   		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE);
>>   	}
>>   
>> -	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
>> -	vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
>> +	if (!sev->debug_swap) {
>> +		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
>> +		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
>> +	}
>>   
>>   	recalc_intercepts(svm);
>>   }
>>   
>>   static inline void clr_dr_intercepts(struct vcpu_svm *svm)
>>   {
>> +	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
>>   	struct vmcb *vmcb = svm->vmcb01.ptr;
>>   
>>   	vmcb->control.intercepts[INTERCEPT_DR] = 0;
>>   
>> -	/* DR7 access must remain intercepted for an SEV-ES guest */
>> -	if (sev_es_guest(svm->vcpu.kvm)) {
>> +	/*
>> +	 * DR7 access must remain intercepted for an SEV-ES guest unless
>> +	 * the DebugSwap feature is set
> 
> Please explain _why_.
> 
>> +	 */
>> +	if (sev_es_guest(svm->vcpu.kvm) && !sev->debug_swap) {
>>   		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ);
>>   		vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE);
>>   	}
>> @@ -677,7 +685,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu);
>>   int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in);
>>   void sev_es_vcpu_reset(struct vcpu_svm *svm);
>>   void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
>> -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa);
>> +void sev_es_prepare_switch_to_guest(struct kvm_vcpu *vcpu, struct sev_es_save_area *hostsa);
>>   void sev_es_unmap_ghcb(struct vcpu_svm *svm);
>>   
>>   /* vmenter.S */
>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
>> index efaaef2b7ae1..fac8b48e3162 100644
>> --- a/arch/x86/kvm/svm/sev.c
>> +++ b/arch/x86/kvm/svm/sev.c
>> @@ -21,6 +21,7 @@
>>   #include <asm/pkru.h>
>>   #include <asm/trapnr.h>
>>   #include <asm/fpu/xcr.h>
>> +#include <asm/debugreg.h>
>>   
>>   #include "mmu.h"
>>   #include "x86.h"
>> @@ -253,6 +254,7 @@ static int sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp)
>>   	if (asid < 0)
>>   		goto e_no_asid;
>>   	sev->asid = asid;
>> +	sev->debug_swap = sev->es_active &&
> 
> Enabling DebugSwap should be guarded with a module param so that the admin can
> disable the feature if necessary.  And then the per-vCPU variable goes away.
> 
>> kvm_cpu_cap_get(KVM_X86_FEATURE_NO_NESTED_DATA_BP);
> 
> kvm_cpu_cap_has().
> 
> And use X86_FEATURE_* directly, which is the whole point of the __feature_translate()
> shenanigans.
> 
>>   
>>   	ret = sev_platform_init(&argp->error);
>>   	if (ret)
>> @@ -564,6 +566,7 @@ static int sev_launch_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp)
>>   static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>>   {
>>   	struct sev_es_save_area *save = svm->sev_es.vmsa;
>> +	struct kvm_sev_info *sev = &to_kvm_svm(svm->vcpu.kvm)->sev_info;
>>   
>>   	/* Check some debug related fields before encrypting the VMSA */
>>   	if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1))
>> @@ -604,6 +607,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm)
>>   	save->xss  = svm->vcpu.arch.ia32_xss;
>>   	save->dr6  = svm->vcpu.arch.dr6;
>>   
>> +	if (sev->debug_swap)
>> +		save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP;
> 
> Resurrecting my objection to "AP Creation NAE event"[*], what happens if a hypervisor
> supports GHCB_HV_FT_SNP_AP_CREATION but not DebugSwap?  IIUC, a guest can corrupt
> host DRs by enabling DebugSwap in the VMSA of an AP vCPU, e.g. the CPU will load
> zeros on VM-Exit if the host hasn't stuffed the host save area fields.
> 
> KVM can obviously just make sure to save its DRs if hardware supports DebugSwap,
> but what if DebugSwap is buggy and needs to be disabled?


IIUC KVM will have to save DRs somewhere (not in the host save area) and 
restore on vmexit and enable intercepts for DR accesses. btw buggy how? 
I cannot imagine but can you? :) SEV-ES/SNP so relies on this swapping, 
hard to imagine just DebugSwap being that broken...


> And what about the next
> feature that can apparently be enabled by the guest?
> [*] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2FYWnbfCet84Vup6q9%40google.com&amp;data=05%7C01%7CAlexey.Kardashevskiy%40amd.com%7C73c50a32d6eb4dd8e76c08dad3c2cd04%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638055130890406019%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=0I4WhDhwve4%2FHebrIk2h4MYbrmlCNiif4bDNLpY7DcU%3D&amp;reserved=0

-- 
Alexey

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-12-09  2:29 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-01  2:19 [PATCH kernel 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy
2022-12-01  2:19 ` [PATCH kernel 1/3] x86/amd/dr_addr_mask: Cache values in percpu variables Alexey Kardashevskiy
2022-12-01 16:58   ` Sean Christopherson
2022-12-01 19:45     ` Andrew Cooper
2022-12-06  7:14     ` Alexey Kardashevskiy
2022-12-06 17:07       ` Sean Christopherson
2022-12-07  0:50         ` Alexey Kardashevskiy
2022-12-07 18:55   ` Borislav Petkov
2022-12-08  6:11     ` Alexey Kardashevskiy
2022-12-08 10:35       ` Borislav Petkov
2022-12-01  2:19 ` [PATCH kernel 2/3] KVM: SEV: Enable DebugSwap Alexey Kardashevskiy
2022-12-01 17:37   ` Sean Christopherson
2022-12-09  2:28     ` Alexey Kardashevskiy
2022-12-01  2:19 ` [PATCH kernel 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy
2022-12-01 17:38   ` Sean Christopherson
2022-12-07 19:01     ` Borislav Petkov
2022-12-07 19:07       ` Sean Christopherson
2022-12-08  7:14         ` Alexey Kardashevskiy
2022-12-08 11:01           ` Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.