* [PATCH kernel v3 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap @ 2023-01-20 3:10 Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 1/3] x86/amd: Cache debug register values in percpu variables Alexey Kardashevskiy ` (2 more replies) 0 siblings, 3 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-01-20 3:10 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin This is to use another AMD SEV-ES hardware assisted register swap, more detail in 2/3. The patches are fairly independend but required in this order. The previous conversation is here: https://lore.kernel.org/all/20221209043804.942352-1-aik@amd.com/ This is based on sha1 195df42eb64d Ingo Molnar "Merge branch into tip/master: 'x86/platform'" in order to have recently added X86_FEATURE_NO_NESTED_DATA_BP cpu feature. Please comment. Thanks. Alexey Kardashevskiy (3): x86/amd: Cache debug register values in percpu variables KVM: SEV: Enable data breakpoints in SEV-ES x86/sev: Do not handle #VC for DR7 read/write arch/x86/include/asm/debugreg.h | 9 +++- arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/svm.h | 1 + arch/x86/kvm/svm/svm.h | 16 +++++-- tools/arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/cpu/amd.c | 47 ++++++++++++++------ arch/x86/kernel/hw_breakpoint.c | 4 +- arch/x86/kernel/sev.c | 6 +++ arch/x86/kvm/svm/sev.c | 29 ++++++++++++ arch/x86/kvm/svm/svm.c | 3 +- 10 files changed, 94 insertions(+), 23 deletions(-) -- 2.38.1 ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH kernel v3 1/3] x86/amd: Cache debug register values in percpu variables 2023-01-20 3:10 [PATCH kernel v3 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy @ 2023-01-20 3:10 ` Alexey Kardashevskiy 2023-01-31 19:27 ` [tip: x86/cpu] " tip-bot2 for Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy 2 siblings, 1 reply; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-01-20 3:10 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to be noticeable with the AMD KVM SEV-ES DebugSwap feature enabled. KVM is going to store host's DR[0-3] and DR[0-3]_ADDR_MASK before switching to a guest; the hardware is going to swap these on VMRUN and VMEXIT. Store MSR values passed to set_dr_addr_mask() in percpu variables (when changed) and return them via new amd_get_dr_addr_mask(). The gain here is about 10x. As set_dr_addr_mask() uses the array too, change the @dr type to unsigned to avoid checking for <0. And give it the amd_ prefix to match the new helper as the whole DR_ADDR_MASK feature is AMD-specific anyway. While at it, replace deprecated boot_cpu_has() with cpu_feature_enabled() in set_dr_addr_mask(). Signed-off-by: Alexey Kardashevskiy <aik@amd.com> --- Changes: v3: * fixed commit log * amd_msr_dr_addr_masks do not do "-1" * store processor id in a local variable * s/set_dr_addr_mask/amd_set_dr_addr_mask/ v2: * reworked to use arrays * set() skips wrmsr() when the mask is not changed * added stub for get_dr_addr_mask() * changed @dr type to unsigned * s/boot_cpu_has/cpu_feature_enabled/ * added amd_ prefix --- arch/x86/include/asm/debugreg.h | 9 +++- arch/x86/kernel/cpu/amd.c | 47 ++++++++++++++------ arch/x86/kernel/hw_breakpoint.c | 4 +- 3 files changed, 42 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h index b049d950612f..f126b2ee890f 100644 --- a/arch/x86/include/asm/debugreg.h +++ b/arch/x86/include/asm/debugreg.h @@ -126,9 +126,14 @@ static __always_inline void local_db_restore(unsigned long dr7) } #ifdef CONFIG_CPU_SUP_AMD -extern void set_dr_addr_mask(unsigned long mask, int dr); +extern void amd_set_dr_addr_mask(unsigned long mask, unsigned int dr); +extern unsigned long amd_get_dr_addr_mask(unsigned int dr); #else -static inline void set_dr_addr_mask(unsigned long mask, int dr) { } +static inline void amd_set_dr_addr_mask(unsigned long mask, unsigned int dr) { } +static inline unsigned long amd_get_dr_addr_mask(unsigned int dr) +{ + return 0; +} #endif #endif /* _ASM_X86_DEBUGREG_H */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 208c2ce8598a..380753b14cab 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1158,24 +1158,43 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum) return false; } -void set_dr_addr_mask(unsigned long mask, int dr) +static DEFINE_PER_CPU_READ_MOSTLY(unsigned long[4], amd_dr_addr_mask); + +static unsigned int amd_msr_dr_addr_masks[] = { + MSR_F16H_DR0_ADDR_MASK, + MSR_F16H_DR1_ADDR_MASK, + MSR_F16H_DR1_ADDR_MASK + 1, + MSR_F16H_DR1_ADDR_MASK + 2 +}; + +void amd_set_dr_addr_mask(unsigned long mask, unsigned int dr) { - if (!boot_cpu_has(X86_FEATURE_BPEXT)) + int cpu = smp_processor_id(); + + if (!cpu_feature_enabled(X86_FEATURE_BPEXT)) return; - switch (dr) { - case 0: - wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0); - break; - case 1: - case 2: - case 3: - wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0); - break; - default: - break; - } + if (WARN_ON_ONCE(dr >= ARRAY_SIZE(amd_msr_dr_addr_masks))) + return; + + if (per_cpu(amd_dr_addr_mask, cpu)[dr] == mask) + return; + + wrmsr(amd_msr_dr_addr_masks[dr], mask, 0); + per_cpu(amd_dr_addr_mask, cpu)[dr] = mask; +} + +unsigned long amd_get_dr_addr_mask(unsigned int dr) +{ + if (!cpu_feature_enabled(X86_FEATURE_BPEXT)) + return 0; + + if (WARN_ON_ONCE(dr >= ARRAY_SIZE(amd_msr_dr_addr_masks))) + return 0; + + return per_cpu(amd_dr_addr_mask[dr], smp_processor_id()); } +EXPORT_SYMBOL_GPL(amd_get_dr_addr_mask); u32 amd_get_highest_perf(void) { diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c index bbb0f737aab1..b01644c949b2 100644 --- a/arch/x86/kernel/hw_breakpoint.c +++ b/arch/x86/kernel/hw_breakpoint.c @@ -127,7 +127,7 @@ int arch_install_hw_breakpoint(struct perf_event *bp) set_debugreg(*dr7, 7); if (info->mask) - set_dr_addr_mask(info->mask, i); + amd_set_dr_addr_mask(info->mask, i); return 0; } @@ -166,7 +166,7 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp) set_debugreg(dr7, 7); if (info->mask) - set_dr_addr_mask(0, i); + amd_set_dr_addr_mask(0, i); /* * Ensure the write to cpu_dr7 is after we've set the DR7 register. -- 2.38.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [tip: x86/cpu] x86/amd: Cache debug register values in percpu variables 2023-01-20 3:10 ` [PATCH kernel v3 1/3] x86/amd: Cache debug register values in percpu variables Alexey Kardashevskiy @ 2023-01-31 19:27 ` tip-bot2 for Alexey Kardashevskiy 0 siblings, 0 replies; 28+ messages in thread From: tip-bot2 for Alexey Kardashevskiy @ 2023-01-31 19:27 UTC (permalink / raw) To: linux-tip-commits Cc: Alexey Kardashevskiy, Borislav Petkov (AMD), x86, linux-kernel The following commit has been merged into the x86/cpu branch of tip: Commit-ID: 7914695743d598b189d549f2f57af24aa5633705 Gitweb: https://git.kernel.org/tip/7914695743d598b189d549f2f57af24aa5633705 Author: Alexey Kardashevskiy <aik@amd.com> AuthorDate: Fri, 20 Jan 2023 14:10:45 +11:00 Committer: Borislav Petkov (AMD) <bp@alien8.de> CommitterDate: Tue, 31 Jan 2023 20:09:26 +01:00 x86/amd: Cache debug register values in percpu variables Reading DR[0-3]_ADDR_MASK MSRs takes about 250 cycles which is going to be noticeable with the AMD KVM SEV-ES DebugSwap feature enabled. KVM is going to store host's DR[0-3] and DR[0-3]_ADDR_MASK before switching to a guest; the hardware is going to swap these on VMRUN and VMEXIT. Store MSR values passed to set_dr_addr_mask() in percpu variables (when changed) and return them via new amd_get_dr_addr_mask(). The gain here is about 10x. As set_dr_addr_mask() uses the array too, change the @dr type to unsigned to avoid checking for <0. And give it the amd_ prefix to match the new helper as the whole DR_ADDR_MASK feature is AMD-specific anyway. While at it, replace deprecated boot_cpu_has() with cpu_feature_enabled() in set_dr_addr_mask(). Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230120031047.628097-2-aik@amd.com --- arch/x86/include/asm/debugreg.h | 9 ++++-- arch/x86/kernel/cpu/amd.c | 47 ++++++++++++++++++++++---------- arch/x86/kernel/hw_breakpoint.c | 4 +-- 3 files changed, 42 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h index b049d95..f126b2e 100644 --- a/arch/x86/include/asm/debugreg.h +++ b/arch/x86/include/asm/debugreg.h @@ -126,9 +126,14 @@ static __always_inline void local_db_restore(unsigned long dr7) } #ifdef CONFIG_CPU_SUP_AMD -extern void set_dr_addr_mask(unsigned long mask, int dr); +extern void amd_set_dr_addr_mask(unsigned long mask, unsigned int dr); +extern unsigned long amd_get_dr_addr_mask(unsigned int dr); #else -static inline void set_dr_addr_mask(unsigned long mask, int dr) { } +static inline void amd_set_dr_addr_mask(unsigned long mask, unsigned int dr) { } +static inline unsigned long amd_get_dr_addr_mask(unsigned int dr) +{ + return 0; +} #endif #endif /* _ASM_X86_DEBUGREG_H */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 208c2ce..380753b 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -1158,24 +1158,43 @@ static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum) return false; } -void set_dr_addr_mask(unsigned long mask, int dr) +static DEFINE_PER_CPU_READ_MOSTLY(unsigned long[4], amd_dr_addr_mask); + +static unsigned int amd_msr_dr_addr_masks[] = { + MSR_F16H_DR0_ADDR_MASK, + MSR_F16H_DR1_ADDR_MASK, + MSR_F16H_DR1_ADDR_MASK + 1, + MSR_F16H_DR1_ADDR_MASK + 2 +}; + +void amd_set_dr_addr_mask(unsigned long mask, unsigned int dr) { - if (!boot_cpu_has(X86_FEATURE_BPEXT)) + int cpu = smp_processor_id(); + + if (!cpu_feature_enabled(X86_FEATURE_BPEXT)) return; - switch (dr) { - case 0: - wrmsr(MSR_F16H_DR0_ADDR_MASK, mask, 0); - break; - case 1: - case 2: - case 3: - wrmsr(MSR_F16H_DR1_ADDR_MASK - 1 + dr, mask, 0); - break; - default: - break; - } + if (WARN_ON_ONCE(dr >= ARRAY_SIZE(amd_msr_dr_addr_masks))) + return; + + if (per_cpu(amd_dr_addr_mask, cpu)[dr] == mask) + return; + + wrmsr(amd_msr_dr_addr_masks[dr], mask, 0); + per_cpu(amd_dr_addr_mask, cpu)[dr] = mask; +} + +unsigned long amd_get_dr_addr_mask(unsigned int dr) +{ + if (!cpu_feature_enabled(X86_FEATURE_BPEXT)) + return 0; + + if (WARN_ON_ONCE(dr >= ARRAY_SIZE(amd_msr_dr_addr_masks))) + return 0; + + return per_cpu(amd_dr_addr_mask[dr], smp_processor_id()); } +EXPORT_SYMBOL_GPL(amd_get_dr_addr_mask); u32 amd_get_highest_perf(void) { diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c index bbb0f73..b01644c 100644 --- a/arch/x86/kernel/hw_breakpoint.c +++ b/arch/x86/kernel/hw_breakpoint.c @@ -127,7 +127,7 @@ int arch_install_hw_breakpoint(struct perf_event *bp) set_debugreg(*dr7, 7); if (info->mask) - set_dr_addr_mask(info->mask, i); + amd_set_dr_addr_mask(info->mask, i); return 0; } @@ -166,7 +166,7 @@ void arch_uninstall_hw_breakpoint(struct perf_event *bp) set_debugreg(dr7, 7); if (info->mask) - set_dr_addr_mask(0, i); + amd_set_dr_addr_mask(0, i); /* * Ensure the write to cpu_dr7 is after we've set the DR7 register. ^ permalink raw reply related [flat|nested] 28+ messages in thread
* [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-01-20 3:10 [PATCH kernel v3 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 1/3] x86/amd: Cache debug register values in percpu variables Alexey Kardashevskiy @ 2023-01-20 3:10 ` Alexey Kardashevskiy 2023-01-31 19:22 ` Borislav Petkov 2023-02-01 2:18 ` Sean Christopherson 2023-01-20 3:10 ` [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy 2 siblings, 2 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-01-20 3:10 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin Prior to SEV-ES, KVM stored/loaded host debug registers upon switching to/from a VM. Changing those registers inside a running SEV VM triggered #VC exit to KVM. SEV-ES added the encrypted state (ES) which uses an encrypted guest page for the VM state (VMSA). The hardware saves/restores certain registers on VMRUN/VMEXIT according to a swap type (A, B, C), see "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual volume 2. AMD Milan (Fam 19h) introduces support for the debug registers swapping. DR6 and DR7 are always swapped. DR[0-3] and DR[0-3]_ADDR_MASK are swapped a type B when SEV_FEATURES[5] ("DebugSwap") is set. Enable DebugSwap in VMSA. But only do so if CPUID Fn80000021_EAX[0] ("NoNestedDataBp", "Processor ignores nested data breakpoints") is supported by the SOC as otherwise a malicious SEV-ES guest can set up data breakpoints on the #VC IDT entry/stack and cause an infinite loop. Eliminate DR7 and #DB intercepts as: - they are not needed when DebugSwap is supported; - #VC for these intercepts is most likely not supported anyway and kills the VM. Keep DR7 intercepted unless DebugSwap enabled to prevent the infinite #DB loop DoS. Signed-off-by: Alexey Kardashevskiy <aik@amd.com> --- Changes: v3: * rewrote the commit log again * rebased on tip/master to use recently defined X86_FEATURE_NO_NESTED_DATA_BP * s/boot_cpu_has/cpu_feature_enabled/ v2: * debug_swap moved from vcpu to module_param * rewrote commit log --- Tested with: === int x; int main(int argc, char *argv[]) { x = 1; return 0; } === gcc -g a.c rsync a.out ruby-954vm:~/ ssh -t ruby-954vm 'gdb -ex "file a.out" -ex "watch x" -ex r' where ruby-954vm is a VM. With "/sys/module/kvm_amd/parameters/debug_swap = 0", gdb does not stop on the watchpoint, with "= 1" - gdb does. --- arch/x86/include/asm/svm.h | 1 + arch/x86/kvm/svm/svm.h | 16 ++++++++--- arch/x86/kvm/svm/sev.c | 29 ++++++++++++++++++++ arch/x86/kvm/svm/svm.c | 3 +- 4 files changed, 44 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index cb1ee53ad3b1..665515c7edae 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -278,6 +278,7 @@ enum avic_ipi_failure_cause { #define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) #define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL +#define SVM_SEV_FEAT_DEBUG_SWAP BIT(5) struct vmcb_seg { u16 selector; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 4826e6cc611b..61f2cad1cbaf 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -389,6 +389,8 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3 return test_bit(bit, (unsigned long *)&control->intercepts); } +extern bool sev_es_is_debug_swap_enabled(void); + static inline void set_dr_intercepts(struct vcpu_svm *svm) { struct vmcb *vmcb = svm->vmcb01.ptr; @@ -410,8 +412,10 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm) vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); } - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); + if (!sev_es_guest(svm->vcpu.kvm) || !sev_es_is_debug_swap_enabled()) { + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); + } recalc_intercepts(svm); } @@ -422,8 +426,12 @@ static inline void clr_dr_intercepts(struct vcpu_svm *svm) vmcb->control.intercepts[INTERCEPT_DR] = 0; - /* DR7 access must remain intercepted for an SEV-ES guest */ - if (sev_es_guest(svm->vcpu.kvm)) { + /* + * DR7 access must remain intercepted for an SEV-ES guest unless DebugSwap + * (depends on NO_NESTED_DATA_BP) is enabled as otherwise a VM writing to DR7 + * from the #DB handler may trigger infinite loop of #DB's. + */ + if (sev_es_guest(svm->vcpu.kvm) && !sev_es_is_debug_swap_enabled()) { vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); } diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 86d6897f4806..e989a6f4953d 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -21,6 +21,7 @@ #include <asm/pkru.h> #include <asm/trapnr.h> #include <asm/fpu/xcr.h> +#include <asm/debugreg.h> #include "mmu.h" #include "x86.h" @@ -52,11 +53,21 @@ module_param_named(sev, sev_enabled, bool, 0444); /* enable/disable SEV-ES support */ static bool sev_es_enabled = true; module_param_named(sev_es, sev_es_enabled, bool, 0444); + +/* enable/disable SEV-ES DebugSwap support */ +static bool sev_es_debug_swap_enabled = true; +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); #else #define sev_enabled false #define sev_es_enabled false +#define sev_es_debug_swap false #endif /* CONFIG_KVM_AMD_SEV */ +bool sev_es_is_debug_swap_enabled(void) +{ + return sev_es_debug_swap_enabled; +} + static u8 sev_enc_bit; static DECLARE_RWSEM(sev_deactivate_lock); static DEFINE_MUTEX(sev_bitmap_lock); @@ -604,6 +615,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm) save->xss = svm->vcpu.arch.ia32_xss; save->dr6 = svm->vcpu.arch.dr6; + if (sev_es_is_debug_swap_enabled()) + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; + pr_debug("Virtual Machine Save Area (VMSA):\n"); print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false); @@ -2249,6 +2263,9 @@ void __init sev_hardware_setup(void) out: sev_enabled = sev_supported; sev_es_enabled = sev_es_supported; + if (sev_es_debug_swap_enabled) + sev_es_debug_swap_enabled = sev_es_enabled && + cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP); #endif } @@ -3027,6 +3044,18 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) /* MSR_IA32_XSS is restored on VMEXIT, save the currnet host value */ hostsa->xss = host_xss; + + /* The DebugSwap SEV feature does Type B swaps of DR[0-3] */ + if (sev_es_is_debug_swap_enabled()) { + hostsa->dr0 = native_get_debugreg(0); + hostsa->dr1 = native_get_debugreg(1); + hostsa->dr2 = native_get_debugreg(2); + hostsa->dr3 = native_get_debugreg(3); + hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0); + hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1); + hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); + hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); + } } void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 60c7c880266b..6c54a3c9d442 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1190,7 +1190,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) set_exception_intercept(svm, UD_VECTOR); set_exception_intercept(svm, MC_VECTOR); set_exception_intercept(svm, AC_VECTOR); - set_exception_intercept(svm, DB_VECTOR); + if (!sev_es_is_debug_swap_enabled()) + set_exception_intercept(svm, DB_VECTOR); /* * Guest access to VMware backdoor ports could legitimately * trigger #GP because of TSS I/O permission bitmap. -- 2.38.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-01-20 3:10 ` [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy @ 2023-01-31 19:22 ` Borislav Petkov 2023-02-01 2:20 ` Sean Christopherson 2023-02-01 2:18 ` Sean Christopherson 1 sibling, 1 reply; 28+ messages in thread From: Borislav Petkov @ 2023-01-31 19:22 UTC (permalink / raw) To: Sean Christopherson Cc: Alexey Kardashevskiy, kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin Hey Sean, On Fri, Jan 20, 2023 at 02:10:46PM +1100, Alexey Kardashevskiy wrote: > Prior to SEV-ES, KVM stored/loaded host debug registers upon switching > to/from a VM. Changing those registers inside a running SEV VM > triggered #VC exit to KVM. > > SEV-ES added the encrypted state (ES) which uses an encrypted guest page > for the VM state (VMSA). The hardware saves/restores certain registers on > VMRUN/VMEXIT according to a swap type (A, B, C), see > "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual > volume 2. > > AMD Milan (Fam 19h) introduces support for the debug registers swapping. > DR6 and DR7 are always swapped. DR[0-3] and DR[0-3]_ADDR_MASK are swapped > a type B when SEV_FEATURES[5] ("DebugSwap") is set. > > Enable DebugSwap in VMSA. But only do so if CPUID Fn80000021_EAX[0] > ("NoNestedDataBp", "Processor ignores nested data breakpoints") is > supported by the SOC as otherwise a malicious SEV-ES guest can set up > data breakpoints on the #VC IDT entry/stack and cause an infinite loop. > > Eliminate DR7 and #DB intercepts as: > - they are not needed when DebugSwap is supported; > - #VC for these intercepts is most likely not supported anyway and > kills the VM. > Keep DR7 intercepted unless DebugSwap enabled to prevent the infinite #DB > loop DoS. ... ok to take this through the tip tree? Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-01-31 19:22 ` Borislav Petkov @ 2023-02-01 2:20 ` Sean Christopherson 2023-02-01 19:32 ` Sean Christopherson 0 siblings, 1 reply; 28+ messages in thread From: Sean Christopherson @ 2023-02-01 2:20 UTC (permalink / raw) To: Borislav Petkov Cc: Alexey Kardashevskiy, kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On Tue, Jan 31, 2023, Borislav Petkov wrote: > Hey Sean, > > On Fri, Jan 20, 2023 at 02:10:46PM +1100, Alexey Kardashevskiy wrote: > > Prior to SEV-ES, KVM stored/loaded host debug registers upon switching > > to/from a VM. Changing those registers inside a running SEV VM > > triggered #VC exit to KVM. > > > > SEV-ES added the encrypted state (ES) which uses an encrypted guest page > > for the VM state (VMSA). The hardware saves/restores certain registers on > > VMRUN/VMEXIT according to a swap type (A, B, C), see > > "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual > > volume 2. > > > > AMD Milan (Fam 19h) introduces support for the debug registers swapping. > > DR6 and DR7 are always swapped. DR[0-3] and DR[0-3]_ADDR_MASK are swapped > > a type B when SEV_FEATURES[5] ("DebugSwap") is set. > > > > Enable DebugSwap in VMSA. But only do so if CPUID Fn80000021_EAX[0] > > ("NoNestedDataBp", "Processor ignores nested data breakpoints") is > > supported by the SOC as otherwise a malicious SEV-ES guest can set up > > data breakpoints on the #VC IDT entry/stack and cause an infinite loop. > > > > Eliminate DR7 and #DB intercepts as: > > - they are not needed when DebugSwap is supported; > > - #VC for these intercepts is most likely not supported anyway and > > kills the VM. > > Keep DR7 intercepted unless DebugSwap enabled to prevent the infinite #DB > > loop DoS. > > ... > > ok to take this through the tip tree? I would prefer to take this through KVM, there's enough subtle complexity in this code that it'd be nice to have it close by. If you're happy with patch 1, maybe ack that one and take it through KVM, and route patch 3 through tip? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-02-01 2:20 ` Sean Christopherson @ 2023-02-01 19:32 ` Sean Christopherson 2023-02-03 12:26 ` Borislav Petkov 0 siblings, 1 reply; 28+ messages in thread From: Sean Christopherson @ 2023-02-01 19:32 UTC (permalink / raw) To: Borislav Petkov Cc: Alexey Kardashevskiy, kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On Wed, Feb 01, 2023, Sean Christopherson wrote: > On Tue, Jan 31, 2023, Borislav Petkov wrote: > > Hey Sean, > > > > On Fri, Jan 20, 2023 at 02:10:46PM +1100, Alexey Kardashevskiy wrote: > > > Prior to SEV-ES, KVM stored/loaded host debug registers upon switching > > > to/from a VM. Changing those registers inside a running SEV VM > > > triggered #VC exit to KVM. > > > > > > SEV-ES added the encrypted state (ES) which uses an encrypted guest page > > > for the VM state (VMSA). The hardware saves/restores certain registers on > > > VMRUN/VMEXIT according to a swap type (A, B, C), see > > > "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual > > > volume 2. > > > > > > AMD Milan (Fam 19h) introduces support for the debug registers swapping. > > > DR6 and DR7 are always swapped. DR[0-3] and DR[0-3]_ADDR_MASK are swapped > > > a type B when SEV_FEATURES[5] ("DebugSwap") is set. > > > > > > Enable DebugSwap in VMSA. But only do so if CPUID Fn80000021_EAX[0] > > > ("NoNestedDataBp", "Processor ignores nested data breakpoints") is > > > supported by the SOC as otherwise a malicious SEV-ES guest can set up > > > data breakpoints on the #VC IDT entry/stack and cause an infinite loop. > > > > > > Eliminate DR7 and #DB intercepts as: > > > - they are not needed when DebugSwap is supported; > > > - #VC for these intercepts is most likely not supported anyway and > > > kills the VM. > > > Keep DR7 intercepted unless DebugSwap enabled to prevent the infinite #DB > > > loop DoS. > > > > ... > > > > ok to take this through the tip tree? > > I would prefer to take this through KVM, there's enough subtle complexity in this > code that it'd be nice to have it close by. > > If you're happy with patch 1, maybe ack that one and take it through KVM, and > route patch 3 through tip? Ah, you've already applied 1. That works too. I don't think KVM support for DebugSwap is going to make v6.3 no matter who takes what, so just base on the next version of this patch on tip/x86/cpu and I'll make a mental note to not try to grab this until after v6.3-rc1. ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-02-01 19:32 ` Sean Christopherson @ 2023-02-03 12:26 ` Borislav Petkov 0 siblings, 0 replies; 28+ messages in thread From: Borislav Petkov @ 2023-02-03 12:26 UTC (permalink / raw) To: Sean Christopherson Cc: Alexey Kardashevskiy, kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On Wed, Feb 01, 2023 at 07:32:29PM +0000, Sean Christopherson wrote: > Ah, you've already applied 1. That works too. I don't think KVM support for > DebugSwap is going to make v6.3 no matter who takes what, so just base on the > next version of this patch on tip/x86/cpu and I'll make a mental note to not try > to grab this until after v6.3-rc1. Yeah, I was thinking the same thing, seeing how you guys are swamped with stuff. (Who isn't?! ;-\). So yeah, let's do that and if there's change of plans, just ping me. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-01-20 3:10 ` [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy 2023-01-31 19:22 ` Borislav Petkov @ 2023-02-01 2:18 ` Sean Christopherson 2023-02-03 3:37 ` Alexey Kardashevskiy 1 sibling, 1 reply; 28+ messages in thread From: Sean Christopherson @ 2023-02-01 2:18 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On Fri, Jan 20, 2023, Alexey Kardashevskiy wrote: > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h > index 4826e6cc611b..61f2cad1cbaf 100644 > --- a/arch/x86/kvm/svm/svm.h > +++ b/arch/x86/kvm/svm/svm.h > @@ -389,6 +389,8 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3 > return test_bit(bit, (unsigned long *)&control->intercepts); > } > > +extern bool sev_es_is_debug_swap_enabled(void); > + > static inline void set_dr_intercepts(struct vcpu_svm *svm) > { > struct vmcb *vmcb = svm->vmcb01.ptr; > @@ -410,8 +412,10 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm) > vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); > } > > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > + if (!sev_es_guest(svm->vcpu.kvm) || !sev_es_is_debug_swap_enabled()) { Looking below, doesn't this do the wrong thing if set_dr_intercepts() is called before SVM_SEV_FEAT_DEBUG_SWAP is set? I.e. when this is called before LAUNCH_UPDATE? Seems like this should check SVM_SEV_FEAT_DEBUG_SWAP in sev_features regardless of when SVM_SEV_FEAT_DEBUG_SWAP is set. And if KVM checks sev_features, then I _think_ we can avoid having to expose sev_es_debug_swap_enabled to svm.{c,h} (though why on earth {set,clr}_dr_intercepts() is in svm.h is another question for the future). Follow-up question: does KVM _have_ to wait until KVM_SEV_LAUNCH_UPDATE_VMSA to set the flag? > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > + } > > recalc_intercepts(svm); > } > @@ -422,8 +426,12 @@ static inline void clr_dr_intercepts(struct vcpu_svm *svm) > > vmcb->control.intercepts[INTERCEPT_DR] = 0; > > - /* DR7 access must remain intercepted for an SEV-ES guest */ > - if (sev_es_guest(svm->vcpu.kvm)) { > + /* > + * DR7 access must remain intercepted for an SEV-ES guest unless DebugSwap > + * (depends on NO_NESTED_DATA_BP) is enabled as otherwise a VM writing to DR7 > + * from the #DB handler may trigger infinite loop of #DB's. > + */ > + if (sev_es_guest(svm->vcpu.kvm) && !sev_es_is_debug_swap_enabled()) { > vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > } > > @@ -52,11 +53,21 @@ module_param_named(sev, sev_enabled, bool, 0444); > /* enable/disable SEV-ES support */ > static bool sev_es_enabled = true; > module_param_named(sev_es, sev_es_enabled, bool, 0444); > + > +/* enable/disable SEV-ES DebugSwap support */ > +static bool sev_es_debug_swap_enabled = true; > +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); Module param needs 0444 permissions, i.e. shouldn't be writable after KVM is loaded. Though I don't know that providing a module param is warranted in this case. KVM provides module params for SEV and SEV-ES because there are legitimate reasons to turn them off, but at a glance, I don't see why we'd want that for this feature. > #else > #define sev_enabled false > #define sev_es_enabled false > +#define sev_es_debug_swap false This needs to be sev_es_debug_swap_enabled, otherwise things fall apart with CONFIG_KVM_AMD_SEV=n. arch/x86/kvm/svm/sev.c: In function ‘sev_es_is_debug_swap_enabled’: arch/x86/kvm/svm/sev.c:69:16: error: ‘sev_es_debug_swap_enabled’ undeclared (first use in this function); did you mean ‘sev_es_is_debug_swap_enabled’? 69 | return sev_es_debug_swap_enabled; | ^~~~~~~~~~~~~~~~~~~~~~~~~ | sev_es_is_debug_swap_enabled > #endif /* CONFIG_KVM_AMD_SEV */ > > +bool sev_es_is_debug_swap_enabled(void) > +{ > + return sev_es_debug_swap_enabled; > +} ... > @@ -604,6 +615,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm) > save->xss = svm->vcpu.arch.ia32_xss; > save->dr6 = svm->vcpu.arch.dr6; > > + if (sev_es_is_debug_swap_enabled()) > + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; > + > pr_debug("Virtual Machine Save Area (VMSA):\n"); > print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false); > > @@ -2249,6 +2263,9 @@ void __init sev_hardware_setup(void) > out: > sev_enabled = sev_supported; > sev_es_enabled = sev_es_supported; > + if (sev_es_debug_swap_enabled) > + sev_es_debug_swap_enabled = sev_es_enabled && > + cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP); Slight preference for: if (!sev_es_enabled || !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP)) sev_es_debug_swap_enabled = false; KVM does short-circuit some checks on module param values, but usually only to avoid additional setup. > #endif > } > > @@ -3027,6 +3044,18 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) > > /* MSR_IA32_XSS is restored on VMEXIT, save the currnet host value */ > hostsa->xss = host_xss; > + > + /* The DebugSwap SEV feature does Type B swaps of DR[0-3] */ > + if (sev_es_is_debug_swap_enabled()) { > + hostsa->dr0 = native_get_debugreg(0); > + hostsa->dr1 = native_get_debugreg(1); > + hostsa->dr2 = native_get_debugreg(2); > + hostsa->dr3 = native_get_debugreg(3); > + hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0); > + hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1); > + hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); > + hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); > + } > } > > void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 60c7c880266b..6c54a3c9d442 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -1190,7 +1190,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) > set_exception_intercept(svm, UD_VECTOR); > set_exception_intercept(svm, MC_VECTOR); > set_exception_intercept(svm, AC_VECTOR); > - set_exception_intercept(svm, DB_VECTOR); > + if (!sev_es_is_debug_swap_enabled()) > + set_exception_intercept(svm, DB_VECTOR); This is wrong. KVM needs to intercept #DBs when debugging non-SEV-ES VMs. This _could_ be tied to X86_FEATURE_NO_NESTED_DATA_BP, but the KVM would need to toggle the intercept depending on whether or not userspace wants to debug the guest. Similar to the DR7 interception, can this check sev_features directly? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-02-01 2:18 ` Sean Christopherson @ 2023-02-03 3:37 ` Alexey Kardashevskiy 2023-02-03 5:14 ` [PATCH kernel v4] " Alexey Kardashevskiy 2023-03-23 16:39 ` [PATCH kernel v3 2/3] " Sean Christopherson 0 siblings, 2 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-02-03 3:37 UTC (permalink / raw) To: Sean Christopherson Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On 01/02/2023 13:18, Sean Christopherson wrote: > On Fri, Jan 20, 2023, Alexey Kardashevskiy wrote: >> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h >> index 4826e6cc611b..61f2cad1cbaf 100644 >> --- a/arch/x86/kvm/svm/svm.h >> +++ b/arch/x86/kvm/svm/svm.h >> @@ -389,6 +389,8 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3 >> return test_bit(bit, (unsigned long *)&control->intercepts); >> } >> >> +extern bool sev_es_is_debug_swap_enabled(void); >> + >> static inline void set_dr_intercepts(struct vcpu_svm *svm) >> { >> struct vmcb *vmcb = svm->vmcb01.ptr; >> @@ -410,8 +412,10 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm) >> vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); >> } >> >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >> + if (!sev_es_guest(svm->vcpu.kvm) || !sev_es_is_debug_swap_enabled()) { > > Looking below, doesn't this do the wrong thing if set_dr_intercepts() is called > before SVM_SEV_FEAT_DEBUG_SWAP is set? I.e. when this is called before LAUNCH_UPDATE? > Seems like this should check SVM_SEV_FEAT_DEBUG_SWAP in sev_features regardless > of when SVM_SEV_FEAT_DEBUG_SWAP is set. > > And if KVM checks sev_features, then I _think_ we can avoid having to expose > sev_es_debug_swap_enabled to svm.{c,h} (though why on earth {set,clr}_dr_intercepts() > is in svm.h is another question for the future). 883b0a91f41a ("KVM: SVM: Move Nested SVM Implementation to nested.c") did that. Makes sense for things like vmcb_set_intercept() but {set,clr}_dr_intercepts() are still only called from svm.c so I'll move them there (btw do I need a separate patch for that? usually yes) > > Follow-up question: does KVM _have_ to wait until KVM_SEV_LAUNCH_UPDATE_VMSA to > set the flag? Nope. Will repost soon as a reply to this mail. >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >> + } >> >> recalc_intercepts(svm); >> } >> @@ -422,8 +426,12 @@ static inline void clr_dr_intercepts(struct vcpu_svm *svm) >> >> vmcb->control.intercepts[INTERCEPT_DR] = 0; >> >> - /* DR7 access must remain intercepted for an SEV-ES guest */ >> - if (sev_es_guest(svm->vcpu.kvm)) { >> + /* >> + * DR7 access must remain intercepted for an SEV-ES guest unless DebugSwap >> + * (depends on NO_NESTED_DATA_BP) is enabled as otherwise a VM writing to DR7 >> + * from the #DB handler may trigger infinite loop of #DB's. >> + */ >> + if (sev_es_guest(svm->vcpu.kvm) && !sev_es_is_debug_swap_enabled()) { >> vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >> vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >> } >> >> @@ -52,11 +53,21 @@ module_param_named(sev, sev_enabled, bool, 0444); >> /* enable/disable SEV-ES support */ >> static bool sev_es_enabled = true; >> module_param_named(sev_es, sev_es_enabled, bool, 0444); >> + >> +/* enable/disable SEV-ES DebugSwap support */ >> +static bool sev_es_debug_swap_enabled = true; >> +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); > > Module param needs 0444 permissions, i.e. shouldn't be writable after KVM is > loaded. Though I don't know that providing a module param is warranted in this > case. > KVM provides module params for SEV and SEV-ES because there are legitimate > reasons to turn them off, but at a glance, I don't see why we'd want that for this > feature. /me confused. You suggested this in the first place for (I think) for the case if the feature is found to be broken later on so admins can disable it. And I was using it to verify "x86/debug: Fix stack recursion caused by DR7 accesses" which is convenient but it is a minor thing. >> #else >> #define sev_enabled false >> #define sev_es_enabled false >> +#define sev_es_debug_swap false > > This needs to be sev_es_debug_swap_enabled, otherwise things fall apart with > CONFIG_KVM_AMD_SEV=n. > > arch/x86/kvm/svm/sev.c: In function ‘sev_es_is_debug_swap_enabled’: > arch/x86/kvm/svm/sev.c:69:16: error: ‘sev_es_debug_swap_enabled’ undeclared (first use in this function); did you mean ‘sev_es_is_debug_swap_enabled’? > 69 | return sev_es_debug_swap_enabled; > | ^~~~~~~~~~~~~~~~~~~~~~~~~ > | sev_es_is_debug_swap_enabled > > >> #endif /* CONFIG_KVM_AMD_SEV */ >> >> +bool sev_es_is_debug_swap_enabled(void) >> +{ >> + return sev_es_debug_swap_enabled; >> +} > > ... > >> @@ -604,6 +615,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm) >> save->xss = svm->vcpu.arch.ia32_xss; >> save->dr6 = svm->vcpu.arch.dr6; >> >> + if (sev_es_is_debug_swap_enabled()) >> + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; >> + >> pr_debug("Virtual Machine Save Area (VMSA):\n"); >> print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false); >> >> @@ -2249,6 +2263,9 @@ void __init sev_hardware_setup(void) >> out: >> sev_enabled = sev_supported; >> sev_es_enabled = sev_es_supported; >> + if (sev_es_debug_swap_enabled) >> + sev_es_debug_swap_enabled = sev_es_enabled && >> + cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP); > > Slight preference for: > > if (!sev_es_enabled || !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP)) > sev_es_debug_swap_enabled = false; > > KVM does short-circuit some checks on module param values, but usually only to > avoid additional setup. > >> #endif >> } >> >> @@ -3027,6 +3044,18 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) >> >> /* MSR_IA32_XSS is restored on VMEXIT, save the currnet host value */ >> hostsa->xss = host_xss; >> + >> + /* The DebugSwap SEV feature does Type B swaps of DR[0-3] */ >> + if (sev_es_is_debug_swap_enabled()) { >> + hostsa->dr0 = native_get_debugreg(0); >> + hostsa->dr1 = native_get_debugreg(1); >> + hostsa->dr2 = native_get_debugreg(2); >> + hostsa->dr3 = native_get_debugreg(3); >> + hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0); >> + hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1); >> + hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); >> + hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); >> + } >> } >> >> void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) >> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >> index 60c7c880266b..6c54a3c9d442 100644 >> --- a/arch/x86/kvm/svm/svm.c >> +++ b/arch/x86/kvm/svm/svm.c >> @@ -1190,7 +1190,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) >> set_exception_intercept(svm, UD_VECTOR); >> set_exception_intercept(svm, MC_VECTOR); >> set_exception_intercept(svm, AC_VECTOR); >> - set_exception_intercept(svm, DB_VECTOR); >> + if (!sev_es_is_debug_swap_enabled()) >> + set_exception_intercept(svm, DB_VECTOR); > > This is wrong. KVM needs to intercept #DBs when debugging non-SEV-ES VMs. Sorry, not following. The #DB intercept for non-SEV-ES is enabled here. Thanks, > This _could_ be tied to X86_FEATURE_NO_NESTED_DATA_BP, but the KVM would need to > toggle the intercept depending on whether or not userspace wants to debug the > guest. > > Similar to the DR7 interception, can this check sev_features directly? -- Alexey ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH kernel v4] KVM: SEV: Enable data breakpoints in SEV-ES 2023-02-03 3:37 ` Alexey Kardashevskiy @ 2023-02-03 5:14 ` Alexey Kardashevskiy 2023-02-21 5:19 ` Alexey Kardashevskiy 2023-03-23 17:40 ` Sean Christopherson 2023-03-23 16:39 ` [PATCH kernel v3 2/3] " Sean Christopherson 1 sibling, 2 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-02-03 5:14 UTC (permalink / raw) To: kvm Cc: x86, linux-kernel, Thomas Gleixner, Sean Christopherson, Paolo Bonzini, Ingo Molnar, Dave Hansen, Borislav Petkov, H. Peter Anvin, Alexey Kardashevskiy Prior to SEV-ES, KVM stored/loaded host debug registers upon switching to/from a VM. Changing those registers inside a running SEV VM triggered #VC exit to KVM. SEV-ES added the encrypted state (ES) which uses an encrypted guest page for the VM state (VMSA). The hardware saves/restores certain registers on VMRUN/VMEXIT according to a swap type (A, B, C), see "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual volume 2. AMD Milan (Fam 19h) introduces support for the debug registers swapping. DR6 and DR7 are always swapped. DR[0-3] and DR[0-3]_ADDR_MASK are swapped a type B when SEV_FEATURES[5] ("DebugSwap") is set. Enable DebugSwap in VMSA. But only do so if CPUID Fn80000021_EAX[0] ("NoNestedDataBp", "Processor ignores nested data breakpoints") is supported by the SOC as otherwise a malicious SEV-ES guest can set up data breakpoints on the #VC IDT entry/stack and cause an infinite loop. Eliminate DR7 and #DB intercepts as: - they are not needed when DebugSwap is supported; - #VC for these intercepts is most likely not supported anyway and kills the VM. Keep DR7 intercepted unless DebugSwap enabled to prevent the infinite #DB loop DoS. While at this, move set_/clr_dr_intercepts to .c and move #DB intercept next to DR7 intercept. Signed-off-by: Alexey Kardashevskiy <aik@amd.com> --- Changes: v4: * removed sev_es_is_debug_swap_enabled() helper * made sev_es_debug_swap_enabled (module param) static * set sev_feature early in sev_es_init_vmcb() and made intercepts dependend on it vs. module param * move set_/clr_dr_intercepts to .c v3: * rewrote the commit log again * rebased on tip/master to use recently defined X86_FEATURE_NO_NESTED_DATA_BP * s/boot_cpu_has/cpu_feature_enabled/ v2: * debug_swap moved from vcpu to module_param * rewrote commit log --- Tested with: === int x; int main(int argc, char *argv[]) { x = 1; return 0; } === gcc -g a.c rsync a.out ruby-954vm:~/ ssh -t ruby-954vm 'gdb -ex "file a.out" -ex "watch x" -ex r' where ruby-954vm is a VM. With "/sys/module/kvm_amd/parameters/debug_swap = 0", gdb does not stop on the watchpoint, with "= 1" - gdb does. --- arch/x86/include/asm/svm.h | 1 + arch/x86/kvm/svm/svm.h | 42 ------------- arch/x86/kvm/svm/sev.c | 24 ++++++++ arch/x86/kvm/svm/svm.c | 65 +++++++++++++++++++- 4 files changed, 87 insertions(+), 45 deletions(-) diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h index cb1ee53ad3b1..665515c7edae 100644 --- a/arch/x86/include/asm/svm.h +++ b/arch/x86/include/asm/svm.h @@ -278,6 +278,7 @@ enum avic_ipi_failure_cause { #define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) #define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL +#define SVM_SEV_FEAT_DEBUG_SWAP BIT(5) struct vmcb_seg { u16 selector; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 4826e6cc611b..653fd09929df 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -389,48 +389,6 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3 return test_bit(bit, (unsigned long *)&control->intercepts); } -static inline void set_dr_intercepts(struct vcpu_svm *svm) -{ - struct vmcb *vmcb = svm->vmcb01.ptr; - - if (!sev_es_guest(svm->vcpu.kvm)) { - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); - } - - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); - - recalc_intercepts(svm); -} - -static inline void clr_dr_intercepts(struct vcpu_svm *svm) -{ - struct vmcb *vmcb = svm->vmcb01.ptr; - - vmcb->control.intercepts[INTERCEPT_DR] = 0; - - /* DR7 access must remain intercepted for an SEV-ES guest */ - if (sev_es_guest(svm->vcpu.kvm)) { - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); - } - - recalc_intercepts(svm); -} - static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit) { struct vmcb *vmcb = svm->vmcb01.ptr; diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 86d6897f4806..af775410c5eb 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -21,6 +21,7 @@ #include <asm/pkru.h> #include <asm/trapnr.h> #include <asm/fpu/xcr.h> +#include <asm/debugreg.h> #include "mmu.h" #include "x86.h" @@ -52,9 +53,14 @@ module_param_named(sev, sev_enabled, bool, 0444); /* enable/disable SEV-ES support */ static bool sev_es_enabled = true; module_param_named(sev_es, sev_es_enabled, bool, 0444); + +/* enable/disable SEV-ES DebugSwap support */ +static bool sev_es_debug_swap_enabled = true; +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); #else #define sev_enabled false #define sev_es_enabled false +#define sev_es_debug_swap false #endif /* CONFIG_KVM_AMD_SEV */ static u8 sev_enc_bit; @@ -2249,6 +2255,8 @@ void __init sev_hardware_setup(void) out: sev_enabled = sev_supported; sev_es_enabled = sev_es_supported; + if (!sev_es_enabled || !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP)) + sev_es_debug_swap_enabled = false; #endif } @@ -2940,6 +2948,7 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in) static void sev_es_init_vmcb(struct vcpu_svm *svm) { struct kvm_vcpu *vcpu = &svm->vcpu; + struct sev_es_save_area *save = svm->sev_es.vmsa; svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ES_ENABLE; svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; @@ -2988,6 +2997,9 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) if (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP)) svm_clr_intercept(svm, INTERCEPT_RDTSCP); } + + if (sev_es_debug_swap_enabled) + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; } void sev_init_vmcb(struct vcpu_svm *svm) @@ -3027,6 +3039,18 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) /* MSR_IA32_XSS is restored on VMEXIT, save the currnet host value */ hostsa->xss = host_xss; + + /* The DebugSwap SEV feature does Type B swaps of DR[0-3] */ + if (sev_es_debug_swap_enabled) { + hostsa->dr0 = native_get_debugreg(0); + hostsa->dr1 = native_get_debugreg(1); + hostsa->dr2 = native_get_debugreg(2); + hostsa->dr3 = native_get_debugreg(3); + hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0); + hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1); + hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); + hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); + } } void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 60c7c880266b..f8e222bee22a 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -671,6 +671,65 @@ static int svm_cpu_init(int cpu) } +static void set_dr_intercepts(struct vcpu_svm *svm) +{ + struct vmcb *vmcb = svm->vmcb01.ptr; + bool intercept; + + if (!sev_es_guest(svm->vcpu.kvm)) { + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); + } + + if (sev_es_guest(svm->vcpu.kvm)) { + struct sev_es_save_area *save = svm->sev_es.vmsa; + + intercept = !(save->sev_features & SVM_SEV_FEAT_DEBUG_SWAP); + } else { + intercept = true; + } + + if (intercept) { + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); + set_exception_intercept(svm, DB_VECTOR); + } + + recalc_intercepts(svm); +} + +static void clr_dr_intercepts(struct vcpu_svm *svm) +{ + struct vmcb *vmcb = svm->vmcb01.ptr; + struct sev_es_save_area *save = svm->sev_es.vmsa; + + vmcb->control.intercepts[INTERCEPT_DR] = 0; + + /* + * DR7 access must remain intercepted for an SEV-ES guest unless DebugSwap + * (depends on NO_NESTED_DATA_BP) is enabled as otherwise a VM writing to DR7 + * from the #DB handler may trigger infinite loop of #DB's. + */ + if (sev_es_guest(svm->vcpu.kvm) && (save->sev_features & SVM_SEV_FEAT_DEBUG_SWAP)) { + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); + } + + recalc_intercepts(svm); +} + static int direct_access_msr_slot(u32 msr) { u32 i; @@ -1184,13 +1243,11 @@ static void init_vmcb(struct kvm_vcpu *vcpu) if (!kvm_vcpu_apicv_active(vcpu)) svm_set_intercept(svm, INTERCEPT_CR8_WRITE); - set_dr_intercepts(svm); - set_exception_intercept(svm, PF_VECTOR); set_exception_intercept(svm, UD_VECTOR); set_exception_intercept(svm, MC_VECTOR); set_exception_intercept(svm, AC_VECTOR); - set_exception_intercept(svm, DB_VECTOR); + /* * Guest access to VMware backdoor ports could legitimately * trigger #GP because of TSS I/O permission bitmap. @@ -1308,6 +1365,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) if (sev_guest(vcpu->kvm)) sev_init_vmcb(svm); + set_dr_intercepts(svm); + svm_hv_init_vmcb(vmcb); init_vmcb_after_set_cpuid(vcpu); -- 2.39.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v4] KVM: SEV: Enable data breakpoints in SEV-ES 2023-02-03 5:14 ` [PATCH kernel v4] " Alexey Kardashevskiy @ 2023-02-21 5:19 ` Alexey Kardashevskiy 2023-03-14 9:43 ` Alexey Kardashevskiy 2023-03-23 17:40 ` Sean Christopherson 1 sibling, 1 reply; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-02-21 5:19 UTC (permalink / raw) To: Sean Christopherson, Borislav Petkov Cc: x86, linux-kernel, Thomas Gleixner, Paolo Bonzini, Ingo Molnar, Dave Hansen, H. Peter Anvin, kvm Ping? Thanks, On 3/2/23 16:14, Alexey Kardashevskiy wrote: > Prior to SEV-ES, KVM stored/loaded host debug registers upon switching > to/from a VM. Changing those registers inside a running SEV VM > triggered #VC exit to KVM. > > SEV-ES added the encrypted state (ES) which uses an encrypted guest page > for the VM state (VMSA). The hardware saves/restores certain registers on > VMRUN/VMEXIT according to a swap type (A, B, C), see > "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual > volume 2. > > AMD Milan (Fam 19h) introduces support for the debug registers swapping. > DR6 and DR7 are always swapped. DR[0-3] and DR[0-3]_ADDR_MASK are swapped > a type B when SEV_FEATURES[5] ("DebugSwap") is set. > > Enable DebugSwap in VMSA. But only do so if CPUID Fn80000021_EAX[0] > ("NoNestedDataBp", "Processor ignores nested data breakpoints") is > supported by the SOC as otherwise a malicious SEV-ES guest can set up > data breakpoints on the #VC IDT entry/stack and cause an infinite loop. > > Eliminate DR7 and #DB intercepts as: > - they are not needed when DebugSwap is supported; > - #VC for these intercepts is most likely not supported anyway and > kills the VM. > Keep DR7 intercepted unless DebugSwap enabled to prevent the infinite #DB > loop DoS. > > While at this, move set_/clr_dr_intercepts to .c and move #DB intercept > next to DR7 intercept. > > Signed-off-by: Alexey Kardashevskiy <aik@amd.com> > --- > Changes: > v4: > * removed sev_es_is_debug_swap_enabled() helper > * made sev_es_debug_swap_enabled (module param) static > * set sev_feature early in sev_es_init_vmcb() and made intercepts > dependend on it vs. module param > * move set_/clr_dr_intercepts to .c > > v3: > * rewrote the commit log again > * rebased on tip/master to use recently defined X86_FEATURE_NO_NESTED_DATA_BP > * s/boot_cpu_has/cpu_feature_enabled/ > > v2: > * debug_swap moved from vcpu to module_param > * rewrote commit log > > --- > Tested with: > === > int x; > int main(int argc, char *argv[]) > { > x = 1; > return 0; > } > === > gcc -g a.c > rsync a.out ruby-954vm:~/ > ssh -t ruby-954vm 'gdb -ex "file a.out" -ex "watch x" -ex r' > > where ruby-954vm is a VM. > > With "/sys/module/kvm_amd/parameters/debug_swap = 0", gdb does not stop > on the watchpoint, with "= 1" - gdb does. > --- > arch/x86/include/asm/svm.h | 1 + > arch/x86/kvm/svm/svm.h | 42 ------------- > arch/x86/kvm/svm/sev.c | 24 ++++++++ > arch/x86/kvm/svm/svm.c | 65 +++++++++++++++++++- > 4 files changed, 87 insertions(+), 45 deletions(-) > > diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h > index cb1ee53ad3b1..665515c7edae 100644 > --- a/arch/x86/include/asm/svm.h > +++ b/arch/x86/include/asm/svm.h > @@ -278,6 +278,7 @@ enum avic_ipi_failure_cause { > #define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) > #define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL > > +#define SVM_SEV_FEAT_DEBUG_SWAP BIT(5) > > struct vmcb_seg { > u16 selector; > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h > index 4826e6cc611b..653fd09929df 100644 > --- a/arch/x86/kvm/svm/svm.h > +++ b/arch/x86/kvm/svm/svm.h > @@ -389,48 +389,6 @@ static inline bool vmcb12_is_intercept(struct vmcb_ctrl_area_cached *control, u3 > return test_bit(bit, (unsigned long *)&control->intercepts); > } > > -static inline void set_dr_intercepts(struct vcpu_svm *svm) > -{ > - struct vmcb *vmcb = svm->vmcb01.ptr; > - > - if (!sev_es_guest(svm->vcpu.kvm)) { > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); > - } > - > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > - > - recalc_intercepts(svm); > -} > - > -static inline void clr_dr_intercepts(struct vcpu_svm *svm) > -{ > - struct vmcb *vmcb = svm->vmcb01.ptr; > - > - vmcb->control.intercepts[INTERCEPT_DR] = 0; > - > - /* DR7 access must remain intercepted for an SEV-ES guest */ > - if (sev_es_guest(svm->vcpu.kvm)) { > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > - } > - > - recalc_intercepts(svm); > -} > - > static inline void set_exception_intercept(struct vcpu_svm *svm, u32 bit) > { > struct vmcb *vmcb = svm->vmcb01.ptr; > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index 86d6897f4806..af775410c5eb 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -21,6 +21,7 @@ > #include <asm/pkru.h> > #include <asm/trapnr.h> > #include <asm/fpu/xcr.h> > +#include <asm/debugreg.h> > > #include "mmu.h" > #include "x86.h" > @@ -52,9 +53,14 @@ module_param_named(sev, sev_enabled, bool, 0444); > /* enable/disable SEV-ES support */ > static bool sev_es_enabled = true; > module_param_named(sev_es, sev_es_enabled, bool, 0444); > + > +/* enable/disable SEV-ES DebugSwap support */ > +static bool sev_es_debug_swap_enabled = true; > +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); > #else > #define sev_enabled false > #define sev_es_enabled false > +#define sev_es_debug_swap false > #endif /* CONFIG_KVM_AMD_SEV */ > > static u8 sev_enc_bit; > @@ -2249,6 +2255,8 @@ void __init sev_hardware_setup(void) > out: > sev_enabled = sev_supported; > sev_es_enabled = sev_es_supported; > + if (!sev_es_enabled || !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP)) > + sev_es_debug_swap_enabled = false; > #endif > } > > @@ -2940,6 +2948,7 @@ int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in) > static void sev_es_init_vmcb(struct vcpu_svm *svm) > { > struct kvm_vcpu *vcpu = &svm->vcpu; > + struct sev_es_save_area *save = svm->sev_es.vmsa; > > svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ES_ENABLE; > svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; > @@ -2988,6 +2997,9 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) > if (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP)) > svm_clr_intercept(svm, INTERCEPT_RDTSCP); > } > + > + if (sev_es_debug_swap_enabled) > + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; > } > > void sev_init_vmcb(struct vcpu_svm *svm) > @@ -3027,6 +3039,18 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) > > /* MSR_IA32_XSS is restored on VMEXIT, save the currnet host value */ > hostsa->xss = host_xss; > + > + /* The DebugSwap SEV feature does Type B swaps of DR[0-3] */ > + if (sev_es_debug_swap_enabled) { > + hostsa->dr0 = native_get_debugreg(0); > + hostsa->dr1 = native_get_debugreg(1); > + hostsa->dr2 = native_get_debugreg(2); > + hostsa->dr3 = native_get_debugreg(3); > + hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0); > + hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1); > + hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); > + hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); > + } > } > > void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 60c7c880266b..f8e222bee22a 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -671,6 +671,65 @@ static int svm_cpu_init(int cpu) > > } > > +static void set_dr_intercepts(struct vcpu_svm *svm) > +{ > + struct vmcb *vmcb = svm->vmcb01.ptr; > + bool intercept; > + > + if (!sev_es_guest(svm->vcpu.kvm)) { > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); > + } > + > + if (sev_es_guest(svm->vcpu.kvm)) { > + struct sev_es_save_area *save = svm->sev_es.vmsa; > + > + intercept = !(save->sev_features & SVM_SEV_FEAT_DEBUG_SWAP); > + } else { > + intercept = true; > + } > + > + if (intercept) { > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > + set_exception_intercept(svm, DB_VECTOR); > + } > + > + recalc_intercepts(svm); > +} > + > +static void clr_dr_intercepts(struct vcpu_svm *svm) > +{ > + struct vmcb *vmcb = svm->vmcb01.ptr; > + struct sev_es_save_area *save = svm->sev_es.vmsa; > + > + vmcb->control.intercepts[INTERCEPT_DR] = 0; > + > + /* > + * DR7 access must remain intercepted for an SEV-ES guest unless DebugSwap > + * (depends on NO_NESTED_DATA_BP) is enabled as otherwise a VM writing to DR7 > + * from the #DB handler may trigger infinite loop of #DB's. > + */ > + if (sev_es_guest(svm->vcpu.kvm) && (save->sev_features & SVM_SEV_FEAT_DEBUG_SWAP)) { > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > + } > + > + recalc_intercepts(svm); > +} > + > static int direct_access_msr_slot(u32 msr) > { > u32 i; > @@ -1184,13 +1243,11 @@ static void init_vmcb(struct kvm_vcpu *vcpu) > if (!kvm_vcpu_apicv_active(vcpu)) > svm_set_intercept(svm, INTERCEPT_CR8_WRITE); > > - set_dr_intercepts(svm); > - > set_exception_intercept(svm, PF_VECTOR); > set_exception_intercept(svm, UD_VECTOR); > set_exception_intercept(svm, MC_VECTOR); > set_exception_intercept(svm, AC_VECTOR); > - set_exception_intercept(svm, DB_VECTOR); > + > /* > * Guest access to VMware backdoor ports could legitimately > * trigger #GP because of TSS I/O permission bitmap. > @@ -1308,6 +1365,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) > if (sev_guest(vcpu->kvm)) > sev_init_vmcb(svm); > > + set_dr_intercepts(svm); > + > svm_hv_init_vmcb(vmcb); > init_vmcb_after_set_cpuid(vcpu); > -- Alexey ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v4] KVM: SEV: Enable data breakpoints in SEV-ES 2023-02-21 5:19 ` Alexey Kardashevskiy @ 2023-03-14 9:43 ` Alexey Kardashevskiy 2023-03-21 6:56 ` Alexey Kardashevskiy 0 siblings, 1 reply; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-03-14 9:43 UTC (permalink / raw) To: Sean Christopherson, Borislav Petkov Cc: x86, linux-kernel, Thomas Gleixner, Paolo Bonzini, Ingo Molnar, Dave Hansen, H. Peter Anvin, kvm Ping? Thanks, On 21/2/23 16:19, Alexey Kardashevskiy wrote: > Ping? Thanks, > > On 3/2/23 16:14, Alexey Kardashevskiy wrote: >> Prior to SEV-ES, KVM stored/loaded host debug registers upon switching >> to/from a VM. Changing those registers inside a running SEV VM >> triggered #VC exit to KVM. >> >> SEV-ES added the encrypted state (ES) which uses an encrypted guest page >> for the VM state (VMSA). The hardware saves/restores certain registers on >> VMRUN/VMEXIT according to a swap type (A, B, C), see >> "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual >> volume 2. >> >> AMD Milan (Fam 19h) introduces support for the debug registers swapping. >> DR6 and DR7 are always swapped. DR[0-3] and DR[0-3]_ADDR_MASK are swapped >> a type B when SEV_FEATURES[5] ("DebugSwap") is set. >> >> Enable DebugSwap in VMSA. But only do so if CPUID Fn80000021_EAX[0] >> ("NoNestedDataBp", "Processor ignores nested data breakpoints") is >> supported by the SOC as otherwise a malicious SEV-ES guest can set up >> data breakpoints on the #VC IDT entry/stack and cause an infinite loop. >> >> Eliminate DR7 and #DB intercepts as: >> - they are not needed when DebugSwap is supported; >> - #VC for these intercepts is most likely not supported anyway and >> kills the VM. >> Keep DR7 intercepted unless DebugSwap enabled to prevent the infinite #DB >> loop DoS. >> >> While at this, move set_/clr_dr_intercepts to .c and move #DB intercept >> next to DR7 intercept. >> >> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> >> --- >> Changes: >> v4: >> * removed sev_es_is_debug_swap_enabled() helper >> * made sev_es_debug_swap_enabled (module param) static >> * set sev_feature early in sev_es_init_vmcb() and made intercepts >> dependend on it vs. module param >> * move set_/clr_dr_intercepts to .c >> >> v3: >> * rewrote the commit log again >> * rebased on tip/master to use recently defined >> X86_FEATURE_NO_NESTED_DATA_BP >> * s/boot_cpu_has/cpu_feature_enabled/ >> >> v2: >> * debug_swap moved from vcpu to module_param >> * rewrote commit log >> >> --- >> Tested with: >> === >> int x; >> int main(int argc, char *argv[]) >> { >> x = 1; >> return 0; >> } >> === >> gcc -g a.c >> rsync a.out ruby-954vm:~/ >> ssh -t ruby-954vm 'gdb -ex "file a.out" -ex "watch x" -ex r' >> >> where ruby-954vm is a VM. >> >> With "/sys/module/kvm_amd/parameters/debug_swap = 0", gdb does not stop >> on the watchpoint, with "= 1" - gdb does. >> --- >> arch/x86/include/asm/svm.h | 1 + >> arch/x86/kvm/svm/svm.h | 42 ------------- >> arch/x86/kvm/svm/sev.c | 24 ++++++++ >> arch/x86/kvm/svm/svm.c | 65 +++++++++++++++++++- >> 4 files changed, 87 insertions(+), 45 deletions(-) >> >> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h >> index cb1ee53ad3b1..665515c7edae 100644 >> --- a/arch/x86/include/asm/svm.h >> +++ b/arch/x86/include/asm/svm.h >> @@ -278,6 +278,7 @@ enum avic_ipi_failure_cause { >> #define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) >> #define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL >> +#define SVM_SEV_FEAT_DEBUG_SWAP BIT(5) >> struct vmcb_seg { >> u16 selector; >> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h >> index 4826e6cc611b..653fd09929df 100644 >> --- a/arch/x86/kvm/svm/svm.h >> +++ b/arch/x86/kvm/svm/svm.h >> @@ -389,48 +389,6 @@ static inline bool vmcb12_is_intercept(struct >> vmcb_ctrl_area_cached *control, u3 >> return test_bit(bit, (unsigned long *)&control->intercepts); >> } >> -static inline void set_dr_intercepts(struct vcpu_svm *svm) >> -{ >> - struct vmcb *vmcb = svm->vmcb01.ptr; >> - >> - if (!sev_es_guest(svm->vcpu.kvm)) { >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); >> - } >> - >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >> - >> - recalc_intercepts(svm); >> -} >> - >> -static inline void clr_dr_intercepts(struct vcpu_svm *svm) >> -{ >> - struct vmcb *vmcb = svm->vmcb01.ptr; >> - >> - vmcb->control.intercepts[INTERCEPT_DR] = 0; >> - >> - /* DR7 access must remain intercepted for an SEV-ES guest */ >> - if (sev_es_guest(svm->vcpu.kvm)) { >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >> - } >> - >> - recalc_intercepts(svm); >> -} >> - >> static inline void set_exception_intercept(struct vcpu_svm *svm, u32 >> bit) >> { >> struct vmcb *vmcb = svm->vmcb01.ptr; >> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c >> index 86d6897f4806..af775410c5eb 100644 >> --- a/arch/x86/kvm/svm/sev.c >> +++ b/arch/x86/kvm/svm/sev.c >> @@ -21,6 +21,7 @@ >> #include <asm/pkru.h> >> #include <asm/trapnr.h> >> #include <asm/fpu/xcr.h> >> +#include <asm/debugreg.h> >> #include "mmu.h" >> #include "x86.h" >> @@ -52,9 +53,14 @@ module_param_named(sev, sev_enabled, bool, 0444); >> /* enable/disable SEV-ES support */ >> static bool sev_es_enabled = true; >> module_param_named(sev_es, sev_es_enabled, bool, 0444); >> + >> +/* enable/disable SEV-ES DebugSwap support */ >> +static bool sev_es_debug_swap_enabled = true; >> +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); >> #else >> #define sev_enabled false >> #define sev_es_enabled false >> +#define sev_es_debug_swap false >> #endif /* CONFIG_KVM_AMD_SEV */ >> static u8 sev_enc_bit; >> @@ -2249,6 +2255,8 @@ void __init sev_hardware_setup(void) >> out: >> sev_enabled = sev_supported; >> sev_es_enabled = sev_es_supported; >> + if (!sev_es_enabled || >> !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP)) >> + sev_es_debug_swap_enabled = false; >> #endif >> } >> @@ -2940,6 +2948,7 @@ int sev_es_string_io(struct vcpu_svm *svm, int >> size, unsigned int port, int in) >> static void sev_es_init_vmcb(struct vcpu_svm *svm) >> { >> struct kvm_vcpu *vcpu = &svm->vcpu; >> + struct sev_es_save_area *save = svm->sev_es.vmsa; >> svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ES_ENABLE; >> svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; >> @@ -2988,6 +2997,9 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) >> if (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP)) >> svm_clr_intercept(svm, INTERCEPT_RDTSCP); >> } >> + >> + if (sev_es_debug_swap_enabled) >> + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; >> } >> void sev_init_vmcb(struct vcpu_svm *svm) >> @@ -3027,6 +3039,18 @@ void sev_es_prepare_switch_to_guest(struct >> sev_es_save_area *hostsa) >> /* MSR_IA32_XSS is restored on VMEXIT, save the currnet host >> value */ >> hostsa->xss = host_xss; >> + >> + /* The DebugSwap SEV feature does Type B swaps of DR[0-3] */ >> + if (sev_es_debug_swap_enabled) { >> + hostsa->dr0 = native_get_debugreg(0); >> + hostsa->dr1 = native_get_debugreg(1); >> + hostsa->dr2 = native_get_debugreg(2); >> + hostsa->dr3 = native_get_debugreg(3); >> + hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0); >> + hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1); >> + hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); >> + hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); >> + } >> } >> void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) >> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >> index 60c7c880266b..f8e222bee22a 100644 >> --- a/arch/x86/kvm/svm/svm.c >> +++ b/arch/x86/kvm/svm/svm.c >> @@ -671,6 +671,65 @@ static int svm_cpu_init(int cpu) >> } >> +static void set_dr_intercepts(struct vcpu_svm *svm) >> +{ >> + struct vmcb *vmcb = svm->vmcb01.ptr; >> + bool intercept; >> + >> + if (!sev_es_guest(svm->vcpu.kvm)) { >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); >> + } >> + >> + if (sev_es_guest(svm->vcpu.kvm)) { >> + struct sev_es_save_area *save = svm->sev_es.vmsa; >> + >> + intercept = !(save->sev_features & SVM_SEV_FEAT_DEBUG_SWAP); >> + } else { >> + intercept = true; >> + } >> + >> + if (intercept) { >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >> + set_exception_intercept(svm, DB_VECTOR); >> + } >> + >> + recalc_intercepts(svm); >> +} >> + >> +static void clr_dr_intercepts(struct vcpu_svm *svm) >> +{ >> + struct vmcb *vmcb = svm->vmcb01.ptr; >> + struct sev_es_save_area *save = svm->sev_es.vmsa; >> + >> + vmcb->control.intercepts[INTERCEPT_DR] = 0; >> + >> + /* >> + * DR7 access must remain intercepted for an SEV-ES guest unless >> DebugSwap >> + * (depends on NO_NESTED_DATA_BP) is enabled as otherwise a VM >> writing to DR7 >> + * from the #DB handler may trigger infinite loop of #DB's. >> + */ >> + if (sev_es_guest(svm->vcpu.kvm) && (save->sev_features & >> SVM_SEV_FEAT_DEBUG_SWAP)) { >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >> + } >> + >> + recalc_intercepts(svm); >> +} >> + >> static int direct_access_msr_slot(u32 msr) >> { >> u32 i; >> @@ -1184,13 +1243,11 @@ static void init_vmcb(struct kvm_vcpu *vcpu) >> if (!kvm_vcpu_apicv_active(vcpu)) >> svm_set_intercept(svm, INTERCEPT_CR8_WRITE); >> - set_dr_intercepts(svm); >> - >> set_exception_intercept(svm, PF_VECTOR); >> set_exception_intercept(svm, UD_VECTOR); >> set_exception_intercept(svm, MC_VECTOR); >> set_exception_intercept(svm, AC_VECTOR); >> - set_exception_intercept(svm, DB_VECTOR); >> + >> /* >> * Guest access to VMware backdoor ports could legitimately >> * trigger #GP because of TSS I/O permission bitmap. >> @@ -1308,6 +1365,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) >> if (sev_guest(vcpu->kvm)) >> sev_init_vmcb(svm); >> + set_dr_intercepts(svm); >> + >> svm_hv_init_vmcb(vmcb); >> init_vmcb_after_set_cpuid(vcpu); > -- Alexey ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v4] KVM: SEV: Enable data breakpoints in SEV-ES 2023-03-14 9:43 ` Alexey Kardashevskiy @ 2023-03-21 6:56 ` Alexey Kardashevskiy 0 siblings, 0 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-03-21 6:56 UTC (permalink / raw) To: Sean Christopherson, Borislav Petkov Cc: x86, linux-kernel, Thomas Gleixner, Paolo Bonzini, Ingo Molnar, Dave Hansen, H. Peter Anvin, kvm Ping? (I am told that pinging once a week is ok) Thanks, On 14/3/23 20:43, Alexey Kardashevskiy wrote: > Ping? Thanks, > > > On 21/2/23 16:19, Alexey Kardashevskiy wrote: >> Ping? Thanks, >> >> On 3/2/23 16:14, Alexey Kardashevskiy wrote: >>> Prior to SEV-ES, KVM stored/loaded host debug registers upon switching >>> to/from a VM. Changing those registers inside a running SEV VM >>> triggered #VC exit to KVM. >>> >>> SEV-ES added the encrypted state (ES) which uses an encrypted guest page >>> for the VM state (VMSA). The hardware saves/restores certain >>> registers on >>> VMRUN/VMEXIT according to a swap type (A, B, C), see >>> "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual >>> volume 2. >>> >>> AMD Milan (Fam 19h) introduces support for the debug registers swapping. >>> DR6 and DR7 are always swapped. DR[0-3] and DR[0-3]_ADDR_MASK are >>> swapped >>> a type B when SEV_FEATURES[5] ("DebugSwap") is set. >>> >>> Enable DebugSwap in VMSA. But only do so if CPUID Fn80000021_EAX[0] >>> ("NoNestedDataBp", "Processor ignores nested data breakpoints") is >>> supported by the SOC as otherwise a malicious SEV-ES guest can set up >>> data breakpoints on the #VC IDT entry/stack and cause an infinite loop. >>> >>> Eliminate DR7 and #DB intercepts as: >>> - they are not needed when DebugSwap is supported; >>> - #VC for these intercepts is most likely not supported anyway and >>> kills the VM. >>> Keep DR7 intercepted unless DebugSwap enabled to prevent the infinite >>> #DB >>> loop DoS. >>> >>> While at this, move set_/clr_dr_intercepts to .c and move #DB intercept >>> next to DR7 intercept. >>> >>> Signed-off-by: Alexey Kardashevskiy <aik@amd.com> >>> --- >>> Changes: >>> v4: >>> * removed sev_es_is_debug_swap_enabled() helper >>> * made sev_es_debug_swap_enabled (module param) static >>> * set sev_feature early in sev_es_init_vmcb() and made intercepts >>> dependend on it vs. module param >>> * move set_/clr_dr_intercepts to .c >>> >>> v3: >>> * rewrote the commit log again >>> * rebased on tip/master to use recently defined >>> X86_FEATURE_NO_NESTED_DATA_BP >>> * s/boot_cpu_has/cpu_feature_enabled/ >>> >>> v2: >>> * debug_swap moved from vcpu to module_param >>> * rewrote commit log >>> >>> --- >>> Tested with: >>> === >>> int x; >>> int main(int argc, char *argv[]) >>> { >>> x = 1; >>> return 0; >>> } >>> === >>> gcc -g a.c >>> rsync a.out ruby-954vm:~/ >>> ssh -t ruby-954vm 'gdb -ex "file a.out" -ex "watch x" -ex r' >>> >>> where ruby-954vm is a VM. >>> >>> With "/sys/module/kvm_amd/parameters/debug_swap = 0", gdb does not stop >>> on the watchpoint, with "= 1" - gdb does. >>> --- >>> arch/x86/include/asm/svm.h | 1 + >>> arch/x86/kvm/svm/svm.h | 42 ------------- >>> arch/x86/kvm/svm/sev.c | 24 ++++++++ >>> arch/x86/kvm/svm/svm.c | 65 +++++++++++++++++++- >>> 4 files changed, 87 insertions(+), 45 deletions(-) >>> >>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h >>> index cb1ee53ad3b1..665515c7edae 100644 >>> --- a/arch/x86/include/asm/svm.h >>> +++ b/arch/x86/include/asm/svm.h >>> @@ -278,6 +278,7 @@ enum avic_ipi_failure_cause { >>> #define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF) >>> #define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL >>> +#define SVM_SEV_FEAT_DEBUG_SWAP BIT(5) >>> struct vmcb_seg { >>> u16 selector; >>> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h >>> index 4826e6cc611b..653fd09929df 100644 >>> --- a/arch/x86/kvm/svm/svm.h >>> +++ b/arch/x86/kvm/svm/svm.h >>> @@ -389,48 +389,6 @@ static inline bool vmcb12_is_intercept(struct >>> vmcb_ctrl_area_cached *control, u3 >>> return test_bit(bit, (unsigned long *)&control->intercepts); >>> } >>> -static inline void set_dr_intercepts(struct vcpu_svm *svm) >>> -{ >>> - struct vmcb *vmcb = svm->vmcb01.ptr; >>> - >>> - if (!sev_es_guest(svm->vcpu.kvm)) { >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); >>> - } >>> - >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >>> - >>> - recalc_intercepts(svm); >>> -} >>> - >>> -static inline void clr_dr_intercepts(struct vcpu_svm *svm) >>> -{ >>> - struct vmcb *vmcb = svm->vmcb01.ptr; >>> - >>> - vmcb->control.intercepts[INTERCEPT_DR] = 0; >>> - >>> - /* DR7 access must remain intercepted for an SEV-ES guest */ >>> - if (sev_es_guest(svm->vcpu.kvm)) { >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >>> - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >>> - } >>> - >>> - recalc_intercepts(svm); >>> -} >>> - >>> static inline void set_exception_intercept(struct vcpu_svm *svm, >>> u32 bit) >>> { >>> struct vmcb *vmcb = svm->vmcb01.ptr; >>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c >>> index 86d6897f4806..af775410c5eb 100644 >>> --- a/arch/x86/kvm/svm/sev.c >>> +++ b/arch/x86/kvm/svm/sev.c >>> @@ -21,6 +21,7 @@ >>> #include <asm/pkru.h> >>> #include <asm/trapnr.h> >>> #include <asm/fpu/xcr.h> >>> +#include <asm/debugreg.h> >>> #include "mmu.h" >>> #include "x86.h" >>> @@ -52,9 +53,14 @@ module_param_named(sev, sev_enabled, bool, 0444); >>> /* enable/disable SEV-ES support */ >>> static bool sev_es_enabled = true; >>> module_param_named(sev_es, sev_es_enabled, bool, 0444); >>> + >>> +/* enable/disable SEV-ES DebugSwap support */ >>> +static bool sev_es_debug_swap_enabled = true; >>> +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); >>> #else >>> #define sev_enabled false >>> #define sev_es_enabled false >>> +#define sev_es_debug_swap false >>> #endif /* CONFIG_KVM_AMD_SEV */ >>> static u8 sev_enc_bit; >>> @@ -2249,6 +2255,8 @@ void __init sev_hardware_setup(void) >>> out: >>> sev_enabled = sev_supported; >>> sev_es_enabled = sev_es_supported; >>> + if (!sev_es_enabled || >>> !cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP)) >>> + sev_es_debug_swap_enabled = false; >>> #endif >>> } >>> @@ -2940,6 +2948,7 @@ int sev_es_string_io(struct vcpu_svm *svm, int >>> size, unsigned int port, int in) >>> static void sev_es_init_vmcb(struct vcpu_svm *svm) >>> { >>> struct kvm_vcpu *vcpu = &svm->vcpu; >>> + struct sev_es_save_area *save = svm->sev_es.vmsa; >>> svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ES_ENABLE; >>> svm->vmcb->control.virt_ext |= LBR_CTL_ENABLE_MASK; >>> @@ -2988,6 +2997,9 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) >>> if (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP)) >>> svm_clr_intercept(svm, INTERCEPT_RDTSCP); >>> } >>> + >>> + if (sev_es_debug_swap_enabled) >>> + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; >>> } >>> void sev_init_vmcb(struct vcpu_svm *svm) >>> @@ -3027,6 +3039,18 @@ void sev_es_prepare_switch_to_guest(struct >>> sev_es_save_area *hostsa) >>> /* MSR_IA32_XSS is restored on VMEXIT, save the currnet host >>> value */ >>> hostsa->xss = host_xss; >>> + >>> + /* The DebugSwap SEV feature does Type B swaps of DR[0-3] */ >>> + if (sev_es_debug_swap_enabled) { >>> + hostsa->dr0 = native_get_debugreg(0); >>> + hostsa->dr1 = native_get_debugreg(1); >>> + hostsa->dr2 = native_get_debugreg(2); >>> + hostsa->dr3 = native_get_debugreg(3); >>> + hostsa->dr0_addr_mask = amd_get_dr_addr_mask(0); >>> + hostsa->dr1_addr_mask = amd_get_dr_addr_mask(1); >>> + hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); >>> + hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); >>> + } >>> } >>> void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) >>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >>> index 60c7c880266b..f8e222bee22a 100644 >>> --- a/arch/x86/kvm/svm/svm.c >>> +++ b/arch/x86/kvm/svm/svm.c >>> @@ -671,6 +671,65 @@ static int svm_cpu_init(int cpu) >>> } >>> +static void set_dr_intercepts(struct vcpu_svm *svm) >>> +{ >>> + struct vmcb *vmcb = svm->vmcb01.ptr; >>> + bool intercept; >>> + >>> + if (!sev_es_guest(svm->vcpu.kvm)) { >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); >>> + } >>> + >>> + if (sev_es_guest(svm->vcpu.kvm)) { >>> + struct sev_es_save_area *save = svm->sev_es.vmsa; >>> + >>> + intercept = !(save->sev_features & SVM_SEV_FEAT_DEBUG_SWAP); >>> + } else { >>> + intercept = true; >>> + } >>> + >>> + if (intercept) { >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >>> + set_exception_intercept(svm, DB_VECTOR); >>> + } >>> + >>> + recalc_intercepts(svm); >>> +} >>> + >>> +static void clr_dr_intercepts(struct vcpu_svm *svm) >>> +{ >>> + struct vmcb *vmcb = svm->vmcb01.ptr; >>> + struct sev_es_save_area *save = svm->sev_es.vmsa; >>> + >>> + vmcb->control.intercepts[INTERCEPT_DR] = 0; >>> + >>> + /* >>> + * DR7 access must remain intercepted for an SEV-ES guest unless >>> DebugSwap >>> + * (depends on NO_NESTED_DATA_BP) is enabled as otherwise a VM >>> writing to DR7 >>> + * from the #DB handler may trigger infinite loop of #DB's. >>> + */ >>> + if (sev_es_guest(svm->vcpu.kvm) && (save->sev_features & >>> SVM_SEV_FEAT_DEBUG_SWAP)) { >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); >>> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); >>> + } >>> + >>> + recalc_intercepts(svm); >>> +} >>> + >>> static int direct_access_msr_slot(u32 msr) >>> { >>> u32 i; >>> @@ -1184,13 +1243,11 @@ static void init_vmcb(struct kvm_vcpu *vcpu) >>> if (!kvm_vcpu_apicv_active(vcpu)) >>> svm_set_intercept(svm, INTERCEPT_CR8_WRITE); >>> - set_dr_intercepts(svm); >>> - >>> set_exception_intercept(svm, PF_VECTOR); >>> set_exception_intercept(svm, UD_VECTOR); >>> set_exception_intercept(svm, MC_VECTOR); >>> set_exception_intercept(svm, AC_VECTOR); >>> - set_exception_intercept(svm, DB_VECTOR); >>> + >>> /* >>> * Guest access to VMware backdoor ports could legitimately >>> * trigger #GP because of TSS I/O permission bitmap. >>> @@ -1308,6 +1365,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) >>> if (sev_guest(vcpu->kvm)) >>> sev_init_vmcb(svm); >>> + set_dr_intercepts(svm); >>> + >>> svm_hv_init_vmcb(vmcb); >>> init_vmcb_after_set_cpuid(vcpu); >> > -- Alexey ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v4] KVM: SEV: Enable data breakpoints in SEV-ES 2023-02-03 5:14 ` [PATCH kernel v4] " Alexey Kardashevskiy 2023-02-21 5:19 ` Alexey Kardashevskiy @ 2023-03-23 17:40 ` Sean Christopherson 2023-03-29 15:13 ` Tom Lendacky 1 sibling, 1 reply; 28+ messages in thread From: Sean Christopherson @ 2023-03-23 17:40 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Thomas Gleixner, Paolo Bonzini, Ingo Molnar, Dave Hansen, Borislav Petkov, H. Peter Anvin On Fri, Feb 03, 2023, Alexey Kardashevskiy wrote: > While at this, move set_/clr_dr_intercepts to .c and move #DB intercept > next to DR7 intercept. Please do non-trivial code movement in separate patches unless the functional change is trivial. Moving and changing at the same time makes the patch difficult to review. > @@ -52,9 +53,14 @@ module_param_named(sev, sev_enabled, bool, 0444); > /* enable/disable SEV-ES support */ > static bool sev_es_enabled = true; > module_param_named(sev_es, sev_es_enabled, bool, 0444); > + > +/* enable/disable SEV-ES DebugSwap support */ > +static bool sev_es_debug_swap_enabled = true; > +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); Needs to be 0444, otherwise userspace can turn on the knob after KVM is loaded, which would allow enabling the feature on unsupported platforms, amongst many other problems. > void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 60c7c880266b..f8e222bee22a 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -671,6 +671,65 @@ static int svm_cpu_init(int cpu) > > } > > +static void set_dr_intercepts(struct vcpu_svm *svm) > +{ > + struct vmcb *vmcb = svm->vmcb01.ptr; > + bool intercept; > + > + if (!sev_es_guest(svm->vcpu.kvm)) { > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); > + } > + > + if (sev_es_guest(svm->vcpu.kvm)) { > + struct sev_es_save_area *save = svm->sev_es.vmsa; > + > + intercept = !(save->sev_features & SVM_SEV_FEAT_DEBUG_SWAP); Blech, the VMCB vs. SEV and SEV-ES code is a mess. E.g. init_vmcb() does /* * Guest access to VMware backdoor ports could legitimately * trigger #GP because of TSS I/O permission bitmap. * We intercept those #GP and allow access to them anyway * as VMware does. Don't intercept #GP for SEV guests as KVM can't * decrypt guest memory to decode the faulting instruction. */ if (enable_vmware_backdoor && !sev_guest(vcpu->kvm)) set_exception_intercept(svm, GP_VECTOR); but then sev_es_init_vmcb() also does: /* No support for enable_vmware_backdoor */ clr_exception_intercept(svm, GP_VECTOR); DR interception is a similar trainwreck. svm_sync_dirty_debug_regs() bails if guest_state_protected is true, i.e. is a nop for SEV-ES guests, but only after the vCPU has done LAUNCH_UPDATE_VMSA. IIUC, that's nonsensical because even before guest state is encrypted, #DB will be reflected as #VC into the guest. And, again IIUC, except for DR7, DRs are never intercepted for SEV-ES guests and so trying to debug from the host is futile as the guest can clobber DRs at any time. Similarly, flowing into dr_interception() on an SEV-ES VMGEXITis just dumb. KVM _knows_ it can't give the guest control of DR7, but it mucks with the intercepts anyways. That the GHCB spec even allows SVM_EXIT_{READ,WRITE}_DR7 is just asinine, but that's a moot point. Anyways, the GHCB spec's "suggestion" effectively says KVM's responsibility is purely to make a read of DR7 return the last written value. And of course KVM's disaster of a flow doesn't even do that unless the host is debugging the guest. Currently, hardware debug traps aren’t supported for an SEV-ES guest. The hypervisor must set the intercept for both read and write of the debug control register (DR7). With the intercepts in place, the #VC handler will be invoked when the guest accesses DR7. For a write to DR7, the #VC handler should perform Standard VMGExit processing. The #VC handler must not update the actual DR7 register, but rather it should cache the DR7 value being written. I bring this up because of the subtle dependency that checking SVM_SEV_FEAT_DEBUG_SWAP creates: set_dr_intercepts() needs to be called after sev_init_vmcb(). I believe this approach also fails to handle intrahost migration; at the very least, what exactly will happen when sev_migrate_from() invokes sev_init_vmcb() is unclear. And I really don't want to pile even more gunk on top of the existing mess. So, can you (and by "you" I really mean "the folks at AMD working on SEV stuff") start with the below diff (not intended to be a single patch), disallow kvm_arch_vcpu_ioctl_set_guest_debug() entirely for SEV-ES guests (will likely take some back and forth to figure out how we want to do this), and then fill in the blanks? I.e. get KVM to a state where all the intercept shenanigans for SEV and SEV-ES are reasonably contained in sev.c, and then enable the debug_swap stuff on top? diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index c25aeb550cd9..ff7a4d68731c 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -2968,8 +2968,7 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) svm_set_intercept(svm, TRAP_CR4_WRITE); svm_set_intercept(svm, TRAP_CR8_WRITE); - /* No support for enable_vmware_backdoor */ - clr_exception_intercept(svm, GP_VECTOR); + <debug register stuff goes here> /* Can't intercept XSETBV, HV can't modify XCR0 directly */ svm_clr_intercept(svm, INTERCEPT_XSETBV); @@ -2996,6 +2995,12 @@ void sev_init_vmcb(struct vcpu_svm *svm) svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE; clr_exception_intercept(svm, UD_VECTOR); + /* + * Don't intercept #GP for SEV guests, e.g. for the VMware backdoor, as + * KVM can't decrypt guest memory to decode the faulting instruction. + */ + clr_exception_intercept(svm, GP_VECTOR); + if (sev_es_guest(svm->vcpu.kvm)) sev_es_init_vmcb(svm); } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e0ec95f1f068..89753d7fd821 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1209,10 +1209,9 @@ static void init_vmcb(struct kvm_vcpu *vcpu) * Guest access to VMware backdoor ports could legitimately * trigger #GP because of TSS I/O permission bitmap. * We intercept those #GP and allow access to them anyway - * as VMware does. Don't intercept #GP for SEV guests as KVM can't - * decrypt guest memory to decode the faulting instruction. + * as VMware does. */ - if (enable_vmware_backdoor && !sev_guest(vcpu->kvm)) + if (enable_vmware_backdoor) set_exception_intercept(svm, GP_VECTOR); svm_set_intercept(svm, INTERCEPT_INTR); @@ -1950,7 +1949,7 @@ static void svm_sync_dirty_debug_regs(struct kvm_vcpu *vcpu) { struct vcpu_svm *svm = to_svm(vcpu); - if (vcpu->arch.guest_state_protected) + if (WARN_ON_ONCE(sev_es_guest(vcpu->kvm))) return; get_debugreg(vcpu->arch.db[0], 0); @@ -2681,7 +2680,7 @@ static int dr_interception(struct kvm_vcpu *vcpu) unsigned long val; int err = 0; - if (vcpu->guest_debug == 0) { + if (vcpu->guest_debug == 0 && !sev_es_guest(vcpu->kvm)) { /* * No more DR vmexits; force a reload of the debug registers * and reenter on this instruction. The next vmexit will diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index f44751dd8d5d..7c99a7d55476 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -409,23 +409,25 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm) { struct vmcb *vmcb = svm->vmcb01.ptr; - if (!sev_es_guest(svm->vcpu.kvm)) { - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); + if (sev_es_guest(svm->vcpu.kvm)) { + WARN_ON_ONCE(svm->vcpu.arch.last_vmentry_cpu != -1); + return; } + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); @@ -436,13 +438,13 @@ static inline void clr_dr_intercepts(struct vcpu_svm *svm) { struct vmcb *vmcb = svm->vmcb01.ptr; + if (WARN_ON_ONCE(sev_es_guest(svm->vcpu.kvm))) + return; + vmcb->control.intercepts[INTERCEPT_DR] = 0; - /* DR7 access must remain intercepted for an SEV-ES guest */ - if (sev_es_guest(svm->vcpu.kvm)) { - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); - } + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); recalc_intercepts(svm); } ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v4] KVM: SEV: Enable data breakpoints in SEV-ES 2023-03-23 17:40 ` Sean Christopherson @ 2023-03-29 15:13 ` Tom Lendacky 0 siblings, 0 replies; 28+ messages in thread From: Tom Lendacky @ 2023-03-29 15:13 UTC (permalink / raw) To: Sean Christopherson, Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Thomas Gleixner, Paolo Bonzini, Ingo Molnar, Dave Hansen, Borislav Petkov, H. Peter Anvin On 3/23/23 12:40, Sean Christopherson wrote: > On Fri, Feb 03, 2023, Alexey Kardashevskiy wrote: >> While at this, move set_/clr_dr_intercepts to .c and move #DB intercept >> next to DR7 intercept. > > Please do non-trivial code movement in separate patches unless the functional change > is trivial. Moving and changing at the same time makes the patch difficult to review. > >> @@ -52,9 +53,14 @@ module_param_named(sev, sev_enabled, bool, 0444); >> /* enable/disable SEV-ES support */ >> static bool sev_es_enabled = true; >> module_param_named(sev_es, sev_es_enabled, bool, 0444); >> + >> +/* enable/disable SEV-ES DebugSwap support */ >> +static bool sev_es_debug_swap_enabled = true; >> +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); > > Needs to be 0444, otherwise userspace can turn on the knob after KVM is loaded, > which would allow enabling the feature on unsupported platforms, amongst many > other problems. > >> void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) >> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >> index 60c7c880266b..f8e222bee22a 100644 >> --- a/arch/x86/kvm/svm/svm.c >> +++ b/arch/x86/kvm/svm/svm.c >> @@ -671,6 +671,65 @@ static int svm_cpu_init(int cpu) >> >> } >> >> +static void set_dr_intercepts(struct vcpu_svm *svm) >> +{ >> + struct vmcb *vmcb = svm->vmcb01.ptr; >> + bool intercept; >> + >> + if (!sev_es_guest(svm->vcpu.kvm)) { >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); >> + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); >> + } >> + >> + if (sev_es_guest(svm->vcpu.kvm)) { >> + struct sev_es_save_area *save = svm->sev_es.vmsa; >> + >> + intercept = !(save->sev_features & SVM_SEV_FEAT_DEBUG_SWAP); > > Blech, the VMCB vs. SEV and SEV-ES code is a mess. E.g. init_vmcb() does > > /* > * Guest access to VMware backdoor ports could legitimately > * trigger #GP because of TSS I/O permission bitmap. > * We intercept those #GP and allow access to them anyway > * as VMware does. Don't intercept #GP for SEV guests as KVM can't > * decrypt guest memory to decode the faulting instruction. > */ > if (enable_vmware_backdoor && !sev_guest(vcpu->kvm)) > set_exception_intercept(svm, GP_VECTOR); > > but then sev_es_init_vmcb() also does: > > /* No support for enable_vmware_backdoor */ > clr_exception_intercept(svm, GP_VECTOR); > > DR interception is a similar trainwreck. svm_sync_dirty_debug_regs() bails if > guest_state_protected is true, i.e. is a nop for SEV-ES guests, but only after > the vCPU has done LAUNCH_UPDATE_VMSA. IIUC, that's nonsensical because even before > guest state is encrypted, #DB will be reflected as #VC into the guest. And, again A guest can't run before the LAUNCH_UPDATE process is complete, so there can't be a #VC before guest_state_proteced is true. > IIUC, except for DR7, DRs are never intercepted for SEV-ES guests and so trying > to debug from the host is futile as the guest can clobber DRs at any time. > > Similarly, flowing into dr_interception() on an SEV-ES VMGEXITis just dumb. KVM > _knows_ it can't give the guest control of DR7, but it mucks with the intercepts > anyways. That the GHCB spec even allows SVM_EXIT_{READ,WRITE}_DR7 is just asinine, > but that's a moot point. Anyways, the GHCB spec's "suggestion" effectively says > KVM's responsibility is purely to make a read of DR7 return the last written value. That's not KVM's responsibility, that is the responsibility of the guest #VC handler. So a DR7 read, while intercepted, should never get to KVM. > And of course KVM's disaster of a flow doesn't even do that unless the host is > debugging the guest. > > Currently, hardware debug traps aren’t supported for an SEV-ES guest. The hypervisor > must set the intercept for both read and write of the debug control register (DR7). > With the intercepts in place, the #VC handler will be invoked when the guest accesses > DR7. For a write to DR7, the #VC handler should perform Standard VMGExit processing. > The #VC handler must not update the actual DR7 register, but rather it should cache > the DR7 value being written. > > I bring this up because of the subtle dependency that checking SVM_SEV_FEAT_DEBUG_SWAP > creates: set_dr_intercepts() needs to be called after sev_init_vmcb(). I believe > this approach also fails to handle intrahost migration; at the very least, what > exactly will happen when sev_migrate_from() invokes sev_init_vmcb() is unclear. > And I really don't want to pile even more gunk on top of the existing mess. > > So, can you (and by "you" I really mean "the folks at AMD working on SEV stuff") > start with the below diff (not intended to be a single patch), disallow > kvm_arch_vcpu_ioctl_set_guest_debug() entirely for SEV-ES guests (will likely > take some back and forth to figure out how we want to do this), and then fill > in the blanks? I.e. get KVM to a state where all the intercept shenanigans for > SEV and SEV-ES are reasonably contained in sev.c, and then enable the debug_swap > stuff on top? > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index c25aeb550cd9..ff7a4d68731c 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -2968,8 +2968,7 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) > svm_set_intercept(svm, TRAP_CR4_WRITE); > svm_set_intercept(svm, TRAP_CR8_WRITE); > > - /* No support for enable_vmware_backdoor */ > - clr_exception_intercept(svm, GP_VECTOR); > + <debug register stuff goes here> > > /* Can't intercept XSETBV, HV can't modify XCR0 directly */ > svm_clr_intercept(svm, INTERCEPT_XSETBV); > @@ -2996,6 +2995,12 @@ void sev_init_vmcb(struct vcpu_svm *svm) > svm->vmcb->control.nested_ctl |= SVM_NESTED_CTL_SEV_ENABLE; > clr_exception_intercept(svm, UD_VECTOR); > > + /* > + * Don't intercept #GP for SEV guests, e.g. for the VMware backdoor, as > + * KVM can't decrypt guest memory to decode the faulting instruction. > + */ > + clr_exception_intercept(svm, GP_VECTOR); > + > if (sev_es_guest(svm->vcpu.kvm)) > sev_es_init_vmcb(svm); > } > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index e0ec95f1f068..89753d7fd821 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -1209,10 +1209,9 @@ static void init_vmcb(struct kvm_vcpu *vcpu) > * Guest access to VMware backdoor ports could legitimately > * trigger #GP because of TSS I/O permission bitmap. > * We intercept those #GP and allow access to them anyway > - * as VMware does. Don't intercept #GP for SEV guests as KVM can't > - * decrypt guest memory to decode the faulting instruction. > + * as VMware does. > */ > - if (enable_vmware_backdoor && !sev_guest(vcpu->kvm)) > + if (enable_vmware_backdoor) > set_exception_intercept(svm, GP_VECTOR); > > svm_set_intercept(svm, INTERCEPT_INTR); > @@ -1950,7 +1949,7 @@ static void svm_sync_dirty_debug_regs(struct kvm_vcpu *vcpu) > { > struct vcpu_svm *svm = to_svm(vcpu); > > - if (vcpu->arch.guest_state_protected) > + if (WARN_ON_ONCE(sev_es_guest(vcpu->kvm))) > return; > > get_debugreg(vcpu->arch.db[0], 0); > @@ -2681,7 +2680,7 @@ static int dr_interception(struct kvm_vcpu *vcpu) > unsigned long val; > int err = 0; > > - if (vcpu->guest_debug == 0) { > + if (vcpu->guest_debug == 0 && !sev_es_guest(vcpu->kvm)) { This will change the current flow of an SEV-ES guest. With SEV-ES, vcpu->guest_debug can never be anything other than 0 and currently always takes this path. So what is really needed is: if (vcpu->guest_debug == 0) { if (!sev_es_guest(vcpu->kvm)) { ... } return 1; } > /* > * No more DR vmexits; force a reload of the debug registers > * and reenter on this instruction. The next vmexit will > diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h > index f44751dd8d5d..7c99a7d55476 100644 > --- a/arch/x86/kvm/svm/svm.h > +++ b/arch/x86/kvm/svm/svm.h > @@ -409,23 +409,25 @@ static inline void set_dr_intercepts(struct vcpu_svm *svm) > { > struct vmcb *vmcb = svm->vmcb01.ptr; > > - if (!sev_es_guest(svm->vcpu.kvm)) { > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); > + if (sev_es_guest(svm->vcpu.kvm)) { > + WARN_ON_ONCE(svm->vcpu.arch.last_vmentry_cpu != -1); > + return; > } > > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR0_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR1_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR2_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR3_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR4_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR5_WRITE); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR6_WRITE); > vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > > @@ -436,13 +438,13 @@ static inline void clr_dr_intercepts(struct vcpu_svm *svm) > { > struct vmcb *vmcb = svm->vmcb01.ptr; > > + if (WARN_ON_ONCE(sev_es_guest(svm->vcpu.kvm))) > + return; > + > vmcb->control.intercepts[INTERCEPT_DR] = 0; > > - /* DR7 access must remain intercepted for an SEV-ES guest */ > - if (sev_es_guest(svm->vcpu.kvm)) { > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > - vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); > - } > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_READ); > + vmcb_set_intercept(&vmcb->control, INTERCEPT_DR7_WRITE); If we never call clr_dr_intercepts() anymore for an SEV-ES guest, then the above two lines should be removed. They only were executed for an SEV-ES guest and now they would be executed for any guest. Thanks, Tom > > recalc_intercepts(svm); > } > ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-02-03 3:37 ` Alexey Kardashevskiy 2023-02-03 5:14 ` [PATCH kernel v4] " Alexey Kardashevskiy @ 2023-03-23 16:39 ` Sean Christopherson 2023-03-24 4:05 ` Alexey Kardashevskiy 1 sibling, 1 reply; 28+ messages in thread From: Sean Christopherson @ 2023-03-23 16:39 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On Fri, Feb 03, 2023, Alexey Kardashevskiy wrote: > > Follow-up question: does KVM _have_ to wait until KVM_SEV_LAUNCH_UPDATE_VMSA to > > set the flag? > > Nope. Will repost soon as a reply to this mail. Please, please do not post new versions In-Reply-To the previous version, and especially not In-Reply-To a random mail buried deep in the thread. b4, which is imperfect but makes my life sooo much easier, gets confused by all the threading and partial rerolls. The next version also buries _this_ reply, which is partly why I haven't responded until now. I simply missed this the below questions because I saw v4 and assumed all my feedback was addressed, i.e. that I could handle this in the context of 6.4 and not earlier. Continuing on that topic, please do not post a new version until open questions from the previous version are resolved. Posting a new version when there are unresolved questions might feel like it helps things move faster, but more often than not it has the comlete opposite effect. > > > +/* enable/disable SEV-ES DebugSwap support */ > > > +static bool sev_es_debug_swap_enabled = true; > > > +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); > > > > Module param needs 0444 permissions, i.e. shouldn't be writable after KVM is > > loaded. Though I don't know that providing a module param is warranted in this > > case. > > KVM provides module params for SEV and SEV-ES because there are legitimate > > reasons to turn them off, but at a glance, I don't see why we'd want that for this > > feature. > > > /me confused. You suggested this in the first place for (I think) for the > case if the feature is found to be broken later on so admins can disable it. Hrm, so I did. Right, IIUC, this has guest visible effects, i.e. guest can read/write DRs, and so the admin might want the ability to disable the feature. Speaking of past me, no one answered my question about how this will interact with SNP, where the VM can maniuplate the VMSA. : > @@ -604,6 +607,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm) : > save->xss = svm->vcpu.arch.ia32_xss; : > save->dr6 = svm->vcpu.arch.dr6; : > : > + if (sev->debug_swap) : > + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; : : Resurrecting my objection to "AP Creation NAE event"[*], what happens if a hypervisor : supports GHCB_HV_FT_SNP_AP_CREATION but not DebugSwap? IIUC, a guest can corrupt : host DRs by enabling DebugSwap in the VMSA of an AP vCPU, e.g. the CPU will load : zeros on VM-Exit if the host hasn't stuffed the host save area fields. : : KVM can obviously just make sure to save its DRs if hardware supports DebugSwap, : but what if DebugSwap is buggy and needs to be disabled? And what about the next : feature that can apparently be enabled by the guest? : : [*] https://lore.kernel.org/all/YWnbfCet84Vup6q9@google.com > And I was using it to verify "x86/debug: Fix stack recursion caused by DR7 > accesses" which is convenient but it is a minor thing. ... > > > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > > > index 60c7c880266b..6c54a3c9d442 100644 > > > --- a/arch/x86/kvm/svm/svm.c > > > +++ b/arch/x86/kvm/svm/svm.c > > > @@ -1190,7 +1190,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) > > > set_exception_intercept(svm, UD_VECTOR); > > > set_exception_intercept(svm, MC_VECTOR); > > > set_exception_intercept(svm, AC_VECTOR); > > > - set_exception_intercept(svm, DB_VECTOR); > > > + if (!sev_es_is_debug_swap_enabled()) > > > + set_exception_intercept(svm, DB_VECTOR); > > > > This is wrong. KVM needs to intercept #DBs when debugging non-SEV-ES VMs. > > Sorry, not following. The #DB intercept for non-SEV-ES is enabled here. The helper in this version (v3) just queries whether or not the feature is enabled, it doesn't differentiate between SEV-ES and other VM types. I.e. loading KVM with SEV-ES and DebugSwap enabled would break non-SEV-ES VMs running on the same host. +bool sev_es_is_debug_swap_enabled(void) +{ + return sev_es_debug_swap_enabled; +} Looks like this was fixed in v4. > > This _could_ be tied to X86_FEATURE_NO_NESTED_DATA_BP, but the KVM would need to > > toggle the intercept depending on whether or not userspace wants to debug the > > guest. > > > > Similar to the DR7 interception, can this check sev_features directly? ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES 2023-03-23 16:39 ` [PATCH kernel v3 2/3] " Sean Christopherson @ 2023-03-24 4:05 ` Alexey Kardashevskiy 0 siblings, 0 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-03-24 4:05 UTC (permalink / raw) To: Sean Christopherson Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On 24/3/23 03:39, Sean Christopherson wrote: > On Fri, Feb 03, 2023, Alexey Kardashevskiy wrote: >>> Follow-up question: does KVM _have_ to wait until KVM_SEV_LAUNCH_UPDATE_VMSA to >>> set the flag? >> >> Nope. Will repost soon as a reply to this mail. > > Please, please do not post new versions In-Reply-To the previous version, and > especially not In-Reply-To a random mail buried deep in the thread. b4, which > is imperfect but makes my life sooo much easier, gets confused by all the threading > and partial rerolls. The next version also buries _this_ reply, which is partly > why I haven't responded until now. I simply missed this the below questions because > I saw v4 and assumed all my feedback was addressed, i.e. that I could handle this > in the context of 6.4 and not earlier. Ok, noted. Sorry for the mess. > Continuing on that topic, please do not post a new version until open questions > from the previous version are resolved. Posting a new version when there are > unresolved questions might feel like it helps things move faster, but more often > than not it has the comlete opposite effect. Well I thought keeping the history in one place/thread is helping. Oh well... >>>> +/* enable/disable SEV-ES DebugSwap support */ >>>> +static bool sev_es_debug_swap_enabled = true; >>>> +module_param_named(debug_swap, sev_es_debug_swap_enabled, bool, 0644); >>> >>> Module param needs 0444 permissions, i.e. shouldn't be writable after KVM is >>> loaded. Though I don't know that providing a module param is warranted in this >>> case. >>> KVM provides module params for SEV and SEV-ES because there are legitimate >>> reasons to turn them off, but at a glance, I don't see why we'd want that for this >>> feature. >> >> >> /me confused. You suggested this in the first place for (I think) for the >> case if the feature is found to be broken later on so admins can disable it. > > Hrm, so I did. Right, IIUC, this has guest visible effects, i.e. guest can > read/write DRs, and so the admin might want the ability to disable the feature. > > Speaking of past me, no one answered my question about how this will interact > with SNP, where the VM can maniuplate the VMSA. > > : > @@ -604,6 +607,9 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm) > : > save->xss = svm->vcpu.arch.ia32_xss; > : > save->dr6 = svm->vcpu.arch.dr6; > : > > : > + if (sev->debug_swap) > : > + save->sev_features |= SVM_SEV_FEAT_DEBUG_SWAP; > : > : Resurrecting my objection to "AP Creation NAE event"[*], what happens if a hypervisor > : supports GHCB_HV_FT_SNP_AP_CREATION but not DebugSwap? IIUC, a guest can corrupt > : host DRs by enabling DebugSwap in the VMSA of an AP vCPU, e.g. the CPU will load > : zeros on VM-Exit if the host hasn't stuffed the host save area fields. I was convinced it was answered (by Tom). (...10min later...) now I can see it was an internal email :-/ The host will have to save DRs and masks to the host save area for SNP and non-SNP guests (until we get the hw which allows masking features). Which patchset will do this - it depends on what gets accepted first - this or the SNP support. > : KVM can obviously just make sure to save its DRs if hardware supports DebugSwap, > : but what if DebugSwap is buggy and needs to be disabled? And what about the next > : feature that can apparently be enabled by the guest? > : > : [*] https://lore.kernel.org/all/YWnbfCet84Vup6q9@google.com > >> And I was using it to verify "x86/debug: Fix stack recursion caused by DR7 >> accesses" which is convenient but it is a minor thing. > > ... > >>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >>>> index 60c7c880266b..6c54a3c9d442 100644 >>>> --- a/arch/x86/kvm/svm/svm.c >>>> +++ b/arch/x86/kvm/svm/svm.c >>>> @@ -1190,7 +1190,8 @@ static void init_vmcb(struct kvm_vcpu *vcpu) >>>> set_exception_intercept(svm, UD_VECTOR); >>>> set_exception_intercept(svm, MC_VECTOR); >>>> set_exception_intercept(svm, AC_VECTOR); >>>> - set_exception_intercept(svm, DB_VECTOR); >>>> + if (!sev_es_is_debug_swap_enabled()) >>>> + set_exception_intercept(svm, DB_VECTOR); >>> >>> This is wrong. KVM needs to intercept #DBs when debugging non-SEV-ES VMs. >> >> Sorry, not following. The #DB intercept for non-SEV-ES is enabled here. > > The helper in this version (v3) just queries whether or not the feature is enabled, > it doesn't differentiate between SEV-ES and other VM types. I.e. loading KVM with > SEV-ES and DebugSwap enabled would break non-SEV-ES VMs running on the same host. > > +bool sev_es_is_debug_swap_enabled(void) > +{ > + return sev_es_debug_swap_enabled; > +} > > Looks like this was fixed in v4. Right. I'll try addressing the comments from the other big reply of yours and send the patches. Thanks for all comments! >>> This _could_ be tied to X86_FEATURE_NO_NESTED_DATA_BP, but the KVM would need to >>> toggle the intercept depending on whether or not userspace wants to debug the >>> guest. >>> >>> Similar to the DR7 interception, can this check sev_features directly? -- Alexey ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-20 3:10 [PATCH kernel v3 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 1/3] x86/amd: Cache debug register values in percpu variables Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy @ 2023-01-20 3:10 ` Alexey Kardashevskiy 2023-01-20 5:12 ` Nikunj A. Dadhania 2023-01-30 0:56 ` [PATCH kernel v4 " Alexey Kardashevskiy 2 siblings, 2 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-01-20 3:10 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin With MSR_AMD64_SEV_DEBUG_SWAP enabled, the VM should not get #VC events for DR7 read/write which it rather avoided. Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> --- Changes: v2: * use new bit definition --- arch/x86/include/asm/msr-index.h | 1 + tools/arch/x86/include/asm/msr-index.h | 1 + arch/x86/kernel/sev.c | 6 ++++++ 3 files changed, 8 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index cb3d0f6e6ac2..e15afe3500ff 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -574,6 +574,7 @@ #define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT) #define MSR_AMD64_SEV_ES_ENABLED BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT) #define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT) +#define MSR_AMD64_SEV_DEBUG_SWAP BIT_ULL(7) #define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f diff --git a/tools/arch/x86/include/asm/msr-index.h b/tools/arch/x86/include/asm/msr-index.h index 37ff47552bcb..27c1c349e49b 100644 --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -565,6 +565,7 @@ #define MSR_AMD64_SEV_ENABLED BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT) #define MSR_AMD64_SEV_ES_ENABLED BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT) #define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT) +#define MSR_AMD64_SEV_DEBUG_SWAP BIT_ULL(7) #define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 679026a640ef..8184f8ba4edc 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -1618,6 +1618,9 @@ static enum es_result vc_handle_dr7_write(struct ghcb *ghcb, long val, *reg = vc_insn_get_rm(ctxt); enum es_result ret; + if (sev_status & MSR_AMD64_SEV_DEBUG_SWAP) + return ES_VMM_ERROR; + if (!reg) return ES_DECODE_FAILED; @@ -1655,6 +1658,9 @@ static enum es_result vc_handle_dr7_read(struct ghcb *ghcb, struct sev_es_runtime_data *data = this_cpu_read(runtime_data); long *reg = vc_insn_get_rm(ctxt); + if (sev_status & MSR_AMD64_SEV_DEBUG_SWAP) + return ES_VMM_ERROR; + if (!reg) return ES_DECODE_FAILED; -- 2.38.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-20 3:10 ` [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy @ 2023-01-20 5:12 ` Nikunj A. Dadhania 2023-01-20 10:23 ` Alexey Kardashevskiy 2023-01-30 0:56 ` [PATCH kernel v4 " Alexey Kardashevskiy 1 sibling, 1 reply; 28+ messages in thread From: Nikunj A. Dadhania @ 2023-01-20 5:12 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On 20/01/23 08:40, Alexey Kardashevskiy wrote: > With MSR_AMD64_SEV_DEBUG_SWAP enabled, the VM should not get #VC > events for DR7 read/write which it rather avoided. > SNP guest feature negotiation patch is part of tip now: https://lore.kernel.org/lkml/167414649850.4906.1693185384677559889.tip-bot2@tip-bot2/ MSR_AMD64_SNP_DEBUG_SWAP is already defined. As this requires guest side changes, please add MSR_AMD64_SNP_DEBUG_SWAP as part of SNP_FEATURES_PRESENT bit mask. Regards Nikunj ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-20 5:12 ` Nikunj A. Dadhania @ 2023-01-20 10:23 ` Alexey Kardashevskiy 2023-01-20 12:06 ` Borislav Petkov 2023-01-24 10:37 ` Nikunj A. Dadhania 0 siblings, 2 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-01-20 10:23 UTC (permalink / raw) To: Nikunj A. Dadhania Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On 20/1/23 16:12, Nikunj A. Dadhania wrote: > On 20/01/23 08:40, Alexey Kardashevskiy wrote: >> With MSR_AMD64_SEV_DEBUG_SWAP enabled, the VM should not get #VC >> events for DR7 read/write which it rather avoided. >> > > SNP guest feature negotiation patch is part of tip now: https://lore.kernel.org/lkml/167414649850.4906.1693185384677559889.tip-bot2@tip-bot2/ Worth mentioning it is tip/x86/urgent (which does not have X86_FEATURE_NO_NESTED_DATA_BP), not tip/master (which has X86_FEATURE_NO_NESTED_DATA_BP). > > MSR_AMD64_SNP_DEBUG_SWAP is already defined. As this requires guest side changes, please add MSR_AMD64_SNP_DEBUG_SWAP as part of SNP_FEATURES_PRESENT bit mask. It is MSR_AMD64_SEV_DEBUG_SWAP (SEV, not SNP), it is an SEV-ES thing. Why is that feature negotiation SNP-only and not SEV? -- Alexey ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-20 10:23 ` Alexey Kardashevskiy @ 2023-01-20 12:06 ` Borislav Petkov 2023-01-25 3:11 ` Alexey Kardashevskiy 2023-01-24 10:37 ` Nikunj A. Dadhania 1 sibling, 1 reply; 28+ messages in thread From: Borislav Petkov @ 2023-01-20 12:06 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: Nikunj A. Dadhania, kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On Fri, Jan 20, 2023 at 09:23:48PM +1100, Alexey Kardashevskiy wrote: > Worth mentioning it is tip/x86/urgent (which does not have > X86_FEATURE_NO_NESTED_DATA_BP), not tip/master (which has > X86_FEATURE_NO_NESTED_DATA_BP). Yeah, when you submit patches for tip, you can always use tip/master which has the latest lineup of all branches and should have all the required bits. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-20 12:06 ` Borislav Petkov @ 2023-01-25 3:11 ` Alexey Kardashevskiy 2023-01-25 5:44 ` Borislav Petkov 0 siblings, 1 reply; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-01-25 3:11 UTC (permalink / raw) To: Borislav Petkov Cc: Nikunj A. Dadhania, kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On 20/1/23 23:06, Borislav Petkov wrote: > On Fri, Jan 20, 2023 at 09:23:48PM +1100, Alexey Kardashevskiy wrote: >> Worth mentioning it is tip/x86/urgent (which does not have >> X86_FEATURE_NO_NESTED_DATA_BP), not tip/master (which has >> X86_FEATURE_NO_NESTED_DATA_BP). > > Yeah, when you submit patches for tip, you can always use tip/master which has > the latest lineup of all branches and should have all the required bits. The tip/master's head just a few days ago was 195df42eb64dcb "Merge branch into tip/master: 'x86/platform'" and did have dcf67f724b8ada "x86/cpu, kvm: Add the NO_NESTED_DATA_BP feature" but now tip/master does not have it - 8272720be044 "Merge x86/cache into tip/master". /me confused. What tree is the tree? > > Thx. > -- Alexey ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-25 3:11 ` Alexey Kardashevskiy @ 2023-01-25 5:44 ` Borislav Petkov 0 siblings, 0 replies; 28+ messages in thread From: Borislav Petkov @ 2023-01-25 5:44 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: Nikunj A. Dadhania, kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On Wed, Jan 25, 2023 at 02:11:27PM +1100, Alexey Kardashevskiy wrote: > The tip/master's head just a few days ago was 195df42eb64dcb "Merge branch > into tip/master: 'x86/platform'" and did have dcf67f724b8ada "x86/cpu, kvm: > Add the NO_NESTED_DATA_BP feature" but now tip/master does not have it - Yeah, it had to be backed out. A new version will be queued soon. > 8272720be044 "Merge x86/cache into tip/master". /me confused. What tree is > the tree? It still is *the* tree - just there were some issues with the KVM side of Kim's series so it had to be dropped but he sent a new version which I need to queue soon. Then I'll have a look at yours. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-20 10:23 ` Alexey Kardashevskiy 2023-01-20 12:06 ` Borislav Petkov @ 2023-01-24 10:37 ` Nikunj A. Dadhania 2023-01-24 12:37 ` Alexey Kardashevskiy 1 sibling, 1 reply; 28+ messages in thread From: Nikunj A. Dadhania @ 2023-01-24 10:37 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On 20/01/23 15:53, Alexey Kardashevskiy wrote: > It is MSR_AMD64_SEV_DEBUG_SWAP (SEV, not SNP), it is an SEV-ES thing. Yes, noticed that, earlier analysis was that Debug Swap shouldn't need any guest side changes, but it does need it. > Why is that feature negotiation SNP-only and not SEV? As per the spec, GHCB termination request: reason code: 0x2 is SNP features specific. Regards Nikunj ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-24 10:37 ` Nikunj A. Dadhania @ 2023-01-24 12:37 ` Alexey Kardashevskiy 2023-01-24 13:17 ` Nikunj A. Dadhania 0 siblings, 1 reply; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-01-24 12:37 UTC (permalink / raw) To: Nikunj A. Dadhania Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On 24/1/23 21:37, Nikunj A. Dadhania wrote:>> It is MSR_AMD64_SEV_DEBUG_SWAP (SEV, not SNP), it is an SEV-ES thing.> Yes, noticed that, earlier analysis was that Debug Swap shouldn't need any guest side changes, but it does need it.>>> Why is that feature negotiation SNP-only and not SEV?> > As per the spec, GHCB termination request: reason code: 0x2 is SNP features specific. Does the guest really need to terminate in such case? A VM could just not do the GHCB thing if it does not want to. -- Alexey ^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-24 12:37 ` Alexey Kardashevskiy @ 2023-01-24 13:17 ` Nikunj A. Dadhania 0 siblings, 0 replies; 28+ messages in thread From: Nikunj A. Dadhania @ 2023-01-24 13:17 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Yury Norov, Venu Busireddy, Tony Luck, Tom Lendacky, Thomas Gleixner, Sean Christopherson, Sandipan Das, Pawan Gupta, Paolo Bonzini, Michael Roth, Mario Limonciello, Kim Phillips, Kees Cook, Juergen Gross, Jakub Kicinski, Ingo Molnar, Dave Hansen, Daniel Sneddon, Brijesh Singh, Borislav Petkov, Arnaldo Carvalho de Melo, Andrew Cooper, Alexander Shishkin, Adrian Hunter, Peter Zijlstra (Intel), Jason A. Donenfeld, H. Peter Anvin On 24/01/23 18:07, Alexey Kardashevskiy wrote: > > > On 24/1/23 21:37, Nikunj A. Dadhania wrote: >> It is MSR_AMD64_SEV_DEBUG_SWAP (SEV, not SNP), it is an SEV-ES thing. > Yes, noticed that, earlier analysis was that Debug Swap shouldn't need any guest side changes, but it does need it. >>> Why is that feature negotiation SNP-only and not SEV? >> As per the spec, GHCB termination request: reason code: 0x2 is SNP features specific. > Does the guest really need to terminate in such case? The termination is from the guest that do not have implementation for the hypervisor enabled feature, in this case DebugSwap. If DebugSwap is enabled by the hypervisor and not handled in guest #VC, then DR7 read/write can be intercepted by the malicious hypervisor, which can return unexpected values. > A VM could just not do the GHCB thing if it does not want to. In that case, the VM can have unexpected failures. Regards Nikunj ^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH kernel v4 3/3] x86/sev: Do not handle #VC for DR7 read/write 2023-01-20 3:10 ` [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy 2023-01-20 5:12 ` Nikunj A. Dadhania @ 2023-01-30 0:56 ` Alexey Kardashevskiy 1 sibling, 0 replies; 28+ messages in thread From: Alexey Kardashevskiy @ 2023-01-30 0:56 UTC (permalink / raw) To: Alexey Kardashevskiy Cc: kvm, x86, linux-kernel, Venu Busireddy, Tom Lendacky, Thomas Gleixner, Nikunj A Dadhania, Michael Roth, Ingo Molnar, Dave Hansen, Brijesh Singh, Borislav Petkov, Jason A. Donenfeld, H. Peter Anvin, Sean Christopherson With MSR_AMD64_SEV_DEBUG_SWAP enabled, the VM should not get #VC events for DR7 read/write which it rather avoided. Update the SNP_FEATURES_PRESENT mask with MSR_AMD64_SNP_DEBUG_SWAP so an SNP guest can gracefully terminate during SNP feature negotiation. For SEV-ES (which does not negotiate features) and enabled DebugSwap, fail to handle DR7's #VC and return en error which in turn causes panic() as there is no goot reason for the HV to keep intercepting DR7. Signed-off-by: Alexey Kardashevskiy <aik@amd.com> --- 1/3 and 2/3 are the same as in v3 and fairly independent from this one. The question now is what should SNP-feature-negotiation-aware guest do when KVM enables DebugSwap. "x86/sev: Add SEV-SNP guest feature negotiation support" is going to reach the upstream long before any of these three from this thread. It does not matter now as there is no SNP in upstream KVM. --- Changes: v4: * rebased on top of SNP feature negotiation v2: * use new bit definition --- arch/x86/boot/compressed/sev.c | 2 +- arch/x86/kernel/sev.c | 6 ++++++ 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index d63ad8f99f83..ac86f458951d 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -315,7 +315,7 @@ static void enforce_vmpl0(void) * by the guest kernel. As and when a new feature is implemented in the * guest kernel, a corresponding bit should be added to the mask. */ -#define SNP_FEATURES_PRESENT (0) +#define SNP_FEATURES_PRESENT MSR_AMD64_SNP_DEBUG_SWAP void snp_check_features(void) { diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 679026a640ef..f29e60c496d7 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -1618,6 +1618,9 @@ static enum es_result vc_handle_dr7_write(struct ghcb *ghcb, long val, *reg = vc_insn_get_rm(ctxt); enum es_result ret; + if (sev_status & MSR_AMD64_SNP_DEBUG_SWAP) + return ES_VMM_ERROR; + if (!reg) return ES_DECODE_FAILED; @@ -1655,6 +1658,9 @@ static enum es_result vc_handle_dr7_read(struct ghcb *ghcb, struct sev_es_runtime_data *data = this_cpu_read(runtime_data); long *reg = vc_insn_get_rm(ctxt); + if (sev_status & MSR_AMD64_SNP_DEBUG_SWAP) + return ES_VMM_ERROR; + if (!reg) return ES_DECODE_FAILED; -- 2.39.1 ^ permalink raw reply related [flat|nested] 28+ messages in thread
end of thread, other threads:[~2023-03-29 15:14 UTC | newest] Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-01-20 3:10 [PATCH kernel v3 0/3] KVM: SEV: Enable AMD SEV-ES DebugSwap Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 1/3] x86/amd: Cache debug register values in percpu variables Alexey Kardashevskiy 2023-01-31 19:27 ` [tip: x86/cpu] " tip-bot2 for Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 2/3] KVM: SEV: Enable data breakpoints in SEV-ES Alexey Kardashevskiy 2023-01-31 19:22 ` Borislav Petkov 2023-02-01 2:20 ` Sean Christopherson 2023-02-01 19:32 ` Sean Christopherson 2023-02-03 12:26 ` Borislav Petkov 2023-02-01 2:18 ` Sean Christopherson 2023-02-03 3:37 ` Alexey Kardashevskiy 2023-02-03 5:14 ` [PATCH kernel v4] " Alexey Kardashevskiy 2023-02-21 5:19 ` Alexey Kardashevskiy 2023-03-14 9:43 ` Alexey Kardashevskiy 2023-03-21 6:56 ` Alexey Kardashevskiy 2023-03-23 17:40 ` Sean Christopherson 2023-03-29 15:13 ` Tom Lendacky 2023-03-23 16:39 ` [PATCH kernel v3 2/3] " Sean Christopherson 2023-03-24 4:05 ` Alexey Kardashevskiy 2023-01-20 3:10 ` [PATCH kernel v3 3/3] x86/sev: Do not handle #VC for DR7 read/write Alexey Kardashevskiy 2023-01-20 5:12 ` Nikunj A. Dadhania 2023-01-20 10:23 ` Alexey Kardashevskiy 2023-01-20 12:06 ` Borislav Petkov 2023-01-25 3:11 ` Alexey Kardashevskiy 2023-01-25 5:44 ` Borislav Petkov 2023-01-24 10:37 ` Nikunj A. Dadhania 2023-01-24 12:37 ` Alexey Kardashevskiy 2023-01-24 13:17 ` Nikunj A. Dadhania 2023-01-30 0:56 ` [PATCH kernel v4 " Alexey Kardashevskiy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).