linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] x86/entry: disallow #DB more
@ 2020-05-28 20:19 Peter Zijlstra
  2020-05-28 20:19 ` [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}() Peter Zijlstra
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 20:19 UTC (permalink / raw)
  To: tglx, luto, peterz
  Cc: linux-kernel, x86, Lai Jiangshan, sean.j.christopherson,
	andrew.cooper3, daniel.thompson

These patches disallow #DB during NMI/#MC and allow removing a lot of fugly code.

Patch #6 should probably wait until we've got the KGDB situation sorted
because applying that makes a bad situation worse.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}()
  2020-05-28 20:19 [PATCH 0/6] x86/entry: disallow #DB more Peter Zijlstra
@ 2020-05-28 20:19 ` Peter Zijlstra
  2020-05-28 20:52   ` Andrew Cooper
  2020-05-28 20:19 ` [PATCH 2/6] x86/entry, nmi: Disable #DB Peter Zijlstra
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 20:19 UTC (permalink / raw)
  To: tglx, luto, peterz
  Cc: linux-kernel, x86, Lai Jiangshan, sean.j.christopherson,
	andrew.cooper3, daniel.thompson

In order to allow other exceptions than #DB to disable breakpoints,
provide a common helper.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/debugreg.h |   25 +++++++++++++++++++++++++
 arch/x86/kernel/traps.c         |   18 ++----------------
 2 files changed, 27 insertions(+), 16 deletions(-)

--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -113,6 +113,31 @@ static inline void debug_stack_usage_inc
 static inline void debug_stack_usage_dec(void) { }
 #endif /* X86_64 */
 
+static __always_inline void local_db_save(unsigned long *dr7)
+{
+	get_debugreg(*dr7, 7);
+	if (*dr7)
+		set_debugreg(0, 7);
+	/*
+	 * Ensure the compiler doesn't lower the above statements into
+	 * the critical section; disabling breakpoints late would not
+	 * be good.
+	 */
+	barrier();
+}
+
+static __always_inline void local_db_restore(unsigned long dr7)
+{
+	/*
+	 * Ensure the compiler doesn't raise this statement into
+	 * the critical section; enabling breakpoints early would
+	 * not be good.
+	 */
+	barrier();
+	if (dr7)
+		set_debugreg(dr7, 7);
+}
+
 #ifdef CONFIG_CPU_SUP_AMD
 extern void set_dr_addr_mask(unsigned long mask, int dr);
 #else
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -727,15 +727,7 @@ static __always_inline void debug_enter(
 	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
 	 * includes the entry stack is excluded for everything.
 	 */
-	get_debugreg(*dr7, 7);
-	set_debugreg(0, 7);
-
-	/*
-	 * Ensure the compiler doesn't lower the above statements into
-	 * the critical section; disabling breakpoints late would not
-	 * be good.
-	 */
-	barrier();
+	local_db_save(dr7);
 
 	/*
 	 * The Intel SDM says:
@@ -756,13 +748,7 @@ static __always_inline void debug_enter(
 
 static __always_inline void debug_exit(unsigned long dr7)
 {
-	/*
-	 * Ensure the compiler doesn't raise this statement into
-	 * the critical section; enabling breakpoints early would
-	 * not be good.
-	 */
-	barrier();
-	set_debugreg(dr7, 7);
+	local_db_restore(dr7);
 }
 
 /*



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 2/6] x86/entry, nmi: Disable #DB
  2020-05-28 20:19 [PATCH 0/6] x86/entry: disallow #DB more Peter Zijlstra
  2020-05-28 20:19 ` [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}() Peter Zijlstra
@ 2020-05-28 20:19 ` Peter Zijlstra
  2020-05-28 20:19 ` [PATCH 3/6] x86/entry, mce: Disallow #DB during #MC Peter Zijlstra
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 20:19 UTC (permalink / raw)
  To: tglx, luto, peterz
  Cc: linux-kernel, x86, Lai Jiangshan, sean.j.christopherson,
	andrew.cooper3, daniel.thompson

Instead of playing stupid games with IST stacks, fully disallow #DB
during NMIs. There is absolutely no reason to allow them, and killing
this saves a heap of trouble.

We already disallow #DB on noinstr and CEA, so we can't get #DB before
this, and this ensures we can't get it after this either.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/nmi.c |   55 ++------------------------------------------------
 1 file changed, 3 insertions(+), 52 deletions(-)

--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -478,40 +478,7 @@ enum nmi_states {
 };
 static DEFINE_PER_CPU(enum nmi_states, nmi_state);
 static DEFINE_PER_CPU(unsigned long, nmi_cr2);
-
-#ifdef CONFIG_X86_64
-/*
- * In x86_64, we need to handle breakpoint -> NMI -> breakpoint.  Without
- * some care, the inner breakpoint will clobber the outer breakpoint's
- * stack.
- *
- * If a breakpoint is being processed, and the debug stack is being
- * used, if an NMI comes in and also hits a breakpoint, the stack
- * pointer will be set to the same fixed address as the breakpoint that
- * was interrupted, causing that stack to be corrupted. To handle this
- * case, check if the stack that was interrupted is the debug stack, and
- * if so, change the IDT so that new breakpoints will use the current
- * stack and not switch to the fixed address. On return of the NMI,
- * switch back to the original IDT.
- */
-static DEFINE_PER_CPU(int, update_debug_stack);
-
-static noinstr bool is_debug_stack(unsigned long addr)
-{
-	struct cea_exception_stacks *cs = __this_cpu_read(cea_exception_stacks);
-	unsigned long top = CEA_ESTACK_TOP(cs, DB);
-	unsigned long bot = CEA_ESTACK_BOT(cs, DB1);
-
-	if (__this_cpu_read(debug_stack_usage))
-		return true;
-	/*
-	 * Note, this covers the guard page between DB and DB1 as well to
-	 * avoid two checks. But by all means @addr can never point into
-	 * the guard page.
-	 */
-	return addr >= bot && addr < top;
-}
-#endif
+static DEFINE_PER_CPU(unsigned long, nmi_dr7);
 
 DEFINE_IDTENTRY_NMI(exc_nmi)
 {
@@ -526,18 +493,7 @@ DEFINE_IDTENTRY_NMI(exc_nmi)
 	this_cpu_write(nmi_cr2, read_cr2());
 nmi_restart:
 
-#ifdef CONFIG_X86_64
-	/*
-	 * If we interrupted a breakpoint, it is possible that
-	 * the nmi handler will have breakpoints too. We need to
-	 * change the IDT such that breakpoints that happen here
-	 * continue to use the NMI stack.
-	 */
-	if (unlikely(is_debug_stack(regs->sp))) {
-		debug_stack_set_zero();
-		this_cpu_write(update_debug_stack, 1);
-	}
-#endif
+	local_db_save(this_cpu_ptr(&nmi_dr7));
 
 	nmi_enter();
 
@@ -548,12 +504,7 @@ DEFINE_IDTENTRY_NMI(exc_nmi)
 
 	nmi_exit();
 
-#ifdef CONFIG_X86_64
-	if (unlikely(this_cpu_read(update_debug_stack))) {
-		debug_stack_reset();
-		this_cpu_write(update_debug_stack, 0);
-	}
-#endif
+	local_db_restore(*this_cpu_ptr(&nmi_dr7));
 
 	if (unlikely(this_cpu_read(nmi_cr2) != read_cr2()))
 		write_cr2(this_cpu_read(nmi_cr2));



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 3/6] x86/entry, mce: Disallow #DB during #MC
  2020-05-28 20:19 [PATCH 0/6] x86/entry: disallow #DB more Peter Zijlstra
  2020-05-28 20:19 ` [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}() Peter Zijlstra
  2020-05-28 20:19 ` [PATCH 2/6] x86/entry, nmi: Disable #DB Peter Zijlstra
@ 2020-05-28 20:19 ` Peter Zijlstra
  2020-05-28 20:19 ` [PATCH 4/6] x86/entry: Optimize local_db_save() for virt Peter Zijlstra
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 20:19 UTC (permalink / raw)
  To: tglx, luto, peterz
  Cc: linux-kernel, x86, Lai Jiangshan, sean.j.christopherson,
	andrew.cooper3, daniel.thompson

#MC is fragile as heck, don't tempt fate.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/cpu/mce/core.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

--- a/arch/x86/kernel/cpu/mce/core.c
+++ b/arch/x86/kernel/cpu/mce/core.c
@@ -1943,22 +1943,34 @@ static __always_inline void exc_machine_
 /* MCE hit kernel mode */
 DEFINE_IDTENTRY_MCE(exc_machine_check)
 {
+	unsigned long dr7;
+
+	local_db_save(&dr7);
 	exc_machine_check_kernel(regs);
+	local_db_restore(dr7);
 }
 
 /* The user mode variant. */
 DEFINE_IDTENTRY_MCE_USER(exc_machine_check)
 {
+	unsigned long dr7;
+
+	local_db_save(&dr7);
 	exc_machine_check_user(regs);
+	local_db_restore(dr7);
 }
 #else
 /* 32bit unified entry point */
 DEFINE_IDTENTRY_MCE(exc_machine_check)
 {
+	unsigned long dr7;
+
+	local_db_save(&dr7);
 	if (user_mode(regs))
 		exc_machine_check_user(regs);
 	else
 		exc_machine_check_kernel(regs);
+	local_db_restore(dr7);
 }
 #endif
 



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 4/6] x86/entry: Optimize local_db_save() for virt
  2020-05-28 20:19 [PATCH 0/6] x86/entry: disallow #DB more Peter Zijlstra
                   ` (2 preceding siblings ...)
  2020-05-28 20:19 ` [PATCH 3/6] x86/entry, mce: Disallow #DB during #MC Peter Zijlstra
@ 2020-05-28 20:19 ` Peter Zijlstra
  2020-05-29 17:24   ` Sean Christopherson
  2020-05-28 20:19 ` [PATCH 5/6] x86/entry: Remove debug IDT frobbing Peter Zijlstra
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 20:19 UTC (permalink / raw)
  To: tglx, luto, peterz
  Cc: linux-kernel, x86, Lai Jiangshan, sean.j.christopherson,
	andrew.cooper3, daniel.thompson, Andy Lutomirski

Because DRn access is 'difficult' with virt; but the DR7 read is
cheaper than a cacheline miss on native, add a virt specific
fast path to local_db_save(), such that when breakpoints are not in
use we avoid touching DRn entirely.

Suggested-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/debugreg.h |    6 ++++++
 arch/x86/kernel/hw_breakpoint.c |   26 ++++++++++++++++++++++----
 arch/x86/kvm/vmx/nested.c       |    2 ++
 3 files changed, 30 insertions(+), 4 deletions(-)

--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -115,6 +115,12 @@ static inline void debug_stack_usage_dec
 
 static __always_inline void local_db_save(unsigned long *dr7)
 {
+	if (static_cpu_has(X86_FEATURE_HYPERVISOR) && !hw_breakpoint_active()) {
+		*dr7 = 0;
+		barrier();
+		return;
+	}
+
 	get_debugreg(*dr7, 7);
 	if (*dr7)
 		set_debugreg(0, 7);
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -97,6 +97,8 @@ int arch_install_hw_breakpoint(struct pe
 	unsigned long *dr7;
 	int i;
 
+	lockdep_assert_irqs_disabled();
+
 	for (i = 0; i < HBP_NUM; i++) {
 		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
@@ -115,6 +117,12 @@ int arch_install_hw_breakpoint(struct pe
 	dr7 = this_cpu_ptr(&cpu_dr7);
 	*dr7 |= encode_dr7(i, info->len, info->type);
 
+	/*
+	 * Ensure we first write cpu_dr7 before we set the DR7 register.
+	 * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
+	 */
+	barrier();
+
 	set_debugreg(*dr7, 7);
 	if (info->mask)
 		set_dr_addr_mask(info->mask, i);
@@ -134,9 +142,11 @@ int arch_install_hw_breakpoint(struct pe
 void arch_uninstall_hw_breakpoint(struct perf_event *bp)
 {
 	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
-	unsigned long *dr7;
+	unsigned long dr7;
 	int i;
 
+	lockdep_assert_irqs_disabled();
+
 	for (i = 0; i < HBP_NUM; i++) {
 		struct perf_event **slot = this_cpu_ptr(&bp_per_reg[i]);
 
@@ -149,12 +159,20 @@ void arch_uninstall_hw_breakpoint(struct
 	if (WARN_ONCE(i == HBP_NUM, "Can't find any breakpoint slot"))
 		return;
 
-	dr7 = this_cpu_ptr(&cpu_dr7);
-	*dr7 &= ~__encode_dr7(i, info->len, info->type);
+	dr7 = this_cpu_read(cpu_dr7);
+	dr7 &= ~__encode_dr7(i, info->len, info->type);
 
-	set_debugreg(*dr7, 7);
+	set_debugreg(dr7, 7);
 	if (info->mask)
 		set_dr_addr_mask(0, i);
+
+	/*
+	 * Ensure the write to cpu_dr7 is after we've set the DR7 register.
+	 * This ensures an NMI never see cpu_dr7 0 when DR7 is not.
+	 */
+	barrier();
+
+	this_cpu_write(cpu_dr7, dr7);
 }
 
 static int arch_bp_generic_len(int x86_len)
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3027,6 +3027,8 @@ static int nested_vmx_check_vmentry_hw(s
 
 	/*
 	 * VMExit clears RFLAGS.IF and DR7, even on a consistency check.
+	 * XXX how is this not broken? access to cpu_dr7 ought to be with
+	 * IRQs disabled.
 	 */
 	local_irq_enable();
 	if (hw_breakpoint_active())



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 5/6] x86/entry: Remove debug IDT frobbing
  2020-05-28 20:19 [PATCH 0/6] x86/entry: disallow #DB more Peter Zijlstra
                   ` (3 preceding siblings ...)
  2020-05-28 20:19 ` [PATCH 4/6] x86/entry: Optimize local_db_save() for virt Peter Zijlstra
@ 2020-05-28 20:19 ` Peter Zijlstra
  2020-05-28 20:19 ` [RFC][PATCH 6/6] x86/entry: Remove DBn stacks Peter Zijlstra
  2020-05-28 22:42 ` [PATCH 0/6] x86/entry: disallow #DB more Lai Jiangshan
  6 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 20:19 UTC (permalink / raw)
  To: tglx, luto, peterz
  Cc: linux-kernel, x86, Lai Jiangshan, sean.j.christopherson,
	andrew.cooper3, daniel.thompson

This is all unused now.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/debugreg.h |   19 -------------------
 arch/x86/include/asm/desc.h     |   34 +---------------------------------
 arch/x86/kernel/cpu/common.c    |   17 -----------------
 arch/x86/kernel/idt.c           |   30 ------------------------------
 arch/x86/kernel/traps.c         |    9 ---------
 5 files changed, 1 insertion(+), 108 deletions(-)

--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -94,25 +94,6 @@ extern void aout_dump_debugregs(struct u
 
 extern void hw_breakpoint_restore(void);
 
-#ifdef CONFIG_X86_64
-DECLARE_PER_CPU(int, debug_stack_usage);
-static inline void debug_stack_usage_inc(void)
-{
-	__this_cpu_inc(debug_stack_usage);
-}
-static inline void debug_stack_usage_dec(void)
-{
-	__this_cpu_dec(debug_stack_usage);
-}
-void debug_stack_set_zero(void);
-void debug_stack_reset(void);
-#else /* !X86_64 */
-static inline void debug_stack_set_zero(void) { }
-static inline void debug_stack_reset(void) { }
-static inline void debug_stack_usage_inc(void) { }
-static inline void debug_stack_usage_dec(void) { }
-#endif /* X86_64 */
-
 static __always_inline void local_db_save(unsigned long *dr7)
 {
 	if (static_cpu_has(X86_FEATURE_HYPERVISOR) && !hw_breakpoint_active()) {
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -42,8 +42,6 @@ static inline void fill_ldt(struct desc_
 
 extern struct desc_ptr idt_descr;
 extern gate_desc idt_table[];
-extern const struct desc_ptr debug_idt_descr;
-extern gate_desc debug_idt_table[];
 
 struct gdt_page {
 	struct desc_struct gdt[GDT_ENTRIES];
@@ -390,31 +388,6 @@ void alloc_intr_gate(unsigned int n, con
 
 extern unsigned long system_vectors[];
 
-#ifdef CONFIG_X86_64
-DECLARE_PER_CPU(u32, debug_idt_ctr);
-static __always_inline bool is_debug_idt_enabled(void)
-{
-	if (this_cpu_read(debug_idt_ctr))
-		return true;
-
-	return false;
-}
-
-static __always_inline void load_debug_idt(void)
-{
-	load_idt((const struct desc_ptr *)&debug_idt_descr);
-}
-#else
-static inline bool is_debug_idt_enabled(void)
-{
-	return false;
-}
-
-static inline void load_debug_idt(void)
-{
-}
-#endif
-
 /*
  * The load_current_idt() must be called with interrupts disabled
  * to avoid races. That way the IDT will always be set back to the expected
@@ -424,10 +397,7 @@ static inline void load_debug_idt(void)
  */
 static __always_inline void load_current_idt(void)
 {
-	if (is_debug_idt_enabled())
-		load_debug_idt();
-	else
-		load_idt((const struct desc_ptr *)&idt_descr);
+	load_idt((const struct desc_ptr *)&idt_descr);
 }
 
 extern void idt_setup_early_handler(void);
@@ -438,11 +408,9 @@ extern void idt_setup_apic_and_irq_gates
 #ifdef CONFIG_X86_64
 extern void idt_setup_early_pf(void);
 extern void idt_setup_ist_traps(void);
-extern void idt_setup_debugidt_traps(void);
 #else
 static inline void idt_setup_early_pf(void) { }
 static inline void idt_setup_ist_traps(void) { }
-static inline void idt_setup_debugidt_traps(void) { }
 #endif
 
 extern void idt_invalidate(void *addr);
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1672,23 +1672,6 @@ void syscall_init(void)
 	       X86_EFLAGS_IOPL|X86_EFLAGS_AC|X86_EFLAGS_NT);
 }
 
-DEFINE_PER_CPU(int, debug_stack_usage);
-DEFINE_PER_CPU(u32, debug_idt_ctr);
-
-noinstr void debug_stack_set_zero(void)
-{
-	this_cpu_inc(debug_idt_ctr);
-	load_current_idt();
-}
-
-noinstr void debug_stack_reset(void)
-{
-	if (WARN_ON(!this_cpu_read(debug_idt_ctr)))
-		return;
-	if (this_cpu_dec_return(debug_idt_ctr) == 0)
-		load_current_idt();
-}
-
 #else	/* CONFIG_X86_64 */
 
 DEFINE_PER_CPU(struct task_struct *, current_task) = &init_task;
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -158,14 +158,6 @@ static const __initconst struct idt_data
 static const __initconst struct idt_data early_pf_idts[] = {
 	INTG(X86_TRAP_PF,		asm_exc_page_fault),
 };
-
-/*
- * Override for the debug_idt. Same as the default, but with interrupt
- * stack set to DEFAULT_STACK (0). Required for NMI trap handling.
- */
-static const __initconst struct idt_data dbg_idts[] = {
-	INTG(X86_TRAP_DB,		asm_exc_debug),
-};
 #endif
 
 /* Must be page-aligned because the real IDT is used in a fixmap. */
@@ -177,9 +169,6 @@ struct desc_ptr idt_descr __ro_after_ini
 };
 
 #ifdef CONFIG_X86_64
-/* No need to be aligned, but done to keep all IDTs defined the same way. */
-gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
-
 /*
  * The exceptions which use Interrupt stacks. They are setup after
  * cpu_init() when the TSS has been initialized.
@@ -192,15 +181,6 @@ static const __initconst struct idt_data
 	ISTG(X86_TRAP_MC,	asm_exc_machine_check,	IST_INDEX_MCE),
 #endif
 };
-
-/*
- * Override for the debug_idt. Same as the default, but with interrupt
- * stack set to DEFAULT_STACK (0). Required for NMI trap handling.
- */
-const struct desc_ptr debug_idt_descr = {
-	.size		= IDT_ENTRIES * 16 - 1,
-	.address	= (unsigned long) debug_idt_table,
-};
 #endif
 
 static inline void idt_init_desc(gate_desc *gate, const struct idt_data *d)
@@ -292,16 +272,6 @@ void __init idt_setup_ist_traps(void)
 {
 	idt_setup_from_table(idt_table, ist_idts, ARRAY_SIZE(ist_idts), true);
 }
-
-/**
- * idt_setup_debugidt_traps - Initialize the debug idt table with debug traps
- */
-void __init idt_setup_debugidt_traps(void)
-{
-	memcpy(&debug_idt_table, &idt_table, IDT_ENTRIES * 16);
-
-	idt_setup_from_table(debug_idt_table, dbg_idts, ARRAY_SIZE(dbg_idts), false);
-}
 #endif
 
 /**
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -798,12 +798,6 @@ static void noinstr handle_debug(struct
 		return;
 	}
 
-	/*
-	 * Let others (NMI) know that the debug stack is in use
-	 * as we may switch to the interrupt stack.
-	 */
-	debug_stack_usage_inc();
-
 	/* It's safe to allow irq's after DR6 has been saved */
 	cond_local_irq_enable(regs);
 
@@ -831,7 +825,6 @@ static void noinstr handle_debug(struct
 
 out:
 	cond_local_irq_disable(regs);
-	debug_stack_usage_dec();
 	instrumentation_end();
 }
 
@@ -1077,6 +1070,4 @@ void __init trap_init(void)
 	cpu_init();
 
 	idt_setup_ist_traps();
-
-	idt_setup_debugidt_traps();
 }



^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC][PATCH 6/6] x86/entry: Remove DBn stacks
  2020-05-28 20:19 [PATCH 0/6] x86/entry: disallow #DB more Peter Zijlstra
                   ` (4 preceding siblings ...)
  2020-05-28 20:19 ` [PATCH 5/6] x86/entry: Remove debug IDT frobbing Peter Zijlstra
@ 2020-05-28 20:19 ` Peter Zijlstra
  2020-05-28 22:35   ` Lai Jiangshan
  2020-05-28 22:42 ` [PATCH 0/6] x86/entry: disallow #DB more Lai Jiangshan
  6 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 20:19 UTC (permalink / raw)
  To: tglx, luto, peterz
  Cc: linux-kernel, x86, Lai Jiangshan, sean.j.christopherson,
	andrew.cooper3, daniel.thompson

Both #DB itself, as all other IST users (NMI, #MC) now clear DR7 on
entry. Combined with not allowing breakpoints on entry/noinstr/NOKPROBE
text and no single step (EFLAGS.TF) inside the #DB handler should
guarantee us no nested #DB.

XXX depends on KGDB breakpoints not being stupid

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_64.S             |   13 -------------
 arch/x86/include/asm/cpu_entry_area.h |    6 ------
 arch/x86/kernel/asm-offsets_64.c      |    3 ---
 arch/x86/kernel/dumpstack_64.c        |    3 ---
 arch/x86/mm/cpu_entry_area.c          |    1 -
 5 files changed, 26 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -396,11 +396,6 @@ SYM_CODE_END(\asmsym)
 	idtentry \vector asm_\cfunc \cfunc has_error_code=0
 .endm
 
-/*
- * MCE and DB exceptions
- */
-#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + (x) * 8)
-
 /**
  * idtentry_mce_db - Macro to generate entry stubs for #MC and #DB
  * @vector:		Vector number
@@ -445,16 +440,8 @@ SYM_CODE_START(\asmsym)
 
 	movq	%rsp, %rdi		/* pt_regs pointer */
 
-	.if \vector == X86_TRAP_DB
-		subq	$DB_STACK_OFFSET, CPU_TSS_IST(IST_INDEX_DB)
-	.endif
-
 	call	\cfunc
 
-	.if \vector == X86_TRAP_DB
-		addq	$DB_STACK_OFFSET, CPU_TSS_IST(IST_INDEX_DB)
-	.endif
-
 	jmp	paranoid_exit
 
 	/* Switch to the regular task stack and use the noist entry point */
--- a/arch/x86/include/asm/cpu_entry_area.h
+++ b/arch/x86/include/asm/cpu_entry_area.h
@@ -16,10 +16,6 @@
 	char	DF_stack[EXCEPTION_STKSZ];	\
 	char	NMI_stack_guard[guardsize];	\
 	char	NMI_stack[EXCEPTION_STKSZ];	\
-	char	DB2_stack_guard[guardsize];	\
-	char	DB2_stack[db2_holesize];	\
-	char	DB1_stack_guard[guardsize];	\
-	char	DB1_stack[EXCEPTION_STKSZ];	\
 	char	DB_stack_guard[guardsize];	\
 	char	DB_stack[EXCEPTION_STKSZ];	\
 	char	MCE_stack_guard[guardsize];	\
@@ -42,8 +38,6 @@ struct cea_exception_stacks {
 enum exception_stack_ordering {
 	ESTACK_DF,
 	ESTACK_NMI,
-	ESTACK_DB2,
-	ESTACK_DB1,
 	ESTACK_DB,
 	ESTACK_MCE,
 	N_EXCEPTION_STACKS
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -57,9 +57,6 @@ int main(void)
 	BLANK();
 #undef ENTRY
 
-	OFFSET(TSS_ist, tss_struct, x86_tss.ist);
-	DEFINE(DB_STACK_OFFSET, offsetof(struct cea_exception_stacks, DB_stack) -
-	       offsetof(struct cea_exception_stacks, DB1_stack));
 	BLANK();
 
 #ifdef CONFIG_STACKPROTECTOR
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -22,8 +22,6 @@
 static const char * const exception_stack_names[] = {
 		[ ESTACK_DF	]	= "#DF",
 		[ ESTACK_NMI	]	= "NMI",
-		[ ESTACK_DB2	]	= "#DB2",
-		[ ESTACK_DB1	]	= "#DB1",
 		[ ESTACK_DB	]	= "#DB",
 		[ ESTACK_MCE	]	= "#MC",
 };
@@ -79,7 +77,6 @@ static const
 struct estack_pages estack_pages[CEA_ESTACK_PAGES] ____cacheline_aligned = {
 	EPAGERANGE(DF),
 	EPAGERANGE(NMI),
-	EPAGERANGE(DB1),
 	EPAGERANGE(DB),
 	EPAGERANGE(MCE),
 };
--- a/arch/x86/mm/cpu_entry_area.c
+++ b/arch/x86/mm/cpu_entry_area.c
@@ -107,7 +107,6 @@ static void __init percpu_setup_exceptio
 	 */
 	cea_map_stack(DF);
 	cea_map_stack(NMI);
-	cea_map_stack(DB1);
 	cea_map_stack(DB);
 	cea_map_stack(MCE);
 }



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}()
  2020-05-28 20:19 ` [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}() Peter Zijlstra
@ 2020-05-28 20:52   ` Andrew Cooper
  2020-05-28 21:15     ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Andrew Cooper @ 2020-05-28 20:52 UTC (permalink / raw)
  To: Peter Zijlstra, tglx, luto
  Cc: linux-kernel, x86, Lai Jiangshan, sean.j.christopherson, daniel.thompson

On 28/05/2020 21:19, Peter Zijlstra wrote:
> --- a/arch/x86/include/asm/debugreg.h
> +++ b/arch/x86/include/asm/debugreg.h
> @@ -113,6 +113,31 @@ static inline void debug_stack_usage_inc
>  static inline void debug_stack_usage_dec(void) { }
>  #endif /* X86_64 */
>  
> +static __always_inline void local_db_save(unsigned long *dr7)
> +{
> +	get_debugreg(*dr7, 7);
> +	if (*dr7)
> +		set_debugreg(0, 7);

%dr7 has an architecturally stuck bit in it.

You want *dr7 != 0x400 to avoid writing 0 unconditionally.

Also, API wise, wouldn't it be nicer to write "dr7 = local_db_save()"
rather than having a void function returning a single long via pointer?

~Andrew

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}()
  2020-05-28 20:52   ` Andrew Cooper
@ 2020-05-28 21:15     ` Peter Zijlstra
  2020-05-28 21:33       ` Peter Zijlstra
  2020-05-28 21:36       ` Andrew Cooper
  0 siblings, 2 replies; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 21:15 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: tglx, luto, linux-kernel, x86, Lai Jiangshan,
	sean.j.christopherson, daniel.thompson

On Thu, May 28, 2020 at 09:52:30PM +0100, Andrew Cooper wrote:
> On 28/05/2020 21:19, Peter Zijlstra wrote:
> > --- a/arch/x86/include/asm/debugreg.h
> > +++ b/arch/x86/include/asm/debugreg.h
> > @@ -113,6 +113,31 @@ static inline void debug_stack_usage_inc
> >  static inline void debug_stack_usage_dec(void) { }
> >  #endif /* X86_64 */
> >  
> > +static __always_inline void local_db_save(unsigned long *dr7)
> > +{
> > +	get_debugreg(*dr7, 7);
> > +	if (*dr7)
> > +		set_debugreg(0, 7);
> 
> %dr7 has an architecturally stuck bit in it.
> 
> You want *dr7 != 0x400 to avoid writing 0 unconditionally.

Do we have to have that bit set when writing it? Otherwise I might
actually prefer masking it out.

> Also, API wise, wouldn't it be nicer to write "dr7 = local_db_save()"
> rather than having a void function returning a single long via pointer?

Probably.. I started with local_irq_save() and .. well, n/m. I'll change
it ;-)

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}()
  2020-05-28 21:15     ` Peter Zijlstra
@ 2020-05-28 21:33       ` Peter Zijlstra
  2020-05-29 17:28         ` Andy Lutomirski
  2020-05-28 21:36       ` Andrew Cooper
  1 sibling, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 21:33 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: tglx, luto, linux-kernel, x86, Lai Jiangshan,
	sean.j.christopherson, daniel.thompson

On Thu, May 28, 2020 at 11:15:50PM +0200, Peter Zijlstra wrote:
> On Thu, May 28, 2020 at 09:52:30PM +0100, Andrew Cooper wrote:
> > On 28/05/2020 21:19, Peter Zijlstra wrote:
> > > --- a/arch/x86/include/asm/debugreg.h
> > > +++ b/arch/x86/include/asm/debugreg.h
> > > @@ -113,6 +113,31 @@ static inline void debug_stack_usage_inc
> > >  static inline void debug_stack_usage_dec(void) { }
> > >  #endif /* X86_64 */
> > >  
> > > +static __always_inline void local_db_save(unsigned long *dr7)
> > > +{
> > > +	get_debugreg(*dr7, 7);
> > > +	if (*dr7)
> > > +		set_debugreg(0, 7);
> > 
> > %dr7 has an architecturally stuck bit in it.
> > 
> > You want *dr7 != 0x400 to avoid writing 0 unconditionally.
> 
> Do we have to have that bit set when writing it? Otherwise I might
> actually prefer masking it out.

I'm an idiot, we write a plain 9..

> > Also, API wise, wouldn't it be nicer to write "dr7 = local_db_save()"
> > rather than having a void function returning a single long via pointer?
> 
> Probably.. I started with local_irq_save() and .. well, n/m. I'll change
> it ;-)

How's this?

---
--- a/arch/x86/include/asm/debugreg.h
+++ b/arch/x86/include/asm/debugreg.h
@@ -113,6 +113,36 @@ static inline void debug_stack_usage_inc
 static inline void debug_stack_usage_dec(void) { }
 #endif /* X86_64 */
 
+static __always_inline unsigned long local_db_save(void)
+{
+	unsigned long dr7;
+
+	get_debugreg(&dr7, 7);
+	dr7 ^= 0x400;
+	if (dr7)
+		set_debugreg(0, 7);
+	/*
+	 * Ensure the compiler doesn't lower the above statements into
+	 * the critical section; disabling breakpoints late would not
+	 * be good.
+	 */
+	barrier();
+
+	return dr7;
+}
+
+static __always_inline void local_db_restore(unsigned long dr7)
+{
+	/*
+	 * Ensure the compiler doesn't raise this statement into
+	 * the critical section; enabling breakpoints early would
+	 * not be good.
+	 */
+	barrier();
+	if (dr7)
+		set_debugreg(dr7, 7);
+}
+
 #ifdef CONFIG_CPU_SUP_AMD
 extern void set_dr_addr_mask(unsigned long mask, int dr);
 #else
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -727,15 +727,7 @@ static __always_inline void debug_enter(
 	 * Entry text is excluded for HW_BP_X and cpu_entry_area, which
 	 * includes the entry stack is excluded for everything.
 	 */
-	get_debugreg(*dr7, 7);
-	set_debugreg(0, 7);
-
-	/*
-	 * Ensure the compiler doesn't lower the above statements into
-	 * the critical section; disabling breakpoints late would not
-	 * be good.
-	 */
-	barrier();
+	*dr7 = local_db_save();
 
 	/*
 	 * The Intel SDM says:
@@ -756,13 +748,7 @@ static __always_inline void debug_enter(
 
 static __always_inline void debug_exit(unsigned long dr7)
 {
-	/*
-	 * Ensure the compiler doesn't raise this statement into
-	 * the critical section; enabling breakpoints early would
-	 * not be good.
-	 */
-	barrier();
-	set_debugreg(dr7, 7);
+	local_db_restore(dr7);
 }
 
 /*


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}()
  2020-05-28 21:15     ` Peter Zijlstra
  2020-05-28 21:33       ` Peter Zijlstra
@ 2020-05-28 21:36       ` Andrew Cooper
  1 sibling, 0 replies; 20+ messages in thread
From: Andrew Cooper @ 2020-05-28 21:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: tglx, luto, linux-kernel, x86, Lai Jiangshan,
	sean.j.christopherson, daniel.thompson

On 28/05/2020 22:15, Peter Zijlstra wrote:
> On Thu, May 28, 2020 at 09:52:30PM +0100, Andrew Cooper wrote:
>> On 28/05/2020 21:19, Peter Zijlstra wrote:
>>> --- a/arch/x86/include/asm/debugreg.h
>>> +++ b/arch/x86/include/asm/debugreg.h
>>> @@ -113,6 +113,31 @@ static inline void debug_stack_usage_inc
>>>  static inline void debug_stack_usage_dec(void) { }
>>>  #endif /* X86_64 */
>>>  
>>> +static __always_inline void local_db_save(unsigned long *dr7)
>>> +{
>>> +	get_debugreg(*dr7, 7);
>>> +	if (*dr7)
>>> +		set_debugreg(0, 7);
>> %dr7 has an architecturally stuck bit in it.
>>
>> You want *dr7 != 0x400 to avoid writing 0 unconditionally.
> Do we have to have that bit set when writing it? Otherwise I might
> actually prefer masking it out.

Not currently.  I guess it depends on how likely %dr7 is to gain an
inverted polarity bit like %dr6 did.

~Andrew

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 6/6] x86/entry: Remove DBn stacks
  2020-05-28 20:19 ` [RFC][PATCH 6/6] x86/entry: Remove DBn stacks Peter Zijlstra
@ 2020-05-28 22:35   ` Lai Jiangshan
  2020-05-28 22:37     ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Lai Jiangshan @ 2020-05-28 22:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Andy Lutomirski, LKML, X86 ML, Lai Jiangshan,
	Sean Christopherson, andrew.cooper3, daniel.thompson

On Fri, May 29, 2020 at 4:26 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> Both #DB itself, as all other IST users (NMI, #MC) now clear DR7 on
> entry. Combined with not allowing breakpoints on entry/noinstr/NOKPROBE
> text and no single step (EFLAGS.TF) inside the #DB handler should
> guarantee us no nested #DB.
>
> XXX depends on KGDB breakpoints not being stupid
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/entry/entry_64.S             |   13 -------------
>  arch/x86/include/asm/cpu_entry_area.h |    6 ------
>  arch/x86/kernel/asm-offsets_64.c      |    3 ---
>  arch/x86/kernel/dumpstack_64.c        |    3 ---
>  arch/x86/mm/cpu_entry_area.c          |    1 -
>  5 files changed, 26 deletions(-)
>
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -396,11 +396,6 @@ SYM_CODE_END(\asmsym)
>         idtentry \vector asm_\cfunc \cfunc has_error_code=0
>  .endm
>
> -/*
> - * MCE and DB exceptions
> - */
> -#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + (x) * 8)
> -
>  /**
>   * idtentry_mce_db - Macro to generate entry stubs for #MC and #DB
>   * @vector:            Vector number
> @@ -445,16 +440,8 @@ SYM_CODE_START(\asmsym)

There is some comment in the code above which need to be removed:
 * If the trap is #DB then the interrupt stack entry in the IST is
 * moved to the second stack, so a potential recursion will have a
 * fresh IST.

see:
https://lore.kernel.org/lkml/20200526014221.2119-6-laijs@linux.alibaba.com/

>
>         movq    %rsp, %rdi              /* pt_regs pointer */
>
> -       .if \vector == X86_TRAP_DB
> -               subq    $DB_STACK_OFFSET, CPU_TSS_IST(IST_INDEX_DB)
> -       .endif
> -
>         call    \cfunc
>
> -       .if \vector == X86_TRAP_DB
> -               addq    $DB_STACK_OFFSET, CPU_TSS_IST(IST_INDEX_DB)
> -       .endif
> -
>         jmp     paranoid_exit
>
>         /* Switch to the regular task stack and use the noist entry point */
> --- a/arch/x86/include/asm/cpu_entry_area.h
> +++ b/arch/x86/include/asm/cpu_entry_area.h
> @@ -16,10 +16,6 @@
>         char    DF_stack[EXCEPTION_STKSZ];      \
>         char    NMI_stack_guard[guardsize];     \
>         char    NMI_stack[EXCEPTION_STKSZ];     \
> -       char    DB2_stack_guard[guardsize];     \
> -       char    DB2_stack[db2_holesize];        \
> -       char    DB1_stack_guard[guardsize];     \
> -       char    DB1_stack[EXCEPTION_STKSZ];     \
>         char    DB_stack_guard[guardsize];      \
>         char    DB_stack[EXCEPTION_STKSZ];      \
>         char    MCE_stack_guard[guardsize];     \

argument db2_holesize should be removed

> @@ -42,8 +38,6 @@ struct cea_exception_stacks {
>  enum exception_stack_ordering {
>         ESTACK_DF,
>         ESTACK_NMI,
> -       ESTACK_DB2,
> -       ESTACK_DB1,
>         ESTACK_DB,
>         ESTACK_MCE,
>         N_EXCEPTION_STACKS



There will be a compile error about N_EXCEPTION_STACKS
in arch/x86/kernel/dumpstack_64.c. It should have

- BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
+ BUILD_BUG_ON(N_EXCEPTION_STACKS != 4);

see:
https://lore.kernel.org/lkml/20200526014221.2119-8-laijs@linux.alibaba.com/





> --- a/arch/x86/kernel/asm-offsets_64.c
> +++ b/arch/x86/kernel/asm-offsets_64.c
> @@ -57,9 +57,6 @@ int main(void)
>         BLANK();
>  #undef ENTRY
>
> -       OFFSET(TSS_ist, tss_struct, x86_tss.ist);
> -       DEFINE(DB_STACK_OFFSET, offsetof(struct cea_exception_stacks, DB_stack) -
> -              offsetof(struct cea_exception_stacks, DB1_stack));
>         BLANK();
>
>  #ifdef CONFIG_STACKPROTECTOR
> --- a/arch/x86/kernel/dumpstack_64.c
> +++ b/arch/x86/kernel/dumpstack_64.c
> @@ -22,8 +22,6 @@
>  static const char * const exception_stack_names[] = {
>                 [ ESTACK_DF     ]       = "#DF",
>                 [ ESTACK_NMI    ]       = "NMI",
> -               [ ESTACK_DB2    ]       = "#DB2",
> -               [ ESTACK_DB1    ]       = "#DB1",
>                 [ ESTACK_DB     ]       = "#DB",
>                 [ ESTACK_MCE    ]       = "#MC",
>  };
> @@ -79,7 +77,6 @@ static const
>  struct estack_pages estack_pages[CEA_ESTACK_PAGES] ____cacheline_aligned = {
>         EPAGERANGE(DF),
>         EPAGERANGE(NMI),
> -       EPAGERANGE(DB1),
>         EPAGERANGE(DB),
>         EPAGERANGE(MCE),
>  };
> --- a/arch/x86/mm/cpu_entry_area.c
> +++ b/arch/x86/mm/cpu_entry_area.c
> @@ -107,7 +107,6 @@ static void __init percpu_setup_exceptio
>          */
>         cea_map_stack(DF);
>         cea_map_stack(NMI);
> -       cea_map_stack(DB1);
>         cea_map_stack(DB);
>         cea_map_stack(MCE);
>  }
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 6/6] x86/entry: Remove DBn stacks
  2020-05-28 22:35   ` Lai Jiangshan
@ 2020-05-28 22:37     ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 22:37 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Thomas Gleixner, Andy Lutomirski, LKML, X86 ML, Lai Jiangshan,
	Sean Christopherson, andrew.cooper3, daniel.thompson

On Fri, May 29, 2020 at 06:35:07AM +0800, Lai Jiangshan wrote:
> There will be a compile error about N_EXCEPTION_STACKS
> in arch/x86/kernel/dumpstack_64.c. It should have
> 
> - BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
> + BUILD_BUG_ON(N_EXCEPTION_STACKS != 4);

Yeah, I actually have that, but forgot to refresh before sending :/

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/6] x86/entry: disallow #DB more
  2020-05-28 20:19 [PATCH 0/6] x86/entry: disallow #DB more Peter Zijlstra
                   ` (5 preceding siblings ...)
  2020-05-28 20:19 ` [RFC][PATCH 6/6] x86/entry: Remove DBn stacks Peter Zijlstra
@ 2020-05-28 22:42 ` Lai Jiangshan
  2020-05-28 22:48   ` Peter Zijlstra
  6 siblings, 1 reply; 20+ messages in thread
From: Lai Jiangshan @ 2020-05-28 22:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Andy Lutomirski, LKML, X86 ML, Lai Jiangshan,
	Sean Christopherson, andrew.cooper3, daniel.thompson

On Fri, May 29, 2020 at 4:25 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> These patches disallow #DB during NMI/#MC and allow removing a lot of fugly code.
>

Hello

Will #DB be allowed in #DF?

Thanks
Lai

> Patch #6 should probably wait until we've got the KGDB situation sorted
> because applying that makes a bad situation worse.
>
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/6] x86/entry: disallow #DB more
  2020-05-28 22:42 ` [PATCH 0/6] x86/entry: disallow #DB more Lai Jiangshan
@ 2020-05-28 22:48   ` Peter Zijlstra
  2020-05-28 23:05     ` Lai Jiangshan
  0 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-28 22:48 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Thomas Gleixner, Andy Lutomirski, LKML, X86 ML, Lai Jiangshan,
	Sean Christopherson, andrew.cooper3, daniel.thompson

On Fri, May 29, 2020 at 06:42:46AM +0800, Lai Jiangshan wrote:
> On Fri, May 29, 2020 at 4:25 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > These patches disallow #DB during NMI/#MC and allow removing a lot of fugly code.
> >
> 
> Hello
> 
> Will #DB be allowed in #DF?

No, that whole thing is noinstr.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/6] x86/entry: disallow #DB more
  2020-05-28 22:48   ` Peter Zijlstra
@ 2020-05-28 23:05     ` Lai Jiangshan
  2020-05-29  8:00       ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Lai Jiangshan @ 2020-05-28 23:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Andy Lutomirski, LKML, X86 ML, Lai Jiangshan,
	Sean Christopherson, andrew.cooper3, daniel.thompson

On Fri, May 29, 2020 at 6:48 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, May 29, 2020 at 06:42:46AM +0800, Lai Jiangshan wrote:
> > On Fri, May 29, 2020 at 4:25 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > These patches disallow #DB during NMI/#MC and allow removing a lot of fugly code.
> > >
> >
> > Hello
> >
> > Will #DB be allowed in #DF?
>
> No, that whole thing is noinstr.

But it calls many functions, including die(), panic().
We don't want #DB to interfere how it die() and panic().
Since it is in fragile #DF, the #DB may mess it up and
make #DF fails to report and die.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/6] x86/entry: disallow #DB more
  2020-05-28 23:05     ` Lai Jiangshan
@ 2020-05-29  8:00       ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-29  8:00 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Thomas Gleixner, Andy Lutomirski, LKML, X86 ML, Lai Jiangshan,
	Sean Christopherson, andrew.cooper3, daniel.thompson

On Fri, May 29, 2020 at 07:05:54AM +0800, Lai Jiangshan wrote:
> On Fri, May 29, 2020 at 6:48 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Fri, May 29, 2020 at 06:42:46AM +0800, Lai Jiangshan wrote:
> > > On Fri, May 29, 2020 at 4:25 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > These patches disallow #DB during NMI/#MC and allow removing a lot of fugly code.
> > > >
> > >
> > > Hello
> > >
> > > Will #DB be allowed in #DF?
> >
> > No, that whole thing is noinstr.
> 
> But it calls many functions, including die(), panic().
> We don't want #DB to interfere how it die() and panic().
> Since it is in fragile #DF, the #DB may mess it up and
> make #DF fails to report and die.

The only recoverable #DF is the ESPFIX shit. If we do not take that,
we're on the way to panic(), I really can't be arsed if you crash the
box before that, we're going to die anyway.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/6] x86/entry: Optimize local_db_save() for virt
  2020-05-28 20:19 ` [PATCH 4/6] x86/entry: Optimize local_db_save() for virt Peter Zijlstra
@ 2020-05-29 17:24   ` Sean Christopherson
  0 siblings, 0 replies; 20+ messages in thread
From: Sean Christopherson @ 2020-05-29 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: tglx, luto, linux-kernel, x86, Lai Jiangshan, andrew.cooper3,
	daniel.thompson, Andy Lutomirski

On Thu, May 28, 2020 at 10:19:41PM +0200, Peter Zijlstra wrote:
>  static int arch_bp_generic_len(int x86_len)
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3027,6 +3027,8 @@ static int nested_vmx_check_vmentry_hw(s
>  
>  	/*
>  	 * VMExit clears RFLAGS.IF and DR7, even on a consistency check.
> +	 * XXX how is this not broken? access to cpu_dr7 ought to be with
> +	 * IRQs disabled.

Ah, it's simply broken.  This code is conditional on a module param that's
off by default, i.e. it's not run widely, and odds are intersection with
debugging is rare.

Moving local_irq_enable() below the DR7 restoration is not an issue.

Maybe also add lockdep_assert_irqs_disabled() to hw_breakpoint_restore() or
hw_breakpoint_active()?

>  	 */
>  	local_irq_enable();
>  	if (hw_breakpoint_active())
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}()
  2020-05-28 21:33       ` Peter Zijlstra
@ 2020-05-29 17:28         ` Andy Lutomirski
  2020-05-29 19:02           ` Peter Zijlstra
  0 siblings, 1 reply; 20+ messages in thread
From: Andy Lutomirski @ 2020-05-29 17:28 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Cooper, tglx, linux-kernel, x86, Lai Jiangshan,
	sean.j.christopherson, daniel.thompson



> On May 28, 2020, at 2:34 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Thu, May 28, 2020 at 11:15:50PM +0200, Peter Zijlstra wrote:
>>> On Thu, May 28, 2020 at 09:52:30PM +0100, Andrew Cooper wrote:
>>> On 28/05/2020 21:19, Peter Zijlstra wrote:
>>>> --- a/arch/x86/include/asm/debugreg.h
>>>> +++ b/arch/x86/include/asm/debugreg.h
>>>> @@ -113,6 +113,31 @@ static inline void debug_stack_usage_inc
>>>> static inline void debug_stack_usage_dec(void) { }
>>>> #endif /* X86_64 */
>>>> 
>>>> +static __always_inline void local_db_save(unsigned long *dr7)
>>>> +{
>>>> +    get_debugreg(*dr7, 7);
>>>> +    if (*dr7)
>>>> +        set_debugreg(0, 7);
>>> 
>>> %dr7 has an architecturally stuck bit in it.
>>> 
>>> You want *dr7 != 0x400 to avoid writing 0 unconditionally.
>> 
>> Do we have to have that bit set when writing it? Otherwise I might
>> actually prefer masking it out.
> 
> I'm an idiot, we write a plain 9..
> 
>>> Also, API wise, wouldn't it be nicer to write "dr7 = local_db_save()"
>>> rather than having a void function returning a single long via pointer?
>> 
>> Probably.. I started with local_irq_save() and .. well, n/m. I'll change
>> it ;-)
> 
> How's this?
> 
> ---
> --- a/arch/x86/include/asm/debugreg.h
> +++ b/arch/x86/include/asm/debugreg.h
> @@ -113,6 +113,36 @@ static inline void debug_stack_usage_inc
> static inline void debug_stack_usage_dec(void) { }
> #endif /* X86_64 */
> 
> +static __always_inline unsigned long local_db_save(void)
> +{
> +    unsigned long dr7;
> +
> +    get_debugreg(&dr7, 7);
> +    dr7 ^= 0x400;

Why xor?  This seems extra confusing.

> +    if (dr7)
> +        set_debugreg(0, 7);
> +    /*
> +     * Ensure the compiler doesn't lower the above statements into
> +     * the critical section; disabling breakpoints late would not
> +     * be good.
> +     */
> +    barrier();
> +
> +    return dr7;
> +}
> +
> +static __always_inline void local_db_restore(unsigned long dr7)
> +{
> +    /*
> +     * Ensure the compiler doesn't raise this statement into
> +     * the critical section; enabling breakpoints early would
> +     * not be good.
> +     */
> +    barrier();
> +    if (dr7)
> +        set_debugreg(dr7, 7);
> +}
> +
> #ifdef CONFIG_CPU_SUP_AMD
> extern void set_dr_addr_mask(unsigned long mask, int dr);
> #else
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -727,15 +727,7 @@ static __always_inline void debug_enter(
>     * Entry text is excluded for HW_BP_X and cpu_entry_area, which
>     * includes the entry stack is excluded for everything.
>     */
> -    get_debugreg(*dr7, 7);
> -    set_debugreg(0, 7);
> -
> -    /*
> -     * Ensure the compiler doesn't lower the above statements into
> -     * the critical section; disabling breakpoints late would not
> -     * be good.
> -     */
> -    barrier();
> +    *dr7 = local_db_save();
> 
>    /*
>     * The Intel SDM says:
> @@ -756,13 +748,7 @@ static __always_inline void debug_enter(
> 
> static __always_inline void debug_exit(unsigned long dr7)
> {
> -    /*
> -     * Ensure the compiler doesn't raise this statement into
> -     * the critical section; enabling breakpoints early would
> -     * not be good.
> -     */
> -    barrier();
> -    set_debugreg(dr7, 7);
> +    local_db_restore(dr7);
> }
> 
> /*
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}()
  2020-05-29 17:28         ` Andy Lutomirski
@ 2020-05-29 19:02           ` Peter Zijlstra
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2020-05-29 19:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Andrew Cooper, tglx, linux-kernel, x86, Lai Jiangshan,
	sean.j.christopherson, daniel.thompson

On Fri, May 29, 2020 at 10:28:33AM -0700, Andy Lutomirski wrote:
> > +static __always_inline unsigned long local_db_save(void)
> > +{
> > +    unsigned long dr7;
> > +
> > +    get_debugreg(&dr7, 7);
> > +    dr7 ^= 0x400;
> 
> Why xor?  This seems extra confusing.

I'll do the normal mask thing ..

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2020-05-29 19:05 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-28 20:19 [PATCH 0/6] x86/entry: disallow #DB more Peter Zijlstra
2020-05-28 20:19 ` [PATCH 1/6] x86/entry: Introduce local_db_{save,restore}() Peter Zijlstra
2020-05-28 20:52   ` Andrew Cooper
2020-05-28 21:15     ` Peter Zijlstra
2020-05-28 21:33       ` Peter Zijlstra
2020-05-29 17:28         ` Andy Lutomirski
2020-05-29 19:02           ` Peter Zijlstra
2020-05-28 21:36       ` Andrew Cooper
2020-05-28 20:19 ` [PATCH 2/6] x86/entry, nmi: Disable #DB Peter Zijlstra
2020-05-28 20:19 ` [PATCH 3/6] x86/entry, mce: Disallow #DB during #MC Peter Zijlstra
2020-05-28 20:19 ` [PATCH 4/6] x86/entry: Optimize local_db_save() for virt Peter Zijlstra
2020-05-29 17:24   ` Sean Christopherson
2020-05-28 20:19 ` [PATCH 5/6] x86/entry: Remove debug IDT frobbing Peter Zijlstra
2020-05-28 20:19 ` [RFC][PATCH 6/6] x86/entry: Remove DBn stacks Peter Zijlstra
2020-05-28 22:35   ` Lai Jiangshan
2020-05-28 22:37     ` Peter Zijlstra
2020-05-28 22:42 ` [PATCH 0/6] x86/entry: disallow #DB more Lai Jiangshan
2020-05-28 22:48   ` Peter Zijlstra
2020-05-28 23:05     ` Lai Jiangshan
2020-05-29  8:00       ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).