linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data
@ 2020-05-25 14:50 Lai Jiangshan
  2020-05-25 14:50 ` [RFC PATCH 1/5] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-25 14:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, x86

Hello

The patchset is based on (tag: entry-v9-the-rest, tglx-devel/x86/entry).
And it is complement of 3ea11ac991d
("x86/hw_breakpoint: Prevent data breakpoints on cpu_entry_area").

After reading the code, we can see that more data needs to be protected
against hw_breakpoint, otherwise it may cause
dangerous/recursive/unwanted #DB.


Lai Jiangshan (5):
  x86/hw_breakpoint: add within_area() to check data breakpoints
  x86/hw_breakpoint: Prevent data breakpoints on direct GDT
  x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw
  x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  x86/hw_breakpoint: Prevent data breakpoints on debug_idt_table

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20200505134058.272448010@linutronix.de
Link: https://lore.kernel.org/lkml/20200521200513.656533920@linutronix.de


 arch/x86/kernel/hw_breakpoint.c | 71 +++++++++++++++++++++++++++++----
 1 file changed, 63 insertions(+), 8 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH 1/5] x86/hw_breakpoint: add within_area() to check data breakpoints
  2020-05-25 14:50 [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
@ 2020-05-25 14:50 ` Lai Jiangshan
  2020-05-25 14:50 ` [RFC PATCH 2/5] x86/hw_breakpoint: Prevent data breakpoints on direct GDT Lai Jiangshan
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-25 14:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

within_area() is added for checking if the data breakpoints overlap
with cpu_entry_area, and will be used for checking if the data
breakpoints overlap with GDT, IDT, or TSS in places other than
cpu_entry_area next patches.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/hw_breakpoint.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 9ddf441ccaa8..c149c7b29ac3 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -227,14 +227,23 @@ int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw)
 	return (va >= TASK_SIZE_MAX) || ((va + len - 1) >= TASK_SIZE_MAX);
 }
 
+/*
+ * Checks whether the range [addr, end], overlaps the area [base, base + size).
+ */
+static inline bool within_area(unsigned long addr, unsigned long end,
+			       unsigned long base, unsigned long size)
+{
+	return end >= base && addr < (base + size);
+}
+
 /*
  * Checks whether the range from addr to end, inclusive, overlaps the CPU
  * entry area range.
  */
 static inline bool within_cpu_entry_area(unsigned long addr, unsigned long end)
 {
-	return end >= CPU_ENTRY_AREA_BASE &&
-	       addr < (CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_TOTAL_SIZE);
+	return within_area(addr, end, CPU_ENTRY_AREA_BASE,
+			   CPU_ENTRY_AREA_TOTAL_SIZE);
 }
 
 static int arch_build_bp_info(struct perf_event *bp,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 2/5] x86/hw_breakpoint: Prevent data breakpoints on direct GDT
  2020-05-25 14:50 [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
  2020-05-25 14:50 ` [RFC PATCH 1/5] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
@ 2020-05-25 14:50 ` Lai Jiangshan
  2020-05-25 14:51 ` [RFC PATCH 3/5] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw Lai Jiangshan
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-25 14:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

A data breakpoint on the GDT is terrifying and should be avoided.
The GDT on CPU entry area is already protected. The direct GDT
should be also protected, although it is seldom used and only
used for short time.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/hw_breakpoint.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index c149c7b29ac3..f859095c1b6c 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -32,6 +32,7 @@
 #include <asm/processor.h>
 #include <asm/debugreg.h>
 #include <asm/user.h>
+#include <asm/desc.h>
 
 /* Per cpu debug control register value */
 DEFINE_PER_CPU(unsigned long, cpu_dr7);
@@ -237,13 +238,26 @@ static inline bool within_area(unsigned long addr, unsigned long end,
 }
 
 /*
- * Checks whether the range from addr to end, inclusive, overlaps the CPU
- * entry area range.
+ * Checks whether the range from addr to end, inclusive, overlaps the fixed
+ * mapped CPU entry area range or other ranges used for CPU entry.
  */
-static inline bool within_cpu_entry_area(unsigned long addr, unsigned long end)
+static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 {
-	return within_area(addr, end, CPU_ENTRY_AREA_BASE,
-			   CPU_ENTRY_AREA_TOTAL_SIZE);
+	int cpu;
+
+	/* CPU entry erea is always used for CPU entry */
+	if (within_area(addr, end, CPU_ENTRY_AREA_BASE,
+			CPU_ENTRY_AREA_TOTAL_SIZE))
+		return true;
+
+	for_each_possible_cpu(cpu) {
+		/* The original rw GDT is being used after load_direct_gdt() */
+		if (within_area(addr, end, (unsigned long)get_cpu_gdt_rw(cpu),
+				GDT_SIZE))
+			return true;
+	}
+
+	return false;
 }
 
 static int arch_build_bp_info(struct perf_event *bp,
@@ -257,12 +271,12 @@ static int arch_build_bp_info(struct perf_event *bp,
 		return -EINVAL;
 
 	/*
-	 * Prevent any breakpoint of any type that overlaps the
-	 * cpu_entry_area.  This protects the IST stacks and also
+	 * Prevent any breakpoint of any type that overlaps the CPU
+	 * entry area and data.  This protects the IST stacks and also
 	 * reduces the chance that we ever find out what happens if
 	 * there's a data breakpoint on the GDT, IDT, or TSS.
 	 */
-	if (within_cpu_entry_area(attr->bp_addr, bp_end))
+	if (within_cpu_entry(attr->bp_addr, bp_end))
 		return -EINVAL;
 
 	hw->address = attr->bp_addr;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 3/5] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw
  2020-05-25 14:50 [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
  2020-05-25 14:50 ` [RFC PATCH 1/5] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
  2020-05-25 14:50 ` [RFC PATCH 2/5] x86/hw_breakpoint: Prevent data breakpoints on direct GDT Lai Jiangshan
@ 2020-05-25 14:51 ` Lai Jiangshan
  2020-05-25 14:51 ` [RFC PATCH 4/5] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask Lai Jiangshan
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-25 14:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

cpu_tss_rw is not directly referenced by hardware, but
cpu_tss_rw is also used in CPU entry code, especially
when #DB shifts its stacks. If a data breakpoint is on
the cpu_tss_rw.x86_tss.ist[IST_INDEX_DB], it will cause
recursive #DB (and then #DF soon for #DB is generated
after the access, IST-shifting, is done).

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/hw_breakpoint.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index f859095c1b6c..7d3966b9aa12 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -255,6 +255,19 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 		if (within_area(addr, end, (unsigned long)get_cpu_gdt_rw(cpu),
 				GDT_SIZE))
 			return true;
+
+		/*
+		 * cpu_tss_rw is not directly referenced by hardware, but
+		 * cpu_tss_rw is also used in CPU entry code, especially
+		 * when #DB shifts its stacks. If a data breakpoint is on
+		 * the cpu_tss_rw.x86_tss.ist[IST_INDEX_DB], it will cause
+		 * recursive #DB (and then #DF soon for #DB is generated
+		 * after the access, IST-shifting, is done).
+		 */
+		if (within_area(addr, end,
+				(unsigned long)&per_cpu(cpu_tss_rw, cpu),
+				sizeof(struct tss_struct)))
+			return true;
 	}
 
 	return false;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 4/5] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  2020-05-25 14:50 [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
                   ` (2 preceding siblings ...)
  2020-05-25 14:51 ` [RFC PATCH 3/5] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw Lai Jiangshan
@ 2020-05-25 14:51 ` Lai Jiangshan
  2020-05-25 14:51 ` [RFC PATCH 5/5] x86/hw_breakpoint: Prevent data breakpoints on debug_idt_table Lai Jiangshan
  2020-05-25 15:25 ` [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Peter Zijlstra
  5 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-25 14:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

The percpu user_pcid_flush_mask is used for CPU entry
(#DB/#NMI/#DF/#MCE -> paranoid_exit() -> RESTORE_CR3).
If a data breakpoint on it, it will cause an unwanted #DB.

There are some other percpu data used in CPU entry, but they are
either in already-protected cpu_tss_rw or are safe to trigger #DB
(espfix related).

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/hw_breakpoint.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 7d3966b9aa12..9579bd6fb589 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -33,6 +33,7 @@
 #include <asm/debugreg.h>
 #include <asm/user.h>
 #include <asm/desc.h>
+#include <asm/tlbflush.h>
 
 /* Per cpu debug control register value */
 DEFINE_PER_CPU(unsigned long, cpu_dr7);
@@ -251,6 +252,8 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 		return true;
 
 	for_each_possible_cpu(cpu) {
+		unsigned short *mask_ptr;
+
 		/* The original rw GDT is being used after load_direct_gdt() */
 		if (within_area(addr, end, (unsigned long)get_cpu_gdt_rw(cpu),
 				GDT_SIZE))
@@ -268,6 +271,17 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 				(unsigned long)&per_cpu(cpu_tss_rw, cpu),
 				sizeof(struct tss_struct)))
 			return true;
+
+		/*
+		 * cpu_tlbstate.user_pcid_flush_mask is used for CPU entry
+		 * (#DB/#NMI/#DF/#MCE -> paranoid_exit() -> RESTORE_CR3).
+		 * If a data breakpoint on it, it will cause an unwanted
+		 * #DB.
+		 */
+		mask_ptr = &per_cpu(cpu_tlbstate.user_pcid_flush_mask, cpu);
+		if (within_area(addr, end, (unsigned long)mask_ptr,
+				sizeof(*mask_ptr)))
+			return true;
 	}
 
 	return false;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH 5/5] x86/hw_breakpoint: Prevent data breakpoints on debug_idt_table
  2020-05-25 14:50 [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
                   ` (3 preceding siblings ...)
  2020-05-25 14:51 ` [RFC PATCH 4/5] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask Lai Jiangshan
@ 2020-05-25 14:51 ` Lai Jiangshan
  2020-05-25 15:25 ` [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Peter Zijlstra
  5 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-25 14:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

A data breakpoint on the IDT is terrifying and should be avoided.
The IDT on CPU entry area is already protected. The debug IDT
should be also protected, although it is seldom used and only
used for short time.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
Please drop this patch when Peter's work to remove debug_idt_table
is merged.

 arch/x86/kernel/hw_breakpoint.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 9579bd6fb589..83d8b1fcbc76 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -284,6 +284,11 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 			return true;
 	}
 
+	/* debug_idt_table is used when load_debug_idt() */
+	if (within_area(addr, end, (unsigned long)debug_idt_table,
+			sizeof(debug_idt_table[0]) * IDT_ENTRIES))
+		return true;
+
 	return false;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data
  2020-05-25 14:50 [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
                   ` (4 preceding siblings ...)
  2020-05-25 14:51 ` [RFC PATCH 5/5] x86/hw_breakpoint: Prevent data breakpoints on debug_idt_table Lai Jiangshan
@ 2020-05-25 15:25 ` Peter Zijlstra
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
  2020-05-26  1:48   ` [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
  5 siblings, 2 replies; 25+ messages in thread
From: Peter Zijlstra @ 2020-05-25 15:25 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, Andy Lutomirski, Thomas Gleixner, x86

On Mon, May 25, 2020 at 02:50:57PM +0000, Lai Jiangshan wrote:
> Hello
> 
> The patchset is based on (tag: entry-v9-the-rest, tglx-devel/x86/entry).
> And it is complement of 3ea11ac991d
> ("x86/hw_breakpoint: Prevent data breakpoints on cpu_entry_area").
> 
> After reading the code, we can see that more data needs to be protected
> against hw_breakpoint, otherwise it may cause
> dangerous/recursive/unwanted #DB.
> 
> 
> Lai Jiangshan (5):
>   x86/hw_breakpoint: add within_area() to check data breakpoints
>   x86/hw_breakpoint: Prevent data breakpoints on direct GDT
>   x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw

I think we can actually get rid of that #DB IST stack frobbing, also see
patches linked below.

>   x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask

Should we disallow the full structure just to be sure?

>   x86/hw_breakpoint: Prevent data breakpoints on debug_idt_table

That's going away, see:

https://lkml.kernel.org/r/20200522204738.645043059@infradead.org

But yes, nice!


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and
  2020-05-25 15:25 ` [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Peter Zijlstra
@ 2020-05-26  1:42   ` Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 1/7] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
                       ` (6 more replies)
  2020-05-26  1:48   ` [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
  1 sibling, 7 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, x86

Hello

The patchset is based on (tag: entry-v9-the-rest, tglx-devel/x86/entry).
And it is complement of 3ea11ac991d
("x86/hw_breakpoint: Prevent data breakpoints on cpu_entry_area").

After reading the code, we can see that more data needs to be protected
against hw_breakpoint, otherwise it may cause
dangerous/recursive/unwanted #DB.

This patch also remove IST-shifting(patch 5-7). Because tglx work includes
debug_enter() which disables nested #DB.
Patch 5-7 depends tglx'w work only by now; they don't depends on Peter's
patchset[3], but this patch 6 should be discarded when they are mareged
with Peter's work.

Actually, I beg/hope Peter incorporate this V2 patchset into his patchset
which will be incorporated to tglx work. Because this V2 patchset
doesn't protect debug_idt_table and patch6 conflicts with Peter's
work.

Changed from V1
  Protect the full cpu_tlbstate structure to be sure. Suggested
	by Peter.
  Drop the last patch of the V1 because debug_idt_table is removed
	in Peter's patchset[3].
  remove IST-shifting

Lai Jiangshan (7):
  x86/hw_breakpoint: add within_area() to check data breakpoints
  x86/hw_breakpoint: Prevent data breakpoints on direct GDT
  x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw
  x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  x86/entry: don't shift stack on #DB
  x86/entry: is_debug_stack() don't check of DB1 stack
  x86/entry: remove DB1 stack and DB2 hole from cpu entry area

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: https://lkml.kernel.org/r/20200505134058.272448010@linutronix.de
Link: https://lore.kernel.org/lkml/20200521200513.656533920@linutronix.de
Link: https://lkml.kernel.org/r/20200522204738.645043059@infradead.org

 arch/x86/entry/entry_64.S             | 17 --------
 arch/x86/include/asm/cpu_entry_area.h | 12 ++---
 arch/x86/kernel/asm-offsets_64.c      |  5 ---
 arch/x86/kernel/dumpstack_64.c        | 10 ++---
 arch/x86/kernel/hw_breakpoint.c       | 63 +++++++++++++++++++++++----
 arch/x86/kernel/nmi.c                 |  7 +--
 arch/x86/mm/cpu_entry_area.c          |  4 +-
 7 files changed, 63 insertions(+), 55 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [RFC PATCH V2 1/7] x86/hw_breakpoint: add within_area() to check data breakpoints
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
@ 2020-05-26  1:42     ` Lai Jiangshan
  2020-05-30  9:57       ` [tip: x86/entry] x86/hw_breakpoint: Add " tip-bot2 for Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 2/7] x86/hw_breakpoint: Prevent data breakpoints on direct GDT Lai Jiangshan
                       ` (5 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

within_area() is added for checking if the data breakpoints overlap
with cpu_entry_area, and will be used for checking if the data
breakpoints overlap with GDT, IDT, or TSS in places other than
cpu_entry_area next patches.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/hw_breakpoint.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 9ddf441ccaa8..c149c7b29ac3 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -227,14 +227,23 @@ int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw)
 	return (va >= TASK_SIZE_MAX) || ((va + len - 1) >= TASK_SIZE_MAX);
 }
 
+/*
+ * Checks whether the range [addr, end], overlaps the area [base, base + size).
+ */
+static inline bool within_area(unsigned long addr, unsigned long end,
+			       unsigned long base, unsigned long size)
+{
+	return end >= base && addr < (base + size);
+}
+
 /*
  * Checks whether the range from addr to end, inclusive, overlaps the CPU
  * entry area range.
  */
 static inline bool within_cpu_entry_area(unsigned long addr, unsigned long end)
 {
-	return end >= CPU_ENTRY_AREA_BASE &&
-	       addr < (CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_TOTAL_SIZE);
+	return within_area(addr, end, CPU_ENTRY_AREA_BASE,
+			   CPU_ENTRY_AREA_TOTAL_SIZE);
 }
 
 static int arch_build_bp_info(struct perf_event *bp,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH V2 2/7] x86/hw_breakpoint: Prevent data breakpoints on direct GDT
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 1/7] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
@ 2020-05-26  1:42     ` Lai Jiangshan
  2020-05-30  9:57       ` [tip: x86/entry] " tip-bot2 for Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 3/7] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw Lai Jiangshan
                       ` (4 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

A data breakpoint on the GDT is terrifying and should be avoided.
The GDT on CPU entry area is already protected. The direct GDT
should be also protected, although it is seldom used and only
used for short time.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/hw_breakpoint.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index c149c7b29ac3..f859095c1b6c 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -32,6 +32,7 @@
 #include <asm/processor.h>
 #include <asm/debugreg.h>
 #include <asm/user.h>
+#include <asm/desc.h>
 
 /* Per cpu debug control register value */
 DEFINE_PER_CPU(unsigned long, cpu_dr7);
@@ -237,13 +238,26 @@ static inline bool within_area(unsigned long addr, unsigned long end,
 }
 
 /*
- * Checks whether the range from addr to end, inclusive, overlaps the CPU
- * entry area range.
+ * Checks whether the range from addr to end, inclusive, overlaps the fixed
+ * mapped CPU entry area range or other ranges used for CPU entry.
  */
-static inline bool within_cpu_entry_area(unsigned long addr, unsigned long end)
+static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 {
-	return within_area(addr, end, CPU_ENTRY_AREA_BASE,
-			   CPU_ENTRY_AREA_TOTAL_SIZE);
+	int cpu;
+
+	/* CPU entry erea is always used for CPU entry */
+	if (within_area(addr, end, CPU_ENTRY_AREA_BASE,
+			CPU_ENTRY_AREA_TOTAL_SIZE))
+		return true;
+
+	for_each_possible_cpu(cpu) {
+		/* The original rw GDT is being used after load_direct_gdt() */
+		if (within_area(addr, end, (unsigned long)get_cpu_gdt_rw(cpu),
+				GDT_SIZE))
+			return true;
+	}
+
+	return false;
 }
 
 static int arch_build_bp_info(struct perf_event *bp,
@@ -257,12 +271,12 @@ static int arch_build_bp_info(struct perf_event *bp,
 		return -EINVAL;
 
 	/*
-	 * Prevent any breakpoint of any type that overlaps the
-	 * cpu_entry_area.  This protects the IST stacks and also
+	 * Prevent any breakpoint of any type that overlaps the CPU
+	 * entry area and data.  This protects the IST stacks and also
 	 * reduces the chance that we ever find out what happens if
 	 * there's a data breakpoint on the GDT, IDT, or TSS.
 	 */
-	if (within_cpu_entry_area(attr->bp_addr, bp_end))
+	if (within_cpu_entry(attr->bp_addr, bp_end))
 		return -EINVAL;
 
 	hw->address = attr->bp_addr;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH V2 3/7] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 1/7] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 2/7] x86/hw_breakpoint: Prevent data breakpoints on direct GDT Lai Jiangshan
@ 2020-05-26  1:42     ` Lai Jiangshan
  2020-05-30  9:57       ` [tip: x86/entry] " tip-bot2 for Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask Lai Jiangshan
                       ` (3 subsequent siblings)
  6 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

cpu_tss_rw is not directly referenced by hardware, but
cpu_tss_rw is also used in CPU entry code, especially
when #DB shifts its stacks. If a data breakpoint is on
the cpu_tss_rw.x86_tss.ist[IST_INDEX_DB], it will cause
recursive #DB (and then #DF soon for #DB is generated
after the access, IST-shifting, is done).

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/hw_breakpoint.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index f859095c1b6c..7d3966b9aa12 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -255,6 +255,19 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 		if (within_area(addr, end, (unsigned long)get_cpu_gdt_rw(cpu),
 				GDT_SIZE))
 			return true;
+
+		/*
+		 * cpu_tss_rw is not directly referenced by hardware, but
+		 * cpu_tss_rw is also used in CPU entry code, especially
+		 * when #DB shifts its stacks. If a data breakpoint is on
+		 * the cpu_tss_rw.x86_tss.ist[IST_INDEX_DB], it will cause
+		 * recursive #DB (and then #DF soon for #DB is generated
+		 * after the access, IST-shifting, is done).
+		 */
+		if (within_area(addr, end,
+				(unsigned long)&per_cpu(cpu_tss_rw, cpu),
+				sizeof(struct tss_struct)))
+			return true;
 	}
 
 	return false;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
                       ` (2 preceding siblings ...)
  2020-05-26  1:42     ` [RFC PATCH V2 3/7] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw Lai Jiangshan
@ 2020-05-26  1:42     ` Lai Jiangshan
  2020-05-26  4:17       ` Andy Lutomirski
  2020-05-30  9:57       ` [tip: x86/entry] " tip-bot2 for Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 5/7] x86/entry: don't shift stack on #DB Lai Jiangshan
                       ` (2 subsequent siblings)
  6 siblings, 2 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre

The percpu user_pcid_flush_mask is used for CPU entry
If a data breakpoint on it, it will cause an unwanted #DB.
Protect the full cpu_tlbstate structure to be sure.

There are some other percpu data used in CPU entry, but they are
either in already-protected cpu_tss_rw or are safe to trigger #DB
(espfix_waddr, espfix_stack).

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/hw_breakpoint.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 7d3966b9aa12..67ef8e24af6a 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -33,6 +33,7 @@
 #include <asm/debugreg.h>
 #include <asm/user.h>
 #include <asm/desc.h>
+#include <asm/tlbflush.h>
 
 /* Per cpu debug control register value */
 DEFINE_PER_CPU(unsigned long, cpu_dr7);
@@ -268,6 +269,16 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 				(unsigned long)&per_cpu(cpu_tss_rw, cpu),
 				sizeof(struct tss_struct)))
 			return true;
+
+		/*
+		 * cpu_tlbstate.user_pcid_flush_mask is used for CPU entry.
+		 * If a data breakpoint on it, it will cause an unwanted #DB.
+		 * Protect the full cpu_tlbstate structure to be sure.
+		 */
+		if (within_area(addr, end,
+				(unsigned long)&per_cpu(cpu_tlbstate, cpu),
+				sizeof(struct tlb_state)))
+			return true;
 	}
 
 	return false;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH V2 5/7] x86/entry: don't shift stack on #DB
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
                       ` (3 preceding siblings ...)
  2020-05-26  1:42     ` [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask Lai Jiangshan
@ 2020-05-26  1:42     ` Lai Jiangshan
  2020-05-26  9:10       ` Peter Zijlstra
  2020-05-26  1:42     ` [RFC PATCH V2 6/7] x86/entry: is_debug_stack() don't check of DB1 stack Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 7/7] x86/entry: remove DB1 stack and DB2 hole from cpu entry area Lai Jiangshan
  6 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin, Andrew Cooper,
	Juergen Gross, Brian Gerst

debug_enter() will disable #DB, there should be no recursive #DB.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/entry/entry_64.S        | 17 -----------------
 arch/x86/kernel/asm-offsets_64.c |  1 -
 2 files changed, 18 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 265ff97b3961..8ecaeee53653 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -396,11 +396,6 @@ SYM_CODE_END(\asmsym)
 	idtentry \vector asm_\cfunc \cfunc has_error_code=0
 .endm
 
-/*
- * MCE and DB exceptions
- */
-#define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss_rw) + (TSS_ist + (x) * 8)
-
 /**
  * idtentry_mce_db - Macro to generate entry stubs for #MC and #DB
  * @vector:		Vector number
@@ -416,10 +411,6 @@ SYM_CODE_END(\asmsym)
  * If hits in kernel mode then it needs to go through the paranoid
  * entry as the exception can hit any random state. No preemption
  * check on exit to keep the paranoid path simple.
- *
- * If the trap is #DB then the interrupt stack entry in the IST is
- * moved to the second stack, so a potential recursion will have a
- * fresh IST.
  */
 .macro idtentry_mce_db vector asmsym cfunc
 SYM_CODE_START(\asmsym)
@@ -445,16 +436,8 @@ SYM_CODE_START(\asmsym)
 
 	movq	%rsp, %rdi		/* pt_regs pointer */
 
-	.if \vector == X86_TRAP_DB
-		subq	$DB_STACK_OFFSET, CPU_TSS_IST(IST_INDEX_DB)
-	.endif
-
 	call	\cfunc
 
-	.if \vector == X86_TRAP_DB
-		addq	$DB_STACK_OFFSET, CPU_TSS_IST(IST_INDEX_DB)
-	.endif
-
 	jmp	paranoid_exit
 
 	/* Switch to the regular task stack and use the noist entry point */
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index c2a47016f243..472378330169 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -57,7 +57,6 @@ int main(void)
 	BLANK();
 #undef ENTRY
 
-	OFFSET(TSS_ist, tss_struct, x86_tss.ist);
 	DEFINE(DB_STACK_OFFSET, offsetof(struct cea_exception_stacks, DB_stack) -
 	       offsetof(struct cea_exception_stacks, DB1_stack));
 	BLANK();
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH V2 6/7] x86/entry: is_debug_stack() don't check of DB1 stack
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
                       ` (4 preceding siblings ...)
  2020-05-26  1:42     ` [RFC PATCH V2 5/7] x86/entry: don't shift stack on #DB Lai Jiangshan
@ 2020-05-26  1:42     ` Lai Jiangshan
  2020-05-26  1:42     ` [RFC PATCH V2 7/7] x86/entry: remove DB1 stack and DB2 hole from cpu entry area Lai Jiangshan
  6 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Alexandre Chartre, Changbin Du, Martin Molnar

IST-shift code is removed from entry code, #DB will not
at DB1 stack. So we remove the check of DB1 stack in
is_debug_stack().

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/kernel/nmi.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index 1c58454ac5fb..2f463f5880c6 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -500,15 +500,10 @@ static noinstr bool is_debug_stack(unsigned long addr)
 {
 	struct cea_exception_stacks *cs = __this_cpu_read(cea_exception_stacks);
 	unsigned long top = CEA_ESTACK_TOP(cs, DB);
-	unsigned long bot = CEA_ESTACK_BOT(cs, DB1);
+	unsigned long bot = CEA_ESTACK_BOT(cs, DB);
 
 	if (__this_cpu_read(debug_stack_usage))
 		return true;
-	/*
-	 * Note, this covers the guard page between DB and DB1 as well to
-	 * avoid two checks. But by all means @addr can never point into
-	 * the guard page.
-	 */
 	return addr >= bot && addr < top;
 }
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [RFC PATCH V2 7/7] x86/entry: remove DB1 stack and DB2 hole from cpu entry area
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
                       ` (5 preceding siblings ...)
  2020-05-26  1:42     ` [RFC PATCH V2 6/7] x86/entry: is_debug_stack() don't check of DB1 stack Lai Jiangshan
@ 2020-05-26  1:42     ` Lai Jiangshan
  6 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:42 UTC (permalink / raw)
  To: linux-kernel
  Cc: Lai Jiangshan, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner,
	x86, Ingo Molnar, Borislav Petkov, H. Peter Anvin, Dave Hansen,
	Nishad Kamdar, Brian Gerst, Juergen Gross, Andrew Cooper,
	Josh Poimboeuf, Miroslav Benes

IST-shift code is removed from entry code, #DB will stick to
DB stack only. So we remove the DB1 stack and the DB2 hole.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
---
 arch/x86/include/asm/cpu_entry_area.h | 12 +++---------
 arch/x86/kernel/asm-offsets_64.c      |  4 ----
 arch/x86/kernel/dumpstack_64.c        | 10 +++-------
 arch/x86/mm/cpu_entry_area.c          |  4 +---
 4 files changed, 7 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/cpu_entry_area.h b/arch/x86/include/asm/cpu_entry_area.h
index 02c0078d3787..8902fdb7de13 100644
--- a/arch/x86/include/asm/cpu_entry_area.h
+++ b/arch/x86/include/asm/cpu_entry_area.h
@@ -11,15 +11,11 @@
 #ifdef CONFIG_X86_64
 
 /* Macro to enforce the same ordering and stack sizes */
-#define ESTACKS_MEMBERS(guardsize, db2_holesize)\
+#define ESTACKS_MEMBERS(guardsize)		\
 	char	DF_stack_guard[guardsize];	\
 	char	DF_stack[EXCEPTION_STKSZ];	\
 	char	NMI_stack_guard[guardsize];	\
 	char	NMI_stack[EXCEPTION_STKSZ];	\
-	char	DB2_stack_guard[guardsize];	\
-	char	DB2_stack[db2_holesize];	\
-	char	DB1_stack_guard[guardsize];	\
-	char	DB1_stack[EXCEPTION_STKSZ];	\
 	char	DB_stack_guard[guardsize];	\
 	char	DB_stack[EXCEPTION_STKSZ];	\
 	char	MCE_stack_guard[guardsize];	\
@@ -28,12 +24,12 @@
 
 /* The exception stacks' physical storage. No guard pages required */
 struct exception_stacks {
-	ESTACKS_MEMBERS(0, 0)
+	ESTACKS_MEMBERS(0)
 };
 
 /* The effective cpu entry area mapping with guard pages. */
 struct cea_exception_stacks {
-	ESTACKS_MEMBERS(PAGE_SIZE, EXCEPTION_STKSZ)
+	ESTACKS_MEMBERS(PAGE_SIZE)
 };
 
 /*
@@ -42,8 +38,6 @@ struct cea_exception_stacks {
 enum exception_stack_ordering {
 	ESTACK_DF,
 	ESTACK_NMI,
-	ESTACK_DB2,
-	ESTACK_DB1,
 	ESTACK_DB,
 	ESTACK_MCE,
 	N_EXCEPTION_STACKS
diff --git a/arch/x86/kernel/asm-offsets_64.c b/arch/x86/kernel/asm-offsets_64.c
index 472378330169..4b4974d91d90 100644
--- a/arch/x86/kernel/asm-offsets_64.c
+++ b/arch/x86/kernel/asm-offsets_64.c
@@ -57,10 +57,6 @@ int main(void)
 	BLANK();
 #undef ENTRY
 
-	DEFINE(DB_STACK_OFFSET, offsetof(struct cea_exception_stacks, DB_stack) -
-	       offsetof(struct cea_exception_stacks, DB1_stack));
-	BLANK();
-
 #ifdef CONFIG_STACKPROTECTOR
 	DEFINE(stack_canary_offset, offsetof(struct fixed_percpu_data, stack_canary));
 	BLANK();
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index 460ae7f66818..6b7051fa3669 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -22,15 +22,13 @@
 static const char * const exception_stack_names[] = {
 		[ ESTACK_DF	]	= "#DF",
 		[ ESTACK_NMI	]	= "NMI",
-		[ ESTACK_DB2	]	= "#DB2",
-		[ ESTACK_DB1	]	= "#DB1",
 		[ ESTACK_DB	]	= "#DB",
 		[ ESTACK_MCE	]	= "#MC",
 };
 
 const char *stack_type_name(enum stack_type type)
 {
-	BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
+	BUILD_BUG_ON(N_EXCEPTION_STACKS != 4);
 
 	if (type == STACK_TYPE_IRQ)
 		return "IRQ";
@@ -72,14 +70,12 @@ struct estack_pages {
 /*
  * Array of exception stack page descriptors. If the stack is larger than
  * PAGE_SIZE, all pages covering a particular stack will have the same
- * info. The guard pages including the not mapped DB2 stack are zeroed
- * out.
+ * info. The guard pages are zeroed out.
  */
 static const
 struct estack_pages estack_pages[CEA_ESTACK_PAGES] ____cacheline_aligned = {
 	EPAGERANGE(DF),
 	EPAGERANGE(NMI),
-	EPAGERANGE(DB1),
 	EPAGERANGE(DB),
 	EPAGERANGE(MCE),
 };
@@ -91,7 +87,7 @@ static bool in_exception_stack(unsigned long *stack, struct stack_info *info)
 	struct pt_regs *regs;
 	unsigned int k;
 
-	BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
+	BUILD_BUG_ON(N_EXCEPTION_STACKS != 4);
 
 	begin = (unsigned long)__this_cpu_read(cea_exception_stacks);
 	/*
diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
index 5199d8a1daf1..686af163be20 100644
--- a/arch/x86/mm/cpu_entry_area.c
+++ b/arch/x86/mm/cpu_entry_area.c
@@ -102,12 +102,10 @@ static void __init percpu_setup_exception_stacks(unsigned int cpu)
 
 	/*
 	 * The exceptions stack mappings in the per cpu area are protected
-	 * by guard pages so each stack must be mapped separately. DB2 is
-	 * not mapped; it just exists to catch triple nesting of #DB.
+	 * by guard pages so each stack must be mapped separately.
 	 */
 	cea_map_stack(DF);
 	cea_map_stack(NMI);
-	cea_map_stack(DB1);
 	cea_map_stack(DB);
 	cea_map_stack(MCE);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data
  2020-05-25 15:25 ` [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Peter Zijlstra
  2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
@ 2020-05-26  1:48   ` Lai Jiangshan
  1 sibling, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  1:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Lai Jiangshan, LKML, Andy Lutomirski, Thomas Gleixner, x86

On Mon, May 25, 2020 at 11:27 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, May 25, 2020 at 02:50:57PM +0000, Lai Jiangshan wrote:
> > Hello
> >
> > The patchset is based on (tag: entry-v9-the-rest, tglx-devel/x86/entry).
> > And it is complement of 3ea11ac991d
> > ("x86/hw_breakpoint: Prevent data breakpoints on cpu_entry_area").
> >
> > After reading the code, we can see that more data needs to be protected
> > against hw_breakpoint, otherwise it may cause
> > dangerous/recursive/unwanted #DB.
> >
> >
> > Lai Jiangshan (5):
> >   x86/hw_breakpoint: add within_area() to check data breakpoints
> >   x86/hw_breakpoint: Prevent data breakpoints on direct GDT
> >   x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw
>
> I think we can actually get rid of that #DB IST stack frobbing, also see
> patches linked below.

Hi, Peter

I reviewed that patchset before. It is all what I want, but it still
didn't remove IST-shifting. I remove it in V2.

>
> >   x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
>
> Should we disallow the full structure just to be sure?

Sure, just did it as you suggested, thanks!

>
> >   x86/hw_breakpoint: Prevent data breakpoints on debug_idt_table
>
> That's going away, see:

Yes, so I added a note in the patch, saying "Please drop this patch
when Peter's work to remove debug_idt_table is merged."

I directly drop the patch in V2.

Thank you.
Lai


>
> https://lkml.kernel.org/r/20200522204738.645043059@infradead.org
>
> But yes, nice!
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  2020-05-26  1:42     ` [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask Lai Jiangshan
@ 2020-05-26  4:17       ` Andy Lutomirski
  2020-05-26  4:31         ` Lai Jiangshan
  2020-05-30  9:57       ` [tip: x86/entry] " tip-bot2 for Lai Jiangshan
  1 sibling, 1 reply; 25+ messages in thread
From: Andy Lutomirski @ 2020-05-26  4:17 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: LKML, Andy Lutomirski, Peter Zijlstra, Thomas Gleixner, X86 ML,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Alexandre Chartre

On Mon, May 25, 2020 at 6:42 PM Lai Jiangshan <laijs@linux.alibaba.com> wrote:
>
> The percpu user_pcid_flush_mask is used for CPU entry
> If a data breakpoint on it, it will cause an unwanted #DB.
> Protect the full cpu_tlbstate structure to be sure.
>
> There are some other percpu data used in CPU entry, but they are
> either in already-protected cpu_tss_rw or are safe to trigger #DB
> (espfix_waddr, espfix_stack).

How hard would it be to rework this to have DECLARE_PERCPU_NODEBUG()
and DEFINE_PERCPU_NODEBUG() or similar?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  2020-05-26  4:17       ` Andy Lutomirski
@ 2020-05-26  4:31         ` Lai Jiangshan
  2020-05-26  4:38           ` Andy Lutomirski
  0 siblings, 1 reply; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  4:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Lai Jiangshan, LKML, Peter Zijlstra, Thomas Gleixner, X86 ML,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Alexandre Chartre,
	Steven Rostedt, Masami Hiramatsu

On Tue, May 26, 2020 at 12:21 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> On Mon, May 25, 2020 at 6:42 PM Lai Jiangshan <laijs@linux.alibaba.com> wrote:
> >
> > The percpu user_pcid_flush_mask is used for CPU entry
> > If a data breakpoint on it, it will cause an unwanted #DB.
> > Protect the full cpu_tlbstate structure to be sure.
> >
> > There are some other percpu data used in CPU entry, but they are
> > either in already-protected cpu_tss_rw or are safe to trigger #DB
> > (espfix_waddr, espfix_stack).
>
> How hard would it be to rework this to have DECLARE_PERCPU_NODEBUG()
> and DEFINE_PERCPU_NODEBUG() or similar?


I don't know, but it is an excellent idea. Although the patchset
protects only 2 or 3 portions of percpu data, but there is many
percpu data used in tracing or kprobe code. They are needed to be
protected too.

Adds CC:
Steven Rostedt <rostedt@goodmis.org>
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  2020-05-26  4:31         ` Lai Jiangshan
@ 2020-05-26  4:38           ` Andy Lutomirski
  2020-05-26  5:48             ` Lai Jiangshan
  0 siblings, 1 reply; 25+ messages in thread
From: Andy Lutomirski @ 2020-05-26  4:38 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Andy Lutomirski, Lai Jiangshan, LKML, Peter Zijlstra,
	Thomas Gleixner, X86 ML, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Alexandre Chartre, Steven Rostedt,
	Masami Hiramatsu

On Mon, May 25, 2020 at 9:31 PM Lai Jiangshan
<jiangshanlai+lkml@gmail.com> wrote:
>
> On Tue, May 26, 2020 at 12:21 PM Andy Lutomirski <luto@kernel.org> wrote:
> >
> > On Mon, May 25, 2020 at 6:42 PM Lai Jiangshan <laijs@linux.alibaba.com> wrote:
> > >
> > > The percpu user_pcid_flush_mask is used for CPU entry
> > > If a data breakpoint on it, it will cause an unwanted #DB.
> > > Protect the full cpu_tlbstate structure to be sure.
> > >
> > > There are some other percpu data used in CPU entry, but they are
> > > either in already-protected cpu_tss_rw or are safe to trigger #DB
> > > (espfix_waddr, espfix_stack).
> >
> > How hard would it be to rework this to have DECLARE_PERCPU_NODEBUG()
> > and DEFINE_PERCPU_NODEBUG() or similar?
>
>
> I don't know, but it is an excellent idea. Although the patchset
> protects only 2 or 3 portions of percpu data, but there is many
> percpu data used in tracing or kprobe code. They are needed to be
> protected too.
>
> Adds CC:
> Steven Rostedt <rostedt@goodmis.org>
> Masami Hiramatsu <mhiramat@kernel.org>

PeterZ is moving things in the direction of more aggressively
disabling hardware breakpoints in the nasty paths where we won't
survive a hardware breakpoint.  Does the tracing code have portions
that won't survive a limited amount of recursion?

I'm hoping that we can keep the number of no-breakpoint-here percpu
variables low.  Maybe we could recruit objtool to help make sure we
got all of them, but that would be a much larger project.

Would we currently survive a breakpoint on the thread stack?  I don't
see any extremely obvious reason that we wouldn't.  Blocking such a
breakpoint would be annoying.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  2020-05-26  4:38           ` Andy Lutomirski
@ 2020-05-26  5:48             ` Lai Jiangshan
  0 siblings, 0 replies; 25+ messages in thread
From: Lai Jiangshan @ 2020-05-26  5:48 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Lai Jiangshan, LKML, Peter Zijlstra, Thomas Gleixner, X86 ML,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Alexandre Chartre,
	Steven Rostedt, Masami Hiramatsu

On Tue, May 26, 2020 at 12:39 PM Andy Lutomirski <luto@kernel.org> wrote:
>
> On Mon, May 25, 2020 at 9:31 PM Lai Jiangshan
> <jiangshanlai+lkml@gmail.com> wrote:
> >
> > On Tue, May 26, 2020 at 12:21 PM Andy Lutomirski <luto@kernel.org> wrote:
> > >
> > > On Mon, May 25, 2020 at 6:42 PM Lai Jiangshan <laijs@linux.alibaba.com> wrote:
> > > >
> > > > The percpu user_pcid_flush_mask is used for CPU entry
> > > > If a data breakpoint on it, it will cause an unwanted #DB.
> > > > Protect the full cpu_tlbstate structure to be sure.
> > > >
> > > > There are some other percpu data used in CPU entry, but they are
> > > > either in already-protected cpu_tss_rw or are safe to trigger #DB
> > > > (espfix_waddr, espfix_stack).
> > >
> > > How hard would it be to rework this to have DECLARE_PERCPU_NODEBUG()
> > > and DEFINE_PERCPU_NODEBUG() or similar?
> >
> >
> > I don't know, but it is an excellent idea. Although the patchset
> > protects only 2 or 3 portions of percpu data, but there is many
> > percpu data used in tracing or kprobe code. They are needed to be
> > protected too.
> >
> > Adds CC:
> > Steven Rostedt <rostedt@goodmis.org>
> > Masami Hiramatsu <mhiramat@kernel.org>
>
> PeterZ is moving things in the direction of more aggressively
> disabling hardware breakpoints in the nasty paths where we won't
> survive a hardware breakpoint.  Does the tracing code have portions
> that won't survive a limited amount of recursion?

Agree, after "aggressively disabling hardware breakpoints in the nasty
paths", only percpu data used by entry code needs to be protected,
even non-instrumentable percpu data used by nmi_enter() doesn't need
to be marked protected, because #DB is disabled.

Only percpu data used by entry code in ranges that #DB is not disabled
needs to be protected, there are only a small number of portions,
I don't think we need DECLARE_PERCPU_NODEBUG() or so for merely 2 or 3
portions of data. This patchset is sufficient.
(espfix_waddr, espfix_stack are not counted into, which needs more
review besides me)

>
> I'm hoping that we can keep the number of no-breakpoint-here percpu
> variables low.  Maybe we could recruit objtool to help make sure we
> got all of them, but that would be a much larger project.
>
> Would we currently survive a breakpoint on the thread stack?  I don't
> see any extremely obvious reason that we wouldn't.  Blocking such a
> breakpoint would be annoying.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [RFC PATCH V2 5/7] x86/entry: don't shift stack on #DB
  2020-05-26  1:42     ` [RFC PATCH V2 5/7] x86/entry: don't shift stack on #DB Lai Jiangshan
@ 2020-05-26  9:10       ` Peter Zijlstra
  0 siblings, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2020-05-26  9:10 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, Andy Lutomirski, Thomas Gleixner, x86, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrew Cooper, Juergen Gross,
	Brian Gerst

On Tue, May 26, 2020 at 01:42:19AM +0000, Lai Jiangshan wrote:
> debug_enter() will disable #DB, there should be no recursive #DB.

should being the operative word.

I have similar patches but was still debating what to do about kgdb, I
think we'll just have to mark it BROKEN for now.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [tip: x86/entry] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask
  2020-05-26  1:42     ` [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask Lai Jiangshan
  2020-05-26  4:17       ` Andy Lutomirski
@ 2020-05-30  9:57       ` tip-bot2 for Lai Jiangshan
  1 sibling, 0 replies; 25+ messages in thread
From: tip-bot2 for Lai Jiangshan @ 2020-05-30  9:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Lai Jiangshan, Peter Zijlstra (Intel), Thomas Gleixner, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     87aa9b64e0ccb779db3970e2e9af1cc5930c2c3d
Gitweb:        https://git.kernel.org/tip/87aa9b64e0ccb779db3970e2e9af1cc5930c2c3d
Author:        Lai Jiangshan <laijs@linux.alibaba.com>
AuthorDate:    Fri, 29 May 2020 23:27:32 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sat, 30 May 2020 10:00:06 +02:00

x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask

The per-CPU user_pcid_flush_mask is used in the low level entry code. A
data breakpoint can cause #DB recursion. 

Protect the full cpu_tlbstate structure for simplicity.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200526014221.2119-5-laijs@linux.alibaba.com
Link: https://lkml.kernel.org/r/20200529213320.955117574@infradead.org

---
 arch/x86/kernel/hw_breakpoint.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index f311bbf..fc1743a 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -33,6 +33,7 @@
 #include <asm/debugreg.h>
 #include <asm/user.h>
 #include <asm/desc.h>
+#include <asm/tlbflush.h>
 
 /* Per cpu debug control register value */
 DEFINE_PER_CPU(unsigned long, cpu_dr7);
@@ -264,6 +265,16 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 				(unsigned long)&per_cpu(cpu_tss_rw, cpu),
 				sizeof(struct tss_struct)))
 			return true;
+
+		/*
+		 * cpu_tlbstate.user_pcid_flush_mask is used for CPU entry.
+		 * If a data breakpoint on it, it will cause an unwanted #DB.
+		 * Protect the full cpu_tlbstate structure to be sure.
+		 */
+		if (within_area(addr, end,
+				(unsigned long)&per_cpu(cpu_tlbstate, cpu),
+				sizeof(struct tlb_state)))
+			return true;
 	}
 
 	return false;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip: x86/entry] x86/hw_breakpoint: Prevent data breakpoints on direct GDT
  2020-05-26  1:42     ` [RFC PATCH V2 2/7] x86/hw_breakpoint: Prevent data breakpoints on direct GDT Lai Jiangshan
@ 2020-05-30  9:57       ` tip-bot2 for Lai Jiangshan
  0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Lai Jiangshan @ 2020-05-30  9:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Lai Jiangshan, Peter Zijlstra (Intel), Thomas Gleixner, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     92a6521bf846dd08768bc4de447b79e8bd2cdb2f
Gitweb:        https://git.kernel.org/tip/92a6521bf846dd08768bc4de447b79e8bd2cdb2f
Author:        Lai Jiangshan <laijs@linux.alibaba.com>
AuthorDate:    Fri, 29 May 2020 23:27:30 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sat, 30 May 2020 10:00:06 +02:00

x86/hw_breakpoint: Prevent data breakpoints on direct GDT

A data breakpoint on the GDT can be fatal and must be avoided.  The GDT in
the CPU entry area is already protected, but not the direct GDT.

Add the necessary protection.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200526014221.2119-3-laijs@linux.alibaba.com
Link: https://lkml.kernel.org/r/20200529213320.840953950@infradead.org

---
 arch/x86/kernel/hw_breakpoint.c | 30 ++++++++++++++++++++++--------
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index c149c7b..f859095 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -32,6 +32,7 @@
 #include <asm/processor.h>
 #include <asm/debugreg.h>
 #include <asm/user.h>
+#include <asm/desc.h>
 
 /* Per cpu debug control register value */
 DEFINE_PER_CPU(unsigned long, cpu_dr7);
@@ -237,13 +238,26 @@ static inline bool within_area(unsigned long addr, unsigned long end,
 }
 
 /*
- * Checks whether the range from addr to end, inclusive, overlaps the CPU
- * entry area range.
+ * Checks whether the range from addr to end, inclusive, overlaps the fixed
+ * mapped CPU entry area range or other ranges used for CPU entry.
  */
-static inline bool within_cpu_entry_area(unsigned long addr, unsigned long end)
+static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 {
-	return within_area(addr, end, CPU_ENTRY_AREA_BASE,
-			   CPU_ENTRY_AREA_TOTAL_SIZE);
+	int cpu;
+
+	/* CPU entry erea is always used for CPU entry */
+	if (within_area(addr, end, CPU_ENTRY_AREA_BASE,
+			CPU_ENTRY_AREA_TOTAL_SIZE))
+		return true;
+
+	for_each_possible_cpu(cpu) {
+		/* The original rw GDT is being used after load_direct_gdt() */
+		if (within_area(addr, end, (unsigned long)get_cpu_gdt_rw(cpu),
+				GDT_SIZE))
+			return true;
+	}
+
+	return false;
 }
 
 static int arch_build_bp_info(struct perf_event *bp,
@@ -257,12 +271,12 @@ static int arch_build_bp_info(struct perf_event *bp,
 		return -EINVAL;
 
 	/*
-	 * Prevent any breakpoint of any type that overlaps the
-	 * cpu_entry_area.  This protects the IST stacks and also
+	 * Prevent any breakpoint of any type that overlaps the CPU
+	 * entry area and data.  This protects the IST stacks and also
 	 * reduces the chance that we ever find out what happens if
 	 * there's a data breakpoint on the GDT, IDT, or TSS.
 	 */
-	if (within_cpu_entry_area(attr->bp_addr, bp_end))
+	if (within_cpu_entry(attr->bp_addr, bp_end))
 		return -EINVAL;
 
 	hw->address = attr->bp_addr;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip: x86/entry] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw
  2020-05-26  1:42     ` [RFC PATCH V2 3/7] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw Lai Jiangshan
@ 2020-05-30  9:57       ` tip-bot2 for Lai Jiangshan
  0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Lai Jiangshan @ 2020-05-30  9:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Lai Jiangshan, Peter Zijlstra (Intel), Thomas Gleixner, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     9a06f99a03a0f961047bbae4ee80ce9265757bfe
Gitweb:        https://git.kernel.org/tip/9a06f99a03a0f961047bbae4ee80ce9265757bfe
Author:        Lai Jiangshan <laijs@linux.alibaba.com>
AuthorDate:    Fri, 29 May 2020 23:27:31 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sat, 30 May 2020 10:00:06 +02:00

x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw

cpu_tss_rw is not directly referenced by hardware, but cpu_tss_rw is
accessed in CPU entry code, especially when #DB shifts its stacks.

If a data breakpoint would be set on cpu_tss_rw.x86_tss.ist[IST_INDEX_DB],
it would cause recursive #DB ending up in a double fault.

Add it to the list of protected items.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200526014221.2119-4-laijs@linux.alibaba.com
Link: https://lkml.kernel.org/r/20200529213320.897976479@infradead.org

---
 arch/x86/kernel/hw_breakpoint.c |  9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index f859095..f311bbf 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -255,6 +255,15 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
 		if (within_area(addr, end, (unsigned long)get_cpu_gdt_rw(cpu),
 				GDT_SIZE))
 			return true;
+
+		/*
+		 * cpu_tss_rw is not directly referenced by hardware, but
+		 * cpu_tss_rw is also used in CPU entry code,
+		 */
+		if (within_area(addr, end,
+				(unsigned long)&per_cpu(cpu_tss_rw, cpu),
+				sizeof(struct tss_struct)))
+			return true;
 	}
 
 	return false;

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [tip: x86/entry] x86/hw_breakpoint: Add within_area() to check data breakpoints
  2020-05-26  1:42     ` [RFC PATCH V2 1/7] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
@ 2020-05-30  9:57       ` tip-bot2 for Lai Jiangshan
  0 siblings, 0 replies; 25+ messages in thread
From: tip-bot2 for Lai Jiangshan @ 2020-05-30  9:57 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Lai Jiangshan, Peter Zijlstra (Intel), Thomas Gleixner, x86, LKML

The following commit has been merged into the x86/entry branch of tip:

Commit-ID:     83f7a80367e99cca0023be95538635abacec07ae
Gitweb:        https://git.kernel.org/tip/83f7a80367e99cca0023be95538635abacec07ae
Author:        Lai Jiangshan <laijs@linux.alibaba.com>
AuthorDate:    Fri, 29 May 2020 23:27:29 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sat, 30 May 2020 10:00:05 +02:00

x86/hw_breakpoint: Add within_area() to check data breakpoints

Add a within_area() helper to checking whether the data breakpoints overlap
with cpu_entry_area.

It will be used to completely prevent data breakpoints on GDT, IDT, or TSS.

Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200526014221.2119-2-laijs@linux.alibaba.com
Link: https://lkml.kernel.org/r/20200529213320.784524504@infradead.org

---
 arch/x86/kernel/hw_breakpoint.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 9ddf441..c149c7b 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -228,13 +228,22 @@ int arch_check_bp_in_kernelspace(struct arch_hw_breakpoint *hw)
 }
 
 /*
+ * Checks whether the range [addr, end], overlaps the area [base, base + size).
+ */
+static inline bool within_area(unsigned long addr, unsigned long end,
+			       unsigned long base, unsigned long size)
+{
+	return end >= base && addr < (base + size);
+}
+
+/*
  * Checks whether the range from addr to end, inclusive, overlaps the CPU
  * entry area range.
  */
 static inline bool within_cpu_entry_area(unsigned long addr, unsigned long end)
 {
-	return end >= CPU_ENTRY_AREA_BASE &&
-	       addr < (CPU_ENTRY_AREA_BASE + CPU_ENTRY_AREA_TOTAL_SIZE);
+	return within_area(addr, end, CPU_ENTRY_AREA_BASE,
+			   CPU_ENTRY_AREA_TOTAL_SIZE);
 }
 
 static int arch_build_bp_info(struct perf_event *bp,

^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-05-30  9:57 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-25 14:50 [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan
2020-05-25 14:50 ` [RFC PATCH 1/5] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
2020-05-25 14:50 ` [RFC PATCH 2/5] x86/hw_breakpoint: Prevent data breakpoints on direct GDT Lai Jiangshan
2020-05-25 14:51 ` [RFC PATCH 3/5] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw Lai Jiangshan
2020-05-25 14:51 ` [RFC PATCH 4/5] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask Lai Jiangshan
2020-05-25 14:51 ` [RFC PATCH 5/5] x86/hw_breakpoint: Prevent data breakpoints on debug_idt_table Lai Jiangshan
2020-05-25 15:25 ` [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Peter Zijlstra
2020-05-26  1:42   ` [RFC PATCH V2 0/7] x86/DB: protects more cpu entry data and Lai Jiangshan
2020-05-26  1:42     ` [RFC PATCH V2 1/7] x86/hw_breakpoint: add within_area() to check data breakpoints Lai Jiangshan
2020-05-30  9:57       ` [tip: x86/entry] x86/hw_breakpoint: Add " tip-bot2 for Lai Jiangshan
2020-05-26  1:42     ` [RFC PATCH V2 2/7] x86/hw_breakpoint: Prevent data breakpoints on direct GDT Lai Jiangshan
2020-05-30  9:57       ` [tip: x86/entry] " tip-bot2 for Lai Jiangshan
2020-05-26  1:42     ` [RFC PATCH V2 3/7] x86/hw_breakpoint: Prevent data breakpoints on per_cpu cpu_tss_rw Lai Jiangshan
2020-05-30  9:57       ` [tip: x86/entry] " tip-bot2 for Lai Jiangshan
2020-05-26  1:42     ` [RFC PATCH V2 4/7] x86/hw_breakpoint: Prevent data breakpoints on user_pcid_flush_mask Lai Jiangshan
2020-05-26  4:17       ` Andy Lutomirski
2020-05-26  4:31         ` Lai Jiangshan
2020-05-26  4:38           ` Andy Lutomirski
2020-05-26  5:48             ` Lai Jiangshan
2020-05-30  9:57       ` [tip: x86/entry] " tip-bot2 for Lai Jiangshan
2020-05-26  1:42     ` [RFC PATCH V2 5/7] x86/entry: don't shift stack on #DB Lai Jiangshan
2020-05-26  9:10       ` Peter Zijlstra
2020-05-26  1:42     ` [RFC PATCH V2 6/7] x86/entry: is_debug_stack() don't check of DB1 stack Lai Jiangshan
2020-05-26  1:42     ` [RFC PATCH V2 7/7] x86/entry: remove DB1 stack and DB2 hole from cpu entry area Lai Jiangshan
2020-05-26  1:48   ` [RFC PATCH 0/5] x86/hw_breakpoint: protects more cpu entry data Lai Jiangshan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).