linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
@ 2020-05-11 20:41 Will Deacon
  2020-05-11 20:41 ` [PATCH v5 01/18] sparc32: mm: Fix argument checking in __srmmu_get_nocache() Will Deacon
                   ` (18 more replies)
  0 siblings, 19 replies; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

Hi folks,

(trimmed CC list since v4 since this is largely just a rebase)

This is version five of the READ_ONCE() codegen improvement series that
I've previously posted here:

RFC: https://lore.kernel.org/lkml/20200110165636.28035-1-will@kernel.org
v2:  https://lore.kernel.org/lkml/20200123153341.19947-1-will@kernel.org
v3:  https://lore.kernel.org/lkml/20200415165218.20251-1-will@kernel.org
v4:  https://lore.kernel.org/lkml/20200421151537.19241-1-will@kernel.org

The main change since v4 is that this is now based on top of the KCSAN
changes queued in -tip (locking/kcsan) and therefore contains the patches
necessary to avoid breaking sparc32 as well as some cleanups to
consolidate {READ,WRITE}_ONCE() and data_race().

Other changes include:

  * Treat 'char' as distinct from 'signed char' and 'unsigned char' for
    __builtin_types_compatible_p()

  * Add a compile-time assertion that the argument to READ_ONCE_NOCHECK()
    points at something the same size as 'unsigned long'

I'm happy for all of this to go via -tip, or I can take it via arm64.

Please let me know.

Cheers,

Will

Cc: Marco Elver <elver@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de> 
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>

--->8

Will Deacon (18):
  sparc32: mm: Fix argument checking in __srmmu_get_nocache()
  sparc32: mm: Restructure sparc32 MMU page-table layout
  sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *
  sparc32: mm: Reduce allocation size for PMD and PTE tables
  compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
  netfilter: Avoid assigning 'const' pointer to non-const pointer
  net: tls: Avoid assigning 'const' pointer to non-const pointer
  fault_inject: Don't rely on "return value" from WRITE_ONCE()
  arm64: csum: Disable KASAN for do_csum()
  READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
  READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
  READ_ONCE: Drop pointer qualifiers when reading from scalar types
  locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
  arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release
    macros
  gcov: Remove old GCC 3.4 support
  kcsan: Rework data_race() so that it can be used by READ_ONCE()
  READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  linux/compiler.h: Remove redundant '#else'

 Documentation/process/changes.rst   |   2 +-
 arch/arm/crypto/Kconfig             |  12 +-
 arch/arm64/include/asm/barrier.h    |  16 +-
 arch/arm64/lib/csum.c               |  20 +-
 arch/sparc/include/asm/page_32.h    |  12 +-
 arch/sparc/include/asm/pgalloc_32.h |  11 +-
 arch/sparc/include/asm/pgtable_32.h |  40 +-
 arch/sparc/include/asm/pgtsrmmu.h   |  36 +-
 arch/sparc/include/asm/viking.h     |   5 +-
 arch/sparc/kernel/head_32.S         |   8 +-
 arch/sparc/mm/hypersparc.S          |   3 +-
 arch/sparc/mm/srmmu.c               |  95 ++---
 arch/sparc/mm/viking.S              |   5 +-
 crypto/Kconfig                      |   1 -
 drivers/xen/time.c                  |   2 +-
 include/asm-generic/barrier.h       |  16 +-
 include/linux/compiler-gcc.h        |   5 +-
 include/linux/compiler.h            | 207 +++++-----
 include/linux/compiler_types.h      |  26 ++
 init/Kconfig                        |   1 -
 kernel/gcov/Kconfig                 |  24 --
 kernel/gcov/Makefile                |   3 +-
 kernel/gcov/gcc_3_4.c               | 573 ----------------------------
 lib/fault-inject.c                  |   4 +-
 net/netfilter/core.c                |   2 +-
 net/tls/tls_main.c                  |   2 +-
 scripts/gcc-plugins/Kconfig         |   2 +-
 27 files changed, 257 insertions(+), 876 deletions(-)
 delete mode 100644 kernel/gcov/gcc_3_4.c

-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 01/18] sparc32: mm: Fix argument checking in __srmmu_get_nocache()
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:37   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 02/18] sparc32: mm: Restructure sparc32 MMU page-table layout Will Deacon
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

The 'size' argument to __srmmu_get_nocache() is a number of bytes not
a shift value, so fix up the sanity checking to treat it properly.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/sparc/mm/srmmu.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index b7c94de70cca..cb9ded8a68b7 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -175,18 +175,18 @@ pte_t *pte_offset_kernel(pmd_t *dir, unsigned long address)
  */
 static void *__srmmu_get_nocache(int size, int align)
 {
-	int offset;
+	int offset, minsz = 1 << SRMMU_NOCACHE_BITMAP_SHIFT;
 	unsigned long addr;
 
-	if (size < SRMMU_NOCACHE_BITMAP_SHIFT) {
+	if (size < minsz) {
 		printk(KERN_ERR "Size 0x%x too small for nocache request\n",
 		       size);
-		size = SRMMU_NOCACHE_BITMAP_SHIFT;
+		size = minsz;
 	}
-	if (size & (SRMMU_NOCACHE_BITMAP_SHIFT - 1)) {
-		printk(KERN_ERR "Size 0x%x unaligned int nocache request\n",
+	if (size & (minsz - 1)) {
+		printk(KERN_ERR "Size 0x%x unaligned in nocache request\n",
 		       size);
-		size += SRMMU_NOCACHE_BITMAP_SHIFT - 1;
+		size += minsz - 1;
 	}
 	BUG_ON(align > SRMMU_NOCACHE_ALIGN_MAX);
 
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 02/18] sparc32: mm: Restructure sparc32 MMU page-table layout
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
  2020-05-11 20:41 ` [PATCH v5 01/18] sparc32: mm: Fix argument checking in __srmmu_get_nocache() Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:37   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 03/18] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page * Will Deacon
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

The "SRMMU" supports 4k pages using a fixed three-level walk with a
256-entry PGD and 64-entry PMD/PTE levels. In order to fill a page
with a 'pgtable_t', the SRMMU code allocates four native PTE tables
into a single PTE allocation and similarly for the PMD level, leading
to an array of 16 physical pointers in a 'pmd_t'

This breaks the generic code which assumes READ_ONCE(*pmd) will be
word sized.

In a manner similar to ef22d8abd876 ("m68k: mm: Restructure Motorola
MMU page-table layout"), this patch implements the native page-table
setup directly. This significantly increases the page-table memory
overhead, but will be addressed in a subsequent patch.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/sparc/include/asm/page_32.h    | 10 ++---
 arch/sparc/include/asm/pgalloc_32.h |  5 ++-
 arch/sparc/include/asm/pgtable_32.h | 29 +++++++-------
 arch/sparc/include/asm/pgtsrmmu.h   | 36 ++---------------
 arch/sparc/include/asm/viking.h     |  5 ++-
 arch/sparc/kernel/head_32.S         |  8 ++--
 arch/sparc/mm/hypersparc.S          |  3 +-
 arch/sparc/mm/srmmu.c               | 60 ++++++++++-------------------
 arch/sparc/mm/viking.S              |  5 ++-
 9 files changed, 58 insertions(+), 103 deletions(-)

diff --git a/arch/sparc/include/asm/page_32.h b/arch/sparc/include/asm/page_32.h
index 478260002836..da01c8c45412 100644
--- a/arch/sparc/include/asm/page_32.h
+++ b/arch/sparc/include/asm/page_32.h
@@ -54,7 +54,7 @@ extern struct sparc_phys_banks sp_banks[SPARC_PHYS_BANKS+1];
  */
 typedef struct { unsigned long pte; } pte_t;
 typedef struct { unsigned long iopte; } iopte_t;
-typedef struct { unsigned long pmdv[16]; } pmd_t;
+typedef struct { unsigned long pmd; } pmd_t;
 typedef struct { unsigned long pgd; } pgd_t;
 typedef struct { unsigned long ctxd; } ctxd_t;
 typedef struct { unsigned long pgprot; } pgprot_t;
@@ -62,7 +62,7 @@ typedef struct { unsigned long iopgprot; } iopgprot_t;
 
 #define pte_val(x)	((x).pte)
 #define iopte_val(x)	((x).iopte)
-#define pmd_val(x)      ((x).pmdv[0])
+#define pmd_val(x)      ((x).pmd)
 #define pgd_val(x)	((x).pgd)
 #define ctxd_val(x)	((x).ctxd)
 #define pgprot_val(x)	((x).pgprot)
@@ -82,7 +82,7 @@ typedef struct { unsigned long iopgprot; } iopgprot_t;
  */
 typedef unsigned long pte_t;
 typedef unsigned long iopte_t;
-typedef struct { unsigned long pmdv[16]; } pmd_t;
+typedef unsigned long pmd_t;
 typedef unsigned long pgd_t;
 typedef unsigned long ctxd_t;
 typedef unsigned long pgprot_t;
@@ -90,14 +90,14 @@ typedef unsigned long iopgprot_t;
 
 #define pte_val(x)	(x)
 #define iopte_val(x)	(x)
-#define pmd_val(x)      ((x).pmdv[0])
+#define pmd_val(x)      (x)
 #define pgd_val(x)	(x)
 #define ctxd_val(x)	(x)
 #define pgprot_val(x)	(x)
 #define iopgprot_val(x)	(x)
 
 #define __pte(x)	(x)
-#define __pmd(x)	((pmd_t) { { (x) }, })
+#define __pmd(x)	(x)
 #define __iopte(x)	(x)
 #define __pgd(x)	(x)
 #define __ctxd(x)	(x)
diff --git a/arch/sparc/include/asm/pgalloc_32.h b/arch/sparc/include/asm/pgalloc_32.h
index eae0c92ec422..99c032424946 100644
--- a/arch/sparc/include/asm/pgalloc_32.h
+++ b/arch/sparc/include/asm/pgalloc_32.h
@@ -60,13 +60,14 @@ pgtable_t pte_alloc_one(struct mm_struct *mm);
 
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
 {
-	return srmmu_get_nocache(PTE_SIZE, PTE_SIZE);
+	return srmmu_get_nocache(SRMMU_PTE_TABLE_SIZE,
+				 SRMMU_PTE_TABLE_SIZE);
 }
 
 
 static inline void free_pte_fast(pte_t *pte)
 {
-	srmmu_free_nocache(pte, PTE_SIZE);
+	srmmu_free_nocache(pte, SRMMU_PTE_TABLE_SIZE);
 }
 
 #define pte_free_kernel(mm, pte)	free_pte_fast(pte)
diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h
index 0de659ae0ba4..3367e2ba89e0 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -11,6 +11,16 @@
 
 #include <linux/const.h>
 
+#define PMD_SHIFT		18
+#define PMD_SIZE        	(1UL << PMD_SHIFT)
+#define PMD_MASK        	(~(PMD_SIZE-1))
+#define PMD_ALIGN(__addr) 	(((__addr) + ~PMD_MASK) & PMD_MASK)
+
+#define PGDIR_SHIFT     	24
+#define PGDIR_SIZE      	(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK      	(~(PGDIR_SIZE-1))
+#define PGDIR_ALIGN(__addr) 	(((__addr) + ~PGDIR_MASK) & PGDIR_MASK)
+
 #ifndef __ASSEMBLY__
 #include <asm-generic/pgtable-nopud.h>
 
@@ -34,17 +44,10 @@ unsigned long __init bootmem_init(unsigned long *pages_avail);
 #define pmd_ERROR(e)   __builtin_trap()
 #define pgd_ERROR(e)   __builtin_trap()
 
-#define PMD_SHIFT		22
-#define PMD_SIZE        	(1UL << PMD_SHIFT)
-#define PMD_MASK        	(~(PMD_SIZE-1))
-#define PMD_ALIGN(__addr) 	(((__addr) + ~PMD_MASK) & PMD_MASK)
-#define PGDIR_SHIFT     	SRMMU_PGDIR_SHIFT
-#define PGDIR_SIZE      	SRMMU_PGDIR_SIZE
-#define PGDIR_MASK      	SRMMU_PGDIR_MASK
-#define PTRS_PER_PTE    	1024
-#define PTRS_PER_PMD    	SRMMU_PTRS_PER_PMD
-#define PTRS_PER_PGD    	SRMMU_PTRS_PER_PGD
-#define USER_PTRS_PER_PGD	PAGE_OFFSET / SRMMU_PGDIR_SIZE
+#define PTRS_PER_PTE    	64
+#define PTRS_PER_PMD    	64
+#define PTRS_PER_PGD    	256
+#define USER_PTRS_PER_PGD	PAGE_OFFSET / PGDIR_SIZE
 #define FIRST_USER_ADDRESS	0UL
 #define PTE_SIZE		(PTRS_PER_PTE*4)
 
@@ -179,9 +182,7 @@ static inline int pmd_none(pmd_t pmd)
 
 static inline void pmd_clear(pmd_t *pmdp)
 {
-	int i;
-	for (i = 0; i < PTRS_PER_PTE/SRMMU_REAL_PTRS_PER_PTE; i++)
-		set_pte((pte_t *)&pmdp->pmdv[i], __pte(0));
+	set_pte((pte_t *)&pmd_val(*pmdp), __pte(0));
 }
 
 static inline int pud_none(pud_t pud)
diff --git a/arch/sparc/include/asm/pgtsrmmu.h b/arch/sparc/include/asm/pgtsrmmu.h
index 32a508897501..58ea8e8c6ee7 100644
--- a/arch/sparc/include/asm/pgtsrmmu.h
+++ b/arch/sparc/include/asm/pgtsrmmu.h
@@ -17,39 +17,9 @@
 /* Number of contexts is implementation-dependent; 64k is the most we support */
 #define SRMMU_MAX_CONTEXTS	65536
 
-/* PMD_SHIFT determines the size of the area a second-level page table entry can map */
-#define SRMMU_REAL_PMD_SHIFT		18
-#define SRMMU_REAL_PMD_SIZE		(1UL << SRMMU_REAL_PMD_SHIFT)
-#define SRMMU_REAL_PMD_MASK		(~(SRMMU_REAL_PMD_SIZE-1))
-#define SRMMU_REAL_PMD_ALIGN(__addr)	(((__addr)+SRMMU_REAL_PMD_SIZE-1)&SRMMU_REAL_PMD_MASK)
-
-/* PGDIR_SHIFT determines what a third-level page table entry can map */
-#define SRMMU_PGDIR_SHIFT       24
-#define SRMMU_PGDIR_SIZE        (1UL << SRMMU_PGDIR_SHIFT)
-#define SRMMU_PGDIR_MASK        (~(SRMMU_PGDIR_SIZE-1))
-#define SRMMU_PGDIR_ALIGN(addr) (((addr)+SRMMU_PGDIR_SIZE-1)&SRMMU_PGDIR_MASK)
-
-#define SRMMU_REAL_PTRS_PER_PTE	64
-#define SRMMU_REAL_PTRS_PER_PMD	64
-#define SRMMU_PTRS_PER_PGD	256
-
-#define SRMMU_REAL_PTE_TABLE_SIZE	(SRMMU_REAL_PTRS_PER_PTE*4)
-#define SRMMU_PMD_TABLE_SIZE		(SRMMU_REAL_PTRS_PER_PMD*4)
-#define SRMMU_PGD_TABLE_SIZE		(SRMMU_PTRS_PER_PGD*4)
-
-/*
- * To support pagetables in highmem, Linux introduces APIs which
- * return struct page* and generally manipulate page tables when
- * they are not mapped into kernel space. Our hardware page tables
- * are smaller than pages. We lump hardware tabes into big, page sized
- * software tables.
- *
- * PMD_SHIFT determines the size of the area a second-level page table entry
- * can map, and our pmd_t is 16 times larger than normal.  The values which
- * were once defined here are now generic for 4c and srmmu, so they're
- * found in pgtable.h.
- */
-#define SRMMU_PTRS_PER_PMD	4
+#define SRMMU_PTE_TABLE_SIZE		(PAGE_SIZE)
+#define SRMMU_PMD_TABLE_SIZE		(PAGE_SIZE)
+#define SRMMU_PGD_TABLE_SIZE		(PTRS_PER_PGD*4)
 
 /* Definition of the values in the ET field of PTD's and PTE's */
 #define SRMMU_ET_MASK         0x3
diff --git a/arch/sparc/include/asm/viking.h b/arch/sparc/include/asm/viking.h
index 0bbefd184221..08ffc605035f 100644
--- a/arch/sparc/include/asm/viking.h
+++ b/arch/sparc/include/asm/viking.h
@@ -10,6 +10,7 @@
 
 #include <asm/asi.h>
 #include <asm/mxcc.h>
+#include <asm/pgtable.h>
 #include <asm/pgtsrmmu.h>
 
 /* Bits in the SRMMU control register for GNU/Viking modules.
@@ -227,7 +228,7 @@ static inline unsigned long viking_hwprobe(unsigned long vaddr)
 			     : "=r" (val)
 			     : "r" (vaddr | 0x200), "i" (ASI_M_FLUSH_PROBE));
 	if ((val & SRMMU_ET_MASK) == SRMMU_ET_PTE) {
-		vaddr &= ~SRMMU_PGDIR_MASK;
+		vaddr &= ~PGDIR_MASK;
 		vaddr >>= PAGE_SHIFT;
 		return val | (vaddr << 8);
 	}
@@ -237,7 +238,7 @@ static inline unsigned long viking_hwprobe(unsigned long vaddr)
 			     : "=r" (val)
 			     : "r" (vaddr | 0x100), "i" (ASI_M_FLUSH_PROBE));
 	if ((val & SRMMU_ET_MASK) == SRMMU_ET_PTE) {
-		vaddr &= ~SRMMU_REAL_PMD_MASK;
+		vaddr &= ~PMD_MASK;
 		vaddr >>= PAGE_SHIFT;
 		return val | (vaddr << 8);
 	}
diff --git a/arch/sparc/kernel/head_32.S b/arch/sparc/kernel/head_32.S
index e55f2c075165..be30c8d4cc73 100644
--- a/arch/sparc/kernel/head_32.S
+++ b/arch/sparc/kernel/head_32.S
@@ -24,7 +24,7 @@
 #include <asm/winmacro.h>
 #include <asm/thread_info.h>	/* TI_UWINMASK */
 #include <asm/errno.h>
-#include <asm/pgtsrmmu.h>	/* SRMMU_PGDIR_SHIFT */
+#include <asm/pgtable.h>	/* PGDIR_SHIFT */
 #include <asm/export.h>
 
 	.data
@@ -273,7 +273,7 @@ not_a_sun4:
 		lda	[%o1] ASI_M_BYPASS, %o2		! This is the 0x0 16MB pgd
 
 		/* Calculate to KERNBASE entry. */
-		add	%o1, KERNBASE >> (SRMMU_PGDIR_SHIFT - 2), %o3
+		add	%o1, KERNBASE >> (PGDIR_SHIFT - 2), %o3
 
 		/* Poke the entry into the calculated address. */
 		sta	%o2, [%o3] ASI_M_BYPASS
@@ -317,7 +317,7 @@ srmmu_not_viking:
 		sll	%g1, 0x8, %g1			! make phys addr for l1 tbl
 
 		lda	[%g1] ASI_M_BYPASS, %g2		! get level1 entry for 0x0
-		add	%g1, KERNBASE >> (SRMMU_PGDIR_SHIFT - 2), %g3
+		add	%g1, KERNBASE >> (PGDIR_SHIFT - 2), %g3
 		sta	%g2, [%g3] ASI_M_BYPASS		! place at KERNBASE entry
 		b	go_to_highmem
 		 nop					! wheee....
@@ -341,7 +341,7 @@ leon_remap:
 		sll	%g1, 0x8, %g1			! make phys addr for l1 tbl
 
 		lda	[%g1] ASI_M_BYPASS, %g2		! get level1 entry for 0x0
-		add	%g1, KERNBASE >> (SRMMU_PGDIR_SHIFT - 2), %g3
+		add	%g1, KERNBASE >> (PGDIR_SHIFT - 2), %g3
 		sta	%g2, [%g3] ASI_M_BYPASS		! place at KERNBASE entry
 		b	go_to_highmem
 		 nop					! wheee....
diff --git a/arch/sparc/mm/hypersparc.S b/arch/sparc/mm/hypersparc.S
index 66885a8dc50a..6c2521e85a42 100644
--- a/arch/sparc/mm/hypersparc.S
+++ b/arch/sparc/mm/hypersparc.S
@@ -10,6 +10,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/asi.h>
 #include <asm/page.h>
+#include <asm/pgtable.h>
 #include <asm/pgtsrmmu.h>
 #include <linux/init.h>
 
@@ -293,7 +294,7 @@ hypersparc_flush_tlb_range:
 	cmp	%o3, -1
 	be	hypersparc_flush_tlb_range_out
 #endif
-	 sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
+	 sethi	%hi(~((1 << PGDIR_SHIFT) - 1)), %o4
 	sta	%o3, [%g1] ASI_M_MMUREGS
 	and	%o1, %o4, %o1
 	add	%o1, 0x200, %o1
diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index cb9ded8a68b7..50da4bcdd6fa 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -136,26 +136,14 @@ static void msi_set_sync(void)
 
 void pmd_set(pmd_t *pmdp, pte_t *ptep)
 {
-	unsigned long ptp;	/* Physical address, shifted right by 4 */
-	int i;
-
-	ptp = __nocache_pa(ptep) >> 4;
-	for (i = 0; i < PTRS_PER_PTE/SRMMU_REAL_PTRS_PER_PTE; i++) {
-		set_pte((pte_t *)&pmdp->pmdv[i], __pte(SRMMU_ET_PTD | ptp));
-		ptp += (SRMMU_REAL_PTRS_PER_PTE * sizeof(pte_t) >> 4);
-	}
+	unsigned long ptp = __nocache_pa(ptep) >> 4;
+	set_pte((pte_t *)&pmd_val(*pmdp), __pte(SRMMU_ET_PTD | ptp));
 }
 
 void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, struct page *ptep)
 {
-	unsigned long ptp;	/* Physical address, shifted right by 4 */
-	int i;
-
-	ptp = page_to_pfn(ptep) << (PAGE_SHIFT-4);	/* watch for overflow */
-	for (i = 0; i < PTRS_PER_PTE/SRMMU_REAL_PTRS_PER_PTE; i++) {
-		set_pte((pte_t *)&pmdp->pmdv[i], __pte(SRMMU_ET_PTD | ptp));
-		ptp += (SRMMU_REAL_PTRS_PER_PTE * sizeof(pte_t) >> 4);
-	}
+	unsigned long ptp = page_to_pfn(ptep) << (PAGE_SHIFT-4); /* watch for overflow */
+	set_pte((pte_t *)&pmd_val(*pmdp), __pte(SRMMU_ET_PTD | ptp));
 }
 
 /* Find an entry in the third-level page table.. */
@@ -163,7 +151,7 @@ pte_t *pte_offset_kernel(pmd_t *dir, unsigned long address)
 {
 	void *pte;
 
-	pte = __nocache_va((dir->pmdv[0] & SRMMU_PTD_PMASK) << 4);
+	pte = __nocache_va((pmd_val(*dir) & SRMMU_PTD_PMASK) << 4);
 	return (pte_t *) pte +
 	    ((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1));
 }
@@ -400,7 +388,7 @@ void pte_free(struct mm_struct *mm, pgtable_t pte)
 	p = page_to_pfn(pte) << PAGE_SHIFT;	/* Physical address */
 
 	/* free non cached virtual address*/
-	srmmu_free_nocache(__nocache_va(p), PTE_SIZE);
+	srmmu_free_nocache(__nocache_va(p), SRMMU_PTE_TABLE_SIZE);
 }
 
 /* context handling - a dynamically sized pool is used */
@@ -822,13 +810,13 @@ static void __init srmmu_inherit_prom_mappings(unsigned long start,
 		what = 0;
 		addr = start - PAGE_SIZE;
 
-		if (!(start & ~(SRMMU_REAL_PMD_MASK))) {
-			if (srmmu_probe(addr + SRMMU_REAL_PMD_SIZE) == probed)
+		if (!(start & ~(PMD_MASK))) {
+			if (srmmu_probe(addr + PMD_SIZE) == probed)
 				what = 1;
 		}
 
-		if (!(start & ~(SRMMU_PGDIR_MASK))) {
-			if (srmmu_probe(addr + SRMMU_PGDIR_SIZE) == probed)
+		if (!(start & ~(PGDIR_MASK))) {
+			if (srmmu_probe(addr + PGDIR_SIZE) == probed)
 				what = 2;
 		}
 
@@ -837,7 +825,7 @@ static void __init srmmu_inherit_prom_mappings(unsigned long start,
 		pudp = pud_offset(p4dp, start);
 		if (what == 2) {
 			*(pgd_t *)__nocache_fix(pgdp) = __pgd(probed);
-			start += SRMMU_PGDIR_SIZE;
+			start += PGDIR_SIZE;
 			continue;
 		}
 		if (pud_none(*(pud_t *)__nocache_fix(pudp))) {
@@ -849,6 +837,11 @@ static void __init srmmu_inherit_prom_mappings(unsigned long start,
 			pud_set(__nocache_fix(pudp), pmdp);
 		}
 		pmdp = pmd_offset(__nocache_fix(pgdp), start);
+		if (what == 1) {
+			*(pmd_t *)__nocache_fix(pmdp) = __pmd(probed);
+			start += PMD_SIZE;
+			continue;
+		}
 		if (srmmu_pmd_none(*(pmd_t *)__nocache_fix(pmdp))) {
 			ptep = __srmmu_get_nocache(PTE_SIZE, PTE_SIZE);
 			if (ptep == NULL)
@@ -856,19 +849,6 @@ static void __init srmmu_inherit_prom_mappings(unsigned long start,
 			memset(__nocache_fix(ptep), 0, PTE_SIZE);
 			pmd_set(__nocache_fix(pmdp), ptep);
 		}
-		if (what == 1) {
-			/* We bend the rule where all 16 PTPs in a pmd_t point
-			 * inside the same PTE page, and we leak a perfectly
-			 * good hardware PTE piece. Alternatives seem worse.
-			 */
-			unsigned int x;	/* Index of HW PMD in soft cluster */
-			unsigned long *val;
-			x = (start >> PMD_SHIFT) & 15;
-			val = &pmdp->pmdv[x];
-			*(unsigned long *)__nocache_fix(val) = probed;
-			start += SRMMU_REAL_PMD_SIZE;
-			continue;
-		}
 		ptep = pte_offset_kernel(__nocache_fix(pmdp), start);
 		*(pte_t *)__nocache_fix(ptep) = __pte(probed);
 		start += PAGE_SIZE;
@@ -890,9 +870,9 @@ static void __init do_large_mapping(unsigned long vaddr, unsigned long phys_base
 /* Map sp_bank entry SP_ENTRY, starting at virtual address VBASE. */
 static unsigned long __init map_spbank(unsigned long vbase, int sp_entry)
 {
-	unsigned long pstart = (sp_banks[sp_entry].base_addr & SRMMU_PGDIR_MASK);
-	unsigned long vstart = (vbase & SRMMU_PGDIR_MASK);
-	unsigned long vend = SRMMU_PGDIR_ALIGN(vbase + sp_banks[sp_entry].num_bytes);
+	unsigned long pstart = (sp_banks[sp_entry].base_addr & PGDIR_MASK);
+	unsigned long vstart = (vbase & PGDIR_MASK);
+	unsigned long vend = PGDIR_ALIGN(vbase + sp_banks[sp_entry].num_bytes);
 	/* Map "low" memory only */
 	const unsigned long min_vaddr = PAGE_OFFSET;
 	const unsigned long max_vaddr = PAGE_OFFSET + SRMMU_MAXMEM;
@@ -905,7 +885,7 @@ static unsigned long __init map_spbank(unsigned long vbase, int sp_entry)
 
 	while (vstart < vend) {
 		do_large_mapping(vstart, pstart);
-		vstart += SRMMU_PGDIR_SIZE; pstart += SRMMU_PGDIR_SIZE;
+		vstart += PGDIR_SIZE; pstart += PGDIR_SIZE;
 	}
 	return vstart;
 }
diff --git a/arch/sparc/mm/viking.S b/arch/sparc/mm/viking.S
index adaef6e7b8cf..48f062de7a7f 100644
--- a/arch/sparc/mm/viking.S
+++ b/arch/sparc/mm/viking.S
@@ -13,6 +13,7 @@
 #include <asm/asi.h>
 #include <asm/mxcc.h>
 #include <asm/page.h>
+#include <asm/pgtable.h>
 #include <asm/pgtsrmmu.h>
 #include <asm/viking.h>
 
@@ -157,7 +158,7 @@ viking_flush_tlb_range:
 	cmp	%o3, -1
 	be	2f
 #endif
-	sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
+	sethi	%hi(~((1 << PGDIR_SHIFT) - 1)), %o4
 	sta	%o3, [%g1] ASI_M_MMUREGS
 	and	%o1, %o4, %o1
 	add	%o1, 0x200, %o1
@@ -243,7 +244,7 @@ sun4dsmp_flush_tlb_range:
 	ld	[%o0 + VMA_VM_MM], %o0
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
-	sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
+	sethi	%hi(~((1 << PGDIR_SHIFT) - 1)), %o4
 	sta	%o3, [%g1] ASI_M_MMUREGS
 	and	%o1, %o4, %o1
 	add	%o1, 0x200, %o1
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 03/18] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
  2020-05-11 20:41 ` [PATCH v5 01/18] sparc32: mm: Fix argument checking in __srmmu_get_nocache() Will Deacon
  2020-05-11 20:41 ` [PATCH v5 02/18] sparc32: mm: Restructure sparc32 MMU page-table layout Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables Will Deacon
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

Change the 'pgtable_t' type for sparc32 so that it represents the uncached
virtual address of the PTE table, rather than the underlying 'struct page'.

This allows us to free page table allocations smaller than a page.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/sparc/include/asm/page_32.h    |  2 +-
 arch/sparc/include/asm/pgalloc_32.h |  6 +++---
 arch/sparc/include/asm/pgtable_32.h | 11 +++++++++++
 arch/sparc/mm/srmmu.c               | 29 +++++++++--------------------
 4 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/arch/sparc/include/asm/page_32.h b/arch/sparc/include/asm/page_32.h
index da01c8c45412..fff8861df107 100644
--- a/arch/sparc/include/asm/page_32.h
+++ b/arch/sparc/include/asm/page_32.h
@@ -106,7 +106,7 @@ typedef unsigned long iopgprot_t;
 
 #endif
 
-typedef struct page *pgtable_t;
+typedef pte_t *pgtable_t;
 
 #define TASK_UNMAPPED_BASE	0x50000000
 
diff --git a/arch/sparc/include/asm/pgalloc_32.h b/arch/sparc/include/asm/pgalloc_32.h
index 99c032424946..b772384871e9 100644
--- a/arch/sparc/include/asm/pgalloc_32.h
+++ b/arch/sparc/include/asm/pgalloc_32.h
@@ -50,11 +50,11 @@ static inline void free_pmd_fast(pmd_t * pmd)
 #define pmd_free(mm, pmd)		free_pmd_fast(pmd)
 #define __pmd_free_tlb(tlb, pmd, addr)	pmd_free((tlb)->mm, pmd)
 
-void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, struct page *ptep);
-#define pmd_pgtable(pmd) pmd_page(pmd)
+#define pmd_populate(mm, pmd, pte)	pmd_set(pmd, pte)
+#define pmd_pgtable(pmd)		(pgtable_t)__pmd_page(pmd)
 
 void pmd_set(pmd_t *pmdp, pte_t *ptep);
-#define pmd_populate_kernel(MM, PMD, PTE) pmd_set(PMD, PTE)
+#define pmd_populate_kernel		pmd_populate
 
 pgtable_t pte_alloc_one(struct mm_struct *mm);
 
diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h
index 3367e2ba89e0..c5625b2aa331 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -135,6 +135,17 @@ static inline struct page *pmd_page(pmd_t pmd)
 	return pfn_to_page((pmd_val(pmd) & SRMMU_PTD_PMASK) >> (PAGE_SHIFT-4));
 }
 
+static inline unsigned long __pmd_page(pmd_t pmd)
+{
+	unsigned long v;
+
+	if (srmmu_device_memory(pmd_val(pmd)))
+		BUG();
+
+	v = pmd_val(pmd) & SRMMU_PTD_PMASK;
+	return (unsigned long)__nocache_va(v << 4);
+}
+
 static inline unsigned long pud_page_vaddr(pud_t pud)
 {
 	if (srmmu_device_memory(pud_val(pud))) {
diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index 50da4bcdd6fa..c861c0f0df73 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -140,12 +140,6 @@ void pmd_set(pmd_t *pmdp, pte_t *ptep)
 	set_pte((pte_t *)&pmd_val(*pmdp), __pte(SRMMU_ET_PTD | ptp));
 }
 
-void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, struct page *ptep)
-{
-	unsigned long ptp = page_to_pfn(ptep) << (PAGE_SHIFT-4); /* watch for overflow */
-	set_pte((pte_t *)&pmd_val(*pmdp), __pte(SRMMU_ET_PTD | ptp));
-}
-
 /* Find an entry in the third-level page table.. */
 pte_t *pte_offset_kernel(pmd_t *dir, unsigned long address)
 {
@@ -364,31 +358,26 @@ pgd_t *get_pgd_fast(void)
  */
 pgtable_t pte_alloc_one(struct mm_struct *mm)
 {
-	unsigned long pte;
+	pte_t *ptep;
 	struct page *page;
 
-	if ((pte = (unsigned long)pte_alloc_one_kernel(mm)) == 0)
+	if ((ptep = pte_alloc_one_kernel(mm)) == 0)
 		return NULL;
-	page = pfn_to_page(__nocache_pa(pte) >> PAGE_SHIFT);
+	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
 	if (!pgtable_pte_page_ctor(page)) {
 		__free_page(page);
 		return NULL;
 	}
-	return page;
+	return ptep;
 }
 
-void pte_free(struct mm_struct *mm, pgtable_t pte)
+void pte_free(struct mm_struct *mm, pgtable_t ptep)
 {
-	unsigned long p;
-
-	pgtable_pte_page_dtor(pte);
-	p = (unsigned long)page_address(pte);	/* Cached address (for test) */
-	if (p == 0)
-		BUG();
-	p = page_to_pfn(pte) << PAGE_SHIFT;	/* Physical address */
+	struct page *page;
 
-	/* free non cached virtual address*/
-	srmmu_free_nocache(__nocache_va(p), SRMMU_PTE_TABLE_SIZE);
+	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
+	pgtable_pte_page_dtor(page);
+	srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
 }
 
 /* context handling - a dynamically sized pool is used */
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (2 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 03/18] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page * Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-17  0:00   ` [PATCH v5 04/18] " Guenter Roeck
  2020-05-11 20:41 ` [PATCH v5 05/18] compiler/gcc: Raise minimum GCC version for kernel builds to 4.8 Will Deacon
                   ` (14 subsequent siblings)
  18 siblings, 2 replies; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

Now that the page table allocator can free page table allocations
smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
to avoid needlessly wasting memory.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/sparc/include/asm/pgtsrmmu.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/include/asm/pgtsrmmu.h b/arch/sparc/include/asm/pgtsrmmu.h
index 58ea8e8c6ee7..7708d015712b 100644
--- a/arch/sparc/include/asm/pgtsrmmu.h
+++ b/arch/sparc/include/asm/pgtsrmmu.h
@@ -17,8 +17,8 @@
 /* Number of contexts is implementation-dependent; 64k is the most we support */
 #define SRMMU_MAX_CONTEXTS	65536
 
-#define SRMMU_PTE_TABLE_SIZE		(PAGE_SIZE)
-#define SRMMU_PMD_TABLE_SIZE		(PAGE_SIZE)
+#define SRMMU_PTE_TABLE_SIZE		(PTRS_PER_PTE*4)
+#define SRMMU_PMD_TABLE_SIZE		(PTRS_PER_PMD*4)
 #define SRMMU_PGD_TABLE_SIZE		(PTRS_PER_PGD*4)
 
 /* Definition of the values in the ET field of PTD's and PTE's */
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 05/18] compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (3 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 06/18] netfilter: Avoid assigning 'const' pointer to non-const pointer Will Deacon
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

It is very rare to see versions of GCC prior to 4.8 being used to build
the mainline kernel. These old compilers are also known to have codegen
issues which can lead to silent miscompilation:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145

Raise the minimum GCC version to 4.8 for building the kernel and remove
some tautological Kconfig dependencies as a consequence.

Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 Documentation/process/changes.rst |  2 +-
 arch/arm/crypto/Kconfig           | 12 ++++++------
 crypto/Kconfig                    |  1 -
 include/linux/compiler-gcc.h      |  5 ++---
 init/Kconfig                      |  1 -
 scripts/gcc-plugins/Kconfig       |  2 +-
 6 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/Documentation/process/changes.rst b/Documentation/process/changes.rst
index 91c5ff8e161e..5cfb54c2aaa6 100644
--- a/Documentation/process/changes.rst
+++ b/Documentation/process/changes.rst
@@ -29,7 +29,7 @@ you probably needn't concern yourself with pcmciautils.
 ====================== ===============  ========================================
         Program        Minimal version       Command to check the version
 ====================== ===============  ========================================
-GNU C                  4.6              gcc --version
+GNU C                  4.8              gcc --version
 GNU make               3.81             make --version
 binutils               2.23             ld -v
 flex                   2.5.35           flex --version
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 2674de6ada1f..c9bf2df85cb9 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -30,7 +30,7 @@ config CRYPTO_SHA1_ARM_NEON
 
 config CRYPTO_SHA1_ARM_CE
 	tristate "SHA1 digest algorithm (ARM v8 Crypto Extensions)"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	select CRYPTO_SHA1_ARM
 	select CRYPTO_HASH
 	help
@@ -39,7 +39,7 @@ config CRYPTO_SHA1_ARM_CE
 
 config CRYPTO_SHA2_ARM_CE
 	tristate "SHA-224/256 digest algorithm (ARM v8 Crypto Extensions)"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	select CRYPTO_SHA256_ARM
 	select CRYPTO_HASH
 	help
@@ -96,7 +96,7 @@ config CRYPTO_AES_ARM_BS
 
 config CRYPTO_AES_ARM_CE
 	tristate "Accelerated AES using ARMv8 Crypto Extensions"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	select CRYPTO_SKCIPHER
 	select CRYPTO_LIB_AES
 	select CRYPTO_SIMD
@@ -106,7 +106,7 @@ config CRYPTO_AES_ARM_CE
 
 config CRYPTO_GHASH_ARM_CE
 	tristate "PMULL-accelerated GHASH using NEON/ARMv8 Crypto Extensions"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	select CRYPTO_HASH
 	select CRYPTO_CRYPTD
 	select CRYPTO_GF128MUL
@@ -118,13 +118,13 @@ config CRYPTO_GHASH_ARM_CE
 
 config CRYPTO_CRCT10DIF_ARM_CE
 	tristate "CRCT10DIF digest algorithm using PMULL instructions"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	depends on CRC_T10DIF
 	select CRYPTO_HASH
 
 config CRYPTO_CRC32_ARM_CE
 	tristate "CRC32(C) digest algorithm using CRC and/or PMULL instructions"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	depends on CRC32
 	select CRYPTO_HASH
 
diff --git a/crypto/Kconfig b/crypto/Kconfig
index c24a47406f8f..34a8c5bfd062 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -316,7 +316,6 @@ config CRYPTO_AEGIS128
 config CRYPTO_AEGIS128_SIMD
 	bool "Support SIMD acceleration for AEGIS-128"
 	depends on CRYPTO_AEGIS128 && ((ARM || ARM64) && KERNEL_MODE_NEON)
-	depends on !ARM || CC_IS_CLANG || GCC_VERSION >= 40800
 	default y
 
 config CRYPTO_AEGIS128_AESNI_SSE2
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index cf294faec2f8..7dd4e0349ef3 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -10,7 +10,8 @@
 		     + __GNUC_MINOR__ * 100	\
 		     + __GNUC_PATCHLEVEL__)
 
-#if GCC_VERSION < 40600
+/* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 */
+#if GCC_VERSION < 40800
 # error Sorry, your compiler is too old - please upgrade it.
 #endif
 
@@ -126,9 +127,7 @@
 #if defined(CONFIG_ARCH_USE_BUILTIN_BSWAP) && !defined(__CHECKER__)
 #define __HAVE_BUILTIN_BSWAP32__
 #define __HAVE_BUILTIN_BSWAP64__
-#if GCC_VERSION >= 40800
 #define __HAVE_BUILTIN_BSWAP16__
-#endif
 #endif /* CONFIG_ARCH_USE_BUILTIN_BSWAP && !__CHECKER__ */
 
 #if GCC_VERSION >= 70000
diff --git a/init/Kconfig b/init/Kconfig
index 9e22ee8fbd75..035d38a4f9ad 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1285,7 +1285,6 @@ config LD_DEAD_CODE_DATA_ELIMINATION
 	bool "Dead code and data elimination (EXPERIMENTAL)"
 	depends on HAVE_LD_DEAD_CODE_DATA_ELIMINATION
 	depends on EXPERT
-	depends on !(FUNCTION_TRACER && CC_IS_GCC && GCC_VERSION < 40800)
 	depends on $(cc-option,-ffunction-sections -fdata-sections)
 	depends on $(ld-option,--gc-sections)
 	help
diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig
index 013ba3a57669..ce0b99fb5847 100644
--- a/scripts/gcc-plugins/Kconfig
+++ b/scripts/gcc-plugins/Kconfig
@@ -8,7 +8,7 @@ config HAVE_GCC_PLUGINS
 menuconfig GCC_PLUGINS
 	bool "GCC plugins"
 	depends on HAVE_GCC_PLUGINS
-	depends on CC_IS_GCC && GCC_VERSION >= 40800
+	depends on CC_IS_GCC
 	depends on $(success,$(srctree)/scripts/gcc-plugin.sh $(CC))
 	default y
 	help
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 06/18] netfilter: Avoid assigning 'const' pointer to non-const pointer
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (4 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 05/18] compiler/gcc: Raise minimum GCC version for kernel builds to 4.8 Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 07/18] net: tls: " Will Deacon
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

nf_remove_net_hook() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:

  | In file included from ./include/linux/export.h:43,
  |                  from ./include/linux/linkage.h:7,
  |                  from ./include/linux/kernel.h:8,
  |                  from net/netfilter/core.c:9:
  | net/netfilter/core.c: In function ‘nf_remove_net_hook’:
  | ./include/linux/compiler.h:216:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  |   *(volatile typeof(x) *)&(x) = (val);  \
  |                               ^
  | net/netfilter/core.c:379:3: note: in expansion of macro ‘WRITE_ONCE’
  |    WRITE_ONCE(orig_ops[i], &dummy_ops);
  |    ^~~~~~~~~~

Follow the pattern used elsewhere in this file and add a cast to 'void *'
to squash the warning.

Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 net/netfilter/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 78f046ec506f..3ac7c8c1548d 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -376,7 +376,7 @@ static bool nf_remove_net_hook(struct nf_hook_entries *old,
 		if (orig_ops[i] != unreg)
 			continue;
 		WRITE_ONCE(old->hooks[i].hook, accept_all);
-		WRITE_ONCE(orig_ops[i], &dummy_ops);
+		WRITE_ONCE(orig_ops[i], (void *)&dummy_ops);
 		return true;
 	}
 
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 07/18] net: tls: Avoid assigning 'const' pointer to non-const pointer
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (5 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 06/18] netfilter: Avoid assigning 'const' pointer to non-const pointer Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 08/18] fault_inject: Don't rely on "return value" from WRITE_ONCE() Will Deacon
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

tls_build_proto() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:

  | net/tls/tls_main.c: In function ‘tls_build_proto’:
  | ./include/linux/compiler.h:229:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  | net/tls/tls_main.c:640:4: note: in expansion of macro ‘smp_store_release’
  |   640 |    smp_store_release(&saved_tcpv6_prot, prot);
  |       |    ^~~~~~~~~~~~~~~~~

Drop the const qualifier from the local 'prot' variable, as it isn't
needed.

Cc: Boris Pismenny <borisp@mellanox.com>
Cc: Aviad Yehezkel <aviadye@mellanox.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Will Deacon <will@kernel.org>
---
 net/tls/tls_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 156efce50dbd..b33e11c27cfa 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -629,7 +629,7 @@ struct tls_context *tls_ctx_create(struct sock *sk)
 static void tls_build_proto(struct sock *sk)
 {
 	int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
-	const struct proto *prot = READ_ONCE(sk->sk_prot);
+	struct proto *prot = READ_ONCE(sk->sk_prot);
 
 	/* Build IPv6 TLS whenever the address of tcpv6 _prot changes */
 	if (ip_ver == TLSV6 &&
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 08/18] fault_inject: Don't rely on "return value" from WRITE_ONCE()
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (6 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 07/18] net: tls: " Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 09/18] arm64: csum: Disable KASAN for do_csum() Will Deacon
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

It's a bit weird that WRITE_ONCE() evaluates to the value it stores and
it's also different to smp_store_release(), which can't be used this
way.

In preparation for preventing this in WRITE_ONCE(), change the fault
injection code to use a local variable instead.

Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 lib/fault-inject.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/fault-inject.c b/lib/fault-inject.c
index 8186ca84910b..ce12621b4275 100644
--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -106,7 +106,9 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
 		unsigned int fail_nth = READ_ONCE(current->fail_nth);
 
 		if (fail_nth) {
-			if (!WRITE_ONCE(current->fail_nth, fail_nth - 1))
+			fail_nth--;
+			WRITE_ONCE(current->fail_nth, fail_nth);
+			if (!fail_nth)
 				goto fail;
 
 			return false;
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 09/18] arm64: csum: Disable KASAN for do_csum()
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (7 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 08/18] fault_inject: Don't rely on "return value" from WRITE_ONCE() Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 10/18] READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE() Will Deacon
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

do_csum() over-reads the source buffer and therefore abuses
READ_ONCE_NOCHECK() on a 128-bit type to avoid tripping up KASAN. In
preparation for READ_ONCE_NOCHECK() requiring an atomic access, and
therefore failing to build when fed a '__uint128_t', annotate do_csum()
explicitly with '__no_sanitize_address' and fall back to normal loads.

Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/lib/csum.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/lib/csum.c b/arch/arm64/lib/csum.c
index 60eccae2abad..78b87a64ca0a 100644
--- a/arch/arm64/lib/csum.c
+++ b/arch/arm64/lib/csum.c
@@ -14,7 +14,11 @@ static u64 accumulate(u64 sum, u64 data)
 	return tmp + (tmp >> 64);
 }
 
-unsigned int do_csum(const unsigned char *buff, int len)
+/*
+ * We over-read the buffer and this makes KASAN unhappy. Instead, disable
+ * instrumentation and call kasan explicitly.
+ */
+unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len)
 {
 	unsigned int offset, shift, sum;
 	const u64 *ptr;
@@ -42,7 +46,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
 	 * odd/even alignment, and means we can ignore it until the very end.
 	 */
 	shift = offset * 8;
-	data = READ_ONCE_NOCHECK(*ptr++);
+	data = *ptr++;
 #ifdef __LITTLE_ENDIAN
 	data = (data >> shift) << shift;
 #else
@@ -58,10 +62,10 @@ unsigned int do_csum(const unsigned char *buff, int len)
 	while (unlikely(len > 64)) {
 		__uint128_t tmp1, tmp2, tmp3, tmp4;
 
-		tmp1 = READ_ONCE_NOCHECK(*(__uint128_t *)ptr);
-		tmp2 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 2));
-		tmp3 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 4));
-		tmp4 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 6));
+		tmp1 = *(__uint128_t *)ptr;
+		tmp2 = *(__uint128_t *)(ptr + 2);
+		tmp3 = *(__uint128_t *)(ptr + 4);
+		tmp4 = *(__uint128_t *)(ptr + 6);
 
 		len -= 64;
 		ptr += 8;
@@ -85,7 +89,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
 		__uint128_t tmp;
 
 		sum64 = accumulate(sum64, data);
-		tmp = READ_ONCE_NOCHECK(*(__uint128_t *)ptr);
+		tmp = *(__uint128_t *)ptr;
 
 		len -= 16;
 		ptr += 2;
@@ -100,7 +104,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
 	}
 	if (len > 0) {
 		sum64 = accumulate(sum64, data);
-		data = READ_ONCE_NOCHECK(*ptr);
+		data = *ptr;
 		len -= 8;
 	}
 	/*
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 10/18] READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (8 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 09/18] arm64: csum: Disable KASAN for do_csum() Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 11/18] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses Will Deacon
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

The implementations of {READ,WRITE}_ONCE() suffer from a significant
amount of indirection and complexity due to a historic GCC bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145

which was originally worked around by 230fa253df63 ("kernel: Provide
READ_ONCE and ASSIGN_ONCE").

Since GCC 4.8 is fairly vintage at this point and we emit a warning if
we detect it during the build, return {READ,WRITE}_ONCE() to their former
glory with an implementation that is easier to understand and, crucially,
more amenable to optimisation. A side effect of this simplification is
that WRITE_ONCE() no longer returns a value, but nobody seems to be
relying on that and the new behaviour is aligned with smp_store_release().

Acked-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/compiler.h | 141 +++++++++++++++------------------------
 1 file changed, 55 insertions(+), 86 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index cce2c92567b5..fe739850e7c9 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -177,28 +177,57 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 # define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
 #endif
 
-#include <uapi/linux/types.h>
+/*
+ * Prevent the compiler from merging or refetching reads or writes. The
+ * compiler is also forbidden from reordering successive instances of
+ * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
+ * particular ordering. One way to make the compiler aware of ordering is to
+ * put the two invocations of READ_ONCE or WRITE_ONCE in different C
+ * statements.
+ *
+ * These two macros will also work on aggregate data types like structs or
+ * unions.
+ *
+ * Their two major use cases are: (1) Mediating communication between
+ * process-level code and irq/NMI handlers, all running on the same CPU,
+ * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
+ * mutilate accesses that either do not require ordering or that interact
+ * with an explicit memory barrier or atomic instruction that provides the
+ * required ordering.
+ */
+#include <asm/barrier.h>
+#include <linux/kasan-checks.h>
 #include <linux/kcsan-checks.h>
 
-#define __READ_ONCE_SIZE						\
+#define __READ_ONCE(x)	(*(volatile typeof(x) *)&(x))
+
+#define READ_ONCE(x)							\
 ({									\
-	switch (size) {							\
-	case 1: *(__u8 *)res = *(volatile __u8 *)p; break;		\
-	case 2: *(__u16 *)res = *(volatile __u16 *)p; break;		\
-	case 4: *(__u32 *)res = *(volatile __u32 *)p; break;		\
-	case 8: *(__u64 *)res = *(volatile __u64 *)p; break;		\
-	default:							\
-		barrier();						\
-		__builtin_memcpy((void *)res, (const void *)p, size);	\
-		barrier();						\
-	}								\
+	typeof(x) *__xp = &(x);						\
+	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
+	__kcsan_disable_current();					\
+	({								\
+		typeof(x) __x = __READ_ONCE(*__xp);			\
+		__kcsan_enable_current();				\
+		smp_read_barrier_depends();				\
+		__x;							\
+	});								\
 })
 
+#define WRITE_ONCE(x, val)						\
+do {									\
+	typeof(x) *__xp = &(x);						\
+	kcsan_check_atomic_write(__xp, sizeof(*__xp));			\
+	__kcsan_disable_current();					\
+	*(volatile typeof(x) *)__xp = (val);				\
+	__kcsan_enable_current();					\
+} while (0)
+
 #ifdef CONFIG_KASAN
 /*
- * We can't declare function 'inline' because __no_sanitize_address confilcts
+ * We can't declare function 'inline' because __no_sanitize_address conflicts
  * with inlining. Attempt to inline it may cause a build failure.
- * 	https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
+ *     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
  * '__maybe_unused' allows us to avoid defined-but-not-used warnings.
  */
 # define __no_kasan_or_inline __no_sanitize_address notrace __maybe_unused
@@ -225,78 +254,26 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #define __no_sanitize_or_inline __always_inline
 #endif
 
-static __no_kcsan_or_inline
-void __read_once_size(const volatile void *p, void *res, int size)
-{
-	kcsan_check_atomic_read(p, size);
-	__READ_ONCE_SIZE;
-}
-
 static __no_sanitize_or_inline
-void __read_once_size_nocheck(const volatile void *p, void *res, int size)
+unsigned long __read_once_word_nocheck(const void *addr)
 {
-	__READ_ONCE_SIZE;
-}
-
-static __no_kcsan_or_inline
-void __write_once_size(volatile void *p, void *res, int size)
-{
-	kcsan_check_atomic_write(p, size);
-
-	switch (size) {
-	case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
-	case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
-	case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
-	case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
-	default:
-		barrier();
-		__builtin_memcpy((void *)p, (const void *)res, size);
-		barrier();
-	}
+	return __READ_ONCE(*(unsigned long *)addr);
 }
 
 /*
- * Prevent the compiler from merging or refetching reads or writes. The
- * compiler is also forbidden from reordering successive instances of
- * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
- * particular ordering. One way to make the compiler aware of ordering is to
- * put the two invocations of READ_ONCE or WRITE_ONCE in different C
- * statements.
- *
- * These two macros will also work on aggregate data types like structs or
- * unions. If the size of the accessed data type exceeds the word size of
- * the machine (e.g., 32 bits or 64 bits) READ_ONCE() and WRITE_ONCE() will
- * fall back to memcpy(). There's at least two memcpy()s: one for the
- * __builtin_memcpy() and then one for the macro doing the copy of variable
- * - '__u' allocated on the stack.
- *
- * Their two major use cases are: (1) Mediating communication between
- * process-level code and irq/NMI handlers, all running on the same CPU,
- * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
- * mutilate accesses that either do not require ordering or that interact
- * with an explicit memory barrier or atomic instruction that provides the
- * required ordering.
+ * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need to load a
+ * word from memory atomically but without telling KASAN/KCSAN. This is
+ * usually used by unwinding code when walking the stack of a running process.
  */
-#include <asm/barrier.h>
-#include <linux/kasan-checks.h>
-
-#define __READ_ONCE(x, check)						\
+#define READ_ONCE_NOCHECK(x)						\
 ({									\
-	union { typeof(x) __val; char __c[1]; } __u;			\
-	if (check)							\
-		__read_once_size(&(x), __u.__c, sizeof(x));		\
-	else								\
-		__read_once_size_nocheck(&(x), __u.__c, sizeof(x));	\
-	smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
-	__u.__val;							\
+	unsigned long __x;						\
+	compiletime_assert(sizeof(x) == sizeof(__x),			\
+		"Unsupported access size for READ_ONCE_NOCHECK().");	\
+	__x = __read_once_word_nocheck(&(x));				\
+	smp_read_barrier_depends();					\
+	__x;								\
 })
-#define READ_ONCE(x) __READ_ONCE(x, 1)
-
-/*
- * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
- * to hide memory access from KASAN.
- */
-#define READ_ONCE_NOCHECK(x) __READ_ONCE(x, 0)
 
 static __no_kasan_or_inline
 unsigned long read_word_at_a_time(const void *addr)
@@ -305,14 +282,6 @@ unsigned long read_word_at_a_time(const void *addr)
 	return *(unsigned long *)addr;
 }
 
-#define WRITE_ONCE(x, val) \
-({							\
-	union { typeof(x) __val; char __c[1]; } __u =	\
-		{ .__val = (__force typeof(x)) (val) }; \
-	__write_once_size(&(x), __u.__c, sizeof(x));	\
-	__u.__val;					\
-})
-
 /**
  * data_race - mark an expression as containing intentional data races
  *
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 11/18] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (9 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 10/18] READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE() Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 12/18] READ_ONCE: Drop pointer qualifiers when reading from scalar types Will Deacon
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

{READ,WRITE}_ONCE() cannot guarantee atomicity for arbitrary data sizes.
This can be surprising to callers that might incorrectly be expecting
atomicity for accesses to aggregate structures, although there are other
callers where tearing is actually permissable (e.g. if they are using
something akin to sequence locking to protect the access).

Linus sayeth:

  | We could also look at being stricter for the normal READ/WRITE_ONCE(),
  | and require that they are
  |
  | (a) regular integer types
  |
  | (b) fit in an atomic word
  |
  | We actually did (b) for a while, until we noticed that we do it on
  | loff_t's etc and relaxed the rules. But maybe we could have a
  | "non-atomic" version of READ/WRITE_ONCE() that is used for the
  | questionable cases?

The slight snag is that we also have to support 64-bit accesses on 32-bit
architectures, as these appear to be widespread and tend to work out ok
if either the architecture supports atomic 64-bit accesses (armv7) or if
the variable being accessed represents a virtual address and therefore
only requires 32-bit atomicity in practice.

Take a step in that direction by introducing a variant of
'compiletime_assert_atomic_type()' and use it to check the pointer
argument to {READ,WRITE}_ONCE(). Expose __{READ,WRITE}_ONCE() variants
which are allowed to tear and convert the one broken caller over to the
new macros.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/xen/time.c       |  2 +-
 include/linux/compiler.h | 40 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/time.c b/drivers/xen/time.c
index 0968859c29d0..108edbcbc040 100644
--- a/drivers/xen/time.c
+++ b/drivers/xen/time.c
@@ -64,7 +64,7 @@ static void xen_get_runstate_snapshot_cpu_delta(
 	do {
 		state_time = get64(&state->state_entry_time);
 		rmb();	/* Hypervisor might update data. */
-		*res = READ_ONCE(*state);
+		*res = __READ_ONCE(*state);
 		rmb();	/* Hypervisor might update data. */
 	} while (get64(&state->state_entry_time) != state_time ||
 		 (state_time & XEN_RUNSTATE_UPDATE));
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index fe739850e7c9..e1b839e42563 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -199,9 +199,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #include <linux/kasan-checks.h>
 #include <linux/kcsan-checks.h>
 
-#define __READ_ONCE(x)	(*(volatile typeof(x) *)&(x))
+/*
+ * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
+ * atomicity or dependency ordering guarantees. Note that this may result
+ * in tears!
+ */
+#define __READ_ONCE(x)	(*(const volatile typeof(x) *)&(x))
 
-#define READ_ONCE(x)							\
+#define __READ_ONCE_SCALAR(x)						\
 ({									\
 	typeof(x) *__xp = &(x);						\
 	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
@@ -214,15 +219,32 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 	});								\
 })
 
-#define WRITE_ONCE(x, val)						\
+#define READ_ONCE(x)							\
+({									\
+	compiletime_assert_rwonce_type(x);				\
+	__READ_ONCE_SCALAR(x);						\
+})
+
+#define __WRITE_ONCE(x, val)						\
+do {									\
+	*(volatile typeof(x) *)&(x) = (val);				\
+} while (0)
+
+#define __WRITE_ONCE_SCALAR(x, val)					\
 do {									\
 	typeof(x) *__xp = &(x);						\
 	kcsan_check_atomic_write(__xp, sizeof(*__xp));			\
 	__kcsan_disable_current();					\
-	*(volatile typeof(x) *)__xp = (val);				\
+	__WRITE_ONCE(*__xp, val);					\
 	__kcsan_enable_current();					\
 } while (0)
 
+#define WRITE_ONCE(x, val)						\
+do {									\
+	compiletime_assert_rwonce_type(x);				\
+	__WRITE_ONCE_SCALAR(x, val);					\
+} while (0)
+
 #ifdef CONFIG_KASAN
 /*
  * We can't declare function 'inline' because __no_sanitize_address conflicts
@@ -366,6 +388,16 @@ static inline void *offset_to_ptr(const int *off)
 	compiletime_assert(__native_word(t),				\
 		"Need native word sized stores/loads for atomicity.")
 
+/*
+ * Yes, this permits 64-bit accesses on 32-bit architectures. These will
+ * actually be atomic in many cases (namely x86), but for others we rely on
+ * the access being split into 2x32-bit accesses for a 32-bit quantity (e.g.
+ * a virtual address) and a strong prevailing wind.
+ */
+#define compiletime_assert_rwonce_type(t)					\
+	compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long),	\
+		"Unsupported access size for {READ,WRITE}_ONCE().")
+
 /* &a[0] degrades to a pointer: a different type from an array */
 #define __must_be_array(a)	BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
 
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 12/18] READ_ONCE: Drop pointer qualifiers when reading from scalar types
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (10 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 11/18] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 13/18] locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros Will Deacon
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

Passing a volatile-qualified pointer to READ_ONCE() is an absolute
trainwreck for code generation: the use of 'typeof()' to define a
temporary variable inside the macro means that the final evaluation in
macro scope ends up forcing a read back from the stack. When stack
protector is enabled (the default for arm64, at least), this causes
the compiler to vomit up all sorts of junk.

Unfortunately, dropping pointer qualifiers inside the macro poses quite
a challenge, especially since the pointed-to type is permitted to be an
aggregate, and this is relied upon by mm/ code accessing things like
'pmd_t'. Based on numerous hacks and discussions on the mailing list,
this is the best I've managed to come up with.

Introduce '__unqual_scalar_typeof()' which takes an expression and, if
the expression is an optionally qualified 8, 16, 32 or 64-bit scalar
type, evaluates to the unqualified type. Other input types, including
aggregates, remain unchanged. Hopefully READ_ONCE() on volatile aggregate
pointers isn't something we do on a fast-path.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/compiler.h       |  6 +++---
 include/linux/compiler_types.h | 26 ++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e1b839e42563..0caced170a8a 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -204,7 +204,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
  * atomicity or dependency ordering guarantees. Note that this may result
  * in tears!
  */
-#define __READ_ONCE(x)	(*(const volatile typeof(x) *)&(x))
+#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
 
 #define __READ_ONCE_SCALAR(x)						\
 ({									\
@@ -212,10 +212,10 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
 	__kcsan_disable_current();					\
 	({								\
-		typeof(x) __x = __READ_ONCE(*__xp);			\
+		__unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp);	\
 		__kcsan_enable_current();				\
 		smp_read_barrier_depends();				\
-		__x;							\
+		(typeof(x))__x;						\
 	});								\
 })
 
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index e970f97a7fcb..6ed0612bc143 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -210,6 +210,32 @@ struct ftrace_likely_data {
 /* Are two types/vars the same type (ignoring qualifiers)? */
 #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
 
+/*
+ * __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
+ *			       non-scalar types unchanged.
+ *
+ * We build this out of a couple of helper macros in a vain attempt to
+ * help you keep your lunch down while reading it.
+ */
+#define __pick_scalar_type(x, type, otherwise)					\
+	__builtin_choose_expr(__same_type(x, type), (type)0, otherwise)
+
+/*
+ * 'char' is not type-compatible with either 'signed char' or 'unsigned char',
+ * so we include the naked type here as well as the signed/unsigned variants.
+ */
+#define __pick_integer_type(x, type, otherwise)					\
+	__pick_scalar_type(x, type,						\
+		__pick_scalar_type(x, unsigned type,				\
+			__pick_scalar_type(x, signed type, otherwise)))
+
+#define __unqual_scalar_typeof(x) typeof(					\
+	__pick_integer_type(x, char,						\
+		__pick_integer_type(x, short,					\
+			__pick_integer_type(x, int,				\
+				__pick_integer_type(x, long,			\
+					__pick_integer_type(x, long long, x))))))
+
 /* Is this type a native word size -- useful for atomic operations */
 #define __native_word(t) \
 	(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 13/18] locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (11 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 12/18] READ_ONCE: Drop pointer qualifiers when reading from scalar types Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 14/18] arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros Will Deacon
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

Passing volatile-qualified pointers to the asm-generic implementations of
the load-acquire macros results in a re-load from the stack due to the
temporary result variable inheriting the volatile semantics thanks to the
use of 'typeof()'.

Define these temporary variables using 'unqual_scalar_typeof' to drop
the volatile qualifier in the case that they are scalar types.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/asm-generic/barrier.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 85b28eb80b11..2eacaf7d62f6 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -128,10 +128,10 @@ do {									\
 #ifndef __smp_load_acquire
 #define __smp_load_acquire(p)						\
 ({									\
-	typeof(*p) ___p1 = READ_ONCE(*p);				\
+	__unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p);		\
 	compiletime_assert_atomic_type(*p);				\
 	__smp_mb();							\
-	___p1;								\
+	(typeof(*p))___p1;						\
 })
 #endif
 
@@ -183,10 +183,10 @@ do {									\
 #ifndef smp_load_acquire
 #define smp_load_acquire(p)						\
 ({									\
-	typeof(*p) ___p1 = READ_ONCE(*p);				\
+	__unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p);		\
 	compiletime_assert_atomic_type(*p);				\
 	barrier();							\
-	___p1;								\
+	(typeof(*p))___p1;						\
 })
 #endif
 
@@ -229,14 +229,14 @@ do {									\
 #ifndef smp_cond_load_relaxed
 #define smp_cond_load_relaxed(ptr, cond_expr) ({		\
 	typeof(ptr) __PTR = (ptr);				\
-	typeof(*ptr) VAL;					\
+	__unqual_scalar_typeof(*ptr) VAL;			\
 	for (;;) {						\
 		VAL = READ_ONCE(*__PTR);			\
 		if (cond_expr)					\
 			break;					\
 		cpu_relax();					\
 	}							\
-	VAL;							\
+	(typeof(*ptr))VAL;					\
 })
 #endif
 
@@ -250,10 +250,10 @@ do {									\
  */
 #ifndef smp_cond_load_acquire
 #define smp_cond_load_acquire(ptr, cond_expr) ({		\
-	typeof(*ptr) _val;					\
+	__unqual_scalar_typeof(*ptr) _val;			\
 	_val = smp_cond_load_relaxed(ptr, cond_expr);		\
 	smp_acquire__after_ctrl_dep();				\
-	_val;							\
+	(typeof(*ptr))_val;					\
 })
 #endif
 
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 14/18] arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (12 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 13/18] locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 15/18] gcov: Remove old GCC 3.4 support Will Deacon
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

Passing volatile-qualified pointers to the arm64 implementations of the
load-acquire/store-release macros results in a re-load from the stack
and a bunch of associated stack-protector churn due to the temporary
result variable inheriting the volatile semantics thanks to the use of
'typeof()'.

Define these temporary variables using 'unqual_scalar_typeof' to drop
the volatile qualifier in the case that they are scalar types.

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/barrier.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 7d9cc5ec4971..fb4c27506ef4 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -76,8 +76,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long idx,
 #define __smp_store_release(p, v)					\
 do {									\
 	typeof(p) __p = (p);						\
-	union { typeof(*p) __val; char __c[1]; } __u =			\
-		{ .__val = (__force typeof(*p)) (v) };			\
+	union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u =	\
+		{ .__val = (__force __unqual_scalar_typeof(*p)) (v) };	\
 	compiletime_assert_atomic_type(*p);				\
 	kasan_check_write(__p, sizeof(*p));				\
 	switch (sizeof(*p)) {						\
@@ -110,7 +110,7 @@ do {									\
 
 #define __smp_load_acquire(p)						\
 ({									\
-	union { typeof(*p) __val; char __c[1]; } __u;			\
+	union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u;	\
 	typeof(p) __p = (p);						\
 	compiletime_assert_atomic_type(*p);				\
 	kasan_check_read(__p, sizeof(*p));				\
@@ -136,33 +136,33 @@ do {									\
 			: "Q" (*__p) : "memory");			\
 		break;							\
 	}								\
-	__u.__val;							\
+	(typeof(*p))__u.__val;						\
 })
 
 #define smp_cond_load_relaxed(ptr, cond_expr)				\
 ({									\
 	typeof(ptr) __PTR = (ptr);					\
-	typeof(*ptr) VAL;						\
+	__unqual_scalar_typeof(*ptr) VAL;				\
 	for (;;) {							\
 		VAL = READ_ONCE(*__PTR);				\
 		if (cond_expr)						\
 			break;						\
 		__cmpwait_relaxed(__PTR, VAL);				\
 	}								\
-	VAL;								\
+	(typeof(*ptr))VAL;						\
 })
 
 #define smp_cond_load_acquire(ptr, cond_expr)				\
 ({									\
 	typeof(ptr) __PTR = (ptr);					\
-	typeof(*ptr) VAL;						\
+	__unqual_scalar_typeof(*ptr) VAL;				\
 	for (;;) {							\
 		VAL = smp_load_acquire(__PTR);				\
 		if (cond_expr)						\
 			break;						\
 		__cmpwait_relaxed(__PTR, VAL);				\
 	}								\
-	VAL;								\
+	(typeof(*ptr))VAL;						\
 })
 
 #include <asm-generic/barrier.h>
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 15/18] gcov: Remove old GCC 3.4 support
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (13 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 14/18] arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 16/18] kcsan: Rework data_race() so that it can be used by READ_ONCE() Will Deacon
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

The kernel requires at least GCC 4.8 in order to build, and so there is
no need to cater for the pre-4.7 gcov format.

Remove the obsolete code.

Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 kernel/gcov/Kconfig   |  24 --
 kernel/gcov/Makefile  |   3 +-
 kernel/gcov/gcc_3_4.c | 573 ------------------------------------------
 3 files changed, 1 insertion(+), 599 deletions(-)
 delete mode 100644 kernel/gcov/gcc_3_4.c

diff --git a/kernel/gcov/Kconfig b/kernel/gcov/Kconfig
index 3941a9c48f83..feaad597b3f4 100644
--- a/kernel/gcov/Kconfig
+++ b/kernel/gcov/Kconfig
@@ -51,28 +51,4 @@ config GCOV_PROFILE_ALL
 	larger and run slower. Also be sure to exclude files from profiling
 	which are not linked to the kernel image to prevent linker errors.
 
-choice
-	prompt "Specify GCOV format"
-	depends on GCOV_KERNEL
-	depends on CC_IS_GCC
-	---help---
-	The gcov format is usually determined by the GCC version, and the
-	default is chosen according to your GCC version. However, there are
-	exceptions where format changes are integrated in lower-version GCCs.
-	In such a case, change this option to adjust the format used in the
-	kernel accordingly.
-
-config GCOV_FORMAT_3_4
-	bool "GCC 3.4 format"
-	depends on GCC_VERSION < 40700
-	---help---
-	Select this option to use the format defined by GCC 3.4.
-
-config GCOV_FORMAT_4_7
-	bool "GCC 4.7 format"
-	---help---
-	Select this option to use the format defined by GCC 4.7.
-
-endchoice
-
 endmenu
diff --git a/kernel/gcov/Makefile b/kernel/gcov/Makefile
index d66a74b0f100..16f8ecc7d882 100644
--- a/kernel/gcov/Makefile
+++ b/kernel/gcov/Makefile
@@ -2,6 +2,5 @@
 ccflags-y := -DSRCTREE='"$(srctree)"' -DOBJTREE='"$(objtree)"'
 
 obj-y := base.o fs.o
-obj-$(CONFIG_GCOV_FORMAT_3_4) += gcc_base.o gcc_3_4.o
-obj-$(CONFIG_GCOV_FORMAT_4_7) += gcc_base.o gcc_4_7.o
+obj-$(CONFIG_CC_IS_GCC) += gcc_base.o gcc_4_7.o
 obj-$(CONFIG_CC_IS_CLANG) += clang.o
diff --git a/kernel/gcov/gcc_3_4.c b/kernel/gcov/gcc_3_4.c
deleted file mode 100644
index acb83558e5df..000000000000
--- a/kernel/gcov/gcc_3_4.c
+++ /dev/null
@@ -1,573 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- *  This code provides functions to handle gcc's profiling data format
- *  introduced with gcc 3.4. Future versions of gcc may change the gcov
- *  format (as happened before), so all format-specific information needs
- *  to be kept modular and easily exchangeable.
- *
- *  This file is based on gcc-internal definitions. Functions and data
- *  structures are defined to be compatible with gcc counterparts.
- *  For a better understanding, refer to gcc source: gcc/gcov-io.h.
- *
- *    Copyright IBM Corp. 2009
- *    Author(s): Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
- *
- *    Uses gcc-internal data definitions.
- */
-
-#include <linux/errno.h>
-#include <linux/slab.h>
-#include <linux/string.h>
-#include <linux/seq_file.h>
-#include <linux/vmalloc.h>
-#include "gcov.h"
-
-#define GCOV_COUNTERS		5
-
-static struct gcov_info *gcov_info_head;
-
-/**
- * struct gcov_fn_info - profiling meta data per function
- * @ident: object file-unique function identifier
- * @checksum: function checksum
- * @n_ctrs: number of values per counter type belonging to this function
- *
- * This data is generated by gcc during compilation and doesn't change
- * at run-time.
- */
-struct gcov_fn_info {
-	unsigned int ident;
-	unsigned int checksum;
-	unsigned int n_ctrs[];
-};
-
-/**
- * struct gcov_ctr_info - profiling data per counter type
- * @num: number of counter values for this type
- * @values: array of counter values for this type
- * @merge: merge function for counter values of this type (unused)
- *
- * This data is generated by gcc during compilation and doesn't change
- * at run-time with the exception of the values array.
- */
-struct gcov_ctr_info {
-	unsigned int	num;
-	gcov_type	*values;
-	void		(*merge)(gcov_type *, unsigned int);
-};
-
-/**
- * struct gcov_info - profiling data per object file
- * @version: gcov version magic indicating the gcc version used for compilation
- * @next: list head for a singly-linked list
- * @stamp: time stamp
- * @filename: name of the associated gcov data file
- * @n_functions: number of instrumented functions
- * @functions: function data
- * @ctr_mask: mask specifying which counter types are active
- * @counts: counter data per counter type
- *
- * This data is generated by gcc during compilation and doesn't change
- * at run-time with the exception of the next pointer.
- */
-struct gcov_info {
-	unsigned int			version;
-	struct gcov_info		*next;
-	unsigned int			stamp;
-	const char			*filename;
-	unsigned int			n_functions;
-	const struct gcov_fn_info	*functions;
-	unsigned int			ctr_mask;
-	struct gcov_ctr_info		counts[];
-};
-
-/**
- * gcov_info_filename - return info filename
- * @info: profiling data set
- */
-const char *gcov_info_filename(struct gcov_info *info)
-{
-	return info->filename;
-}
-
-/**
- * gcov_info_version - return info version
- * @info: profiling data set
- */
-unsigned int gcov_info_version(struct gcov_info *info)
-{
-	return info->version;
-}
-
-/**
- * gcov_info_next - return next profiling data set
- * @info: profiling data set
- *
- * Returns next gcov_info following @info or first gcov_info in the chain if
- * @info is %NULL.
- */
-struct gcov_info *gcov_info_next(struct gcov_info *info)
-{
-	if (!info)
-		return gcov_info_head;
-
-	return info->next;
-}
-
-/**
- * gcov_info_link - link/add profiling data set to the list
- * @info: profiling data set
- */
-void gcov_info_link(struct gcov_info *info)
-{
-	info->next = gcov_info_head;
-	gcov_info_head = info;
-}
-
-/**
- * gcov_info_unlink - unlink/remove profiling data set from the list
- * @prev: previous profiling data set
- * @info: profiling data set
- */
-void gcov_info_unlink(struct gcov_info *prev, struct gcov_info *info)
-{
-	if (prev)
-		prev->next = info->next;
-	else
-		gcov_info_head = info->next;
-}
-
-/**
- * gcov_info_within_module - check if a profiling data set belongs to a module
- * @info: profiling data set
- * @mod: module
- *
- * Returns true if profiling data belongs module, false otherwise.
- */
-bool gcov_info_within_module(struct gcov_info *info, struct module *mod)
-{
-	return within_module((unsigned long)info, mod);
-}
-
-/* Symbolic links to be created for each profiling data file. */
-const struct gcov_link gcov_link[] = {
-	{ OBJ_TREE, "gcno" },	/* Link to .gcno file in $(objtree). */
-	{ 0, NULL},
-};
-
-/*
- * Determine whether a counter is active. Based on gcc magic. Doesn't change
- * at run-time.
- */
-static int counter_active(struct gcov_info *info, unsigned int type)
-{
-	return (1 << type) & info->ctr_mask;
-}
-
-/* Determine number of active counters. Based on gcc magic. */
-static unsigned int num_counter_active(struct gcov_info *info)
-{
-	unsigned int i;
-	unsigned int result = 0;
-
-	for (i = 0; i < GCOV_COUNTERS; i++) {
-		if (counter_active(info, i))
-			result++;
-	}
-	return result;
-}
-
-/**
- * gcov_info_reset - reset profiling data to zero
- * @info: profiling data set
- */
-void gcov_info_reset(struct gcov_info *info)
-{
-	unsigned int active = num_counter_active(info);
-	unsigned int i;
-
-	for (i = 0; i < active; i++) {
-		memset(info->counts[i].values, 0,
-		       info->counts[i].num * sizeof(gcov_type));
-	}
-}
-
-/**
- * gcov_info_is_compatible - check if profiling data can be added
- * @info1: first profiling data set
- * @info2: second profiling data set
- *
- * Returns non-zero if profiling data can be added, zero otherwise.
- */
-int gcov_info_is_compatible(struct gcov_info *info1, struct gcov_info *info2)
-{
-	return (info1->stamp == info2->stamp);
-}
-
-/**
- * gcov_info_add - add up profiling data
- * @dest: profiling data set to which data is added
- * @source: profiling data set which is added
- *
- * Adds profiling counts of @source to @dest.
- */
-void gcov_info_add(struct gcov_info *dest, struct gcov_info *source)
-{
-	unsigned int i;
-	unsigned int j;
-
-	for (i = 0; i < num_counter_active(dest); i++) {
-		for (j = 0; j < dest->counts[i].num; j++) {
-			dest->counts[i].values[j] +=
-				source->counts[i].values[j];
-		}
-	}
-}
-
-/* Get size of function info entry. Based on gcc magic. */
-static size_t get_fn_size(struct gcov_info *info)
-{
-	size_t size;
-
-	size = sizeof(struct gcov_fn_info) + num_counter_active(info) *
-	       sizeof(unsigned int);
-	if (__alignof__(struct gcov_fn_info) > sizeof(unsigned int))
-		size = ALIGN(size, __alignof__(struct gcov_fn_info));
-	return size;
-}
-
-/* Get address of function info entry. Based on gcc magic. */
-static struct gcov_fn_info *get_fn_info(struct gcov_info *info, unsigned int fn)
-{
-	return (struct gcov_fn_info *)
-		((char *) info->functions + fn * get_fn_size(info));
-}
-
-/**
- * gcov_info_dup - duplicate profiling data set
- * @info: profiling data set to duplicate
- *
- * Return newly allocated duplicate on success, %NULL on error.
- */
-struct gcov_info *gcov_info_dup(struct gcov_info *info)
-{
-	struct gcov_info *dup;
-	unsigned int i;
-	unsigned int active;
-
-	/* Duplicate gcov_info. */
-	active = num_counter_active(info);
-	dup = kzalloc(struct_size(dup, counts, active), GFP_KERNEL);
-	if (!dup)
-		return NULL;
-	dup->version		= info->version;
-	dup->stamp		= info->stamp;
-	dup->n_functions	= info->n_functions;
-	dup->ctr_mask		= info->ctr_mask;
-	/* Duplicate filename. */
-	dup->filename		= kstrdup(info->filename, GFP_KERNEL);
-	if (!dup->filename)
-		goto err_free;
-	/* Duplicate table of functions. */
-	dup->functions = kmemdup(info->functions, info->n_functions *
-				 get_fn_size(info), GFP_KERNEL);
-	if (!dup->functions)
-		goto err_free;
-	/* Duplicate counter arrays. */
-	for (i = 0; i < active ; i++) {
-		struct gcov_ctr_info *ctr = &info->counts[i];
-		size_t size = ctr->num * sizeof(gcov_type);
-
-		dup->counts[i].num = ctr->num;
-		dup->counts[i].merge = ctr->merge;
-		dup->counts[i].values = vmalloc(size);
-		if (!dup->counts[i].values)
-			goto err_free;
-		memcpy(dup->counts[i].values, ctr->values, size);
-	}
-	return dup;
-
-err_free:
-	gcov_info_free(dup);
-	return NULL;
-}
-
-/**
- * gcov_info_free - release memory for profiling data set duplicate
- * @info: profiling data set duplicate to free
- */
-void gcov_info_free(struct gcov_info *info)
-{
-	unsigned int active = num_counter_active(info);
-	unsigned int i;
-
-	for (i = 0; i < active ; i++)
-		vfree(info->counts[i].values);
-	kfree(info->functions);
-	kfree(info->filename);
-	kfree(info);
-}
-
-/**
- * struct type_info - iterator helper array
- * @ctr_type: counter type
- * @offset: index of the first value of the current function for this type
- *
- * This array is needed to convert the in-memory data format into the in-file
- * data format:
- *
- * In-memory:
- *   for each counter type
- *     for each function
- *       values
- *
- * In-file:
- *   for each function
- *     for each counter type
- *       values
- *
- * See gcc source gcc/gcov-io.h for more information on data organization.
- */
-struct type_info {
-	int ctr_type;
-	unsigned int offset;
-};
-
-/**
- * struct gcov_iterator - specifies current file position in logical records
- * @info: associated profiling data
- * @record: record type
- * @function: function number
- * @type: counter type
- * @count: index into values array
- * @num_types: number of counter types
- * @type_info: helper array to get values-array offset for current function
- */
-struct gcov_iterator {
-	struct gcov_info *info;
-
-	int record;
-	unsigned int function;
-	unsigned int type;
-	unsigned int count;
-
-	int num_types;
-	struct type_info type_info[];
-};
-
-static struct gcov_fn_info *get_func(struct gcov_iterator *iter)
-{
-	return get_fn_info(iter->info, iter->function);
-}
-
-static struct type_info *get_type(struct gcov_iterator *iter)
-{
-	return &iter->type_info[iter->type];
-}
-
-/**
- * gcov_iter_new - allocate and initialize profiling data iterator
- * @info: profiling data set to be iterated
- *
- * Return file iterator on success, %NULL otherwise.
- */
-struct gcov_iterator *gcov_iter_new(struct gcov_info *info)
-{
-	struct gcov_iterator *iter;
-
-	iter = kzalloc(struct_size(iter, type_info, num_counter_active(info)),
-		       GFP_KERNEL);
-	if (iter)
-		iter->info = info;
-
-	return iter;
-}
-
-/**
- * gcov_iter_free - release memory for iterator
- * @iter: file iterator to free
- */
-void gcov_iter_free(struct gcov_iterator *iter)
-{
-	kfree(iter);
-}
-
-/**
- * gcov_iter_get_info - return profiling data set for given file iterator
- * @iter: file iterator
- */
-struct gcov_info *gcov_iter_get_info(struct gcov_iterator *iter)
-{
-	return iter->info;
-}
-
-/**
- * gcov_iter_start - reset file iterator to starting position
- * @iter: file iterator
- */
-void gcov_iter_start(struct gcov_iterator *iter)
-{
-	int i;
-
-	iter->record = 0;
-	iter->function = 0;
-	iter->type = 0;
-	iter->count = 0;
-	iter->num_types = 0;
-	for (i = 0; i < GCOV_COUNTERS; i++) {
-		if (counter_active(iter->info, i)) {
-			iter->type_info[iter->num_types].ctr_type = i;
-			iter->type_info[iter->num_types++].offset = 0;
-		}
-	}
-}
-
-/* Mapping of logical record number to actual file content. */
-#define RECORD_FILE_MAGIC	0
-#define RECORD_GCOV_VERSION	1
-#define RECORD_TIME_STAMP	2
-#define RECORD_FUNCTION_TAG	3
-#define RECORD_FUNCTON_TAG_LEN	4
-#define RECORD_FUNCTION_IDENT	5
-#define RECORD_FUNCTION_CHECK	6
-#define RECORD_COUNT_TAG	7
-#define RECORD_COUNT_LEN	8
-#define RECORD_COUNT		9
-
-/**
- * gcov_iter_next - advance file iterator to next logical record
- * @iter: file iterator
- *
- * Return zero if new position is valid, non-zero if iterator has reached end.
- */
-int gcov_iter_next(struct gcov_iterator *iter)
-{
-	switch (iter->record) {
-	case RECORD_FILE_MAGIC:
-	case RECORD_GCOV_VERSION:
-	case RECORD_FUNCTION_TAG:
-	case RECORD_FUNCTON_TAG_LEN:
-	case RECORD_FUNCTION_IDENT:
-	case RECORD_COUNT_TAG:
-		/* Advance to next record */
-		iter->record++;
-		break;
-	case RECORD_COUNT:
-		/* Advance to next count */
-		iter->count++;
-		/* fall through */
-	case RECORD_COUNT_LEN:
-		if (iter->count < get_func(iter)->n_ctrs[iter->type]) {
-			iter->record = 9;
-			break;
-		}
-		/* Advance to next counter type */
-		get_type(iter)->offset += iter->count;
-		iter->count = 0;
-		iter->type++;
-		/* fall through */
-	case RECORD_FUNCTION_CHECK:
-		if (iter->type < iter->num_types) {
-			iter->record = 7;
-			break;
-		}
-		/* Advance to next function */
-		iter->type = 0;
-		iter->function++;
-		/* fall through */
-	case RECORD_TIME_STAMP:
-		if (iter->function < iter->info->n_functions)
-			iter->record = 3;
-		else
-			iter->record = -1;
-		break;
-	}
-	/* Check for EOF. */
-	if (iter->record == -1)
-		return -EINVAL;
-	else
-		return 0;
-}
-
-/**
- * seq_write_gcov_u32 - write 32 bit number in gcov format to seq_file
- * @seq: seq_file handle
- * @v: value to be stored
- *
- * Number format defined by gcc: numbers are recorded in the 32 bit
- * unsigned binary form of the endianness of the machine generating the
- * file.
- */
-static int seq_write_gcov_u32(struct seq_file *seq, u32 v)
-{
-	return seq_write(seq, &v, sizeof(v));
-}
-
-/**
- * seq_write_gcov_u64 - write 64 bit number in gcov format to seq_file
- * @seq: seq_file handle
- * @v: value to be stored
- *
- * Number format defined by gcc: numbers are recorded in the 32 bit
- * unsigned binary form of the endianness of the machine generating the
- * file. 64 bit numbers are stored as two 32 bit numbers, the low part
- * first.
- */
-static int seq_write_gcov_u64(struct seq_file *seq, u64 v)
-{
-	u32 data[2];
-
-	data[0] = (v & 0xffffffffUL);
-	data[1] = (v >> 32);
-	return seq_write(seq, data, sizeof(data));
-}
-
-/**
- * gcov_iter_write - write data for current pos to seq_file
- * @iter: file iterator
- * @seq: seq_file handle
- *
- * Return zero on success, non-zero otherwise.
- */
-int gcov_iter_write(struct gcov_iterator *iter, struct seq_file *seq)
-{
-	int rc = -EINVAL;
-
-	switch (iter->record) {
-	case RECORD_FILE_MAGIC:
-		rc = seq_write_gcov_u32(seq, GCOV_DATA_MAGIC);
-		break;
-	case RECORD_GCOV_VERSION:
-		rc = seq_write_gcov_u32(seq, iter->info->version);
-		break;
-	case RECORD_TIME_STAMP:
-		rc = seq_write_gcov_u32(seq, iter->info->stamp);
-		break;
-	case RECORD_FUNCTION_TAG:
-		rc = seq_write_gcov_u32(seq, GCOV_TAG_FUNCTION);
-		break;
-	case RECORD_FUNCTON_TAG_LEN:
-		rc = seq_write_gcov_u32(seq, 2);
-		break;
-	case RECORD_FUNCTION_IDENT:
-		rc = seq_write_gcov_u32(seq, get_func(iter)->ident);
-		break;
-	case RECORD_FUNCTION_CHECK:
-		rc = seq_write_gcov_u32(seq, get_func(iter)->checksum);
-		break;
-	case RECORD_COUNT_TAG:
-		rc = seq_write_gcov_u32(seq,
-			GCOV_TAG_FOR_COUNTER(get_type(iter)->ctr_type));
-		break;
-	case RECORD_COUNT_LEN:
-		rc = seq_write_gcov_u32(seq,
-				get_func(iter)->n_ctrs[iter->type] * 2);
-		break;
-	case RECORD_COUNT:
-		rc = seq_write_gcov_u64(seq,
-			iter->info->counts[iter->type].
-				values[iter->count + get_type(iter)->offset]);
-		break;
-	}
-	return rc;
-}
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 16/18] kcsan: Rework data_race() so that it can be used by READ_ONCE()
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (14 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 15/18] gcov: Remove old GCC 3.4 support Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 17/18] READ_ONCE: Use data_race() to avoid KCSAN instrumentation Will Deacon
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

Rework the data_race() macro so that it:

  - Accepts expressions which evaluate to a 'const' type
  - Attempts to discard volatile qualifiers from scalar types, avoiding
    pointless stack spills
  - Uses __kcsan_{disable,enable}_current(), allowing its use from code
    that is built independently from the kernel, such as the vDSO

This will allow for its use by {READ,WRITE}_ONCE() in a subsequent patch.
At the same time, fix-up some weird whitespace issues.

Cc: Marco Elver <elver@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/compiler.h | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 0caced170a8a..f2a64195ee8e 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -314,14 +314,15 @@ unsigned long read_word_at_a_time(const void *addr)
  * This macro *does not* affect normal code generation, but is a hint
  * to tooling that data races here are to be ignored.
  */
-#define data_race(expr)                                                        \
-	({                                                                     \
-		typeof(({ expr; })) __val;                                     \
-		kcsan_disable_current();                                       \
-		__val = ({ expr; });                                           \
-		kcsan_enable_current();                                        \
-		__val;                                                         \
-	})
+#define data_race(expr)							\
+({									\
+	__kcsan_disable_current();					\
+	({								\
+		__unqual_scalar_typeof(({ expr; })) __v = ({ expr; });	\
+		__kcsan_enable_current();				\
+		__v;							\
+	});								\
+})
 #else
 
 #endif /* __KERNEL__ */
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 17/18] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (15 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 16/18] kcsan: Rework data_race() so that it can be used by READ_ONCE() Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12  8:23   ` Peter Zijlstra
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-11 20:41 ` [PATCH v5 18/18] linux/compiler.h: Remove redundant '#else' Will Deacon
  2020-05-12  8:18 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Peter Zijlstra
  18 siblings, 2 replies; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

Rather then open-code the disabling/enabling of KCSAN across the guts of
{READ,WRITE}_ONCE(), defer to the data_race() macro instead.

Cc: Marco Elver <elver@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/compiler.h | 53 ++++++++++++++++++----------------------
 1 file changed, 24 insertions(+), 29 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index f2a64195ee8e..d21a823e73c6 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -199,6 +199,26 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #include <linux/kasan-checks.h>
 #include <linux/kcsan-checks.h>
 
+/**
+ * data_race - mark an expression as containing intentional data races
+ *
+ * This data_race() macro is useful for situations in which data races
+ * should be forgiven.  One example is diagnostic code that accesses
+ * shared variables but is not a part of the core synchronization design.
+ *
+ * This macro *does not* affect normal code generation, but is a hint
+ * to tooling that data races here are to be ignored.
+ */
+#define data_race(expr)							\
+({									\
+	__kcsan_disable_current();					\
+	({								\
+		__unqual_scalar_typeof(({ expr; })) __v = ({ expr; });	\
+		__kcsan_enable_current();				\
+		__v;							\
+	});								\
+})
+
 /*
  * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
  * atomicity or dependency ordering guarantees. Note that this may result
@@ -209,14 +229,10 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #define __READ_ONCE_SCALAR(x)						\
 ({									\
 	typeof(x) *__xp = &(x);						\
+	__unqual_scalar_typeof(x) __x = data_race(__READ_ONCE(*__xp));	\
 	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
-	__kcsan_disable_current();					\
-	({								\
-		__unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp);	\
-		__kcsan_enable_current();				\
-		smp_read_barrier_depends();				\
-		(typeof(x))__x;						\
-	});								\
+	smp_read_barrier_depends();					\
+	(typeof(x))__x;							\
 })
 
 #define READ_ONCE(x)							\
@@ -234,9 +250,7 @@ do {									\
 do {									\
 	typeof(x) *__xp = &(x);						\
 	kcsan_check_atomic_write(__xp, sizeof(*__xp));			\
-	__kcsan_disable_current();					\
-	__WRITE_ONCE(*__xp, val);					\
-	__kcsan_enable_current();					\
+	data_race(({ __WRITE_ONCE(*__xp, val); 0; }));			\
 } while (0)
 
 #define WRITE_ONCE(x, val)						\
@@ -304,25 +318,6 @@ unsigned long read_word_at_a_time(const void *addr)
 	return *(unsigned long *)addr;
 }
 
-/**
- * data_race - mark an expression as containing intentional data races
- *
- * This data_race() macro is useful for situations in which data races
- * should be forgiven.  One example is diagnostic code that accesses
- * shared variables but is not a part of the core synchronization design.
- *
- * This macro *does not* affect normal code generation, but is a hint
- * to tooling that data races here are to be ignored.
- */
-#define data_race(expr)							\
-({									\
-	__kcsan_disable_current();					\
-	({								\
-		__unqual_scalar_typeof(({ expr; })) __v = ({ expr; });	\
-		__kcsan_enable_current();				\
-		__v;							\
-	});								\
-})
 #else
 
 #endif /* __KERNEL__ */
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* [PATCH v5 18/18] linux/compiler.h: Remove redundant '#else'
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (16 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 17/18] READ_ONCE: Use data_race() to avoid KCSAN instrumentation Will Deacon
@ 2020-05-11 20:41 ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  2020-05-12  8:18 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Peter Zijlstra
  18 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-11 20:41 UTC (permalink / raw)
  To: linux-kernel; +Cc: elver, tglx, paulmck, mingo, peterz, will

The '#else' clause after checking '#ifdef __KERNEL__' is empty in
linux/compiler.h, so remove it.

Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/compiler.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index d21a823e73c6..741c93c62ecf 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -318,8 +318,6 @@ unsigned long read_word_at_a_time(const void *addr)
 	return *(unsigned long *)addr;
 }
 
-#else
-
 #endif /* __KERNEL__ */
 
 /*
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
                   ` (17 preceding siblings ...)
  2020-05-11 20:41 ` [PATCH v5 18/18] linux/compiler.h: Remove redundant '#else' Will Deacon
@ 2020-05-12  8:18 ` Peter Zijlstra
  2020-05-12 17:53   ` Marco Elver
  18 siblings, 1 reply; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-12  8:18 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-kernel, elver, tglx, paulmck, mingo

On Mon, May 11, 2020 at 09:41:32PM +0100, Will Deacon wrote:
> Hi folks,
> 
> (trimmed CC list since v4 since this is largely just a rebase)
> 
> This is version five of the READ_ONCE() codegen improvement series that
> I've previously posted here:
> 
> RFC: https://lore.kernel.org/lkml/20200110165636.28035-1-will@kernel.org
> v2:  https://lore.kernel.org/lkml/20200123153341.19947-1-will@kernel.org
> v3:  https://lore.kernel.org/lkml/20200415165218.20251-1-will@kernel.org
> v4:  https://lore.kernel.org/lkml/20200421151537.19241-1-will@kernel.org
> 
> The main change since v4 is that this is now based on top of the KCSAN
> changes queued in -tip (locking/kcsan) and therefore contains the patches
> necessary to avoid breaking sparc32 as well as some cleanups to
> consolidate {READ,WRITE}_ONCE() and data_race().
> 
> Other changes include:
> 
>   * Treat 'char' as distinct from 'signed char' and 'unsigned char' for
>     __builtin_types_compatible_p()
> 
>   * Add a compile-time assertion that the argument to READ_ONCE_NOCHECK()
>     points at something the same size as 'unsigned long'
> 
> I'm happy for all of this to go via -tip, or I can take it via arm64.

Looks good to me; Thanks!

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 17/18] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-11 20:41 ` [PATCH v5 17/18] READ_ONCE: Use data_race() to avoid KCSAN instrumentation Will Deacon
@ 2020-05-12  8:23   ` Peter Zijlstra
  2020-05-12  9:49     ` Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
  1 sibling, 1 reply; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-12  8:23 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-kernel, elver, tglx, paulmck, mingo

On Mon, May 11, 2020 at 09:41:49PM +0100, Will Deacon wrote:

> +	data_race(({ __WRITE_ONCE(*__xp, val); 0; }));			\

That had me blink for a little, I see how we got there, but urgh.

Anyway, it's all in *much* better shape now than it was, so no real
copmlaints.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 17/18] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-12  8:23   ` Peter Zijlstra
@ 2020-05-12  9:49     ` Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: Will Deacon @ 2020-05-12  9:49 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, elver, tglx, paulmck, mingo

On Tue, May 12, 2020 at 10:23:06AM +0200, Peter Zijlstra wrote:
> On Mon, May 11, 2020 at 09:41:49PM +0100, Will Deacon wrote:
> 
> > +	data_race(({ __WRITE_ONCE(*__xp, val); 0; }));			\
> 
> That had me blink for a little, I see how we got there, but urgh.

I tried for a while to see if data_race() could act differently if the
expression was type-compatible with 'void' and, while I got something that
mostly worked, it would fire unexpectedly in a few places where the
expression was most definitely not void (e.g. dereferencing a vma pointer)
so I gave up :(

> Anyway, it's all in *much* better shape now than it was, so no real
> copmlaints.

Thanks.

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-11 20:41 ` [PATCH v5 17/18] READ_ONCE: Use data_race() to avoid KCSAN instrumentation Will Deacon
  2020-05-12  8:23   ` Peter Zijlstra
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  2020-05-20 22:17     ` Borislav Petkov
  1 sibling, 1 reply; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	Marco Elver, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     cdd28ad2d8110099e43527e96d059c5639809680
Gitweb:        https://git.kernel.org/tip/cdd28ad2d8110099e43527e96d059c5639809680
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:49 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:17 +02:00

READ_ONCE: Use data_race() to avoid KCSAN instrumentation

Rather then open-code the disabling/enabling of KCSAN across the guts of
{READ,WRITE}_ONCE(), defer to the data_race() macro instead.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200511204150.27858-18-will@kernel.org

---
 include/linux/compiler.h | 54 +++++++++++++++++----------------------
 1 file changed, 24 insertions(+), 30 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index cb2e3b3..741c93c 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -199,6 +199,26 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #include <linux/kasan-checks.h>
 #include <linux/kcsan-checks.h>
 
+/**
+ * data_race - mark an expression as containing intentional data races
+ *
+ * This data_race() macro is useful for situations in which data races
+ * should be forgiven.  One example is diagnostic code that accesses
+ * shared variables but is not a part of the core synchronization design.
+ *
+ * This macro *does not* affect normal code generation, but is a hint
+ * to tooling that data races here are to be ignored.
+ */
+#define data_race(expr)							\
+({									\
+	__kcsan_disable_current();					\
+	({								\
+		__unqual_scalar_typeof(({ expr; })) __v = ({ expr; });	\
+		__kcsan_enable_current();				\
+		__v;							\
+	});								\
+})
+
 /*
  * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
  * atomicity or dependency ordering guarantees. Note that this may result
@@ -209,14 +229,10 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #define __READ_ONCE_SCALAR(x)						\
 ({									\
 	typeof(x) *__xp = &(x);						\
+	__unqual_scalar_typeof(x) __x = data_race(__READ_ONCE(*__xp));	\
 	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
-	__kcsan_disable_current();					\
-	({								\
-		__unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp);	\
-		__kcsan_enable_current();				\
-		smp_read_barrier_depends();				\
-		(typeof(x))__x;						\
-	});								\
+	smp_read_barrier_depends();					\
+	(typeof(x))__x;							\
 })
 
 #define READ_ONCE(x)							\
@@ -234,9 +250,7 @@ do {									\
 do {									\
 	typeof(x) *__xp = &(x);						\
 	kcsan_check_atomic_write(__xp, sizeof(*__xp));			\
-	__kcsan_disable_current();					\
-	__WRITE_ONCE(*__xp, val);					\
-	__kcsan_enable_current();					\
+	data_race(({ __WRITE_ONCE(*__xp, val); 0; }));			\
 } while (0)
 
 #define WRITE_ONCE(x, val)						\
@@ -304,26 +318,6 @@ unsigned long read_word_at_a_time(const void *addr)
 	return *(unsigned long *)addr;
 }
 
-/**
- * data_race - mark an expression as containing intentional data races
- *
- * This data_race() macro is useful for situations in which data races
- * should be forgiven.  One example is diagnostic code that accesses
- * shared variables but is not a part of the core synchronization design.
- *
- * This macro *does not* affect normal code generation, but is a hint
- * to tooling that data races here are to be ignored.
- */
-#define data_race(expr)							\
-({									\
-	__kcsan_disable_current();					\
-	({								\
-		__unqual_scalar_typeof(({ expr; })) __v = ({ expr; });	\
-		__kcsan_enable_current();				\
-		__v;							\
-	});								\
-})
-
 #endif /* __KERNEL__ */
 
 /*

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] gcov: Remove old GCC 3.4 support
  2020-05-11 20:41 ` [PATCH v5 15/18] gcov: Remove old GCC 3.4 support Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Masahiro Yamada, Nick Desaulniers,
	Peter Zijlstra (Intel),
	Peter Oberparleiter, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     1c1da2d6f6fc27fadd18045247541bab463781fc
Gitweb:        https://git.kernel.org/tip/1c1da2d6f6fc27fadd18045247541bab463781fc
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:47 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:16 +02:00

gcov: Remove old GCC 3.4 support

The kernel requires at least GCC 4.8 in order to build, and so there is
no need to cater for the pre-4.7 gcov format.

Remove the obsolete code.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Link: https://lkml.kernel.org/r/20200511204150.27858-16-will@kernel.org

---
 kernel/gcov/Kconfig   |  24 +--
 kernel/gcov/Makefile  |   3 +-
 kernel/gcov/gcc_3_4.c | 573 +-----------------------------------------
 3 files changed, 1 insertion(+), 599 deletions(-)
 delete mode 100644 kernel/gcov/gcc_3_4.c

diff --git a/kernel/gcov/Kconfig b/kernel/gcov/Kconfig
index 3941a9c..feaad59 100644
--- a/kernel/gcov/Kconfig
+++ b/kernel/gcov/Kconfig
@@ -51,28 +51,4 @@ config GCOV_PROFILE_ALL
 	larger and run slower. Also be sure to exclude files from profiling
 	which are not linked to the kernel image to prevent linker errors.
 
-choice
-	prompt "Specify GCOV format"
-	depends on GCOV_KERNEL
-	depends on CC_IS_GCC
-	---help---
-	The gcov format is usually determined by the GCC version, and the
-	default is chosen according to your GCC version. However, there are
-	exceptions where format changes are integrated in lower-version GCCs.
-	In such a case, change this option to adjust the format used in the
-	kernel accordingly.
-
-config GCOV_FORMAT_3_4
-	bool "GCC 3.4 format"
-	depends on GCC_VERSION < 40700
-	---help---
-	Select this option to use the format defined by GCC 3.4.
-
-config GCOV_FORMAT_4_7
-	bool "GCC 4.7 format"
-	---help---
-	Select this option to use the format defined by GCC 4.7.
-
-endchoice
-
 endmenu
diff --git a/kernel/gcov/Makefile b/kernel/gcov/Makefile
index d66a74b..16f8ecc 100644
--- a/kernel/gcov/Makefile
+++ b/kernel/gcov/Makefile
@@ -2,6 +2,5 @@
 ccflags-y := -DSRCTREE='"$(srctree)"' -DOBJTREE='"$(objtree)"'
 
 obj-y := base.o fs.o
-obj-$(CONFIG_GCOV_FORMAT_3_4) += gcc_base.o gcc_3_4.o
-obj-$(CONFIG_GCOV_FORMAT_4_7) += gcc_base.o gcc_4_7.o
+obj-$(CONFIG_CC_IS_GCC) += gcc_base.o gcc_4_7.o
 obj-$(CONFIG_CC_IS_CLANG) += clang.o
diff --git a/kernel/gcov/gcc_3_4.c b/kernel/gcov/gcc_3_4.c
deleted file mode 100644
index acb8355..0000000
--- a/kernel/gcov/gcc_3_4.c
+++ /dev/null
@@ -1,573 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- *  This code provides functions to handle gcc's profiling data format
- *  introduced with gcc 3.4. Future versions of gcc may change the gcov
- *  format (as happened before), so all format-specific information needs
- *  to be kept modular and easily exchangeable.
- *
- *  This file is based on gcc-internal definitions. Functions and data
- *  structures are defined to be compatible with gcc counterparts.
- *  For a better understanding, refer to gcc source: gcc/gcov-io.h.
- *
- *    Copyright IBM Corp. 2009
- *    Author(s): Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
- *
- *    Uses gcc-internal data definitions.
- */
-
-#include <linux/errno.h>
-#include <linux/slab.h>
-#include <linux/string.h>
-#include <linux/seq_file.h>
-#include <linux/vmalloc.h>
-#include "gcov.h"
-
-#define GCOV_COUNTERS		5
-
-static struct gcov_info *gcov_info_head;
-
-/**
- * struct gcov_fn_info - profiling meta data per function
- * @ident: object file-unique function identifier
- * @checksum: function checksum
- * @n_ctrs: number of values per counter type belonging to this function
- *
- * This data is generated by gcc during compilation and doesn't change
- * at run-time.
- */
-struct gcov_fn_info {
-	unsigned int ident;
-	unsigned int checksum;
-	unsigned int n_ctrs[];
-};
-
-/**
- * struct gcov_ctr_info - profiling data per counter type
- * @num: number of counter values for this type
- * @values: array of counter values for this type
- * @merge: merge function for counter values of this type (unused)
- *
- * This data is generated by gcc during compilation and doesn't change
- * at run-time with the exception of the values array.
- */
-struct gcov_ctr_info {
-	unsigned int	num;
-	gcov_type	*values;
-	void		(*merge)(gcov_type *, unsigned int);
-};
-
-/**
- * struct gcov_info - profiling data per object file
- * @version: gcov version magic indicating the gcc version used for compilation
- * @next: list head for a singly-linked list
- * @stamp: time stamp
- * @filename: name of the associated gcov data file
- * @n_functions: number of instrumented functions
- * @functions: function data
- * @ctr_mask: mask specifying which counter types are active
- * @counts: counter data per counter type
- *
- * This data is generated by gcc during compilation and doesn't change
- * at run-time with the exception of the next pointer.
- */
-struct gcov_info {
-	unsigned int			version;
-	struct gcov_info		*next;
-	unsigned int			stamp;
-	const char			*filename;
-	unsigned int			n_functions;
-	const struct gcov_fn_info	*functions;
-	unsigned int			ctr_mask;
-	struct gcov_ctr_info		counts[];
-};
-
-/**
- * gcov_info_filename - return info filename
- * @info: profiling data set
- */
-const char *gcov_info_filename(struct gcov_info *info)
-{
-	return info->filename;
-}
-
-/**
- * gcov_info_version - return info version
- * @info: profiling data set
- */
-unsigned int gcov_info_version(struct gcov_info *info)
-{
-	return info->version;
-}
-
-/**
- * gcov_info_next - return next profiling data set
- * @info: profiling data set
- *
- * Returns next gcov_info following @info or first gcov_info in the chain if
- * @info is %NULL.
- */
-struct gcov_info *gcov_info_next(struct gcov_info *info)
-{
-	if (!info)
-		return gcov_info_head;
-
-	return info->next;
-}
-
-/**
- * gcov_info_link - link/add profiling data set to the list
- * @info: profiling data set
- */
-void gcov_info_link(struct gcov_info *info)
-{
-	info->next = gcov_info_head;
-	gcov_info_head = info;
-}
-
-/**
- * gcov_info_unlink - unlink/remove profiling data set from the list
- * @prev: previous profiling data set
- * @info: profiling data set
- */
-void gcov_info_unlink(struct gcov_info *prev, struct gcov_info *info)
-{
-	if (prev)
-		prev->next = info->next;
-	else
-		gcov_info_head = info->next;
-}
-
-/**
- * gcov_info_within_module - check if a profiling data set belongs to a module
- * @info: profiling data set
- * @mod: module
- *
- * Returns true if profiling data belongs module, false otherwise.
- */
-bool gcov_info_within_module(struct gcov_info *info, struct module *mod)
-{
-	return within_module((unsigned long)info, mod);
-}
-
-/* Symbolic links to be created for each profiling data file. */
-const struct gcov_link gcov_link[] = {
-	{ OBJ_TREE, "gcno" },	/* Link to .gcno file in $(objtree). */
-	{ 0, NULL},
-};
-
-/*
- * Determine whether a counter is active. Based on gcc magic. Doesn't change
- * at run-time.
- */
-static int counter_active(struct gcov_info *info, unsigned int type)
-{
-	return (1 << type) & info->ctr_mask;
-}
-
-/* Determine number of active counters. Based on gcc magic. */
-static unsigned int num_counter_active(struct gcov_info *info)
-{
-	unsigned int i;
-	unsigned int result = 0;
-
-	for (i = 0; i < GCOV_COUNTERS; i++) {
-		if (counter_active(info, i))
-			result++;
-	}
-	return result;
-}
-
-/**
- * gcov_info_reset - reset profiling data to zero
- * @info: profiling data set
- */
-void gcov_info_reset(struct gcov_info *info)
-{
-	unsigned int active = num_counter_active(info);
-	unsigned int i;
-
-	for (i = 0; i < active; i++) {
-		memset(info->counts[i].values, 0,
-		       info->counts[i].num * sizeof(gcov_type));
-	}
-}
-
-/**
- * gcov_info_is_compatible - check if profiling data can be added
- * @info1: first profiling data set
- * @info2: second profiling data set
- *
- * Returns non-zero if profiling data can be added, zero otherwise.
- */
-int gcov_info_is_compatible(struct gcov_info *info1, struct gcov_info *info2)
-{
-	return (info1->stamp == info2->stamp);
-}
-
-/**
- * gcov_info_add - add up profiling data
- * @dest: profiling data set to which data is added
- * @source: profiling data set which is added
- *
- * Adds profiling counts of @source to @dest.
- */
-void gcov_info_add(struct gcov_info *dest, struct gcov_info *source)
-{
-	unsigned int i;
-	unsigned int j;
-
-	for (i = 0; i < num_counter_active(dest); i++) {
-		for (j = 0; j < dest->counts[i].num; j++) {
-			dest->counts[i].values[j] +=
-				source->counts[i].values[j];
-		}
-	}
-}
-
-/* Get size of function info entry. Based on gcc magic. */
-static size_t get_fn_size(struct gcov_info *info)
-{
-	size_t size;
-
-	size = sizeof(struct gcov_fn_info) + num_counter_active(info) *
-	       sizeof(unsigned int);
-	if (__alignof__(struct gcov_fn_info) > sizeof(unsigned int))
-		size = ALIGN(size, __alignof__(struct gcov_fn_info));
-	return size;
-}
-
-/* Get address of function info entry. Based on gcc magic. */
-static struct gcov_fn_info *get_fn_info(struct gcov_info *info, unsigned int fn)
-{
-	return (struct gcov_fn_info *)
-		((char *) info->functions + fn * get_fn_size(info));
-}
-
-/**
- * gcov_info_dup - duplicate profiling data set
- * @info: profiling data set to duplicate
- *
- * Return newly allocated duplicate on success, %NULL on error.
- */
-struct gcov_info *gcov_info_dup(struct gcov_info *info)
-{
-	struct gcov_info *dup;
-	unsigned int i;
-	unsigned int active;
-
-	/* Duplicate gcov_info. */
-	active = num_counter_active(info);
-	dup = kzalloc(struct_size(dup, counts, active), GFP_KERNEL);
-	if (!dup)
-		return NULL;
-	dup->version		= info->version;
-	dup->stamp		= info->stamp;
-	dup->n_functions	= info->n_functions;
-	dup->ctr_mask		= info->ctr_mask;
-	/* Duplicate filename. */
-	dup->filename		= kstrdup(info->filename, GFP_KERNEL);
-	if (!dup->filename)
-		goto err_free;
-	/* Duplicate table of functions. */
-	dup->functions = kmemdup(info->functions, info->n_functions *
-				 get_fn_size(info), GFP_KERNEL);
-	if (!dup->functions)
-		goto err_free;
-	/* Duplicate counter arrays. */
-	for (i = 0; i < active ; i++) {
-		struct gcov_ctr_info *ctr = &info->counts[i];
-		size_t size = ctr->num * sizeof(gcov_type);
-
-		dup->counts[i].num = ctr->num;
-		dup->counts[i].merge = ctr->merge;
-		dup->counts[i].values = vmalloc(size);
-		if (!dup->counts[i].values)
-			goto err_free;
-		memcpy(dup->counts[i].values, ctr->values, size);
-	}
-	return dup;
-
-err_free:
-	gcov_info_free(dup);
-	return NULL;
-}
-
-/**
- * gcov_info_free - release memory for profiling data set duplicate
- * @info: profiling data set duplicate to free
- */
-void gcov_info_free(struct gcov_info *info)
-{
-	unsigned int active = num_counter_active(info);
-	unsigned int i;
-
-	for (i = 0; i < active ; i++)
-		vfree(info->counts[i].values);
-	kfree(info->functions);
-	kfree(info->filename);
-	kfree(info);
-}
-
-/**
- * struct type_info - iterator helper array
- * @ctr_type: counter type
- * @offset: index of the first value of the current function for this type
- *
- * This array is needed to convert the in-memory data format into the in-file
- * data format:
- *
- * In-memory:
- *   for each counter type
- *     for each function
- *       values
- *
- * In-file:
- *   for each function
- *     for each counter type
- *       values
- *
- * See gcc source gcc/gcov-io.h for more information on data organization.
- */
-struct type_info {
-	int ctr_type;
-	unsigned int offset;
-};
-
-/**
- * struct gcov_iterator - specifies current file position in logical records
- * @info: associated profiling data
- * @record: record type
- * @function: function number
- * @type: counter type
- * @count: index into values array
- * @num_types: number of counter types
- * @type_info: helper array to get values-array offset for current function
- */
-struct gcov_iterator {
-	struct gcov_info *info;
-
-	int record;
-	unsigned int function;
-	unsigned int type;
-	unsigned int count;
-
-	int num_types;
-	struct type_info type_info[];
-};
-
-static struct gcov_fn_info *get_func(struct gcov_iterator *iter)
-{
-	return get_fn_info(iter->info, iter->function);
-}
-
-static struct type_info *get_type(struct gcov_iterator *iter)
-{
-	return &iter->type_info[iter->type];
-}
-
-/**
- * gcov_iter_new - allocate and initialize profiling data iterator
- * @info: profiling data set to be iterated
- *
- * Return file iterator on success, %NULL otherwise.
- */
-struct gcov_iterator *gcov_iter_new(struct gcov_info *info)
-{
-	struct gcov_iterator *iter;
-
-	iter = kzalloc(struct_size(iter, type_info, num_counter_active(info)),
-		       GFP_KERNEL);
-	if (iter)
-		iter->info = info;
-
-	return iter;
-}
-
-/**
- * gcov_iter_free - release memory for iterator
- * @iter: file iterator to free
- */
-void gcov_iter_free(struct gcov_iterator *iter)
-{
-	kfree(iter);
-}
-
-/**
- * gcov_iter_get_info - return profiling data set for given file iterator
- * @iter: file iterator
- */
-struct gcov_info *gcov_iter_get_info(struct gcov_iterator *iter)
-{
-	return iter->info;
-}
-
-/**
- * gcov_iter_start - reset file iterator to starting position
- * @iter: file iterator
- */
-void gcov_iter_start(struct gcov_iterator *iter)
-{
-	int i;
-
-	iter->record = 0;
-	iter->function = 0;
-	iter->type = 0;
-	iter->count = 0;
-	iter->num_types = 0;
-	for (i = 0; i < GCOV_COUNTERS; i++) {
-		if (counter_active(iter->info, i)) {
-			iter->type_info[iter->num_types].ctr_type = i;
-			iter->type_info[iter->num_types++].offset = 0;
-		}
-	}
-}
-
-/* Mapping of logical record number to actual file content. */
-#define RECORD_FILE_MAGIC	0
-#define RECORD_GCOV_VERSION	1
-#define RECORD_TIME_STAMP	2
-#define RECORD_FUNCTION_TAG	3
-#define RECORD_FUNCTON_TAG_LEN	4
-#define RECORD_FUNCTION_IDENT	5
-#define RECORD_FUNCTION_CHECK	6
-#define RECORD_COUNT_TAG	7
-#define RECORD_COUNT_LEN	8
-#define RECORD_COUNT		9
-
-/**
- * gcov_iter_next - advance file iterator to next logical record
- * @iter: file iterator
- *
- * Return zero if new position is valid, non-zero if iterator has reached end.
- */
-int gcov_iter_next(struct gcov_iterator *iter)
-{
-	switch (iter->record) {
-	case RECORD_FILE_MAGIC:
-	case RECORD_GCOV_VERSION:
-	case RECORD_FUNCTION_TAG:
-	case RECORD_FUNCTON_TAG_LEN:
-	case RECORD_FUNCTION_IDENT:
-	case RECORD_COUNT_TAG:
-		/* Advance to next record */
-		iter->record++;
-		break;
-	case RECORD_COUNT:
-		/* Advance to next count */
-		iter->count++;
-		/* fall through */
-	case RECORD_COUNT_LEN:
-		if (iter->count < get_func(iter)->n_ctrs[iter->type]) {
-			iter->record = 9;
-			break;
-		}
-		/* Advance to next counter type */
-		get_type(iter)->offset += iter->count;
-		iter->count = 0;
-		iter->type++;
-		/* fall through */
-	case RECORD_FUNCTION_CHECK:
-		if (iter->type < iter->num_types) {
-			iter->record = 7;
-			break;
-		}
-		/* Advance to next function */
-		iter->type = 0;
-		iter->function++;
-		/* fall through */
-	case RECORD_TIME_STAMP:
-		if (iter->function < iter->info->n_functions)
-			iter->record = 3;
-		else
-			iter->record = -1;
-		break;
-	}
-	/* Check for EOF. */
-	if (iter->record == -1)
-		return -EINVAL;
-	else
-		return 0;
-}
-
-/**
- * seq_write_gcov_u32 - write 32 bit number in gcov format to seq_file
- * @seq: seq_file handle
- * @v: value to be stored
- *
- * Number format defined by gcc: numbers are recorded in the 32 bit
- * unsigned binary form of the endianness of the machine generating the
- * file.
- */
-static int seq_write_gcov_u32(struct seq_file *seq, u32 v)
-{
-	return seq_write(seq, &v, sizeof(v));
-}
-
-/**
- * seq_write_gcov_u64 - write 64 bit number in gcov format to seq_file
- * @seq: seq_file handle
- * @v: value to be stored
- *
- * Number format defined by gcc: numbers are recorded in the 32 bit
- * unsigned binary form of the endianness of the machine generating the
- * file. 64 bit numbers are stored as two 32 bit numbers, the low part
- * first.
- */
-static int seq_write_gcov_u64(struct seq_file *seq, u64 v)
-{
-	u32 data[2];
-
-	data[0] = (v & 0xffffffffUL);
-	data[1] = (v >> 32);
-	return seq_write(seq, data, sizeof(data));
-}
-
-/**
- * gcov_iter_write - write data for current pos to seq_file
- * @iter: file iterator
- * @seq: seq_file handle
- *
- * Return zero on success, non-zero otherwise.
- */
-int gcov_iter_write(struct gcov_iterator *iter, struct seq_file *seq)
-{
-	int rc = -EINVAL;
-
-	switch (iter->record) {
-	case RECORD_FILE_MAGIC:
-		rc = seq_write_gcov_u32(seq, GCOV_DATA_MAGIC);
-		break;
-	case RECORD_GCOV_VERSION:
-		rc = seq_write_gcov_u32(seq, iter->info->version);
-		break;
-	case RECORD_TIME_STAMP:
-		rc = seq_write_gcov_u32(seq, iter->info->stamp);
-		break;
-	case RECORD_FUNCTION_TAG:
-		rc = seq_write_gcov_u32(seq, GCOV_TAG_FUNCTION);
-		break;
-	case RECORD_FUNCTON_TAG_LEN:
-		rc = seq_write_gcov_u32(seq, 2);
-		break;
-	case RECORD_FUNCTION_IDENT:
-		rc = seq_write_gcov_u32(seq, get_func(iter)->ident);
-		break;
-	case RECORD_FUNCTION_CHECK:
-		rc = seq_write_gcov_u32(seq, get_func(iter)->checksum);
-		break;
-	case RECORD_COUNT_TAG:
-		rc = seq_write_gcov_u32(seq,
-			GCOV_TAG_FOR_COUNTER(get_type(iter)->ctr_type));
-		break;
-	case RECORD_COUNT_LEN:
-		rc = seq_write_gcov_u32(seq,
-				get_func(iter)->n_ctrs[iter->type] * 2);
-		break;
-	case RECORD_COUNT:
-		rc = seq_write_gcov_u64(seq,
-			iter->info->counts[iter->type].
-				values[iter->count + get_type(iter)->offset]);
-		break;
-	}
-	return rc;
-}

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] kcsan: Rework data_race() so that it can be used by READ_ONCE()
  2020-05-11 20:41 ` [PATCH v5 16/18] kcsan: Rework data_race() so that it can be used by READ_ONCE() Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	Marco Elver, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     88f1be32068d4323aa31236452352d6019a03ccc
Gitweb:        https://git.kernel.org/tip/88f1be32068d4323aa31236452352d6019a03ccc
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:48 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:17 +02:00

kcsan: Rework data_race() so that it can be used by READ_ONCE()

Rework the data_race() macro so that it:

  - Accepts expressions which evaluate to a 'const' type
  - Attempts to discard volatile qualifiers from scalar types, avoiding
    pointless stack spills
  - Uses __kcsan_{disable,enable}_current(), allowing its use from code
    that is built independently from the kernel, such as the vDSO

This will allow for its use by {READ,WRITE}_ONCE() in a subsequent patch.
At the same time, fix-up some weird whitespace issues.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200511204150.27858-17-will@kernel.org

---
 include/linux/compiler.h | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 548294e..cb2e3b3 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -314,14 +314,15 @@ unsigned long read_word_at_a_time(const void *addr)
  * This macro *does not* affect normal code generation, but is a hint
  * to tooling that data races here are to be ignored.
  */
-#define data_race(expr)                                                        \
-	({                                                                     \
-		typeof(({ expr; })) __val;                                     \
-		kcsan_disable_current();                                       \
-		__val = ({ expr; });                                           \
-		kcsan_enable_current();                                        \
-		__val;                                                         \
-	})
+#define data_race(expr)							\
+({									\
+	__kcsan_disable_current();					\
+	({								\
+		__unqual_scalar_typeof(({ expr; })) __v = ({ expr; });	\
+		__kcsan_enable_current();				\
+		__v;							\
+	});								\
+})
 
 #endif /* __KERNEL__ */
 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros
  2020-05-11 20:41 ` [PATCH v5 14/18] arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	Mark Rutland, Linus Torvalds, Arnd Bergmann, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     a9e777c275422371c86a3bdc8d81f4a21ec2a63c
Gitweb:        https://git.kernel.org/tip/a9e777c275422371c86a3bdc8d81f4a21ec2a63c
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:46 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:16 +02:00

arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros

Passing volatile-qualified pointers to the arm64 implementations of the
load-acquire/store-release macros results in a re-load from the stack
and a bunch of associated stack-protector churn due to the temporary
result variable inheriting the volatile semantics thanks to the use of
'typeof()'.

Define these temporary variables using 'unqual_scalar_typeof' to drop
the volatile qualifier in the case that they are scalar types.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/20200511204150.27858-15-will@kernel.org

---
 arch/arm64/include/asm/barrier.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 7d9cc5e..fb4c275 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -76,8 +76,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long idx,
 #define __smp_store_release(p, v)					\
 do {									\
 	typeof(p) __p = (p);						\
-	union { typeof(*p) __val; char __c[1]; } __u =			\
-		{ .__val = (__force typeof(*p)) (v) };			\
+	union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u =	\
+		{ .__val = (__force __unqual_scalar_typeof(*p)) (v) };	\
 	compiletime_assert_atomic_type(*p);				\
 	kasan_check_write(__p, sizeof(*p));				\
 	switch (sizeof(*p)) {						\
@@ -110,7 +110,7 @@ do {									\
 
 #define __smp_load_acquire(p)						\
 ({									\
-	union { typeof(*p) __val; char __c[1]; } __u;			\
+	union { __unqual_scalar_typeof(*p) __val; char __c[1]; } __u;	\
 	typeof(p) __p = (p);						\
 	compiletime_assert_atomic_type(*p);				\
 	kasan_check_read(__p, sizeof(*p));				\
@@ -136,33 +136,33 @@ do {									\
 			: "Q" (*__p) : "memory");			\
 		break;							\
 	}								\
-	__u.__val;							\
+	(typeof(*p))__u.__val;						\
 })
 
 #define smp_cond_load_relaxed(ptr, cond_expr)				\
 ({									\
 	typeof(ptr) __PTR = (ptr);					\
-	typeof(*ptr) VAL;						\
+	__unqual_scalar_typeof(*ptr) VAL;				\
 	for (;;) {							\
 		VAL = READ_ONCE(*__PTR);				\
 		if (cond_expr)						\
 			break;						\
 		__cmpwait_relaxed(__PTR, VAL);				\
 	}								\
-	VAL;								\
+	(typeof(*ptr))VAL;						\
 })
 
 #define smp_cond_load_acquire(ptr, cond_expr)				\
 ({									\
 	typeof(ptr) __PTR = (ptr);					\
-	typeof(*ptr) VAL;						\
+	__unqual_scalar_typeof(*ptr) VAL;				\
 	for (;;) {							\
 		VAL = smp_load_acquire(__PTR);				\
 		if (cond_expr)						\
 			break;						\
 		__cmpwait_relaxed(__PTR, VAL);				\
 	}								\
-	VAL;								\
+	(typeof(*ptr))VAL;						\
 })
 
 #include <asm-generic/barrier.h>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] READ_ONCE: Drop pointer qualifiers when reading from scalar types
  2020-05-11 20:41 ` [PATCH v5 12/18] READ_ONCE: Drop pointer qualifiers when reading from scalar types Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Michael Ellerman, Linus Torvalds, Will Deacon, Thomas Gleixner,
	Peter Zijlstra (Intel),
	Arnd Bergmann, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     7b364f0949ae2dd205d5e9afa4b82ee17030d928
Gitweb:        https://git.kernel.org/tip/7b364f0949ae2dd205d5e9afa4b82ee17030d928
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:44 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:14 +02:00

READ_ONCE: Drop pointer qualifiers when reading from scalar types

Passing a volatile-qualified pointer to READ_ONCE() is an absolute
trainwreck for code generation: the use of 'typeof()' to define a
temporary variable inside the macro means that the final evaluation in
macro scope ends up forcing a read back from the stack. When stack
protector is enabled (the default for arm64, at least), this causes
the compiler to vomit up all sorts of junk.

Unfortunately, dropping pointer qualifiers inside the macro poses quite
a challenge, especially since the pointed-to type is permitted to be an
aggregate, and this is relied upon by mm/ code accessing things like
'pmd_t'. Based on numerous hacks and discussions on the mailing list,
this is the best I've managed to come up with.

Introduce '__unqual_scalar_typeof()' which takes an expression and, if
the expression is an optionally qualified 8, 16, 32 or 64-bit scalar
type, evaluates to the unqualified type. Other input types, including
aggregates, remain unchanged. Hopefully READ_ONCE() on volatile aggregate
pointers isn't something we do on a fast-path.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/20200511204150.27858-13-will@kernel.org

---
 include/linux/compiler.h       |  6 +++---
 include/linux/compiler_types.h | 26 ++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 733605f..548294e 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -204,7 +204,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
  * atomicity or dependency ordering guarantees. Note that this may result
  * in tears!
  */
-#define __READ_ONCE(x)	(*(const volatile typeof(x) *)&(x))
+#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
 
 #define __READ_ONCE_SCALAR(x)						\
 ({									\
@@ -212,10 +212,10 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
 	__kcsan_disable_current();					\
 	({								\
-		typeof(x) __x = __READ_ONCE(*__xp);			\
+		__unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp);	\
 		__kcsan_enable_current();				\
 		smp_read_barrier_depends();				\
-		__x;							\
+		(typeof(x))__x;						\
 	});								\
 })
 
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index e970f97..6ed0612 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -210,6 +210,32 @@ struct ftrace_likely_data {
 /* Are two types/vars the same type (ignoring qualifiers)? */
 #define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
 
+/*
+ * __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
+ *			       non-scalar types unchanged.
+ *
+ * We build this out of a couple of helper macros in a vain attempt to
+ * help you keep your lunch down while reading it.
+ */
+#define __pick_scalar_type(x, type, otherwise)					\
+	__builtin_choose_expr(__same_type(x, type), (type)0, otherwise)
+
+/*
+ * 'char' is not type-compatible with either 'signed char' or 'unsigned char',
+ * so we include the naked type here as well as the signed/unsigned variants.
+ */
+#define __pick_integer_type(x, type, otherwise)					\
+	__pick_scalar_type(x, type,						\
+		__pick_scalar_type(x, unsigned type,				\
+			__pick_scalar_type(x, signed type, otherwise)))
+
+#define __unqual_scalar_typeof(x) typeof(					\
+	__pick_integer_type(x, char,						\
+		__pick_integer_type(x, short,					\
+			__pick_integer_type(x, int,				\
+				__pick_integer_type(x, long,			\
+					__pick_integer_type(x, long long, x))))))
+
 /* Is this type a native word size -- useful for atomic operations */
 #define __native_word(t) \
 	(sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || \

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
  2020-05-11 20:41 ` [PATCH v5 13/18] locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	Linus Torvalds, Arnd Bergmann, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     0229d80867ef27b3e41585a784d5f78695a58f95
Gitweb:        https://git.kernel.org/tip/0229d80867ef27b3e41585a784d5f78695a58f95
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:45 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:15 +02:00

locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros

Passing volatile-qualified pointers to the asm-generic implementations of
the load-acquire macros results in a re-load from the stack due to the
temporary result variable inheriting the volatile semantics thanks to the
use of 'typeof()'.

Define these temporary variables using 'unqual_scalar_typeof' to drop
the volatile qualifier in the case that they are scalar types.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/20200511204150.27858-14-will@kernel.org

---
 include/asm-generic/barrier.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 85b28eb..2eacaf7 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -128,10 +128,10 @@ do {									\
 #ifndef __smp_load_acquire
 #define __smp_load_acquire(p)						\
 ({									\
-	typeof(*p) ___p1 = READ_ONCE(*p);				\
+	__unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p);		\
 	compiletime_assert_atomic_type(*p);				\
 	__smp_mb();							\
-	___p1;								\
+	(typeof(*p))___p1;						\
 })
 #endif
 
@@ -183,10 +183,10 @@ do {									\
 #ifndef smp_load_acquire
 #define smp_load_acquire(p)						\
 ({									\
-	typeof(*p) ___p1 = READ_ONCE(*p);				\
+	__unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p);		\
 	compiletime_assert_atomic_type(*p);				\
 	barrier();							\
-	___p1;								\
+	(typeof(*p))___p1;						\
 })
 #endif
 
@@ -229,14 +229,14 @@ do {									\
 #ifndef smp_cond_load_relaxed
 #define smp_cond_load_relaxed(ptr, cond_expr) ({		\
 	typeof(ptr) __PTR = (ptr);				\
-	typeof(*ptr) VAL;					\
+	__unqual_scalar_typeof(*ptr) VAL;			\
 	for (;;) {						\
 		VAL = READ_ONCE(*__PTR);			\
 		if (cond_expr)					\
 			break;					\
 		cpu_relax();					\
 	}							\
-	VAL;							\
+	(typeof(*ptr))VAL;					\
 })
 #endif
 
@@ -250,10 +250,10 @@ do {									\
  */
 #ifndef smp_cond_load_acquire
 #define smp_cond_load_acquire(ptr, cond_expr) ({		\
-	typeof(*ptr) _val;					\
+	__unqual_scalar_typeof(*ptr) _val;			\
 	_val = smp_cond_load_relaxed(ptr, cond_expr);		\
 	smp_acquire__after_ctrl_dep();				\
-	_val;							\
+	(typeof(*ptr))_val;					\
 })
 #endif
 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
  2020-05-11 20:41 ` [PATCH v5 10/18] READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE() Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Linus Torvalds, Will Deacon, Thomas Gleixner,
	Peter Zijlstra (Intel),
	Mark Rutland, Michael Ellerman, Arnd Bergmann,
	Christian Borntraeger, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     bbfa112b46bdbbdfc2f5bfb9c2dcbef780ff6417
Gitweb:        https://git.kernel.org/tip/bbfa112b46bdbbdfc2f5bfb9c2dcbef780ff6417
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:42 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:13 +02:00

READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()

The implementations of {READ,WRITE}_ONCE() suffer from a significant
amount of indirection and complexity due to a historic GCC bug:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145

which was originally worked around by 230fa253df63 ("kernel: Provide
READ_ONCE and ASSIGN_ONCE").

Since GCC 4.8 is fairly vintage at this point and we emit a warning if
we detect it during the build, return {READ,WRITE}_ONCE() to their former
glory with an implementation that is easier to understand and, crucially,
more amenable to optimisation. A side effect of this simplification is
that WRITE_ONCE() no longer returns a value, but nobody seems to be
relying on that and the new behaviour is aligned with smp_store_release().

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lkml.kernel.org/r/20200511204150.27858-11-will@kernel.org

---
 include/linux/compiler.h | 141 ++++++++++++++------------------------
 1 file changed, 55 insertions(+), 86 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 9bd0f76..1b4e64d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -177,28 +177,57 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 # define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __LINE__)
 #endif
 
-#include <uapi/linux/types.h>
+/*
+ * Prevent the compiler from merging or refetching reads or writes. The
+ * compiler is also forbidden from reordering successive instances of
+ * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
+ * particular ordering. One way to make the compiler aware of ordering is to
+ * put the two invocations of READ_ONCE or WRITE_ONCE in different C
+ * statements.
+ *
+ * These two macros will also work on aggregate data types like structs or
+ * unions.
+ *
+ * Their two major use cases are: (1) Mediating communication between
+ * process-level code and irq/NMI handlers, all running on the same CPU,
+ * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
+ * mutilate accesses that either do not require ordering or that interact
+ * with an explicit memory barrier or atomic instruction that provides the
+ * required ordering.
+ */
+#include <asm/barrier.h>
+#include <linux/kasan-checks.h>
 #include <linux/kcsan-checks.h>
 
-#define __READ_ONCE_SIZE						\
+#define __READ_ONCE(x)	(*(volatile typeof(x) *)&(x))
+
+#define READ_ONCE(x)							\
 ({									\
-	switch (size) {							\
-	case 1: *(__u8 *)res = *(volatile __u8 *)p; break;		\
-	case 2: *(__u16 *)res = *(volatile __u16 *)p; break;		\
-	case 4: *(__u32 *)res = *(volatile __u32 *)p; break;		\
-	case 8: *(__u64 *)res = *(volatile __u64 *)p; break;		\
-	default:							\
-		barrier();						\
-		__builtin_memcpy((void *)res, (const void *)p, size);	\
-		barrier();						\
-	}								\
+	typeof(x) *__xp = &(x);						\
+	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
+	__kcsan_disable_current();					\
+	({								\
+		typeof(x) __x = __READ_ONCE(*__xp);			\
+		__kcsan_enable_current();				\
+		smp_read_barrier_depends();				\
+		__x;							\
+	});								\
 })
 
+#define WRITE_ONCE(x, val)						\
+do {									\
+	typeof(x) *__xp = &(x);						\
+	kcsan_check_atomic_write(__xp, sizeof(*__xp));			\
+	__kcsan_disable_current();					\
+	*(volatile typeof(x) *)__xp = (val);				\
+	__kcsan_enable_current();					\
+} while (0)
+
 #ifdef CONFIG_KASAN
 /*
- * We can't declare function 'inline' because __no_sanitize_address confilcts
+ * We can't declare function 'inline' because __no_sanitize_address conflicts
  * with inlining. Attempt to inline it may cause a build failure.
- * 	https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
+ *     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67368
  * '__maybe_unused' allows us to avoid defined-but-not-used warnings.
  */
 # define __no_kasan_or_inline __no_sanitize_address notrace __maybe_unused
@@ -225,78 +254,26 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #define __no_sanitize_or_inline __always_inline
 #endif
 
-static __no_kcsan_or_inline
-void __read_once_size(const volatile void *p, void *res, int size)
-{
-	kcsan_check_atomic_read(p, size);
-	__READ_ONCE_SIZE;
-}
-
 static __no_sanitize_or_inline
-void __read_once_size_nocheck(const volatile void *p, void *res, int size)
+unsigned long __read_once_word_nocheck(const void *addr)
 {
-	__READ_ONCE_SIZE;
-}
-
-static __no_kcsan_or_inline
-void __write_once_size(volatile void *p, void *res, int size)
-{
-	kcsan_check_atomic_write(p, size);
-
-	switch (size) {
-	case 1: *(volatile __u8 *)p = *(__u8 *)res; break;
-	case 2: *(volatile __u16 *)p = *(__u16 *)res; break;
-	case 4: *(volatile __u32 *)p = *(__u32 *)res; break;
-	case 8: *(volatile __u64 *)p = *(__u64 *)res; break;
-	default:
-		barrier();
-		__builtin_memcpy((void *)p, (const void *)res, size);
-		barrier();
-	}
+	return __READ_ONCE(*(unsigned long *)addr);
 }
 
 /*
- * Prevent the compiler from merging or refetching reads or writes. The
- * compiler is also forbidden from reordering successive instances of
- * READ_ONCE and WRITE_ONCE, but only when the compiler is aware of some
- * particular ordering. One way to make the compiler aware of ordering is to
- * put the two invocations of READ_ONCE or WRITE_ONCE in different C
- * statements.
- *
- * These two macros will also work on aggregate data types like structs or
- * unions. If the size of the accessed data type exceeds the word size of
- * the machine (e.g., 32 bits or 64 bits) READ_ONCE() and WRITE_ONCE() will
- * fall back to memcpy(). There's at least two memcpy()s: one for the
- * __builtin_memcpy() and then one for the macro doing the copy of variable
- * - '__u' allocated on the stack.
- *
- * Their two major use cases are: (1) Mediating communication between
- * process-level code and irq/NMI handlers, all running on the same CPU,
- * and (2) Ensuring that the compiler does not fold, spindle, or otherwise
- * mutilate accesses that either do not require ordering or that interact
- * with an explicit memory barrier or atomic instruction that provides the
- * required ordering.
+ * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need to load a
+ * word from memory atomically but without telling KASAN/KCSAN. This is
+ * usually used by unwinding code when walking the stack of a running process.
  */
-#include <asm/barrier.h>
-#include <linux/kasan-checks.h>
-
-#define __READ_ONCE(x, check)						\
+#define READ_ONCE_NOCHECK(x)						\
 ({									\
-	union { typeof(x) __val; char __c[1]; } __u;			\
-	if (check)							\
-		__read_once_size(&(x), __u.__c, sizeof(x));		\
-	else								\
-		__read_once_size_nocheck(&(x), __u.__c, sizeof(x));	\
-	smp_read_barrier_depends(); /* Enforce dependency ordering from x */ \
-	__u.__val;							\
+	unsigned long __x;						\
+	compiletime_assert(sizeof(x) == sizeof(__x),			\
+		"Unsupported access size for READ_ONCE_NOCHECK().");	\
+	__x = __read_once_word_nocheck(&(x));				\
+	smp_read_barrier_depends();					\
+	__x;								\
 })
-#define READ_ONCE(x) __READ_ONCE(x, 1)
-
-/*
- * Use READ_ONCE_NOCHECK() instead of READ_ONCE() if you need
- * to hide memory access from KASAN.
- */
-#define READ_ONCE_NOCHECK(x) __READ_ONCE(x, 0)
 
 static __no_kasan_or_inline
 unsigned long read_word_at_a_time(const void *addr)
@@ -305,14 +282,6 @@ unsigned long read_word_at_a_time(const void *addr)
 	return *(unsigned long *)addr;
 }
 
-#define WRITE_ONCE(x, val) \
-({							\
-	union { typeof(x) __val; char __c[1]; } __u =	\
-		{ .__val = (__force typeof(x)) (val) }; \
-	__write_once_size(&(x), __u.__c, sizeof(x));	\
-	__u.__val;					\
-})
-
 /**
  * data_race - mark an expression as containing intentional data races
  *

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
  2020-05-11 20:41 ` [PATCH v5 11/18] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Linus Torvalds, Will Deacon, Thomas Gleixner,
	Peter Zijlstra (Intel),
	Michael Ellerman, Arnd Bergmann, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     2ab3a0a02905d9994746dc4692c010d47b2beb74
Gitweb:        https://git.kernel.org/tip/2ab3a0a02905d9994746dc4692c010d47b2beb74
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:43 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:14 +02:00

READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses

{READ,WRITE}_ONCE() cannot guarantee atomicity for arbitrary data sizes.
This can be surprising to callers that might incorrectly be expecting
atomicity for accesses to aggregate structures, although there are other
callers where tearing is actually permissable (e.g. if they are using
something akin to sequence locking to protect the access).

Linus sayeth:

  | We could also look at being stricter for the normal READ/WRITE_ONCE(),
  | and require that they are
  |
  | (a) regular integer types
  |
  | (b) fit in an atomic word
  |
  | We actually did (b) for a while, until we noticed that we do it on
  | loff_t's etc and relaxed the rules. But maybe we could have a
  | "non-atomic" version of READ/WRITE_ONCE() that is used for the
  | questionable cases?

The slight snag is that we also have to support 64-bit accesses on 32-bit
architectures, as these appear to be widespread and tend to work out ok
if either the architecture supports atomic 64-bit accesses (armv7) or if
the variable being accessed represents a virtual address and therefore
only requires 32-bit atomicity in practice.

Take a step in that direction by introducing a variant of
'compiletime_assert_atomic_type()' and use it to check the pointer
argument to {READ,WRITE}_ONCE(). Expose __{READ,WRITE}_ONCE() variants
which are allowed to tear and convert the one broken caller over to the
new macros.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/20200511204150.27858-12-will@kernel.org

---
 drivers/xen/time.c       |  2 +-
 include/linux/compiler.h | 40 +++++++++++++++++++++++++++++++++++----
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/drivers/xen/time.c b/drivers/xen/time.c
index 0968859..108edbc 100644
--- a/drivers/xen/time.c
+++ b/drivers/xen/time.c
@@ -64,7 +64,7 @@ static void xen_get_runstate_snapshot_cpu_delta(
 	do {
 		state_time = get64(&state->state_entry_time);
 		rmb();	/* Hypervisor might update data. */
-		*res = READ_ONCE(*state);
+		*res = __READ_ONCE(*state);
 		rmb();	/* Hypervisor might update data. */
 	} while (get64(&state->state_entry_time) != state_time ||
 		 (state_time & XEN_RUNSTATE_UPDATE));
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 1b4e64d..733605f 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -199,9 +199,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 #include <linux/kasan-checks.h>
 #include <linux/kcsan-checks.h>
 
-#define __READ_ONCE(x)	(*(volatile typeof(x) *)&(x))
+/*
+ * Use __READ_ONCE() instead of READ_ONCE() if you do not require any
+ * atomicity or dependency ordering guarantees. Note that this may result
+ * in tears!
+ */
+#define __READ_ONCE(x)	(*(const volatile typeof(x) *)&(x))
 
-#define READ_ONCE(x)							\
+#define __READ_ONCE_SCALAR(x)						\
 ({									\
 	typeof(x) *__xp = &(x);						\
 	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
@@ -214,15 +219,32 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 	});								\
 })
 
-#define WRITE_ONCE(x, val)						\
+#define READ_ONCE(x)							\
+({									\
+	compiletime_assert_rwonce_type(x);				\
+	__READ_ONCE_SCALAR(x);						\
+})
+
+#define __WRITE_ONCE(x, val)						\
+do {									\
+	*(volatile typeof(x) *)&(x) = (val);				\
+} while (0)
+
+#define __WRITE_ONCE_SCALAR(x, val)					\
 do {									\
 	typeof(x) *__xp = &(x);						\
 	kcsan_check_atomic_write(__xp, sizeof(*__xp));			\
 	__kcsan_disable_current();					\
-	*(volatile typeof(x) *)__xp = (val);				\
+	__WRITE_ONCE(*__xp, val);					\
 	__kcsan_enable_current();					\
 } while (0)
 
+#define WRITE_ONCE(x, val)						\
+do {									\
+	compiletime_assert_rwonce_type(x);				\
+	__WRITE_ONCE_SCALAR(x, val);					\
+} while (0)
+
 #ifdef CONFIG_KASAN
 /*
  * We can't declare function 'inline' because __no_sanitize_address conflicts
@@ -365,6 +387,16 @@ static inline void *offset_to_ptr(const int *off)
 	compiletime_assert(__native_word(t),				\
 		"Need native word sized stores/loads for atomicity.")
 
+/*
+ * Yes, this permits 64-bit accesses on 32-bit architectures. These will
+ * actually be atomic in many cases (namely x86), but for others we rely on
+ * the access being split into 2x32-bit accesses for a 32-bit quantity (e.g.
+ * a virtual address) and a strong prevailing wind.
+ */
+#define compiletime_assert_rwonce_type(t)					\
+	compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long),	\
+		"Unsupported access size for {READ,WRITE}_ONCE().")
+
 /* &a[0] degrades to a pointer: a different type from an array */
 #define __must_be_array(a)	BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] net: tls: Avoid assigning 'const' pointer to non-const pointer
  2020-05-11 20:41 ` [PATCH v5 07/18] net: tls: " Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	Boris Pismenny, Aviad Yehezkel, John Fastabend, Daniel Borkmann,
	x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     268c779f206f105c12fa82499fbbf960b256750f
Gitweb:        https://git.kernel.org/tip/268c779f206f105c12fa82499fbbf960b256750f
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:39 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:12 +02:00

net: tls: Avoid assigning 'const' pointer to non-const pointer

tls_build_proto() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:

  | net/tls/tls_main.c: In function ‘tls_build_proto’:
  | ./include/linux/compiler.h:229:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  | net/tls/tls_main.c:640:4: note: in expansion of macro ‘smp_store_release’
  |   640 |    smp_store_release(&saved_tcpv6_prot, prot);
  |       |    ^~~~~~~~~~~~~~~~~

Drop the const qualifier from the local 'prot' variable, as it isn't
needed.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Boris Pismenny <borisp@mellanox.com>
Cc: Aviad Yehezkel <aviadye@mellanox.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lkml.kernel.org/r/20200511204150.27858-8-will@kernel.org

---
 net/tls/tls_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 156efce..b33e11c 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -629,7 +629,7 @@ struct tls_context *tls_ctx_create(struct sock *sk)
 static void tls_build_proto(struct sock *sk)
 {
 	int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
-	const struct proto *prot = READ_ONCE(sk->sk_prot);
+	struct proto *prot = READ_ONCE(sk->sk_prot);
 
 	/* Build IPv6 TLS whenever the address of tcpv6 _prot changes */
 	if (ip_ver == TLSV6 &&

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] arm64: csum: Disable KASAN for do_csum()
  2020-05-11 20:41 ` [PATCH v5 09/18] arm64: csum: Disable KASAN for do_csum() Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	Robin Murphy, Mark Rutland, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     5a7d7f5d57f61d650619b89c1b7d4adcf4fdecfe
Gitweb:        https://git.kernel.org/tip/5a7d7f5d57f61d650619b89c1b7d4adcf4fdecfe
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:41 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:13 +02:00

arm64: csum: Disable KASAN for do_csum()

do_csum() over-reads the source buffer and therefore abuses
READ_ONCE_NOCHECK() on a 128-bit type to avoid tripping up KASAN. In
preparation for READ_ONCE_NOCHECK() requiring an atomic access, and
therefore failing to build when fed a '__uint128_t', annotate do_csum()
explicitly with '__no_sanitize_address' and fall back to normal loads.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/20200511204150.27858-10-will@kernel.org

---
 arch/arm64/lib/csum.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/lib/csum.c b/arch/arm64/lib/csum.c
index 60eccae..78b87a6 100644
--- a/arch/arm64/lib/csum.c
+++ b/arch/arm64/lib/csum.c
@@ -14,7 +14,11 @@ static u64 accumulate(u64 sum, u64 data)
 	return tmp + (tmp >> 64);
 }
 
-unsigned int do_csum(const unsigned char *buff, int len)
+/*
+ * We over-read the buffer and this makes KASAN unhappy. Instead, disable
+ * instrumentation and call kasan explicitly.
+ */
+unsigned int __no_sanitize_address do_csum(const unsigned char *buff, int len)
 {
 	unsigned int offset, shift, sum;
 	const u64 *ptr;
@@ -42,7 +46,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
 	 * odd/even alignment, and means we can ignore it until the very end.
 	 */
 	shift = offset * 8;
-	data = READ_ONCE_NOCHECK(*ptr++);
+	data = *ptr++;
 #ifdef __LITTLE_ENDIAN
 	data = (data >> shift) << shift;
 #else
@@ -58,10 +62,10 @@ unsigned int do_csum(const unsigned char *buff, int len)
 	while (unlikely(len > 64)) {
 		__uint128_t tmp1, tmp2, tmp3, tmp4;
 
-		tmp1 = READ_ONCE_NOCHECK(*(__uint128_t *)ptr);
-		tmp2 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 2));
-		tmp3 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 4));
-		tmp4 = READ_ONCE_NOCHECK(*(__uint128_t *)(ptr + 6));
+		tmp1 = *(__uint128_t *)ptr;
+		tmp2 = *(__uint128_t *)(ptr + 2);
+		tmp3 = *(__uint128_t *)(ptr + 4);
+		tmp4 = *(__uint128_t *)(ptr + 6);
 
 		len -= 64;
 		ptr += 8;
@@ -85,7 +89,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
 		__uint128_t tmp;
 
 		sum64 = accumulate(sum64, data);
-		tmp = READ_ONCE_NOCHECK(*(__uint128_t *)ptr);
+		tmp = *(__uint128_t *)ptr;
 
 		len -= 16;
 		ptr += 2;
@@ -100,7 +104,7 @@ unsigned int do_csum(const unsigned char *buff, int len)
 	}
 	if (len > 0) {
 		sum64 = accumulate(sum64, data);
-		data = READ_ONCE_NOCHECK(*ptr);
+		data = *ptr;
 		len -= 8;
 	}
 	/*

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] fault_inject: Don't rely on "return value" from WRITE_ONCE()
  2020-05-11 20:41 ` [PATCH v5 08/18] fault_inject: Don't rely on "return value" from WRITE_ONCE() Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	Akinobu Mita, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     9a7cb2d8d6b959fc11a34668b1523f745ae5f714
Gitweb:        https://git.kernel.org/tip/9a7cb2d8d6b959fc11a34668b1523f745ae5f714
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:40 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:12 +02:00

fault_inject: Don't rely on "return value" from WRITE_ONCE()

It's a bit weird that WRITE_ONCE() evaluates to the value it stores and
it's also different to smp_store_release(), which can't be used this
way.

In preparation for preventing this in WRITE_ONCE(), change the fault
injection code to use a local variable instead.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Link: https://lkml.kernel.org/r/20200511204150.27858-9-will@kernel.org

---
 lib/fault-inject.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/fault-inject.c b/lib/fault-inject.c
index 8186ca8..ce12621 100644
--- a/lib/fault-inject.c
+++ b/lib/fault-inject.c
@@ -106,7 +106,9 @@ bool should_fail(struct fault_attr *attr, ssize_t size)
 		unsigned int fail_nth = READ_ONCE(current->fail_nth);
 
 		if (fail_nth) {
-			if (!WRITE_ONCE(current->fail_nth, fail_nth - 1))
+			fail_nth--;
+			WRITE_ONCE(current->fail_nth, fail_nth);
+			if (!fail_nth)
 				goto fail;
 
 			return false;

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] linux/compiler.h: Remove redundant '#else'
  2020-05-11 20:41 ` [PATCH v5 18/18] linux/compiler.h: Remove redundant '#else' Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel), x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     8367aadcd83d2570fd4ce4af40ae7aec7c2bfcb7
Gitweb:        https://git.kernel.org/tip/8367aadcd83d2570fd4ce4af40ae7aec7c2bfcb7
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:50 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:11 +02:00

linux/compiler.h: Remove redundant '#else'

The '#else' clause after checking '#ifdef __KERNEL__' is empty in
linux/compiler.h, so remove it.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200511204150.27858-19-will@kernel.org

---
 include/linux/compiler.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index cce2c92..9bd0f76 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -331,7 +331,6 @@ unsigned long read_word_at_a_time(const void *addr)
 		kcsan_enable_current();                                        \
 		__val;                                                         \
 	})
-#else
 
 #endif /* __KERNEL__ */
 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] netfilter: Avoid assigning 'const' pointer to non-const pointer
  2020-05-11 20:41 ` [PATCH v5 06/18] netfilter: Avoid assigning 'const' pointer to non-const pointer Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Nick Desaulniers,
	Peter Zijlstra (Intel),
	Pablo Neira Ayuso, Jozsef Kadlecsik, Florian Westphal,
	David S. Miller, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     f64554152014597a40403ea1a291c80785a2dfe9
Gitweb:        https://git.kernel.org/tip/f64554152014597a40403ea1a291c80785a2dfe9
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:38 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:11 +02:00

netfilter: Avoid assigning 'const' pointer to non-const pointer

nf_remove_net_hook() uses WRITE_ONCE() to assign a 'const' pointer to a
'non-const' pointer. Cleanups to the implementation of WRITE_ONCE() mean
that this will give rise to a compiler warning, just like a plain old
assignment would do:

  | In file included from ./include/linux/export.h:43,
  |                  from ./include/linux/linkage.h:7,
  |                  from ./include/linux/kernel.h:8,
  |                  from net/netfilter/core.c:9:
  | net/netfilter/core.c: In function ‘nf_remove_net_hook’:
  | ./include/linux/compiler.h:216:30: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  |   *(volatile typeof(x) *)&(x) = (val);  \
  |                               ^
  | net/netfilter/core.c:379:3: note: in expansion of macro ‘WRITE_ONCE’
  |    WRITE_ONCE(orig_ops[i], &dummy_ops);
  |    ^~~~~~~~~~

Follow the pattern used elsewhere in this file and add a cast to 'void *'
to squash the warning.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Link: https://lkml.kernel.org/r/20200511204150.27858-7-will@kernel.org

---
 net/netfilter/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 78f046e..3ac7c8c 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -376,7 +376,7 @@ static bool nf_remove_net_hook(struct nf_hook_entries *old,
 		if (orig_ops[i] != unreg)
 			continue;
 		WRITE_ONCE(old->hooks[i].hook, accept_all);
-		WRITE_ONCE(orig_ops[i], &dummy_ops);
+		WRITE_ONCE(orig_ops[i], (void *)&dummy_ops);
 		return true;
 	}
 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-11 20:41 ` [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  2020-05-17  0:00   ` [PATCH v5 04/18] " Guenter Roeck
  1 sibling, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	David S. Miller, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     2443600dc98fdc91661b2e24184f279d1198f8cc
Gitweb:        https://git.kernel.org/tip/2443600dc98fdc91661b2e24184f279d1198f8cc
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:36 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:09 +02:00

sparc32: mm: Reduce allocation size for PMD and PTE tables

Now that the page table allocator can free page table allocations
smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
to avoid needlessly wasting memory.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Link: https://lkml.kernel.org/r/20200511204150.27858-5-will@kernel.org

---
 arch/sparc/include/asm/pgtsrmmu.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/include/asm/pgtsrmmu.h b/arch/sparc/include/asm/pgtsrmmu.h
index 58ea8e8..7708d01 100644
--- a/arch/sparc/include/asm/pgtsrmmu.h
+++ b/arch/sparc/include/asm/pgtsrmmu.h
@@ -17,8 +17,8 @@
 /* Number of contexts is implementation-dependent; 64k is the most we support */
 #define SRMMU_MAX_CONTEXTS	65536
 
-#define SRMMU_PTE_TABLE_SIZE		(PAGE_SIZE)
-#define SRMMU_PMD_TABLE_SIZE		(PAGE_SIZE)
+#define SRMMU_PTE_TABLE_SIZE		(PTRS_PER_PTE*4)
+#define SRMMU_PMD_TABLE_SIZE		(PTRS_PER_PMD*4)
 #define SRMMU_PGD_TABLE_SIZE		(PTRS_PER_PGD*4)
 
 /* Definition of the values in the ET field of PTD's and PTE's */

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *
  2020-05-11 20:41 ` [PATCH v5 03/18] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page * Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	David S. Miller, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     c95be5b549d6af16e1f9b9307f745ef78a01d11c
Gitweb:        https://git.kernel.org/tip/c95be5b549d6af16e1f9b9307f745ef78a01d11c
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:35 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:09 +02:00

sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *

Change the 'pgtable_t' type for sparc32 so that it represents the uncached
virtual address of the PTE table, rather than the underlying 'struct page'.

This allows to free page table allocations smaller than a page.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Link: https://lkml.kernel.org/r/20200511204150.27858-4-will@kernel.org

---
 arch/sparc/include/asm/page_32.h    |  2 +-
 arch/sparc/include/asm/pgalloc_32.h |  6 +++---
 arch/sparc/include/asm/pgtable_32.h | 11 +++++++++++-
 arch/sparc/mm/srmmu.c               | 29 ++++++++--------------------
 4 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/arch/sparc/include/asm/page_32.h b/arch/sparc/include/asm/page_32.h
index da01c8c..fff8861 100644
--- a/arch/sparc/include/asm/page_32.h
+++ b/arch/sparc/include/asm/page_32.h
@@ -106,7 +106,7 @@ typedef unsigned long iopgprot_t;
 
 #endif
 
-typedef struct page *pgtable_t;
+typedef pte_t *pgtable_t;
 
 #define TASK_UNMAPPED_BASE	0x50000000
 
diff --git a/arch/sparc/include/asm/pgalloc_32.h b/arch/sparc/include/asm/pgalloc_32.h
index 99c0324..b772384 100644
--- a/arch/sparc/include/asm/pgalloc_32.h
+++ b/arch/sparc/include/asm/pgalloc_32.h
@@ -50,11 +50,11 @@ static inline void free_pmd_fast(pmd_t * pmd)
 #define pmd_free(mm, pmd)		free_pmd_fast(pmd)
 #define __pmd_free_tlb(tlb, pmd, addr)	pmd_free((tlb)->mm, pmd)
 
-void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, struct page *ptep);
-#define pmd_pgtable(pmd) pmd_page(pmd)
+#define pmd_populate(mm, pmd, pte)	pmd_set(pmd, pte)
+#define pmd_pgtable(pmd)		(pgtable_t)__pmd_page(pmd)
 
 void pmd_set(pmd_t *pmdp, pte_t *ptep);
-#define pmd_populate_kernel(MM, PMD, PTE) pmd_set(PMD, PTE)
+#define pmd_populate_kernel		pmd_populate
 
 pgtable_t pte_alloc_one(struct mm_struct *mm);
 
diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h
index 3367e2b..c5625b2 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -135,6 +135,17 @@ static inline struct page *pmd_page(pmd_t pmd)
 	return pfn_to_page((pmd_val(pmd) & SRMMU_PTD_PMASK) >> (PAGE_SHIFT-4));
 }
 
+static inline unsigned long __pmd_page(pmd_t pmd)
+{
+	unsigned long v;
+
+	if (srmmu_device_memory(pmd_val(pmd)))
+		BUG();
+
+	v = pmd_val(pmd) & SRMMU_PTD_PMASK;
+	return (unsigned long)__nocache_va(v << 4);
+}
+
 static inline unsigned long pud_page_vaddr(pud_t pud)
 {
 	if (srmmu_device_memory(pud_val(pud))) {
diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index 50da4bc..c861c0f 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -140,12 +140,6 @@ void pmd_set(pmd_t *pmdp, pte_t *ptep)
 	set_pte((pte_t *)&pmd_val(*pmdp), __pte(SRMMU_ET_PTD | ptp));
 }
 
-void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, struct page *ptep)
-{
-	unsigned long ptp = page_to_pfn(ptep) << (PAGE_SHIFT-4); /* watch for overflow */
-	set_pte((pte_t *)&pmd_val(*pmdp), __pte(SRMMU_ET_PTD | ptp));
-}
-
 /* Find an entry in the third-level page table.. */
 pte_t *pte_offset_kernel(pmd_t *dir, unsigned long address)
 {
@@ -364,31 +358,26 @@ pgd_t *get_pgd_fast(void)
  */
 pgtable_t pte_alloc_one(struct mm_struct *mm)
 {
-	unsigned long pte;
+	pte_t *ptep;
 	struct page *page;
 
-	if ((pte = (unsigned long)pte_alloc_one_kernel(mm)) == 0)
+	if ((ptep = pte_alloc_one_kernel(mm)) == 0)
 		return NULL;
-	page = pfn_to_page(__nocache_pa(pte) >> PAGE_SHIFT);
+	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
 	if (!pgtable_pte_page_ctor(page)) {
 		__free_page(page);
 		return NULL;
 	}
-	return page;
+	return ptep;
 }
 
-void pte_free(struct mm_struct *mm, pgtable_t pte)
+void pte_free(struct mm_struct *mm, pgtable_t ptep)
 {
-	unsigned long p;
-
-	pgtable_pte_page_dtor(pte);
-	p = (unsigned long)page_address(pte);	/* Cached address (for test) */
-	if (p == 0)
-		BUG();
-	p = page_to_pfn(pte) << PAGE_SHIFT;	/* Physical address */
+	struct page *page;
 
-	/* free non cached virtual address*/
-	srmmu_free_nocache(__nocache_va(p), SRMMU_PTE_TABLE_SIZE);
+	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
+	pgtable_pte_page_dtor(page);
+	srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
 }
 
 /* context handling - a dynamically sized pool is used */

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
  2020-05-11 20:41 ` [PATCH v5 05/18] compiler/gcc: Raise minimum GCC version for kernel builds to 4.8 Will Deacon
@ 2020-05-12 14:36   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Masahiro Yamada, Nick Desaulniers,
	Peter Zijlstra (Intel),
	Arnd Bergmann, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     62e13ab29e79d93a65fab5874e9c25ed4b3cec61
Gitweb:        https://git.kernel.org/tip/62e13ab29e79d93a65fab5874e9c25ed4b3cec61
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:37 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:10 +02:00

compiler/gcc: Raise minimum GCC version for kernel builds to 4.8

It is very rare to see versions of GCC prior to 4.8 being used to build
the mainline kernel. These old compilers are also known to have codegen
issues which can lead to silent miscompilation:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145

Raise the minimum GCC version to 4.8 for building the kernel and remove
some tautological Kconfig dependencies as a consequence.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/20200511204150.27858-6-will@kernel.org

---
 Documentation/process/changes.rst |  2 +-
 arch/arm/crypto/Kconfig           | 12 ++++++------
 crypto/Kconfig                    |  1 -
 include/linux/compiler-gcc.h      |  5 ++---
 init/Kconfig                      |  1 -
 scripts/gcc-plugins/Kconfig       |  2 +-
 6 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/Documentation/process/changes.rst b/Documentation/process/changes.rst
index 91c5ff8..5cfb54c 100644
--- a/Documentation/process/changes.rst
+++ b/Documentation/process/changes.rst
@@ -29,7 +29,7 @@ you probably needn't concern yourself with pcmciautils.
 ====================== ===============  ========================================
         Program        Minimal version       Command to check the version
 ====================== ===============  ========================================
-GNU C                  4.6              gcc --version
+GNU C                  4.8              gcc --version
 GNU make               3.81             make --version
 binutils               2.23             ld -v
 flex                   2.5.35           flex --version
diff --git a/arch/arm/crypto/Kconfig b/arch/arm/crypto/Kconfig
index 2674de6..c9bf2df 100644
--- a/arch/arm/crypto/Kconfig
+++ b/arch/arm/crypto/Kconfig
@@ -30,7 +30,7 @@ config CRYPTO_SHA1_ARM_NEON
 
 config CRYPTO_SHA1_ARM_CE
 	tristate "SHA1 digest algorithm (ARM v8 Crypto Extensions)"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	select CRYPTO_SHA1_ARM
 	select CRYPTO_HASH
 	help
@@ -39,7 +39,7 @@ config CRYPTO_SHA1_ARM_CE
 
 config CRYPTO_SHA2_ARM_CE
 	tristate "SHA-224/256 digest algorithm (ARM v8 Crypto Extensions)"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	select CRYPTO_SHA256_ARM
 	select CRYPTO_HASH
 	help
@@ -96,7 +96,7 @@ config CRYPTO_AES_ARM_BS
 
 config CRYPTO_AES_ARM_CE
 	tristate "Accelerated AES using ARMv8 Crypto Extensions"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	select CRYPTO_SKCIPHER
 	select CRYPTO_LIB_AES
 	select CRYPTO_SIMD
@@ -106,7 +106,7 @@ config CRYPTO_AES_ARM_CE
 
 config CRYPTO_GHASH_ARM_CE
 	tristate "PMULL-accelerated GHASH using NEON/ARMv8 Crypto Extensions"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	select CRYPTO_HASH
 	select CRYPTO_CRYPTD
 	select CRYPTO_GF128MUL
@@ -118,13 +118,13 @@ config CRYPTO_GHASH_ARM_CE
 
 config CRYPTO_CRCT10DIF_ARM_CE
 	tristate "CRCT10DIF digest algorithm using PMULL instructions"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	depends on CRC_T10DIF
 	select CRYPTO_HASH
 
 config CRYPTO_CRC32_ARM_CE
 	tristate "CRC32(C) digest algorithm using CRC and/or PMULL instructions"
-	depends on KERNEL_MODE_NEON && (CC_IS_CLANG || GCC_VERSION >= 40800)
+	depends on KERNEL_MODE_NEON
 	depends on CRC32
 	select CRYPTO_HASH
 
diff --git a/crypto/Kconfig b/crypto/Kconfig
index c24a474..34a8c5b 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -316,7 +316,6 @@ config CRYPTO_AEGIS128
 config CRYPTO_AEGIS128_SIMD
 	bool "Support SIMD acceleration for AEGIS-128"
 	depends on CRYPTO_AEGIS128 && ((ARM || ARM64) && KERNEL_MODE_NEON)
-	depends on !ARM || CC_IS_CLANG || GCC_VERSION >= 40800
 	default y
 
 config CRYPTO_AEGIS128_AESNI_SSE2
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index cf294fa..7dd4e03 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -10,7 +10,8 @@
 		     + __GNUC_MINOR__ * 100	\
 		     + __GNUC_PATCHLEVEL__)
 
-#if GCC_VERSION < 40600
+/* https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145 */
+#if GCC_VERSION < 40800
 # error Sorry, your compiler is too old - please upgrade it.
 #endif
 
@@ -126,9 +127,7 @@
 #if defined(CONFIG_ARCH_USE_BUILTIN_BSWAP) && !defined(__CHECKER__)
 #define __HAVE_BUILTIN_BSWAP32__
 #define __HAVE_BUILTIN_BSWAP64__
-#if GCC_VERSION >= 40800
 #define __HAVE_BUILTIN_BSWAP16__
-#endif
 #endif /* CONFIG_ARCH_USE_BUILTIN_BSWAP && !__CHECKER__ */
 
 #if GCC_VERSION >= 70000
diff --git a/init/Kconfig b/init/Kconfig
index 9e22ee8..035d38a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1285,7 +1285,6 @@ config LD_DEAD_CODE_DATA_ELIMINATION
 	bool "Dead code and data elimination (EXPERIMENTAL)"
 	depends on HAVE_LD_DEAD_CODE_DATA_ELIMINATION
 	depends on EXPERT
-	depends on !(FUNCTION_TRACER && CC_IS_GCC && GCC_VERSION < 40800)
 	depends on $(cc-option,-ffunction-sections -fdata-sections)
 	depends on $(ld-option,--gc-sections)
 	help
diff --git a/scripts/gcc-plugins/Kconfig b/scripts/gcc-plugins/Kconfig
index 013ba3a..ce0b99f 100644
--- a/scripts/gcc-plugins/Kconfig
+++ b/scripts/gcc-plugins/Kconfig
@@ -8,7 +8,7 @@ config HAVE_GCC_PLUGINS
 menuconfig GCC_PLUGINS
 	bool "GCC plugins"
 	depends on HAVE_GCC_PLUGINS
-	depends on CC_IS_GCC && GCC_VERSION >= 40800
+	depends on CC_IS_GCC
 	depends on $(success,$(srctree)/scripts/gcc-plugin.sh $(CC))
 	default y
 	help

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] sparc32: mm: Fix argument checking in __srmmu_get_nocache()
  2020-05-11 20:41 ` [PATCH v5 01/18] sparc32: mm: Fix argument checking in __srmmu_get_nocache() Will Deacon
@ 2020-05-12 14:37   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	David S. Miller, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     f790d0205fd5bd646bbc219211903a2aa164da97
Gitweb:        https://git.kernel.org/tip/f790d0205fd5bd646bbc219211903a2aa164da97
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:33 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:08 +02:00

sparc32: mm: Fix argument checking in __srmmu_get_nocache()

The 'size' argument to __srmmu_get_nocache() is a number of bytes not
a shift value, so fix up the sanity checking to treat it properly.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Link: https://lkml.kernel.org/r/20200511204150.27858-2-will@kernel.org

---
 arch/sparc/mm/srmmu.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index b7c94de..cb9ded8 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -175,18 +175,18 @@ pte_t *pte_offset_kernel(pmd_t *dir, unsigned long address)
  */
 static void *__srmmu_get_nocache(int size, int align)
 {
-	int offset;
+	int offset, minsz = 1 << SRMMU_NOCACHE_BITMAP_SHIFT;
 	unsigned long addr;
 
-	if (size < SRMMU_NOCACHE_BITMAP_SHIFT) {
+	if (size < minsz) {
 		printk(KERN_ERR "Size 0x%x too small for nocache request\n",
 		       size);
-		size = SRMMU_NOCACHE_BITMAP_SHIFT;
+		size = minsz;
 	}
-	if (size & (SRMMU_NOCACHE_BITMAP_SHIFT - 1)) {
-		printk(KERN_ERR "Size 0x%x unaligned int nocache request\n",
+	if (size & (minsz - 1)) {
+		printk(KERN_ERR "Size 0x%x unaligned in nocache request\n",
 		       size);
-		size += SRMMU_NOCACHE_BITMAP_SHIFT - 1;
+		size += minsz - 1;
 	}
 	BUG_ON(align > SRMMU_NOCACHE_ALIGN_MAX);
 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] sparc32: mm: Restructure sparc32 MMU page-table layout
  2020-05-11 20:41 ` [PATCH v5 02/18] sparc32: mm: Restructure sparc32 MMU page-table layout Will Deacon
@ 2020-05-12 14:37   ` tip-bot2 for Will Deacon
  0 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Will Deacon @ 2020-05-12 14:37 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Will Deacon, Thomas Gleixner, Peter Zijlstra (Intel),
	David S. Miller, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     3408974d0533e1e9bd6345610b335c7c52195a49
Gitweb:        https://git.kernel.org/tip/3408974d0533e1e9bd6345610b335c7c52195a49
Author:        Will Deacon <will@kernel.org>
AuthorDate:    Mon, 11 May 2020 21:41:34 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Tue, 12 May 2020 11:04:09 +02:00

sparc32: mm: Restructure sparc32 MMU page-table layout

The "SRMMU" supports 4k pages using a fixed three-level walk with a
256-entry PGD and 64-entry PMD/PTE levels. In order to fill a page
with a 'pgtable_t', the SRMMU code allocates four native PTE tables
into a single PTE allocation and similarly for the PMD level, leading
to an array of 16 physical pointers in a 'pmd_t'

This breaks the generic code which assumes READ_ONCE(*pmd) will be
word sized.

In a manner similar to ef22d8abd876 ("m68k: mm: Restructure Motorola MMU
page-table layout"), implement the native page-table setup directly. This
significantly increases the page-table memory overhead, but will be
addressed in a subsequent patch.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Link: https://lkml.kernel.org/r/20200511204150.27858-3-will@kernel.org

---
 arch/sparc/include/asm/page_32.h    | 10 ++---
 arch/sparc/include/asm/pgalloc_32.h |  5 +-
 arch/sparc/include/asm/pgtable_32.h | 29 +++++++-------
 arch/sparc/include/asm/pgtsrmmu.h   | 36 +----------------
 arch/sparc/include/asm/viking.h     |  5 +-
 arch/sparc/kernel/head_32.S         |  8 ++--
 arch/sparc/mm/hypersparc.S          |  3 +-
 arch/sparc/mm/srmmu.c               | 60 +++++++++-------------------
 arch/sparc/mm/viking.S              |  5 +-
 9 files changed, 58 insertions(+), 103 deletions(-)

diff --git a/arch/sparc/include/asm/page_32.h b/arch/sparc/include/asm/page_32.h
index 4782600..da01c8c 100644
--- a/arch/sparc/include/asm/page_32.h
+++ b/arch/sparc/include/asm/page_32.h
@@ -54,7 +54,7 @@ extern struct sparc_phys_banks sp_banks[SPARC_PHYS_BANKS+1];
  */
 typedef struct { unsigned long pte; } pte_t;
 typedef struct { unsigned long iopte; } iopte_t;
-typedef struct { unsigned long pmdv[16]; } pmd_t;
+typedef struct { unsigned long pmd; } pmd_t;
 typedef struct { unsigned long pgd; } pgd_t;
 typedef struct { unsigned long ctxd; } ctxd_t;
 typedef struct { unsigned long pgprot; } pgprot_t;
@@ -62,7 +62,7 @@ typedef struct { unsigned long iopgprot; } iopgprot_t;
 
 #define pte_val(x)	((x).pte)
 #define iopte_val(x)	((x).iopte)
-#define pmd_val(x)      ((x).pmdv[0])
+#define pmd_val(x)      ((x).pmd)
 #define pgd_val(x)	((x).pgd)
 #define ctxd_val(x)	((x).ctxd)
 #define pgprot_val(x)	((x).pgprot)
@@ -82,7 +82,7 @@ typedef struct { unsigned long iopgprot; } iopgprot_t;
  */
 typedef unsigned long pte_t;
 typedef unsigned long iopte_t;
-typedef struct { unsigned long pmdv[16]; } pmd_t;
+typedef unsigned long pmd_t;
 typedef unsigned long pgd_t;
 typedef unsigned long ctxd_t;
 typedef unsigned long pgprot_t;
@@ -90,14 +90,14 @@ typedef unsigned long iopgprot_t;
 
 #define pte_val(x)	(x)
 #define iopte_val(x)	(x)
-#define pmd_val(x)      ((x).pmdv[0])
+#define pmd_val(x)      (x)
 #define pgd_val(x)	(x)
 #define ctxd_val(x)	(x)
 #define pgprot_val(x)	(x)
 #define iopgprot_val(x)	(x)
 
 #define __pte(x)	(x)
-#define __pmd(x)	((pmd_t) { { (x) }, })
+#define __pmd(x)	(x)
 #define __iopte(x)	(x)
 #define __pgd(x)	(x)
 #define __ctxd(x)	(x)
diff --git a/arch/sparc/include/asm/pgalloc_32.h b/arch/sparc/include/asm/pgalloc_32.h
index eae0c92..99c0324 100644
--- a/arch/sparc/include/asm/pgalloc_32.h
+++ b/arch/sparc/include/asm/pgalloc_32.h
@@ -60,13 +60,14 @@ pgtable_t pte_alloc_one(struct mm_struct *mm);
 
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
 {
-	return srmmu_get_nocache(PTE_SIZE, PTE_SIZE);
+	return srmmu_get_nocache(SRMMU_PTE_TABLE_SIZE,
+				 SRMMU_PTE_TABLE_SIZE);
 }
 
 
 static inline void free_pte_fast(pte_t *pte)
 {
-	srmmu_free_nocache(pte, PTE_SIZE);
+	srmmu_free_nocache(pte, SRMMU_PTE_TABLE_SIZE);
 }
 
 #define pte_free_kernel(mm, pte)	free_pte_fast(pte)
diff --git a/arch/sparc/include/asm/pgtable_32.h b/arch/sparc/include/asm/pgtable_32.h
index 0de659a..3367e2b 100644
--- a/arch/sparc/include/asm/pgtable_32.h
+++ b/arch/sparc/include/asm/pgtable_32.h
@@ -11,6 +11,16 @@
 
 #include <linux/const.h>
 
+#define PMD_SHIFT		18
+#define PMD_SIZE        	(1UL << PMD_SHIFT)
+#define PMD_MASK        	(~(PMD_SIZE-1))
+#define PMD_ALIGN(__addr) 	(((__addr) + ~PMD_MASK) & PMD_MASK)
+
+#define PGDIR_SHIFT     	24
+#define PGDIR_SIZE      	(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK      	(~(PGDIR_SIZE-1))
+#define PGDIR_ALIGN(__addr) 	(((__addr) + ~PGDIR_MASK) & PGDIR_MASK)
+
 #ifndef __ASSEMBLY__
 #include <asm-generic/pgtable-nopud.h>
 
@@ -34,17 +44,10 @@ unsigned long __init bootmem_init(unsigned long *pages_avail);
 #define pmd_ERROR(e)   __builtin_trap()
 #define pgd_ERROR(e)   __builtin_trap()
 
-#define PMD_SHIFT		22
-#define PMD_SIZE        	(1UL << PMD_SHIFT)
-#define PMD_MASK        	(~(PMD_SIZE-1))
-#define PMD_ALIGN(__addr) 	(((__addr) + ~PMD_MASK) & PMD_MASK)
-#define PGDIR_SHIFT     	SRMMU_PGDIR_SHIFT
-#define PGDIR_SIZE      	SRMMU_PGDIR_SIZE
-#define PGDIR_MASK      	SRMMU_PGDIR_MASK
-#define PTRS_PER_PTE    	1024
-#define PTRS_PER_PMD    	SRMMU_PTRS_PER_PMD
-#define PTRS_PER_PGD    	SRMMU_PTRS_PER_PGD
-#define USER_PTRS_PER_PGD	PAGE_OFFSET / SRMMU_PGDIR_SIZE
+#define PTRS_PER_PTE    	64
+#define PTRS_PER_PMD    	64
+#define PTRS_PER_PGD    	256
+#define USER_PTRS_PER_PGD	PAGE_OFFSET / PGDIR_SIZE
 #define FIRST_USER_ADDRESS	0UL
 #define PTE_SIZE		(PTRS_PER_PTE*4)
 
@@ -179,9 +182,7 @@ static inline int pmd_none(pmd_t pmd)
 
 static inline void pmd_clear(pmd_t *pmdp)
 {
-	int i;
-	for (i = 0; i < PTRS_PER_PTE/SRMMU_REAL_PTRS_PER_PTE; i++)
-		set_pte((pte_t *)&pmdp->pmdv[i], __pte(0));
+	set_pte((pte_t *)&pmd_val(*pmdp), __pte(0));
 }
 
 static inline int pud_none(pud_t pud)
diff --git a/arch/sparc/include/asm/pgtsrmmu.h b/arch/sparc/include/asm/pgtsrmmu.h
index 32a5088..58ea8e8 100644
--- a/arch/sparc/include/asm/pgtsrmmu.h
+++ b/arch/sparc/include/asm/pgtsrmmu.h
@@ -17,39 +17,9 @@
 /* Number of contexts is implementation-dependent; 64k is the most we support */
 #define SRMMU_MAX_CONTEXTS	65536
 
-/* PMD_SHIFT determines the size of the area a second-level page table entry can map */
-#define SRMMU_REAL_PMD_SHIFT		18
-#define SRMMU_REAL_PMD_SIZE		(1UL << SRMMU_REAL_PMD_SHIFT)
-#define SRMMU_REAL_PMD_MASK		(~(SRMMU_REAL_PMD_SIZE-1))
-#define SRMMU_REAL_PMD_ALIGN(__addr)	(((__addr)+SRMMU_REAL_PMD_SIZE-1)&SRMMU_REAL_PMD_MASK)
-
-/* PGDIR_SHIFT determines what a third-level page table entry can map */
-#define SRMMU_PGDIR_SHIFT       24
-#define SRMMU_PGDIR_SIZE        (1UL << SRMMU_PGDIR_SHIFT)
-#define SRMMU_PGDIR_MASK        (~(SRMMU_PGDIR_SIZE-1))
-#define SRMMU_PGDIR_ALIGN(addr) (((addr)+SRMMU_PGDIR_SIZE-1)&SRMMU_PGDIR_MASK)
-
-#define SRMMU_REAL_PTRS_PER_PTE	64
-#define SRMMU_REAL_PTRS_PER_PMD	64
-#define SRMMU_PTRS_PER_PGD	256
-
-#define SRMMU_REAL_PTE_TABLE_SIZE	(SRMMU_REAL_PTRS_PER_PTE*4)
-#define SRMMU_PMD_TABLE_SIZE		(SRMMU_REAL_PTRS_PER_PMD*4)
-#define SRMMU_PGD_TABLE_SIZE		(SRMMU_PTRS_PER_PGD*4)
-
-/*
- * To support pagetables in highmem, Linux introduces APIs which
- * return struct page* and generally manipulate page tables when
- * they are not mapped into kernel space. Our hardware page tables
- * are smaller than pages. We lump hardware tabes into big, page sized
- * software tables.
- *
- * PMD_SHIFT determines the size of the area a second-level page table entry
- * can map, and our pmd_t is 16 times larger than normal.  The values which
- * were once defined here are now generic for 4c and srmmu, so they're
- * found in pgtable.h.
- */
-#define SRMMU_PTRS_PER_PMD	4
+#define SRMMU_PTE_TABLE_SIZE		(PAGE_SIZE)
+#define SRMMU_PMD_TABLE_SIZE		(PAGE_SIZE)
+#define SRMMU_PGD_TABLE_SIZE		(PTRS_PER_PGD*4)
 
 /* Definition of the values in the ET field of PTD's and PTE's */
 #define SRMMU_ET_MASK         0x3
diff --git a/arch/sparc/include/asm/viking.h b/arch/sparc/include/asm/viking.h
index 0bbefd1..08ffc60 100644
--- a/arch/sparc/include/asm/viking.h
+++ b/arch/sparc/include/asm/viking.h
@@ -10,6 +10,7 @@
 
 #include <asm/asi.h>
 #include <asm/mxcc.h>
+#include <asm/pgtable.h>
 #include <asm/pgtsrmmu.h>
 
 /* Bits in the SRMMU control register for GNU/Viking modules.
@@ -227,7 +228,7 @@ static inline unsigned long viking_hwprobe(unsigned long vaddr)
 			     : "=r" (val)
 			     : "r" (vaddr | 0x200), "i" (ASI_M_FLUSH_PROBE));
 	if ((val & SRMMU_ET_MASK) == SRMMU_ET_PTE) {
-		vaddr &= ~SRMMU_PGDIR_MASK;
+		vaddr &= ~PGDIR_MASK;
 		vaddr >>= PAGE_SHIFT;
 		return val | (vaddr << 8);
 	}
@@ -237,7 +238,7 @@ static inline unsigned long viking_hwprobe(unsigned long vaddr)
 			     : "=r" (val)
 			     : "r" (vaddr | 0x100), "i" (ASI_M_FLUSH_PROBE));
 	if ((val & SRMMU_ET_MASK) == SRMMU_ET_PTE) {
-		vaddr &= ~SRMMU_REAL_PMD_MASK;
+		vaddr &= ~PMD_MASK;
 		vaddr >>= PAGE_SHIFT;
 		return val | (vaddr << 8);
 	}
diff --git a/arch/sparc/kernel/head_32.S b/arch/sparc/kernel/head_32.S
index e55f2c0..be30c8d 100644
--- a/arch/sparc/kernel/head_32.S
+++ b/arch/sparc/kernel/head_32.S
@@ -24,7 +24,7 @@
 #include <asm/winmacro.h>
 #include <asm/thread_info.h>	/* TI_UWINMASK */
 #include <asm/errno.h>
-#include <asm/pgtsrmmu.h>	/* SRMMU_PGDIR_SHIFT */
+#include <asm/pgtable.h>	/* PGDIR_SHIFT */
 #include <asm/export.h>
 
 	.data
@@ -273,7 +273,7 @@ not_a_sun4:
 		lda	[%o1] ASI_M_BYPASS, %o2		! This is the 0x0 16MB pgd
 
 		/* Calculate to KERNBASE entry. */
-		add	%o1, KERNBASE >> (SRMMU_PGDIR_SHIFT - 2), %o3
+		add	%o1, KERNBASE >> (PGDIR_SHIFT - 2), %o3
 
 		/* Poke the entry into the calculated address. */
 		sta	%o2, [%o3] ASI_M_BYPASS
@@ -317,7 +317,7 @@ srmmu_not_viking:
 		sll	%g1, 0x8, %g1			! make phys addr for l1 tbl
 
 		lda	[%g1] ASI_M_BYPASS, %g2		! get level1 entry for 0x0
-		add	%g1, KERNBASE >> (SRMMU_PGDIR_SHIFT - 2), %g3
+		add	%g1, KERNBASE >> (PGDIR_SHIFT - 2), %g3
 		sta	%g2, [%g3] ASI_M_BYPASS		! place at KERNBASE entry
 		b	go_to_highmem
 		 nop					! wheee....
@@ -341,7 +341,7 @@ leon_remap:
 		sll	%g1, 0x8, %g1			! make phys addr for l1 tbl
 
 		lda	[%g1] ASI_M_BYPASS, %g2		! get level1 entry for 0x0
-		add	%g1, KERNBASE >> (SRMMU_PGDIR_SHIFT - 2), %g3
+		add	%g1, KERNBASE >> (PGDIR_SHIFT - 2), %g3
 		sta	%g2, [%g3] ASI_M_BYPASS		! place at KERNBASE entry
 		b	go_to_highmem
 		 nop					! wheee....
diff --git a/arch/sparc/mm/hypersparc.S b/arch/sparc/mm/hypersparc.S
index 66885a8..6c2521e 100644
--- a/arch/sparc/mm/hypersparc.S
+++ b/arch/sparc/mm/hypersparc.S
@@ -10,6 +10,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/asi.h>
 #include <asm/page.h>
+#include <asm/pgtable.h>
 #include <asm/pgtsrmmu.h>
 #include <linux/init.h>
 
@@ -293,7 +294,7 @@ hypersparc_flush_tlb_range:
 	cmp	%o3, -1
 	be	hypersparc_flush_tlb_range_out
 #endif
-	 sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
+	 sethi	%hi(~((1 << PGDIR_SHIFT) - 1)), %o4
 	sta	%o3, [%g1] ASI_M_MMUREGS
 	and	%o1, %o4, %o1
 	add	%o1, 0x200, %o1
diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index cb9ded8..50da4bc 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -136,26 +136,14 @@ static void msi_set_sync(void)
 
 void pmd_set(pmd_t *pmdp, pte_t *ptep)
 {
-	unsigned long ptp;	/* Physical address, shifted right by 4 */
-	int i;
-
-	ptp = __nocache_pa(ptep) >> 4;
-	for (i = 0; i < PTRS_PER_PTE/SRMMU_REAL_PTRS_PER_PTE; i++) {
-		set_pte((pte_t *)&pmdp->pmdv[i], __pte(SRMMU_ET_PTD | ptp));
-		ptp += (SRMMU_REAL_PTRS_PER_PTE * sizeof(pte_t) >> 4);
-	}
+	unsigned long ptp = __nocache_pa(ptep) >> 4;
+	set_pte((pte_t *)&pmd_val(*pmdp), __pte(SRMMU_ET_PTD | ptp));
 }
 
 void pmd_populate(struct mm_struct *mm, pmd_t *pmdp, struct page *ptep)
 {
-	unsigned long ptp;	/* Physical address, shifted right by 4 */
-	int i;
-
-	ptp = page_to_pfn(ptep) << (PAGE_SHIFT-4);	/* watch for overflow */
-	for (i = 0; i < PTRS_PER_PTE/SRMMU_REAL_PTRS_PER_PTE; i++) {
-		set_pte((pte_t *)&pmdp->pmdv[i], __pte(SRMMU_ET_PTD | ptp));
-		ptp += (SRMMU_REAL_PTRS_PER_PTE * sizeof(pte_t) >> 4);
-	}
+	unsigned long ptp = page_to_pfn(ptep) << (PAGE_SHIFT-4); /* watch for overflow */
+	set_pte((pte_t *)&pmd_val(*pmdp), __pte(SRMMU_ET_PTD | ptp));
 }
 
 /* Find an entry in the third-level page table.. */
@@ -163,7 +151,7 @@ pte_t *pte_offset_kernel(pmd_t *dir, unsigned long address)
 {
 	void *pte;
 
-	pte = __nocache_va((dir->pmdv[0] & SRMMU_PTD_PMASK) << 4);
+	pte = __nocache_va((pmd_val(*dir) & SRMMU_PTD_PMASK) << 4);
 	return (pte_t *) pte +
 	    ((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1));
 }
@@ -400,7 +388,7 @@ void pte_free(struct mm_struct *mm, pgtable_t pte)
 	p = page_to_pfn(pte) << PAGE_SHIFT;	/* Physical address */
 
 	/* free non cached virtual address*/
-	srmmu_free_nocache(__nocache_va(p), PTE_SIZE);
+	srmmu_free_nocache(__nocache_va(p), SRMMU_PTE_TABLE_SIZE);
 }
 
 /* context handling - a dynamically sized pool is used */
@@ -822,13 +810,13 @@ static void __init srmmu_inherit_prom_mappings(unsigned long start,
 		what = 0;
 		addr = start - PAGE_SIZE;
 
-		if (!(start & ~(SRMMU_REAL_PMD_MASK))) {
-			if (srmmu_probe(addr + SRMMU_REAL_PMD_SIZE) == probed)
+		if (!(start & ~(PMD_MASK))) {
+			if (srmmu_probe(addr + PMD_SIZE) == probed)
 				what = 1;
 		}
 
-		if (!(start & ~(SRMMU_PGDIR_MASK))) {
-			if (srmmu_probe(addr + SRMMU_PGDIR_SIZE) == probed)
+		if (!(start & ~(PGDIR_MASK))) {
+			if (srmmu_probe(addr + PGDIR_SIZE) == probed)
 				what = 2;
 		}
 
@@ -837,7 +825,7 @@ static void __init srmmu_inherit_prom_mappings(unsigned long start,
 		pudp = pud_offset(p4dp, start);
 		if (what == 2) {
 			*(pgd_t *)__nocache_fix(pgdp) = __pgd(probed);
-			start += SRMMU_PGDIR_SIZE;
+			start += PGDIR_SIZE;
 			continue;
 		}
 		if (pud_none(*(pud_t *)__nocache_fix(pudp))) {
@@ -849,6 +837,11 @@ static void __init srmmu_inherit_prom_mappings(unsigned long start,
 			pud_set(__nocache_fix(pudp), pmdp);
 		}
 		pmdp = pmd_offset(__nocache_fix(pgdp), start);
+		if (what == 1) {
+			*(pmd_t *)__nocache_fix(pmdp) = __pmd(probed);
+			start += PMD_SIZE;
+			continue;
+		}
 		if (srmmu_pmd_none(*(pmd_t *)__nocache_fix(pmdp))) {
 			ptep = __srmmu_get_nocache(PTE_SIZE, PTE_SIZE);
 			if (ptep == NULL)
@@ -856,19 +849,6 @@ static void __init srmmu_inherit_prom_mappings(unsigned long start,
 			memset(__nocache_fix(ptep), 0, PTE_SIZE);
 			pmd_set(__nocache_fix(pmdp), ptep);
 		}
-		if (what == 1) {
-			/* We bend the rule where all 16 PTPs in a pmd_t point
-			 * inside the same PTE page, and we leak a perfectly
-			 * good hardware PTE piece. Alternatives seem worse.
-			 */
-			unsigned int x;	/* Index of HW PMD in soft cluster */
-			unsigned long *val;
-			x = (start >> PMD_SHIFT) & 15;
-			val = &pmdp->pmdv[x];
-			*(unsigned long *)__nocache_fix(val) = probed;
-			start += SRMMU_REAL_PMD_SIZE;
-			continue;
-		}
 		ptep = pte_offset_kernel(__nocache_fix(pmdp), start);
 		*(pte_t *)__nocache_fix(ptep) = __pte(probed);
 		start += PAGE_SIZE;
@@ -890,9 +870,9 @@ static void __init do_large_mapping(unsigned long vaddr, unsigned long phys_base
 /* Map sp_bank entry SP_ENTRY, starting at virtual address VBASE. */
 static unsigned long __init map_spbank(unsigned long vbase, int sp_entry)
 {
-	unsigned long pstart = (sp_banks[sp_entry].base_addr & SRMMU_PGDIR_MASK);
-	unsigned long vstart = (vbase & SRMMU_PGDIR_MASK);
-	unsigned long vend = SRMMU_PGDIR_ALIGN(vbase + sp_banks[sp_entry].num_bytes);
+	unsigned long pstart = (sp_banks[sp_entry].base_addr & PGDIR_MASK);
+	unsigned long vstart = (vbase & PGDIR_MASK);
+	unsigned long vend = PGDIR_ALIGN(vbase + sp_banks[sp_entry].num_bytes);
 	/* Map "low" memory only */
 	const unsigned long min_vaddr = PAGE_OFFSET;
 	const unsigned long max_vaddr = PAGE_OFFSET + SRMMU_MAXMEM;
@@ -905,7 +885,7 @@ static unsigned long __init map_spbank(unsigned long vbase, int sp_entry)
 
 	while (vstart < vend) {
 		do_large_mapping(vstart, pstart);
-		vstart += SRMMU_PGDIR_SIZE; pstart += SRMMU_PGDIR_SIZE;
+		vstart += PGDIR_SIZE; pstart += PGDIR_SIZE;
 	}
 	return vstart;
 }
diff --git a/arch/sparc/mm/viking.S b/arch/sparc/mm/viking.S
index adaef6e..48f062d 100644
--- a/arch/sparc/mm/viking.S
+++ b/arch/sparc/mm/viking.S
@@ -13,6 +13,7 @@
 #include <asm/asi.h>
 #include <asm/mxcc.h>
 #include <asm/page.h>
+#include <asm/pgtable.h>
 #include <asm/pgtsrmmu.h>
 #include <asm/viking.h>
 
@@ -157,7 +158,7 @@ viking_flush_tlb_range:
 	cmp	%o3, -1
 	be	2f
 #endif
-	sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
+	sethi	%hi(~((1 << PGDIR_SHIFT) - 1)), %o4
 	sta	%o3, [%g1] ASI_M_MMUREGS
 	and	%o1, %o4, %o1
 	add	%o1, 0x200, %o1
@@ -243,7 +244,7 @@ sun4dsmp_flush_tlb_range:
 	ld	[%o0 + VMA_VM_MM], %o0
 	ld	[%o0 + AOFF_mm_context], %o3
 	lda	[%g1] ASI_M_MMUREGS, %g5
-	sethi	%hi(~((1 << SRMMU_PGDIR_SHIFT) - 1)), %o4
+	sethi	%hi(~((1 << PGDIR_SHIFT) - 1)), %o4
 	sta	%o3, [%g1] ASI_M_MMUREGS
 	and	%o1, %o4, %o1
 	add	%o1, 0x200, %o1

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-12  8:18 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Peter Zijlstra
@ 2020-05-12 17:53   ` Marco Elver
  2020-05-12 18:55     ` Marco Elver
  2020-05-12 19:07     ` Peter Zijlstra
  0 siblings, 2 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-12 17:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

On Tue, 12 May 2020 at 10:18, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, May 11, 2020 at 09:41:32PM +0100, Will Deacon wrote:
> > Hi folks,
> >
> > (trimmed CC list since v4 since this is largely just a rebase)
> >
> > This is version five of the READ_ONCE() codegen improvement series that
> > I've previously posted here:
> >
> > RFC: https://lore.kernel.org/lkml/20200110165636.28035-1-will@kernel.org
> > v2:  https://lore.kernel.org/lkml/20200123153341.19947-1-will@kernel.org
> > v3:  https://lore.kernel.org/lkml/20200415165218.20251-1-will@kernel.org
> > v4:  https://lore.kernel.org/lkml/20200421151537.19241-1-will@kernel.org
> >
> > The main change since v4 is that this is now based on top of the KCSAN
> > changes queued in -tip (locking/kcsan) and therefore contains the patches
> > necessary to avoid breaking sparc32 as well as some cleanups to
> > consolidate {READ,WRITE}_ONCE() and data_race().
> >
> > Other changes include:
> >
> >   * Treat 'char' as distinct from 'signed char' and 'unsigned char' for
> >     __builtin_types_compatible_p()
> >
> >   * Add a compile-time assertion that the argument to READ_ONCE_NOCHECK()
> >     points at something the same size as 'unsigned long'
> >
> > I'm happy for all of this to go via -tip, or I can take it via arm64.
>
> Looks good to me; Thanks!
>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

I just ran a bunch of KCSAN tests. While this series alone would have
passed the tests, there appears to be a problem with
__READ_ONCE/__WRITE_ONCE. I think they should already be using
'data_race()', as otherwise we will get lots of false positives in
future.

I noticed this when testing -tip/locking/kcsan, which breaks
unfortunately, because I see a bunch of spurious data races with
arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
From what I see, the intent was to not double-instrument,
unfortunately they are still double-instrumented because
__READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
actually). I don't think we can use __no_sanitize_or_inline for the
arch_ functions, because we really want them to be __always_inline
(also to avoid calls to these functions in uaccess regions, which
objtool would notice).

I think the easiest way to resolve this is to wrap the accesses in
__*_ONCE with data_race().

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-12 17:53   ` Marco Elver
@ 2020-05-12 18:55     ` Marco Elver
  2020-05-12 19:07     ` Peter Zijlstra
  1 sibling, 0 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-12 18:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

On Tue, 12 May 2020 at 19:53, Marco Elver <elver@google.com> wrote:
>
> On Tue, 12 May 2020 at 10:18, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, May 11, 2020 at 09:41:32PM +0100, Will Deacon wrote:
> > > Hi folks,
> > >
> > > (trimmed CC list since v4 since this is largely just a rebase)
> > >
> > > This is version five of the READ_ONCE() codegen improvement series that
> > > I've previously posted here:
> > >
> > > RFC: https://lore.kernel.org/lkml/20200110165636.28035-1-will@kernel.org
> > > v2:  https://lore.kernel.org/lkml/20200123153341.19947-1-will@kernel.org
> > > v3:  https://lore.kernel.org/lkml/20200415165218.20251-1-will@kernel.org
> > > v4:  https://lore.kernel.org/lkml/20200421151537.19241-1-will@kernel.org
> > >
> > > The main change since v4 is that this is now based on top of the KCSAN
> > > changes queued in -tip (locking/kcsan) and therefore contains the patches
> > > necessary to avoid breaking sparc32 as well as some cleanups to
> > > consolidate {READ,WRITE}_ONCE() and data_race().
> > >
> > > Other changes include:
> > >
> > >   * Treat 'char' as distinct from 'signed char' and 'unsigned char' for
> > >     __builtin_types_compatible_p()
> > >
> > >   * Add a compile-time assertion that the argument to READ_ONCE_NOCHECK()
> > >     points at something the same size as 'unsigned long'
> > >
> > > I'm happy for all of this to go via -tip, or I can take it via arm64.
> >
> > Looks good to me; Thanks!
> >
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>
> I just ran a bunch of KCSAN tests. While this series alone would have
> passed the tests, there appears to be a problem with
> __READ_ONCE/__WRITE_ONCE. I think they should already be using
> 'data_race()', as otherwise we will get lots of false positives in
> future.
>
> I noticed this when testing -tip/locking/kcsan, which breaks
> unfortunately, because I see a bunch of spurious data races with
> arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> From what I see, the intent was to not double-instrument,
> unfortunately they are still double-instrumented because
> __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> actually). I don't think we can use __no_sanitize_or_inline for the
> arch_ functions, because we really want them to be __always_inline
> (also to avoid calls to these functions in uaccess regions, which
> objtool would notice).
>
> I think the easiest way to resolve this is to wrap the accesses in
> __*_ONCE with data_race().

I just sent https://lkml.kernel.org/r/20200512183839.2373-1-elver@google.com
-- note that, using __*_ONCE in arch_atomic_{read,set} will once again
double-instrument with this. Overall there are 2 options:
1. provide __READ_ONCE/__WRITE_ONCE wrapped purely in data_race(), or
2. make __READ_ONCE/__WRITE_ONCE perform an atomic check so we may
still catch races with plain accesses.
The patch I sent does (2). It is inevitable that these will be used in
places that we did not expect, purely to get around the type check,
which is why I thought it might be the more conservative approach.

Thoughts?

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-12 17:53   ` Marco Elver
  2020-05-12 18:55     ` Marco Elver
@ 2020-05-12 19:07     ` Peter Zijlstra
  2020-05-12 20:31       ` Marco Elver
  2020-05-12 21:14       ` Will Deacon
  1 sibling, 2 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-12 19:07 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

On Tue, May 12, 2020 at 07:53:00PM +0200, Marco Elver wrote:
> I just ran a bunch of KCSAN tests. While this series alone would have
> passed the tests, there appears to be a problem with
> __READ_ONCE/__WRITE_ONCE. I think they should already be using
> 'data_race()', as otherwise we will get lots of false positives in
> future.
> 
> I noticed this when testing -tip/locking/kcsan, which breaks
> unfortunately, because I see a bunch of spurious data races with
> arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> From what I see, the intent was to not double-instrument,
> unfortunately they are still double-instrumented because
> __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> actually). I don't think we can use __no_sanitize_or_inline for the
> arch_ functions, because we really want them to be __always_inline
> (also to avoid calls to these functions in uaccess regions, which
> objtool would notice).
> 
> I think the easiest way to resolve this is to wrap the accesses in
> __*_ONCE with data_race().

But we can't... because I need arch_atomic_*() and __READ_ONCE() to not
call out to _ANYTHING_.

Sadly, because the compilers are 'broken' that whole __no_sanitize thing
didn't work, but I'll be moving a whole bunch of code into .c files with
all the sanitizers killed dead. And we'll be validating it'll not be
calling out to anything.

data_race() will include active calls to kcsan_{dis,en}able_current(),
and this must not happen.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-12 19:07     ` Peter Zijlstra
@ 2020-05-12 20:31       ` Marco Elver
  2020-05-13 11:10         ` Peter Zijlstra
  2020-05-12 21:14       ` Will Deacon
  1 sibling, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-05-12 20:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

On Tue, 12 May 2020 at 21:08, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, May 12, 2020 at 07:53:00PM +0200, Marco Elver wrote:
> > I just ran a bunch of KCSAN tests. While this series alone would have
> > passed the tests, there appears to be a problem with
> > __READ_ONCE/__WRITE_ONCE. I think they should already be using
> > 'data_race()', as otherwise we will get lots of false positives in
> > future.
> >
> > I noticed this when testing -tip/locking/kcsan, which breaks
> > unfortunately, because I see a bunch of spurious data races with
> > arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> > instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> > From what I see, the intent was to not double-instrument,
> > unfortunately they are still double-instrumented because
> > __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> > actually). I don't think we can use __no_sanitize_or_inline for the
> > arch_ functions, because we really want them to be __always_inline
> > (also to avoid calls to these functions in uaccess regions, which
> > objtool would notice).
> >
> > I think the easiest way to resolve this is to wrap the accesses in
> > __*_ONCE with data_race().
>
> But we can't... because I need arch_atomic_*() and __READ_ONCE() to not
> call out to _ANYTHING_.
>
> Sadly, because the compilers are 'broken' that whole __no_sanitize thing
> didn't work, but I'll be moving a whole bunch of code into .c files with
> all the sanitizers killed dead. And we'll be validating it'll not be
> calling out to anything.
>
> data_race() will include active calls to kcsan_{dis,en}able_current(),
> and this must not happen.

Only if instrumentation is enabled for the compilation unit. If you
have KCSAN_SANITIZE_foo.c := n, no calls are emitted not even to
kcsan_{dis,en}able_current(). Does that help?

By default, right now __READ_ONCE() will still generate a call due to
instrumentation (call to __tsan_readX).

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-12 19:07     ` Peter Zijlstra
  2020-05-12 20:31       ` Marco Elver
@ 2020-05-12 21:14       ` Will Deacon
  2020-05-12 22:00         ` Marco Elver
  1 sibling, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-12 21:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Marco Elver, LKML, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

On Tue, May 12, 2020 at 09:07:55PM +0200, Peter Zijlstra wrote:
> On Tue, May 12, 2020 at 07:53:00PM +0200, Marco Elver wrote:
> > I just ran a bunch of KCSAN tests. While this series alone would have
> > passed the tests, there appears to be a problem with
> > __READ_ONCE/__WRITE_ONCE. I think they should already be using
> > 'data_race()', as otherwise we will get lots of false positives in
> > future.
> > 
> > I noticed this when testing -tip/locking/kcsan, which breaks
> > unfortunately, because I see a bunch of spurious data races with
> > arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> > instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> > From what I see, the intent was to not double-instrument,
> > unfortunately they are still double-instrumented because
> > __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> > actually). I don't think we can use __no_sanitize_or_inline for the
> > arch_ functions, because we really want them to be __always_inline
> > (also to avoid calls to these functions in uaccess regions, which
> > objtool would notice).
> > 
> > I think the easiest way to resolve this is to wrap the accesses in
> > __*_ONCE with data_race().
> 
> But we can't... because I need arch_atomic_*() and __READ_ONCE() to not
> call out to _ANYTHING_.
> 
> Sadly, because the compilers are 'broken' that whole __no_sanitize thing
> didn't work, but I'll be moving a whole bunch of code into .c files with
> all the sanitizers killed dead. And we'll be validating it'll not be
> calling out to anything.

Hmm, I may have just run into this problem too. I'm using clang 11.0.1,
but even if I do something like:

unsigned long __no_sanitize_or_inline foo(unsigned long *p)
{
	return READ_ONCE_NOCHECK(*p);
}

then I /still/ get calls to __tcsan_func_{entry,exit} emitted by the
compiler. Marco -- how do you turn this thing off?!

I'm also not particularly fond of treating __{READ,WRITE}ONCE() as "atomic",
since they're allowed to tear and I think callers should probably either be
using data_race() explicitly or disabling instrumentation (assuming that's
possible).

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-12 21:14       ` Will Deacon
@ 2020-05-12 22:00         ` Marco Elver
  0 siblings, 0 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-12 22:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney, Ingo Molnar

On Tue, 12 May 2020 at 23:15, Will Deacon <will@kernel.org> wrote:
>
> On Tue, May 12, 2020 at 09:07:55PM +0200, Peter Zijlstra wrote:
> > On Tue, May 12, 2020 at 07:53:00PM +0200, Marco Elver wrote:
> > > I just ran a bunch of KCSAN tests. While this series alone would have
> > > passed the tests, there appears to be a problem with
> > > __READ_ONCE/__WRITE_ONCE. I think they should already be using
> > > 'data_race()', as otherwise we will get lots of false positives in
> > > future.
> > >
> > > I noticed this when testing -tip/locking/kcsan, which breaks
> > > unfortunately, because I see a bunch of spurious data races with
> > > arch_atomic_{read,set} because "locking/atomics: Flip fallbacks and
> > > instrumentation" changed them to use __READ_ONCE()/__WRITE_ONCE().
> > > From what I see, the intent was to not double-instrument,
> > > unfortunately they are still double-instrumented because
> > > __READ_ONCE/__WRITE_ONCE doesn't hide the access from KCSAN (nor KASAN
> > > actually). I don't think we can use __no_sanitize_or_inline for the
> > > arch_ functions, because we really want them to be __always_inline
> > > (also to avoid calls to these functions in uaccess regions, which
> > > objtool would notice).
> > >
> > > I think the easiest way to resolve this is to wrap the accesses in
> > > __*_ONCE with data_race().
> >
> > But we can't... because I need arch_atomic_*() and __READ_ONCE() to not
> > call out to _ANYTHING_.
> >
> > Sadly, because the compilers are 'broken' that whole __no_sanitize thing
> > didn't work, but I'll be moving a whole bunch of code into .c files with
> > all the sanitizers killed dead. And we'll be validating it'll not be
> > calling out to anything.
>
> Hmm, I may have just run into this problem too. I'm using clang 11.0.1,
> but even if I do something like:
>
> unsigned long __no_sanitize_or_inline foo(unsigned long *p)
> {
>         return READ_ONCE_NOCHECK(*p);
> }
>
> then I /still/ get calls to __tcsan_func_{entry,exit} emitted by the
> compiler. Marco -- how do you turn this thing off?!

For Clang we have an option ("-mllvm
-tsan-instrument-func-entry-exit=0"), for GCC, I don't think we have
the option.

I had hoped we could keep these compiler changes optional for now, to
not require a very recent compiler. I'll send a patch to enable the
option, but keep it optional for now. Or do you think we require the
compiler to support this? Because then we'll only support Clang.

> I'm also not particularly fond of treating __{READ,WRITE}ONCE() as "atomic",
> since they're allowed to tear and I think callers should probably either be
> using data_race() explicitly or disabling instrumentation (assuming that's
> possible).

That point is fair enough. But how do we fix arch_atomic_{read,set} then?

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-12 20:31       ` Marco Elver
@ 2020-05-13 11:10         ` Peter Zijlstra
  2020-05-13 11:14           ` Peter Zijlstra
  2020-05-13 11:48           ` Marco Elver
  0 siblings, 2 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-13 11:10 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, dvyukov

On Tue, May 12, 2020 at 10:31:44PM +0200, Marco Elver wrote:
> On Tue, 12 May 2020 at 21:08, Peter Zijlstra <peterz@infradead.org> wrote:

> > data_race() will include active calls to kcsan_{dis,en}able_current(),
> > and this must not happen.
> 
> Only if instrumentation is enabled for the compilation unit. If you
> have KCSAN_SANITIZE_foo.c := n, no calls are emitted not even to
> kcsan_{dis,en}able_current(). Does that help?
> 
> By default, right now __READ_ONCE() will still generate a call due to
> instrumentation (call to __tsan_readX).

Ah, so looking at:

#define data_race(expr)							\
({									\
	__kcsan_disable_current();					\
	({								\
		__unqual_scalar_typeof(({ expr; })) __v = ({ expr; });	\
		__kcsan_enable_current();				\
		__v;							\
	});								\
})

had me confused, but then you've got this squirreled away in another
header:

#ifdef __SANITIZE_THREAD__
/*
 * Only calls into the runtime when the particular compilation unit has KCSAN
 * instrumentation enabled. May be used in header files.
 */
#define kcsan_check_access __kcsan_check_access

/*
 * Only use these to disable KCSAN for accesses in the current compilation unit;
 * calls into libraries may still perform KCSAN checks.
 */
#define __kcsan_disable_current kcsan_disable_current
#define __kcsan_enable_current kcsan_enable_current_nowarn
#else
static inline void kcsan_check_access(const volatile void *ptr, size_t size,
				      int type) { }
static inline void __kcsan_enable_current(void)  { }
static inline void __kcsan_disable_current(void) { }
#endif

And I suppose KCSAN_SANITIZE := n, results in __SANITIZE_THREAD__ not
being defined.

I really hate the function attribute situation, that is some ill
considered trainwreck.

Looking at this more, I found you already have:

arch/x86/kernel/Makefile:KCSAN_SANITIZE := n
arch/x86/kernel/Makefile:KCOV_INSTRUMENT                := n
arch/x86/mm/Makefile:KCSAN_SANITIZE := n

So how about I complete that and kill everhthing for all arch/x86/ that
has DEFINE_IDTENTRY*() in.

That avoids me having to do a lot of work to split up the tricky bits.
You didn't think it was important, so why should I bother.

So then I end up with something like the below, and I've validated that
does not generate instrumentation... HOWEVER, I now need ~10g of memory
and many seconds to compile each file in arch/x86/kernel/.

That is, when I do 'make arch/x86/kernel/ -j8', it is slow enough that I
can run top and grab:

31249 root      20   0 6128580   4.1g  13092 R 100.0  13.1   0:16.29 cc1
31278 root      20   0 6259456   4.4g  12932 R 100.0  13.9   0:16.27 cc1
31286 root      20   0 7243160   4.9g  13028 R 100.0  15.5   0:16.26 cc1
31289 root      20   0 5933824   4.0g  12936 R 100.0  12.8   0:16.26 cc1
31331 root      20   0 4250924   2.9g  13016 R 100.0   9.3   0:09.54 cc1
31346 root      20   0 1939552   1.3g  13028 R 100.0   4.1   0:07.01 cc1
31238 root      20   0 6293524   4.1g  13008 R 100.0  13.0   0:16.29 cc1
31259 root      20   0 6817076   4.7g  12956 R 100.0  14.9   0:16.27 cc1

and it then triggers OOMs, while previously I could build kernels with
-j80 on that machine:

31289 root      20   0   10.8g   6.2g    884 R 100.0  19.7   1:01.56 cc1
31249 root      20   0   10.2g   6.1g    484 R 100.0  19.3   1:00.10 cc1
31331 root      20   0   10.3g   7.2g    496 R 100.0  23.1   0:53.95 cc1

Only 3 left, because the others OOM'ed.

This is gcc-8.3, the situation with gcc-10 seems marginally better, but
still atrocious.

---
diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
index b7a5790d8d63..ff959f0209e7 100644
--- a/arch/x86/entry/Makefile
+++ b/arch/x86/entry/Makefile
@@ -6,6 +6,7 @@
 KASAN_SANITIZE := n
 UBSAN_SANITIZE := n
 KCOV_INSTRUMENT := n
+KCSAN_INSTRUMENT := n
 
 CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE) -fstack-protector -fstack-protector-strong
 CFLAGS_REMOVE_syscall_32.o = $(CC_FLAGS_FTRACE) -fstack-protector -fstack-protector-strong
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index d6d61c4455fa..f2a46a87026e 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -22,15 +22,18 @@ CFLAGS_REMOVE_early_printk.o = -pg
 CFLAGS_REMOVE_head64.o = -pg
 endif
 
-KASAN_SANITIZE_head$(BITS).o				:= n
-KASAN_SANITIZE_dumpstack.o				:= n
-KASAN_SANITIZE_dumpstack_$(BITS).o			:= n
-KASAN_SANITIZE_stacktrace.o				:= n
-KASAN_SANITIZE_paravirt.o				:= n
-
-# With some compiler versions the generated code results in boot hangs, caused
-# by several compilation units. To be safe, disable all instrumentation.
-KCSAN_SANITIZE := n
+#
+# You cannot instrument entry code, that results in definite problems.
+# In particular, anything with DEFINE_IDTENTRY*() in must not have
+# instrumentation on.
+#
+# If only function attributes and inlining would work properly, without
+# that untangling this is a giant trainwreck, don't attempt.
+#
+KASAN_SANITIZE := n
+UBSAN_SANITIZE := n
+KCOV_INSTRUMENT := n
+KCSAN_INSTRUMENT := n
 
 OBJECT_FILES_NON_STANDARD_test_nx.o			:= y
 OBJECT_FILES_NON_STANDARD_paravirt_patch.o		:= y
@@ -39,11 +42,6 @@ ifdef CONFIG_FRAME_POINTER
 OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o		:= y
 endif
 
-# If instrumentation of this dir is enabled, boot hangs during first second.
-# Probably could be more selective here, but note that files related to irqs,
-# boot, dumpstack/stacktrace, etc are either non-interesting or can lead to
-# non-deterministic coverage.
-KCOV_INSTRUMENT		:= n
 
 CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
 
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index f7fd0e868c9c..f8d7e7432847 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -1,15 +1,17 @@
 # SPDX-License-Identifier: GPL-2.0
-# Kernel does not boot with instrumentation of tlb.c and mem_encrypt*.c
-KCOV_INSTRUMENT_tlb.o			:= n
-KCOV_INSTRUMENT_mem_encrypt.o		:= n
-KCOV_INSTRUMENT_mem_encrypt_identity.o	:= n
 
-KASAN_SANITIZE_mem_encrypt.o		:= n
-KASAN_SANITIZE_mem_encrypt_identity.o	:= n
-
-# Disable KCSAN entirely, because otherwise we get warnings that some functions
-# reference __initdata sections.
-KCSAN_SANITIZE := n
+#
+# You cannot instrument entry code, that results in definite problems.
+# In particular, anything with DEFINE_IDTENTRY*() in must not have
+# instrumentation on.
+#
+# If only function attributes and inlining would work properly, without
+# that untangling this is a giant trainwreck, don't attempt.
+#
+KASAN_SANITIZE := n
+UBSAN_SANITIZE := n
+KCOV_INSTRUMENT := n
+KCSAN_INSTRUMENT := n
 
 ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_mem_encrypt.o		= -pg
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3bb962959d8b..48f85d1d2db6 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -241,7 +241,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
  * atomicity or dependency ordering guarantees. Note that this may result
  * in tears!
  */
-#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
+#define __READ_ONCE(x)	data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))
 
 #define __READ_ONCE_SCALAR(x)						\
 ({									\
@@ -260,7 +260,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 
 #define __WRITE_ONCE(x, val)						\
 do {									\
-	*(volatile typeof(x) *)&(x) = (val);				\
+	data_race(*(volatile typeof(x) *)&(x) = (val));			\
 } while (0)
 
 #define __WRITE_ONCE_SCALAR(x, val)					\



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 11:10         ` Peter Zijlstra
@ 2020-05-13 11:14           ` Peter Zijlstra
  2020-05-13 11:48           ` Marco Elver
  1 sibling, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-13 11:14 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, dvyukov

On Wed, May 13, 2020 at 01:10:57PM +0200, Peter Zijlstra wrote:

> So then I end up with something like the below, and I've validated that
> does not generate instrumentation... HOWEVER, I now need ~10g of memory
> and many seconds to compile each file in arch/x86/kernel/.

> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 3bb962959d8b..48f85d1d2db6 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -241,7 +241,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>   * atomicity or dependency ordering guarantees. Note that this may result
>   * in tears!
>   */
> -#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
> +#define __READ_ONCE(x)	data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))
>  
>  #define __READ_ONCE_SCALAR(x)						\
>  ({									\
> @@ -260,7 +260,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>  
>  #define __WRITE_ONCE(x, val)						\
>  do {									\
> -	*(volatile typeof(x) *)&(x) = (val);				\
> +	data_race(*(volatile typeof(x) *)&(x) = (val));			\
>  } while (0)
>  
>  #define __WRITE_ONCE_SCALAR(x, val)					\

The above is responsible for that, the below variant is _MUCH_ better
again. It really doesn't like nested data_race(), as in _REALLY_ doesn't
like.

---

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3bb962959d8b..2ea532b19e75 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -241,12 +241,12 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
  * atomicity or dependency ordering guarantees. Note that this may result
  * in tears!
  */
-#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
+#define __READ_ONCE(x)	data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))
 
 #define __READ_ONCE_SCALAR(x)						\
 ({									\
 	typeof(x) *__xp = &(x);						\
-	__unqual_scalar_typeof(x) __x = data_race(__READ_ONCE(*__xp));	\
+	__unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp);		\
 	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
 	smp_read_barrier_depends();					\
 	(typeof(x))__x;							\
@@ -260,14 +260,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 
 #define __WRITE_ONCE(x, val)						\
 do {									\
-	*(volatile typeof(x) *)&(x) = (val);				\
+	data_race(*(volatile typeof(x) *)&(x) = (val));			\
 } while (0)
 
 #define __WRITE_ONCE_SCALAR(x, val)					\
 do {									\
 	typeof(x) *__xp = &(x);						\
 	kcsan_check_atomic_write(__xp, sizeof(*__xp));			\
-	data_race(({ __WRITE_ONCE(*__xp, val); 0; }));			\
+	__WRITE_ONCE(*__xp, val);					\
 } while (0)
 
 #define WRITE_ONCE(x, val)						\

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 11:10         ` Peter Zijlstra
  2020-05-13 11:14           ` Peter Zijlstra
@ 2020-05-13 11:48           ` Marco Elver
  2020-05-13 12:32             ` Peter Zijlstra
  1 sibling, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-05-13 11:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, 13 May 2020 at 13:11, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, May 12, 2020 at 10:31:44PM +0200, Marco Elver wrote:
> > On Tue, 12 May 2020 at 21:08, Peter Zijlstra <peterz@infradead.org> wrote:
>
> > > data_race() will include active calls to kcsan_{dis,en}able_current(),
> > > and this must not happen.
> >
> > Only if instrumentation is enabled for the compilation unit. If you
> > have KCSAN_SANITIZE_foo.c := n, no calls are emitted not even to
> > kcsan_{dis,en}able_current(). Does that help?
> >
> > By default, right now __READ_ONCE() will still generate a call due to
> > instrumentation (call to __tsan_readX).
>
> Ah, so looking at:
>
> #define data_race(expr)                                                 \
> ({                                                                      \
>         __kcsan_disable_current();                                      \
>         ({                                                              \
>                 __unqual_scalar_typeof(({ expr; })) __v = ({ expr; });  \
>                 __kcsan_enable_current();                               \
>                 __v;                                                    \
>         });                                                             \
> })
>
> had me confused, but then you've got this squirreled away in another
> header:
>
> #ifdef __SANITIZE_THREAD__
> /*
>  * Only calls into the runtime when the particular compilation unit has KCSAN
>  * instrumentation enabled. May be used in header files.
>  */
> #define kcsan_check_access __kcsan_check_access
>
> /*
>  * Only use these to disable KCSAN for accesses in the current compilation unit;
>  * calls into libraries may still perform KCSAN checks.
>  */
> #define __kcsan_disable_current kcsan_disable_current
> #define __kcsan_enable_current kcsan_enable_current_nowarn
> #else
> static inline void kcsan_check_access(const volatile void *ptr, size_t size,
>                                       int type) { }
> static inline void __kcsan_enable_current(void)  { }
> static inline void __kcsan_disable_current(void) { }
> #endif
>
> And I suppose KCSAN_SANITIZE := n, results in __SANITIZE_THREAD__ not
> being defined.
>
> I really hate the function attribute situation, that is some ill
> considered trainwreck.
>
> Looking at this more, I found you already have:
>
> arch/x86/kernel/Makefile:KCSAN_SANITIZE := n
> arch/x86/kernel/Makefile:KCOV_INSTRUMENT                := n
> arch/x86/mm/Makefile:KCSAN_SANITIZE := n
>
> So how about I complete that and kill everhthing for all arch/x86/ that
> has DEFINE_IDTENTRY*() in.
>
> That avoids me having to do a lot of work to split up the tricky bits.
> You didn't think it was important, so why should I bother.
>
> So then I end up with something like the below, and I've validated that
> does not generate instrumentation... HOWEVER, I now need ~10g of memory
> and many seconds to compile each file in arch/x86/kernel/.
>
> That is, when I do 'make arch/x86/kernel/ -j8', it is slow enough that I
> can run top and grab:
>
> 31249 root      20   0 6128580   4.1g  13092 R 100.0  13.1   0:16.29 cc1
> 31278 root      20   0 6259456   4.4g  12932 R 100.0  13.9   0:16.27 cc1
> 31286 root      20   0 7243160   4.9g  13028 R 100.0  15.5   0:16.26 cc1
> 31289 root      20   0 5933824   4.0g  12936 R 100.0  12.8   0:16.26 cc1
> 31331 root      20   0 4250924   2.9g  13016 R 100.0   9.3   0:09.54 cc1
> 31346 root      20   0 1939552   1.3g  13028 R 100.0   4.1   0:07.01 cc1
> 31238 root      20   0 6293524   4.1g  13008 R 100.0  13.0   0:16.29 cc1
> 31259 root      20   0 6817076   4.7g  12956 R 100.0  14.9   0:16.27 cc1
>
> and it then triggers OOMs, while previously I could build kernels with
> -j80 on that machine:
>
> 31289 root      20   0   10.8g   6.2g    884 R 100.0  19.7   1:01.56 cc1
> 31249 root      20   0   10.2g   6.1g    484 R 100.0  19.3   1:00.10 cc1
> 31331 root      20   0   10.3g   7.2g    496 R 100.0  23.1   0:53.95 cc1
>
> Only 3 left, because the others OOM'ed.
>
> This is gcc-8.3, the situation with gcc-10 seems marginally better, but
> still atrocious.
>
> ---
> diff --git a/arch/x86/entry/Makefile b/arch/x86/entry/Makefile
> index b7a5790d8d63..ff959f0209e7 100644
> --- a/arch/x86/entry/Makefile
> +++ b/arch/x86/entry/Makefile
> @@ -6,6 +6,7 @@
>  KASAN_SANITIZE := n
>  UBSAN_SANITIZE := n
>  KCOV_INSTRUMENT := n
> +KCSAN_INSTRUMENT := n
>
>  CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE) -fstack-protector -fstack-protector-strong
>  CFLAGS_REMOVE_syscall_32.o = $(CC_FLAGS_FTRACE) -fstack-protector -fstack-protector-strong
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index d6d61c4455fa..f2a46a87026e 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -22,15 +22,18 @@ CFLAGS_REMOVE_early_printk.o = -pg
>  CFLAGS_REMOVE_head64.o = -pg
>  endif
>
> -KASAN_SANITIZE_head$(BITS).o                           := n
> -KASAN_SANITIZE_dumpstack.o                             := n
> -KASAN_SANITIZE_dumpstack_$(BITS).o                     := n
> -KASAN_SANITIZE_stacktrace.o                            := n
> -KASAN_SANITIZE_paravirt.o                              := n
> -
> -# With some compiler versions the generated code results in boot hangs, caused
> -# by several compilation units. To be safe, disable all instrumentation.
> -KCSAN_SANITIZE := n
> +#
> +# You cannot instrument entry code, that results in definite problems.
> +# In particular, anything with DEFINE_IDTENTRY*() in must not have
> +# instrumentation on.
> +#
> +# If only function attributes and inlining would work properly, without
> +# that untangling this is a giant trainwreck, don't attempt.
> +#
> +KASAN_SANITIZE := n
> +UBSAN_SANITIZE := n
> +KCOV_INSTRUMENT := n
> +KCSAN_INSTRUMENT := n
>
>  OBJECT_FILES_NON_STANDARD_test_nx.o                    := y
>  OBJECT_FILES_NON_STANDARD_paravirt_patch.o             := y
> @@ -39,11 +42,6 @@ ifdef CONFIG_FRAME_POINTER
>  OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o             := y
>  endif
>
> -# If instrumentation of this dir is enabled, boot hangs during first second.
> -# Probably could be more selective here, but note that files related to irqs,
> -# boot, dumpstack/stacktrace, etc are either non-interesting or can lead to
> -# non-deterministic coverage.
> -KCOV_INSTRUMENT                := n
>
>  CFLAGS_irq.o := -I $(srctree)/$(src)/../include/asm/trace
>
> diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
> index f7fd0e868c9c..f8d7e7432847 100644
> --- a/arch/x86/mm/Makefile
> +++ b/arch/x86/mm/Makefile
> @@ -1,15 +1,17 @@
>  # SPDX-License-Identifier: GPL-2.0
> -# Kernel does not boot with instrumentation of tlb.c and mem_encrypt*.c
> -KCOV_INSTRUMENT_tlb.o                  := n
> -KCOV_INSTRUMENT_mem_encrypt.o          := n
> -KCOV_INSTRUMENT_mem_encrypt_identity.o := n
>
> -KASAN_SANITIZE_mem_encrypt.o           := n
> -KASAN_SANITIZE_mem_encrypt_identity.o  := n
> -
> -# Disable KCSAN entirely, because otherwise we get warnings that some functions
> -# reference __initdata sections.
> -KCSAN_SANITIZE := n
> +#
> +# You cannot instrument entry code, that results in definite problems.
> +# In particular, anything with DEFINE_IDTENTRY*() in must not have
> +# instrumentation on.
> +#
> +# If only function attributes and inlining would work properly, without
> +# that untangling this is a giant trainwreck, don't attempt.
> +#
> +KASAN_SANITIZE := n
> +UBSAN_SANITIZE := n
> +KCOV_INSTRUMENT := n
> +KCSAN_INSTRUMENT := n
>
>  ifdef CONFIG_FUNCTION_TRACER
>  CFLAGS_REMOVE_mem_encrypt.o            = -pg
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 3bb962959d8b..48f85d1d2db6 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -241,7 +241,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>   * atomicity or dependency ordering guarantees. Note that this may result
>   * in tears!
>   */
> -#define __READ_ONCE(x) (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> +#define __READ_ONCE(x) data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))
>
>  #define __READ_ONCE_SCALAR(x)                                          \
>  ({                                                                     \
> @@ -260,7 +260,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
>
>  #define __WRITE_ONCE(x, val)                                           \
>  do {                                                                   \
> -       *(volatile typeof(x) *)&(x) = (val);                            \
> +       data_race(*(volatile typeof(x) *)&(x) = (val));                 \
>  } while (0)
>
>  #define __WRITE_ONCE_SCALAR(x, val)                                    \
>

Disabling most instrumentation for arch/x86 is reasonable. Also fine
with the __READ_ONCE/__WRITE_ONCE changes (your improved
compiler-friendlier version).

We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
(as Will suggested) *and* avoid double-instrumentation in arch_atomic.
If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
now.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 11:48           ` Marco Elver
@ 2020-05-13 12:32             ` Peter Zijlstra
  2020-05-13 12:40               ` Will Deacon
  0 siblings, 1 reply; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-13 12:32 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:

> Disabling most instrumentation for arch/x86 is reasonable. Also fine
> with the __READ_ONCE/__WRITE_ONCE changes (your improved
> compiler-friendlier version).
> 
> We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> now.

Right, if/when people want sanitize crud enabled for x86 I need
something that:

 - can mark a function 'no_sanitize' and all code that gets inlined into
   that function must automagically also not get sanitized. ie. make
   inline work like macros (again).

And optionally:

 - can mark a function explicitly 'sanitize', and only when an explicit
   sanitize and no_sanitize mix in inlining give the current
   incompatible attribute splat.

That way we can have the noinstr function attribute imply no_sanitize
and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
helper instead of __always_inline for __##func().

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 12:32             ` Peter Zijlstra
@ 2020-05-13 12:40               ` Will Deacon
  2020-05-13 13:15                 ` Marco Elver
  2020-05-13 13:21                 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen David Laight
  0 siblings, 2 replies; 127+ messages in thread
From: Will Deacon @ 2020-05-13 12:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Marco Elver, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> 
> > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > compiler-friendlier version).
> > 
> > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > now.

I agree that Peter's patch is the right thing to do for now. I was hoping we
could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
__no_sanitize_or_inline doesn't seem to do anything.

> Right, if/when people want sanitize crud enabled for x86 I need
> something that:
> 
>  - can mark a function 'no_sanitize' and all code that gets inlined into
>    that function must automagically also not get sanitized. ie. make
>    inline work like macros (again).
> 
> And optionally:
> 
>  - can mark a function explicitly 'sanitize', and only when an explicit
>    sanitize and no_sanitize mix in inlining give the current
>    incompatible attribute splat.
> 
> That way we can have the noinstr function attribute imply no_sanitize
> and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> helper instead of __always_inline for __##func().

Sounds like a good plan to me, assuming the compiler folks are onboard.
In the meantime, can we kill __no_sanitize_or_inline and put it back to
the old __no_kasan_or_inline, which I think simplifies compiler.h and
doesn't mislead people into using the function annotation to avoid KCSAN?

READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
appreciate that's a noisier change.

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 12:40               ` Will Deacon
@ 2020-05-13 13:15                 ` Marco Elver
  2020-05-13 13:24                   ` Peter Zijlstra
  2020-05-13 16:50                   ` Will Deacon
  2020-05-13 13:21                 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen David Laight
  1 sibling, 2 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-13 13:15 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, 13 May 2020 at 14:40, Will Deacon <will@kernel.org> wrote:
>
> On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> >
> > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > compiler-friendlier version).
> > >
> > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > now.
>
> I agree that Peter's patch is the right thing to do for now. I was hoping we
> could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> __no_sanitize_or_inline doesn't seem to do anything.
>
> > Right, if/when people want sanitize crud enabled for x86 I need
> > something that:
> >
> >  - can mark a function 'no_sanitize' and all code that gets inlined into
> >    that function must automagically also not get sanitized. ie. make
> >    inline work like macros (again).
> >
> > And optionally:
> >
> >  - can mark a function explicitly 'sanitize', and only when an explicit
> >    sanitize and no_sanitize mix in inlining give the current
> >    incompatible attribute splat.
> >
> > That way we can have the noinstr function attribute imply no_sanitize
> > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > helper instead of __always_inline for __##func().
>
> Sounds like a good plan to me, assuming the compiler folks are onboard.
> In the meantime, can we kill __no_sanitize_or_inline and put it back to
> the old __no_kasan_or_inline, which I think simplifies compiler.h and
> doesn't mislead people into using the function annotation to avoid KCSAN?
>
> READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> appreciate that's a noisier change.

So far so good, except: both __no_sanitize_or_inline and
__no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
just doesn't avoid explicit kcsan_check calls, like those in
READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
just because macros won't be redefined just for __no_sanitize
functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
access is unchecked.

This will have the expected result:
__no_sanitize_or_inline void foo(void) { x++; } // no data races reported

This will not work as expected:
__no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
races are reported

All this could be fixed if GCC devs would finally take my patch to
make -fsanitize=thread distinguish volatile [1], but then we have to
wait ~years for the new compilers to reach us. So please don't hold
your breath for this one any time soon.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* RE: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 12:40               ` Will Deacon
  2020-05-13 13:15                 ` Marco Elver
@ 2020-05-13 13:21                 ` David Laight
  2020-05-13 16:32                   ` Thomas Gleixner
  1 sibling, 1 reply; 127+ messages in thread
From: David Laight @ 2020-05-13 13:21 UTC (permalink / raw)
  To: 'Will Deacon', Peter Zijlstra
  Cc: Marco Elver, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

From: Will Deacon
> Sent: 13 May 2020 13:40
> On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> >
> > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > compiler-friendlier version).
> > >
> > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > now.
> 
> I agree that Peter's patch is the right thing to do for now. I was hoping we
> could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> __no_sanitize_or_inline doesn't seem to do anything.

Could something be done that put the addresses of the instructions
into a separate segment and have KASAN check that table before
reporting an actual error?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 13:15                 ` Marco Elver
@ 2020-05-13 13:24                   ` Peter Zijlstra
  2020-05-13 13:58                     ` Marco Elver
  2020-05-13 16:50                   ` Will Deacon
  1 sibling, 1 reply; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-13 13:24 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> So far so good, except: both __no_sanitize_or_inline and
> __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> just doesn't avoid explicit kcsan_check calls, like those in
> READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> just because macros won't be redefined just for __no_sanitize
> functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> access is unchecked.
> 
> This will have the expected result:
> __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> 
> This will not work as expected:
> __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> races are reported
> 
> All this could be fixed if GCC devs would finally take my patch to
> make -fsanitize=thread distinguish volatile [1], but then we have to
> wait ~years for the new compilers to reach us. So please don't hold
> your breath for this one any time soon.
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html

Right, but that does not address the much larger issue of the attribute
vs inline tranwreck :/

Also, could not this compiler instrumentation live as a kernel specific
GCC-plugin instead of being part of GCC proper? Because in that case,
we'd have much better control over it.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 13:24                   ` Peter Zijlstra
@ 2020-05-13 13:58                     ` Marco Elver
  2020-05-14 11:21                       ` Peter Zijlstra
  2020-05-14 14:13                       ` Peter Zijlstra
  0 siblings, 2 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-13 13:58 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, 13 May 2020 at 15:24, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > So far so good, except: both __no_sanitize_or_inline and
> > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > just doesn't avoid explicit kcsan_check calls, like those in
> > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > just because macros won't be redefined just for __no_sanitize
> > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > access is unchecked.
> >
> > This will have the expected result:
> > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> >
> > This will not work as expected:
> > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> > races are reported
> >
> > All this could be fixed if GCC devs would finally take my patch to
> > make -fsanitize=thread distinguish volatile [1], but then we have to
> > wait ~years for the new compilers to reach us. So please don't hold
> > your breath for this one any time soon.
> > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html
>
> Right, but that does not address the much larger issue of the attribute
> vs inline tranwreck :/

Could you check if Clang is equally broken for you? I think GCC and
Clang have differing behaviour on this. No idea what it takes to fix
GCC though.

> Also, could not this compiler instrumentation live as a kernel specific
> GCC-plugin instead of being part of GCC proper? Because in that case,
> we'd have much better control over it.

I'd like it if we could make it a GCC-plugin for GCC, but how? I don't
see a way to affect TSAN instrumentation. FWIW Clang already has
distinguish-volatile support (unreleased Clang 11).

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* RE: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 13:21                 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen David Laight
@ 2020-05-13 16:32                   ` Thomas Gleixner
  0 siblings, 0 replies; 127+ messages in thread
From: Thomas Gleixner @ 2020-05-13 16:32 UTC (permalink / raw)
  To: David Laight, 'Will Deacon', Peter Zijlstra
  Cc: Marco Elver, LKML, Paul E. McKenney, Ingo Molnar, Dmitry Vyukov

David Laight <David.Laight@ACULAB.COM> writes:
> From: Will Deacon
>> Sent: 13 May 2020 13:40
>> On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
>> > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
>> >
>> > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
>> > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
>> > > compiler-friendlier version).
>> > >
>> > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
>> > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
>> > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
>> > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
>> > > now.
>> 
>> I agree that Peter's patch is the right thing to do for now. I was hoping we
>> could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
>> __no_sanitize_or_inline doesn't seem to do anything.
>
> Could something be done that put the addresses of the instructions
> into a separate segment and have KASAN check that table before
> reporting an actual error?

That still does not solve the problem that the compiler adds calls into
k*san which we need to prevent in the first place.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 13:15                 ` Marco Elver
  2020-05-13 13:24                   ` Peter Zijlstra
@ 2020-05-13 16:50                   ` Will Deacon
  2020-05-13 17:32                     ` Marco Elver
  1 sibling, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-13 16:50 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 14:40, Will Deacon <will@kernel.org> wrote:
> >
> > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > >
> > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > compiler-friendlier version).
> > > >
> > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > now.
> >
> > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > __no_sanitize_or_inline doesn't seem to do anything.
> >
> > > Right, if/when people want sanitize crud enabled for x86 I need
> > > something that:
> > >
> > >  - can mark a function 'no_sanitize' and all code that gets inlined into
> > >    that function must automagically also not get sanitized. ie. make
> > >    inline work like macros (again).
> > >
> > > And optionally:
> > >
> > >  - can mark a function explicitly 'sanitize', and only when an explicit
> > >    sanitize and no_sanitize mix in inlining give the current
> > >    incompatible attribute splat.
> > >
> > > That way we can have the noinstr function attribute imply no_sanitize
> > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > helper instead of __always_inline for __##func().
> >
> > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > doesn't mislead people into using the function annotation to avoid KCSAN?
> >
> > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > appreciate that's a noisier change.
> 
> So far so good, except: both __no_sanitize_or_inline and
> __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> just doesn't avoid explicit kcsan_check calls, like those in
> READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> just because macros won't be redefined just for __no_sanitize
> functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> access is unchecked.
> 
> This will have the expected result:
> __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> 
> This will not work as expected:
> __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> races are reported

But the problem is that *this* does not work as expected:

unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
{
	return READ_ONCE_NOCHECK(*ptr);
}

which I think means that the function annotation is practically useless.

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 16:50                   ` Will Deacon
@ 2020-05-13 17:32                     ` Marco Elver
  2020-05-13 17:47                       ` Will Deacon
  0 siblings, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-05-13 17:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, 13 May 2020 at 18:50, Will Deacon <will@kernel.org> wrote:
>
> On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 14:40, Will Deacon <will@kernel.org> wrote:
> > >
> > > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > > >
> > > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > > compiler-friendlier version).
> > > > >
> > > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > > now.
> > >
> > > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > > __no_sanitize_or_inline doesn't seem to do anything.
> > >
> > > > Right, if/when people want sanitize crud enabled for x86 I need
> > > > something that:
> > > >
> > > >  - can mark a function 'no_sanitize' and all code that gets inlined into
> > > >    that function must automagically also not get sanitized. ie. make
> > > >    inline work like macros (again).
> > > >
> > > > And optionally:
> > > >
> > > >  - can mark a function explicitly 'sanitize', and only when an explicit
> > > >    sanitize and no_sanitize mix in inlining give the current
> > > >    incompatible attribute splat.
> > > >
> > > > That way we can have the noinstr function attribute imply no_sanitize
> > > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > > helper instead of __always_inline for __##func().
> > >
> > > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > > doesn't mislead people into using the function annotation to avoid KCSAN?
> > >
> > > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > > appreciate that's a noisier change.
> >
> > So far so good, except: both __no_sanitize_or_inline and
> > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > just doesn't avoid explicit kcsan_check calls, like those in
> > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > just because macros won't be redefined just for __no_sanitize
> > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > access is unchecked.
> >
> > This will have the expected result:
> > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> >
> > This will not work as expected:
> > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> > races are reported
>
> But the problem is that *this* does not work as expected:
>
> unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
> {
>         return READ_ONCE_NOCHECK(*ptr);
> }
>
> which I think means that the function annotation is practically useless.

Let me understand the problem better:

- We do not want __tsan_func_entry/exit (looking at the disassembly,
these aren't always generated).
- We do not want kcsan_disable/enable calls (with the new __READ_ONCE version).
- We do *not* want the call to __read_once_word_nocheck if we have
__no_sanitize_or_inline. AFAIK that's the main problem -- this applies
to both KASAN and KCSAN.

From what I gather, we want to just compile the function as if the
sanitizer was never enabled. One reason for why this doesn't quite
work is because of the preprocessor.

Note that the sanitizers won't complain about these accesses, which
unfortunately is all these attributes ever were documented to do. So
the attributes aren't completely useless. Why doesn't
K[AC]SAN_SANITIZE := n work?

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 17:32                     ` Marco Elver
@ 2020-05-13 17:47                       ` Will Deacon
  2020-05-13 18:54                         ` Marco Elver
  0 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-13 17:47 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 18:50, Will Deacon <will@kernel.org> wrote:
> >
> > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > On Wed, 13 May 2020 at 14:40, Will Deacon <will@kernel.org> wrote:
> > > >
> > > > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > > > >
> > > > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > > > compiler-friendlier version).
> > > > > >
> > > > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > > > now.
> > > >
> > > > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > > > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > > > __no_sanitize_or_inline doesn't seem to do anything.
> > > >
> > > > > Right, if/when people want sanitize crud enabled for x86 I need
> > > > > something that:
> > > > >
> > > > >  - can mark a function 'no_sanitize' and all code that gets inlined into
> > > > >    that function must automagically also not get sanitized. ie. make
> > > > >    inline work like macros (again).
> > > > >
> > > > > And optionally:
> > > > >
> > > > >  - can mark a function explicitly 'sanitize', and only when an explicit
> > > > >    sanitize and no_sanitize mix in inlining give the current
> > > > >    incompatible attribute splat.
> > > > >
> > > > > That way we can have the noinstr function attribute imply no_sanitize
> > > > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > > > helper instead of __always_inline for __##func().
> > > >
> > > > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > > > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > > > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > > > doesn't mislead people into using the function annotation to avoid KCSAN?
> > > >
> > > > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > > > appreciate that's a noisier change.
> > >
> > > So far so good, except: both __no_sanitize_or_inline and
> > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > just doesn't avoid explicit kcsan_check calls, like those in
> > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > just because macros won't be redefined just for __no_sanitize
> > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > access is unchecked.
> > >
> > > This will have the expected result:
> > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > >
> > > This will not work as expected:
> > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> > > races are reported
> >
> > But the problem is that *this* does not work as expected:
> >
> > unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
> > {
> >         return READ_ONCE_NOCHECK(*ptr);
> > }
> >
> > which I think means that the function annotation is practically useless.
> 
> Let me understand the problem better:
> 
> - We do not want __tsan_func_entry/exit (looking at the disassembly,
> these aren't always generated).
> - We do not want kcsan_disable/enable calls (with the new __READ_ONCE version).
> - We do *not* want the call to __read_once_word_nocheck if we have
> __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> to both KASAN and KCSAN.

Sorry, I should've been more explicit. The code above, with KASAN enabled,
compiles to:

ffffffff810a2d50 <foo>:
ffffffff810a2d50:       48 8b 07                mov    (%rdi),%rax
ffffffff810a2d53:       c3                      retq

but with KCSAN enabled, compiles to:

ffffffff8109ecd0 <foo>:
ffffffff8109ecd0:       53                      push   %rbx
ffffffff8109ecd1:       48 89 fb                mov    %rdi,%rbx
ffffffff8109ecd4:       48 8b 7c 24 08          mov    0x8(%rsp),%rdi
ffffffff8109ecd9:       e8 52 9c 1a 00          callq  ffffffff81248930 <__tsan_func_entry>
ffffffff8109ecde:       48 89 df                mov    %rbx,%rdi
ffffffff8109ece1:       e8 1a 00 00 00          callq  ffffffff8109ed00 <__read_once_word_nocheck>
ffffffff8109ece6:       48 89 c3                mov    %rax,%rbx
ffffffff8109ece9:       e8 52 9c 1a 00          callq  ffffffff81248940 <__tsan_func_exit>
ffffffff8109ecee:       48 89 d8                mov    %rbx,%rax
ffffffff8109ecf1:       5b                      pop    %rbx
ffffffff8109ecf2:       c3                      retq

Is that expected? There don't appear to be any more annotations to throw
at it.

> From what I gather, we want to just compile the function as if the
> sanitizer was never enabled. One reason for why this doesn't quite
> work is because of the preprocessor.
> 
> Note that the sanitizers won't complain about these accesses, which
> unfortunately is all these attributes ever were documented to do. So
> the attributes aren't completely useless. Why doesn't
> K[AC]SAN_SANITIZE := n work?

I just don't get the point in having a function annotation if you then have to
pass flags at the per-object level. That also then necessitates either weird
refactoring and grouping of code into "noinstrument.c" type files, or blanket
disabling of instrumentation for things like arch/x86/

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 17:47                       ` Will Deacon
@ 2020-05-13 18:54                         ` Marco Elver
  2020-05-13 21:25                           ` Will Deacon
  2020-05-22 16:08                           ` [tip: locking/kcsan] kcsan: Remove 'noinline' from __no_kcsan_or_inline tip-bot2 for Marco Elver
  0 siblings, 2 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-13 18:54 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, 13 May 2020 at 19:47, Will Deacon <will@kernel.org> wrote:
>
> On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 18:50, Will Deacon <will@kernel.org> wrote:
> > >
> > > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > > On Wed, 13 May 2020 at 14:40, Will Deacon <will@kernel.org> wrote:
> > > > >
> > > > > On Wed, May 13, 2020 at 02:32:43PM +0200, Peter Zijlstra wrote:
> > > > > > On Wed, May 13, 2020 at 01:48:41PM +0200, Marco Elver wrote:
> > > > > >
> > > > > > > Disabling most instrumentation for arch/x86 is reasonable. Also fine
> > > > > > > with the __READ_ONCE/__WRITE_ONCE changes (your improved
> > > > > > > compiler-friendlier version).
> > > > > > >
> > > > > > > We likely can't have both: still instrument __READ_ONCE/__WRITE_ONCE
> > > > > > > (as Will suggested) *and* avoid double-instrumentation in arch_atomic.
> > > > > > > If most use-cases of __READ_ONCE/__WRITE_ONCE are likely to use
> > > > > > > data_race() or KCSAN_SANITIZE := n anyway, I'd say it's reasonable for
> > > > > > > now.
> > > > >
> > > > > I agree that Peter's patch is the right thing to do for now. I was hoping we
> > > > > could instrument __{READ,WRITE}_ONCE(), but that we before I realised that
> > > > > __no_sanitize_or_inline doesn't seem to do anything.
> > > > >
> > > > > > Right, if/when people want sanitize crud enabled for x86 I need
> > > > > > something that:
> > > > > >
> > > > > >  - can mark a function 'no_sanitize' and all code that gets inlined into
> > > > > >    that function must automagically also not get sanitized. ie. make
> > > > > >    inline work like macros (again).
> > > > > >
> > > > > > And optionally:
> > > > > >
> > > > > >  - can mark a function explicitly 'sanitize', and only when an explicit
> > > > > >    sanitize and no_sanitize mix in inlining give the current
> > > > > >    incompatible attribute splat.
> > > > > >
> > > > > > That way we can have the noinstr function attribute imply no_sanitize
> > > > > > and frob the DEFINE_IDTENTRY*() macros to use (a new) sanitize_or_inline
> > > > > > helper instead of __always_inline for __##func().
> > > > >
> > > > > Sounds like a good plan to me, assuming the compiler folks are onboard.
> > > > > In the meantime, can we kill __no_sanitize_or_inline and put it back to
> > > > > the old __no_kasan_or_inline, which I think simplifies compiler.h and
> > > > > doesn't mislead people into using the function annotation to avoid KCSAN?
> > > > >
> > > > > READ_ONCE_NOCHECK should also probably be READ_ONCE_NOKASAN, but I
> > > > > appreciate that's a noisier change.
> > > >
> > > > So far so good, except: both __no_sanitize_or_inline and
> > > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > > just doesn't avoid explicit kcsan_check calls, like those in
> > > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > > just because macros won't be redefined just for __no_sanitize
> > > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > > access is unchecked.
> > > >
> > > > This will have the expected result:
> > > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > > >
> > > > This will not work as expected:
> > > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> > > > races are reported
> > >
> > > But the problem is that *this* does not work as expected:
> > >
> > > unsigned long __no_sanitize_or_inline foo(unsigned long *ptr)
> > > {
> > >         return READ_ONCE_NOCHECK(*ptr);
> > > }
> > >
> > > which I think means that the function annotation is practically useless.
> >
> > Let me understand the problem better:
> >
> > - We do not want __tsan_func_entry/exit (looking at the disassembly,
> > these aren't always generated).
> > - We do not want kcsan_disable/enable calls (with the new __READ_ONCE version).
> > - We do *not* want the call to __read_once_word_nocheck if we have
> > __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> > to both KASAN and KCSAN.
>
> Sorry, I should've been more explicit. The code above, with KASAN enabled,
> compiles to:
>
> ffffffff810a2d50 <foo>:
> ffffffff810a2d50:       48 8b 07                mov    (%rdi),%rax
> ffffffff810a2d53:       c3                      retq
>
> but with KCSAN enabled, compiles to:
>
> ffffffff8109ecd0 <foo>:
> ffffffff8109ecd0:       53                      push   %rbx
> ffffffff8109ecd1:       48 89 fb                mov    %rdi,%rbx
> ffffffff8109ecd4:       48 8b 7c 24 08          mov    0x8(%rsp),%rdi
> ffffffff8109ecd9:       e8 52 9c 1a 00          callq  ffffffff81248930 <__tsan_func_entry>
> ffffffff8109ecde:       48 89 df                mov    %rbx,%rdi
> ffffffff8109ece1:       e8 1a 00 00 00          callq  ffffffff8109ed00 <__read_once_word_nocheck>
> ffffffff8109ece6:       48 89 c3                mov    %rax,%rbx
> ffffffff8109ece9:       e8 52 9c 1a 00          callq  ffffffff81248940 <__tsan_func_exit>
> ffffffff8109ecee:       48 89 d8                mov    %rbx,%rax
> ffffffff8109ecf1:       5b                      pop    %rbx
> ffffffff8109ecf2:       c3                      retq
>
> Is that expected? There don't appear to be any more annotations to throw
> at it.

Right, so this is expected. We can definitely make
__tsan_func_entry/exit disappear with Clang, with GCC it's going to be
a while if we want to fix it.

If we remove 'noinline' from __no_kcsan_or_inline, we no longer get
the call to __read_once_word_nocheck above! But...

For KCSAN we force 'noinline' because older compilers still inline and
then instrument small functions even if we just have the no_sanitize
attribute (without inline mentioned). The same is actually true for
KASAN, so KASAN's READ_ONCE_NOCHECK might be broken in a few places,
but nobody seems to have noticed [1]. KASAN's __no_kasan_or_inline
should also have a 'noinline' I think. I just tested
__no_kcsan_or_inline without 'noinline', and yes, GCC 9 still decided
to inline a small function and then instrument the accesses.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600

The good news is that Clang does the right thing when removing
'noinline' from __no_kcsan_or_inline:
1. doesn't inline into functions that are instrumented, and
2. your above example doesn't do the call to __read_once_word_nocheck.

The obvious solution to this is: restrict which compiler we want to support?

> > From what I gather, we want to just compile the function as if the
> > sanitizer was never enabled. One reason for why this doesn't quite
> > work is because of the preprocessor.
> >
> > Note that the sanitizers won't complain about these accesses, which
> > unfortunately is all these attributes ever were documented to do. So
> > the attributes aren't completely useless. Why doesn't
> > K[AC]SAN_SANITIZE := n work?
>
> I just don't get the point in having a function annotation if you then have to
> pass flags at the per-object level. That also then necessitates either weird
> refactoring and grouping of code into "noinstrument.c" type files, or blanket
> disabling of instrumentation for things like arch/x86/

If you want a solution now, here is one way to get us closer to where
we want to be:

1. Peter's patch to add data_race around __READ_ONCE/__WRITE_ONCE.
2. Patch to make __tsan_func_entry/exit disappear with Clang.
3. Remove 'noinline' from __no_kcsan_or_inline.
4. Patch to warn users that KCSAN may have problems with GCC and
should use Clang >= 7.

But this is probably only half a solution.

If you *also* want to fix __READ_ONCE etc not adding calls to
__no_sanitize functions:

5. Remove any mention of data_race and kcsan_check calls from
__{READ,WRITE}_ONCE, {READ,WRITE}_ONCE.  [Won't need #1 above.]
6. I'll send a patch to make KCSAN distinguish volatile accesses, and
we will require Clang 11.

That is *if* you insist on __no_sanitize to behave like you suggest.
Note that, at that point, I really don't know how to salvage GCC,
mainly because of fixing __no_sanitize with GCC looking hopeless.

Let me know what you prefer.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 18:54                         ` Marco Elver
@ 2020-05-13 21:25                           ` Will Deacon
  2020-05-14  7:31                             ` Marco Elver
  2020-05-22 16:08                           ` [tip: locking/kcsan] kcsan: Remove 'noinline' from __no_kcsan_or_inline tip-bot2 for Marco Elver
  1 sibling, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-13 21:25 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, May 13, 2020 at 08:54:03PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 19:47, Will Deacon <will@kernel.org> wrote:
> > On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> > > - We do *not* want the call to __read_once_word_nocheck if we have
> > > __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> > > to both KASAN and KCSAN.
> >
> > Sorry, I should've been more explicit. The code above, with KASAN enabled,
> > compiles to:
> >
> > ffffffff810a2d50 <foo>:
> > ffffffff810a2d50:       48 8b 07                mov    (%rdi),%rax
> > ffffffff810a2d53:       c3                      retq
> >
> > but with KCSAN enabled, compiles to:
> >
> > ffffffff8109ecd0 <foo>:
> > ffffffff8109ecd0:       53                      push   %rbx
> > ffffffff8109ecd1:       48 89 fb                mov    %rdi,%rbx
> > ffffffff8109ecd4:       48 8b 7c 24 08          mov    0x8(%rsp),%rdi
> > ffffffff8109ecd9:       e8 52 9c 1a 00          callq  ffffffff81248930 <__tsan_func_entry>
> > ffffffff8109ecde:       48 89 df                mov    %rbx,%rdi
> > ffffffff8109ece1:       e8 1a 00 00 00          callq  ffffffff8109ed00 <__read_once_word_nocheck>
> > ffffffff8109ece6:       48 89 c3                mov    %rax,%rbx
> > ffffffff8109ece9:       e8 52 9c 1a 00          callq  ffffffff81248940 <__tsan_func_exit>
> > ffffffff8109ecee:       48 89 d8                mov    %rbx,%rax
> > ffffffff8109ecf1:       5b                      pop    %rbx
> > ffffffff8109ecf2:       c3                      retq
> >
> > Is that expected? There don't appear to be any more annotations to throw
> > at it.
> 
> Right, so this is expected.

Fair enough, I just found it weird since it's different to the usual
"disable instrumentation/trace" function annotations.

> We can definitely make __tsan_func_entry/exit disappear with Clang, with
> GCC it's going to be a while if we want to fix it.
> 
> If we remove 'noinline' from __no_kcsan_or_inline, we no longer get
> the call to __read_once_word_nocheck above! But...
> 
> For KCSAN we force 'noinline' because older compilers still inline and
> then instrument small functions even if we just have the no_sanitize
> attribute (without inline mentioned). The same is actually true for
> KASAN, so KASAN's READ_ONCE_NOCHECK might be broken in a few places,
> but nobody seems to have noticed [1]. KASAN's __no_kasan_or_inline
> should also have a 'noinline' I think. I just tested
> __no_kcsan_or_inline without 'noinline', and yes, GCC 9 still decided
> to inline a small function and then instrument the accesses.
> 
> [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600
> 
> The good news is that Clang does the right thing when removing
> 'noinline' from __no_kcsan_or_inline:
> 1. doesn't inline into functions that are instrumented, and
> 2. your above example doesn't do the call to __read_once_word_nocheck.
> 
> The obvious solution to this is: restrict which compiler we want to support?

I would be in favour of that, but I defer to the x86 folks since this
affects them much more than it does me. On the arm64 side, we've got patches
queued for 5.8 that require GCC 10.0.1 or later, and that thing is only a
week old. I think it's reasonable to require a recent toolchain for optional
features like this that inherently rely on compiler support.

> > > From what I gather, we want to just compile the function as if the
> > > sanitizer was never enabled. One reason for why this doesn't quite
> > > work is because of the preprocessor.
> > >
> > > Note that the sanitizers won't complain about these accesses, which
> > > unfortunately is all these attributes ever were documented to do. So
> > > the attributes aren't completely useless. Why doesn't
> > > K[AC]SAN_SANITIZE := n work?
> >
> > I just don't get the point in having a function annotation if you then have to
> > pass flags at the per-object level. That also then necessitates either weird
> > refactoring and grouping of code into "noinstrument.c" type files, or blanket
> > disabling of instrumentation for things like arch/x86/
> 
> If you want a solution now, here is one way to get us closer to where
> we want to be:
> 
> 1. Peter's patch to add data_race around __READ_ONCE/__WRITE_ONCE.
> 2. Patch to make __tsan_func_entry/exit disappear with Clang.
> 3. Remove 'noinline' from __no_kcsan_or_inline.
> 4. Patch to warn users that KCSAN may have problems with GCC and
> should use Clang >= 7.
> 
> But this is probably only half a solution.

At this point, I think that if READ_ONCE_NOCHECK() works as expected, and
calling __{READ,WRITE}_ONCE from functions tagged with __no_sanitize doesn't
result in instrumentation, then we're good.

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 21:25                           ` Will Deacon
@ 2020-05-14  7:31                             ` Marco Elver
  2020-05-14 11:05                               ` Will Deacon
  0 siblings, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-05-14  7:31 UTC (permalink / raw)
  To: Will Deacon
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, 13 May 2020 at 23:25, Will Deacon <will@kernel.org> wrote:
>
> On Wed, May 13, 2020 at 08:54:03PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 19:47, Will Deacon <will@kernel.org> wrote:
> > > On Wed, May 13, 2020 at 07:32:58PM +0200, Marco Elver wrote:
> > > > - We do *not* want the call to __read_once_word_nocheck if we have
> > > > __no_sanitize_or_inline. AFAIK that's the main problem -- this applies
> > > > to both KASAN and KCSAN.
> > >
> > > Sorry, I should've been more explicit. The code above, with KASAN enabled,
> > > compiles to:
> > >
> > > ffffffff810a2d50 <foo>:
> > > ffffffff810a2d50:       48 8b 07                mov    (%rdi),%rax
> > > ffffffff810a2d53:       c3                      retq
> > >
> > > but with KCSAN enabled, compiles to:
> > >
> > > ffffffff8109ecd0 <foo>:
> > > ffffffff8109ecd0:       53                      push   %rbx
> > > ffffffff8109ecd1:       48 89 fb                mov    %rdi,%rbx
> > > ffffffff8109ecd4:       48 8b 7c 24 08          mov    0x8(%rsp),%rdi
> > > ffffffff8109ecd9:       e8 52 9c 1a 00          callq  ffffffff81248930 <__tsan_func_entry>
> > > ffffffff8109ecde:       48 89 df                mov    %rbx,%rdi
> > > ffffffff8109ece1:       e8 1a 00 00 00          callq  ffffffff8109ed00 <__read_once_word_nocheck>
> > > ffffffff8109ece6:       48 89 c3                mov    %rax,%rbx
> > > ffffffff8109ece9:       e8 52 9c 1a 00          callq  ffffffff81248940 <__tsan_func_exit>
> > > ffffffff8109ecee:       48 89 d8                mov    %rbx,%rax
> > > ffffffff8109ecf1:       5b                      pop    %rbx
> > > ffffffff8109ecf2:       c3                      retq
> > >
> > > Is that expected? There don't appear to be any more annotations to throw
> > > at it.
> >
> > Right, so this is expected.
>
> Fair enough, I just found it weird since it's different to the usual
> "disable instrumentation/trace" function annotations.
>
> > We can definitely make __tsan_func_entry/exit disappear with Clang, with
> > GCC it's going to be a while if we want to fix it.
> >
> > If we remove 'noinline' from __no_kcsan_or_inline, we no longer get
> > the call to __read_once_word_nocheck above! But...
> >
> > For KCSAN we force 'noinline' because older compilers still inline and
> > then instrument small functions even if we just have the no_sanitize
> > attribute (without inline mentioned). The same is actually true for
> > KASAN, so KASAN's READ_ONCE_NOCHECK might be broken in a few places,
> > but nobody seems to have noticed [1]. KASAN's __no_kasan_or_inline
> > should also have a 'noinline' I think. I just tested
> > __no_kcsan_or_inline without 'noinline', and yes, GCC 9 still decided
> > to inline a small function and then instrument the accesses.
> >
> > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59600
> >
> > The good news is that Clang does the right thing when removing
> > 'noinline' from __no_kcsan_or_inline:
> > 1. doesn't inline into functions that are instrumented, and
> > 2. your above example doesn't do the call to __read_once_word_nocheck.
> >
> > The obvious solution to this is: restrict which compiler we want to support?
>
> I would be in favour of that, but I defer to the x86 folks since this
> affects them much more than it does me. On the arm64 side, we've got patches
> queued for 5.8 that require GCC 10.0.1 or later, and that thing is only a
> week old. I think it's reasonable to require a recent toolchain for optional
> features like this that inherently rely on compiler support.
>
> > > > From what I gather, we want to just compile the function as if the
> > > > sanitizer was never enabled. One reason for why this doesn't quite
> > > > work is because of the preprocessor.
> > > >
> > > > Note that the sanitizers won't complain about these accesses, which
> > > > unfortunately is all these attributes ever were documented to do. So
> > > > the attributes aren't completely useless. Why doesn't
> > > > K[AC]SAN_SANITIZE := n work?
> > >
> > > I just don't get the point in having a function annotation if you then have to
> > > pass flags at the per-object level. That also then necessitates either weird
> > > refactoring and grouping of code into "noinstrument.c" type files, or blanket
> > > disabling of instrumentation for things like arch/x86/
> >
> > If you want a solution now, here is one way to get us closer to where
> > we want to be:
> >
> > 1. Peter's patch to add data_race around __READ_ONCE/__WRITE_ONCE.
> > 2. Patch to make __tsan_func_entry/exit disappear with Clang.
> > 3. Remove 'noinline' from __no_kcsan_or_inline.
> > 4. Patch to warn users that KCSAN may have problems with GCC and
> > should use Clang >= 7.
> >
> > But this is probably only half a solution.
>
> At this point, I think that if READ_ONCE_NOCHECK() works as expected, and
> calling __{READ,WRITE}_ONCE from functions tagged with __no_sanitize doesn't
> result in instrumentation, then we're good.

Ouch. With the __{READ,WRITE}_ONCE requirement, we're going to need
Clang 11 though.

Because without the data_race() around __*_ONCE,
arch_atomic_{read,set} will be broken for KCSAN, but we can't have
data_race() because it would still add
kcsan_{enable,disable}_current() calls to __no_sanitize functions (if
compilation unit is instrumented). We can't make arch_atomic functions
__no_sanitize_or_inline, because even in code that we want to
sanitize, they should remain __always_inline (so they work properly in
__no_sanitize functions). Therefore, Clang 11 with support for
distinguishing volatiles will be the compiler that will satisfy all
the constraints.

If this is what we want, let me prepare a series on top of
-tip/locking/kcsan with all the things I think we need.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14  7:31                             ` Marco Elver
@ 2020-05-14 11:05                               ` Will Deacon
  2020-05-14 13:35                                 ` Marco Elver
  2020-06-03 18:52                                 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Borislav Petkov
  0 siblings, 2 replies; 127+ messages in thread
From: Will Deacon @ 2020-05-14 11:05 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

Hi Marco,

On Thu, May 14, 2020 at 09:31:49AM +0200, Marco Elver wrote:
> Ouch. With the __{READ,WRITE}_ONCE requirement, we're going to need
> Clang 11 though.
> 
> Because without the data_race() around __*_ONCE,
> arch_atomic_{read,set} will be broken for KCSAN, but we can't have
> data_race() because it would still add
> kcsan_{enable,disable}_current() calls to __no_sanitize functions (if
> compilation unit is instrumented). We can't make arch_atomic functions
> __no_sanitize_or_inline, because even in code that we want to
> sanitize, they should remain __always_inline (so they work properly in
> __no_sanitize functions). Therefore, Clang 11 with support for
> distinguishing volatiles will be the compiler that will satisfy all
> the constraints.
> 
> If this is what we want, let me prepare a series on top of
> -tip/locking/kcsan with all the things I think we need.

Stepping back a second, the locking/kcsan branch is at least functional at
the moment by virtue of KCSAN_SANITIZE := n being used liberally in
arch/x86/. However, I still think we want to do better than that because (a)
it would be good to get more x86 coverage and (b) enabling this for arm64,
where objtool is not yet available, will be fragile if we have to whitelist
object files. There's also a fair bit of arm64 low-level code spread around
drivers/, so it feels like we'd end up with a really bad case of whack-a-mole.

Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
and the behaviour for __always_inline is:

  * An __always_inline function inlined into a __no_sanitize function is
    not instrumented
  * An __always_inline function inlined into an instrumented function is
    instrumented
  * You can't mark a function as both __always_inline __no_sanitize, because
    __no_sanitize functions are never inlined

GCC, on the other hand, may still inline __no_sanitize functions and then
subsequently instrument them.

So if were willing to make KCSAN depend on Clang >= 7, then we could:

  - Remove the data_race() from __{READ,WRITE}_ONCE()
  - Wrap arch_atomic*() in data_race() when called from the instrumented
    atomic wrappers

At which point, I *think* everything works as expected. READ_ONCE_NOCHECK()
won't generate any surprises, and Peter can happily use arch_atomic()
from non-instrumented code.

Thoughts? I don't see the need to support buggy compilers when enabling
a new debug feature.

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 13:58                     ` Marco Elver
@ 2020-05-14 11:21                       ` Peter Zijlstra
  2020-05-14 11:24                         ` Peter Zijlstra
                                           ` (3 more replies)
  2020-05-14 14:13                       ` Peter Zijlstra
  1 sibling, 4 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 11:21 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

[-- Attachment #1: Type: text/plain, Size: 3929 bytes --]

On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 15:24, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > So far so good, except: both __no_sanitize_or_inline and
> > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > just doesn't avoid explicit kcsan_check calls, like those in
> > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > just because macros won't be redefined just for __no_sanitize
> > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > access is unchecked.
> > >
> > > This will have the expected result:
> > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > >
> > > This will not work as expected:
> > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> > > races are reported
> > >
> > > All this could be fixed if GCC devs would finally take my patch to
> > > make -fsanitize=thread distinguish volatile [1], but then we have to
> > > wait ~years for the new compilers to reach us. So please don't hold
> > > your breath for this one any time soon.
> > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html
> >
> > Right, but that does not address the much larger issue of the attribute
> > vs inline tranwreck :/
> 
> Could you check if Clang is equally broken for you? I think GCC and
> Clang have differing behaviour on this. No idea what it takes to fix
> GCC though.

So I have some good and some maybe not so good news.

Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
find that I could actually build alternative.o for gcc-{8,9,10} and
indeed clang-10. Any earlier gcc (I tried, 5,6,7) does not build:

../arch/x86/include/asm/ptrace.h:126:28: error: inlining failed in call to always_inline ‘user_mode’: function attribute mismatch

I dumped the poke_int3_handler output using objdump, find the attached
files.

It looks like clang-10 doesn't want to turn UBSAN off :/ The GCC files
look OK, no funny calls in those.

(the config has KASAN/UBSAN on, it looks like KCSAN and KASAN are
mutually exclusive)

---

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 77c83833d91e..06d8db612efc 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -990,7 +990,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
 	return 0;
 }
 
-int noinstr poke_int3_handler(struct pt_regs *regs)
+int noinstr __no_kcsan __no_sanitize_address __no_sanitize_undefined poke_int3_handler(struct pt_regs *regs)
 {
 	struct bp_patching_desc *desc;
 	struct text_poke_loc *tp;
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 2cb42d8bdedc..5e83aada6553 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -15,6 +15,13 @@
 /* all clang versions usable with the kernel support KASAN ABI version 5 */
 #define KASAN_ABI_VERSION 5
 
+#if __has_feature(undefined_sanitizer)
+#define __no_sanitize_undefined \
+               __attribute__((no_sanitize("undefined")))
+#else
+#define __no_sanitize_undefined
+#endif
+
 #if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
 /* Emulate GCC's __SANITIZE_ADDRESS__ flag */
 #define __SANITIZE_ADDRESS__
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 7dd4e0349ef3..8196a121a78e 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -138,6 +138,12 @@
 #define KASAN_ABI_VERSION 3
 #endif
 
+#if __has_attribute(__no_sanitize_undefined__)
+#define __no_sanitize_undefined __attribute__((no_sanitize_undefined))
+#else
+#define __no_sanitize_undefined
+#endif
+
 #if __has_attribute(__no_sanitize_address__)
 #define __no_sanitize_address __attribute__((no_sanitize_address))
 #else




[-- Attachment #2: poke_int3_handler-gcc8.asm --]
[-- Type: text/plain, Size: 5096 bytes --]

0000 0000000000000000 <poke_int3_handler>:
0000    0:	f6 87 88 00 00 00 03 	testb  $0x3,0x88(%rdi)
0007    7:	75 5d                	jne    66 <poke_int3_handler+0x66>
0009    9:	48 8b 35 00 00 00 00 	mov    0x0(%rip),%rsi        # 10 <poke_int3_handler+0x10>
000c 			c: R_X86_64_PC32	.bss+0x105c
0010   10:	48 85 f6             	test   %rsi,%rsi
0013   13:	74 51                	je     66 <poke_int3_handler+0x66>
0015   15:	8b 46 0c             	mov    0xc(%rsi),%eax
0018   18:	48 8d 56 0c          	lea    0xc(%rsi),%rdx
001c   1c:	85 c0                	test   %eax,%eax
001e   1e:	74 46                	je     66 <poke_int3_handler+0x66>
0020   20:	8d 48 01             	lea    0x1(%rax),%ecx
0023   23:	f0 0f b1 4e 0c       	lock cmpxchg %ecx,0xc(%rsi)
0028   28:	75 2d                	jne    57 <poke_int3_handler+0x57>
002a   2a:	48 8b 97 80 00 00 00 	mov    0x80(%rdi),%rdx
0031   31:	48 63 46 08          	movslq 0x8(%rsi),%rax
0035   35:	48 8b 0e             	mov    (%rsi),%rcx
0038   38:	4c 8d 42 ff          	lea    -0x1(%rdx),%r8
003c   3c:	83 f8 01             	cmp    $0x1,%eax
003f   3f:	7f 28                	jg     69 <poke_int3_handler+0x69>
0041   41:	4c 63 09             	movslq (%rcx),%r9
0044   44:	31 c0                	xor    %eax,%eax
0046   46:	49 81 c1 00 00 00 00 	add    $0x0,%r9
0049 			49: R_X86_64_32S	_stext
004d   4d:	4d 39 c8             	cmp    %r9,%r8
0050   50:	74 48                	je     9a <poke_int3_handler+0x9a>
0052   52:	f0 ff 4e 0c          	lock decl 0xc(%rsi)
0056   56:	c3                   	retq   
0057   57:	85 c0                	test   %eax,%eax
0059   59:	74 0b                	je     66 <poke_int3_handler+0x66>
005b   5b:	8d 48 01             	lea    0x1(%rax),%ecx
005e   5e:	f0 0f b1 0a          	lock cmpxchg %ecx,(%rdx)
0062   62:	74 c6                	je     2a <poke_int3_handler+0x2a>
0064   64:	eb f1                	jmp    57 <poke_int3_handler+0x57>
0066   66:	31 c0                	xor    %eax,%eax
0068   68:	c3                   	retq   
0069   69:	49 89 cb             	mov    %rcx,%r11
006c   6c:	49 89 c2             	mov    %rax,%r10
006f   6f:	49 d1 ea             	shr    %r10
0072   72:	4c 89 d1             	mov    %r10,%rcx
0075   75:	48 c1 e1 04          	shl    $0x4,%rcx
0079   79:	4c 01 d9             	add    %r11,%rcx
007c   7c:	4c 63 09             	movslq (%rcx),%r9
007f   7f:	49 81 c1 00 00 00 00 	add    $0x0,%r9
0082 			82: R_X86_64_32S	_stext
0086   86:	4d 39 c8             	cmp    %r9,%r8
0089   89:	0f 82 95 00 00 00    	jb     124 <poke_int3_handler+0x124>
008f   8f:	0f 87 84 00 00 00    	ja     119 <poke_int3_handler+0x119>
0095   95:	48 85 c9             	test   %rcx,%rcx
0098   98:	74 29                	je     c3 <poke_int3_handler+0xc3>
009a   9a:	0f b6 41 08          	movzbl 0x8(%rcx),%eax
009e   9e:	44 8d 48 34          	lea    0x34(%rax),%r9d
00a2   a2:	41 80 f9 1f          	cmp    $0x1f,%r9b
00a6   a6:	76 02                	jbe    aa <poke_int3_handler+0xaa>
00a8   a8:	0f 0b                	ud2    
00aa   aa:	45 0f b6 c9          	movzbl %r9b,%r9d
00ae   ae:	4d 0f be 89 00 00 00 	movsbq 0x0(%r9),%r9
00b5   b5:	00 
00b2 			b2: R_X86_64_32S	.rodata+0x620
00b6   b6:	4d 01 c8             	add    %r9,%r8
00b9   b9:	3c e8                	cmp    $0xe8,%al
00bb   bb:	74 2a                	je     e7 <poke_int3_handler+0xe7>
00bd   bd:	77 08                	ja     c7 <poke_int3_handler+0xc7>
00bf   bf:	3c cc                	cmp    $0xcc,%al
00c1   c1:	75 e5                	jne    a8 <poke_int3_handler+0xa8>
00c3   c3:	31 c0                	xor    %eax,%eax
00c5   c5:	eb 8b                	jmp    52 <poke_int3_handler+0x52>
00c7   c7:	3c e9                	cmp    $0xe9,%al
00c9   c9:	74 04                	je     cf <poke_int3_handler+0xcf>
00cb   cb:	3c eb                	cmp    $0xeb,%al
00cd   cd:	75 d9                	jne    a8 <poke_int3_handler+0xa8>
00cf   cf:	4c 63 49 04          	movslq 0x4(%rcx),%r9
00d3   d3:	b8 01 00 00 00       	mov    $0x1,%eax
00d8   d8:	4d 01 c8             	add    %r9,%r8
00db   db:	4c 89 87 80 00 00 00 	mov    %r8,0x80(%rdi)
00e2   e2:	e9 6b ff ff ff       	jmpq   52 <poke_int3_handler+0x52>
00e7   e7:	4c 63 49 04          	movslq 0x4(%rcx),%r9
00eb   eb:	48 8b 87 98 00 00 00 	mov    0x98(%rdi),%rax
00f2   f2:	48 83 c2 04          	add    $0x4,%rdx
00f6   f6:	48 8d 48 f8          	lea    -0x8(%rax),%rcx
00fa   fa:	4d 01 c8             	add    %r9,%r8
00fd   fd:	48 89 8f 98 00 00 00 	mov    %rcx,0x98(%rdi)
0104  104:	48 89 50 f8          	mov    %rdx,-0x8(%rax)
0108  108:	b8 01 00 00 00       	mov    $0x1,%eax
010d  10d:	4c 89 87 80 00 00 00 	mov    %r8,0x80(%rdi)
0114  114:	e9 39 ff ff ff       	jmpq   52 <poke_int3_handler+0x52>
0119  119:	4c 8d 50 ff          	lea    -0x1(%rax),%r10
011d  11d:	4c 8d 59 10          	lea    0x10(%rcx),%r11
0121  121:	49 d1 ea             	shr    %r10
0124  124:	4c 89 d0             	mov    %r10,%rax
0127  127:	4d 85 d2             	test   %r10,%r10
012a  12a:	0f 85 3c ff ff ff    	jne    6c <poke_int3_handler+0x6c>
0130  130:	e9 1d ff ff ff       	jmpq   52 <poke_int3_handler+0x52>

[-- Attachment #3: poke_int3_handler-gcc9.asm --]
[-- Type: text/plain, Size: 4737 bytes --]

0000 0000000000000000 <poke_int3_handler>:
0000    0:	f6 87 88 00 00 00 03 	testb  $0x3,0x88(%rdi)
0007    7:	75 4d                	jne    56 <poke_int3_handler+0x56>
0009    9:	48 8b 15 00 00 00 00 	mov    0x0(%rip),%rdx        # 10 <poke_int3_handler+0x10>
000c 			c: R_X86_64_PC32	.bss+0x105c
0010   10:	48 85 d2             	test   %rdx,%rdx
0013   13:	74 41                	je     56 <poke_int3_handler+0x56>
0015   15:	8b 42 0c             	mov    0xc(%rdx),%eax
0018   18:	48 8d 4a 0c          	lea    0xc(%rdx),%rcx
001c   1c:	85 c0                	test   %eax,%eax
001e   1e:	74 36                	je     56 <poke_int3_handler+0x56>
0020   20:	8d 70 01             	lea    0x1(%rax),%esi
0023   23:	f0 0f b1 31          	lock cmpxchg %esi,(%rcx)
0027   27:	75 f3                	jne    1c <poke_int3_handler+0x1c>
0029   29:	4c 8b 8f 80 00 00 00 	mov    0x80(%rdi),%r9
0030   30:	48 63 42 08          	movslq 0x8(%rdx),%rax
0034   34:	48 8b 32             	mov    (%rdx),%rsi
0037   37:	49 8d 49 ff          	lea    -0x1(%r9),%rcx
003b   3b:	83 f8 01             	cmp    $0x1,%eax
003e   3e:	7f 19                	jg     59 <poke_int3_handler+0x59>
0040   40:	4c 63 06             	movslq (%rsi),%r8
0043   43:	31 c0                	xor    %eax,%eax
0045   45:	49 81 c0 00 00 00 00 	add    $0x0,%r8
0048 			48: R_X86_64_32S	_stext
004c   4c:	4c 39 c1             	cmp    %r8,%rcx
004f   4f:	74 39                	je     8a <poke_int3_handler+0x8a>
0051   51:	f0 ff 4a 0c          	lock decl 0xc(%rdx)
0055   55:	c3                   	retq   
0056   56:	31 c0                	xor    %eax,%eax
0058   58:	c3                   	retq   
0059   59:	49 89 f3             	mov    %rsi,%r11
005c   5c:	49 89 c2             	mov    %rax,%r10
005f   5f:	49 d1 ea             	shr    %r10
0062   62:	4c 89 d6             	mov    %r10,%rsi
0065   65:	48 c1 e6 04          	shl    $0x4,%rsi
0069   69:	4c 01 de             	add    %r11,%rsi
006c   6c:	4c 63 06             	movslq (%rsi),%r8
006f   6f:	49 81 c0 00 00 00 00 	add    $0x0,%r8
0072 			72: R_X86_64_32S	_stext
0076   76:	4c 39 c1             	cmp    %r8,%rcx
0079   79:	0f 82 a2 00 00 00    	jb     121 <poke_int3_handler+0x121>
007f   7f:	0f 87 83 00 00 00    	ja     108 <poke_int3_handler+0x108>
0085   85:	48 85 f6             	test   %rsi,%rsi
0088   88:	74 45                	je     cf <poke_int3_handler+0xcf>
008a   8a:	0f b6 46 08          	movzbl 0x8(%rsi),%eax
008e   8e:	44 8d 40 34          	lea    0x34(%rax),%r8d
0092   92:	41 80 f8 1f          	cmp    $0x1f,%r8b
0096   96:	76 02                	jbe    9a <poke_int3_handler+0x9a>
0098   98:	0f 0b                	ud2    
009a   9a:	45 0f b6 c0          	movzbl %r8b,%r8d
009e   9e:	4d 0f be 80 00 00 00 	movsbq 0x0(%r8),%r8
00a5   a5:	00 
00a2 			a2: R_X86_64_32S	.rodata+0x620
00a6   a6:	4c 01 c1             	add    %r8,%rcx
00a9   a9:	3c e8                	cmp    $0xe8,%al
00ab   ab:	74 29                	je     d6 <poke_int3_handler+0xd6>
00ad   ad:	76 1c                	jbe    cb <poke_int3_handler+0xcb>
00af   af:	83 e0 fd             	and    $0xfffffffd,%eax
00b2   b2:	3c e9                	cmp    $0xe9,%al
00b4   b4:	75 e2                	jne    98 <poke_int3_handler+0x98>
00b6   b6:	48 63 46 04          	movslq 0x4(%rsi),%rax
00ba   ba:	48 01 c1             	add    %rax,%rcx
00bd   bd:	b8 01 00 00 00       	mov    $0x1,%eax
00c2   c2:	48 89 8f 80 00 00 00 	mov    %rcx,0x80(%rdi)
00c9   c9:	eb 86                	jmp    51 <poke_int3_handler+0x51>
00cb   cb:	3c cc                	cmp    $0xcc,%al
00cd   cd:	75 c9                	jne    98 <poke_int3_handler+0x98>
00cf   cf:	31 c0                	xor    %eax,%eax
00d1   d1:	e9 7b ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
00d6   d6:	48 63 46 04          	movslq 0x4(%rsi),%rax
00da   da:	49 83 c1 04          	add    $0x4,%r9
00de   de:	48 01 c1             	add    %rax,%rcx
00e1   e1:	48 8b 87 98 00 00 00 	mov    0x98(%rdi),%rax
00e8   e8:	48 8d 70 f8          	lea    -0x8(%rax),%rsi
00ec   ec:	48 89 b7 98 00 00 00 	mov    %rsi,0x98(%rdi)
00f3   f3:	4c 89 48 f8          	mov    %r9,-0x8(%rax)
00f7   f7:	b8 01 00 00 00       	mov    $0x1,%eax
00fc   fc:	48 89 8f 80 00 00 00 	mov    %rcx,0x80(%rdi)
0103  103:	e9 49 ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
0108  108:	48 83 e8 01          	sub    $0x1,%rax
010c  10c:	4c 8d 5e 10          	lea    0x10(%rsi),%r11
0110  110:	48 d1 e8             	shr    %rax
0113  113:	48 85 c0             	test   %rax,%rax
0116  116:	0f 85 40 ff ff ff    	jne    5c <poke_int3_handler+0x5c>
011c  11c:	e9 30 ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
0121  121:	4c 89 d0             	mov    %r10,%rax
0124  124:	eb ed                	jmp    113 <poke_int3_handler+0x113>

[-- Attachment #4: poke_int3_handler-gcc10.asm --]
[-- Type: text/plain, Size: 4737 bytes --]

0000 0000000000000000 <poke_int3_handler>:
0000    0:	f6 87 88 00 00 00 03 	testb  $0x3,0x88(%rdi)
0007    7:	75 4d                	jne    56 <poke_int3_handler+0x56>
0009    9:	48 8b 15 00 00 00 00 	mov    0x0(%rip),%rdx        # 10 <poke_int3_handler+0x10>
000c 			c: R_X86_64_PC32	.bss+0x105c
0010   10:	48 85 d2             	test   %rdx,%rdx
0013   13:	74 41                	je     56 <poke_int3_handler+0x56>
0015   15:	8b 42 0c             	mov    0xc(%rdx),%eax
0018   18:	48 8d 4a 0c          	lea    0xc(%rdx),%rcx
001c   1c:	85 c0                	test   %eax,%eax
001e   1e:	74 36                	je     56 <poke_int3_handler+0x56>
0020   20:	8d 70 01             	lea    0x1(%rax),%esi
0023   23:	f0 0f b1 31          	lock cmpxchg %esi,(%rcx)
0027   27:	75 f3                	jne    1c <poke_int3_handler+0x1c>
0029   29:	4c 8b 8f 80 00 00 00 	mov    0x80(%rdi),%r9
0030   30:	48 63 42 08          	movslq 0x8(%rdx),%rax
0034   34:	48 8b 32             	mov    (%rdx),%rsi
0037   37:	49 8d 49 ff          	lea    -0x1(%r9),%rcx
003b   3b:	83 f8 01             	cmp    $0x1,%eax
003e   3e:	7f 19                	jg     59 <poke_int3_handler+0x59>
0040   40:	4c 63 06             	movslq (%rsi),%r8
0043   43:	31 c0                	xor    %eax,%eax
0045   45:	49 81 c0 00 00 00 00 	add    $0x0,%r8
0048 			48: R_X86_64_32S	_stext
004c   4c:	4c 39 c1             	cmp    %r8,%rcx
004f   4f:	74 39                	je     8a <poke_int3_handler+0x8a>
0051   51:	f0 ff 4a 0c          	lock decl 0xc(%rdx)
0055   55:	c3                   	retq   
0056   56:	31 c0                	xor    %eax,%eax
0058   58:	c3                   	retq   
0059   59:	49 89 f3             	mov    %rsi,%r11
005c   5c:	49 89 c2             	mov    %rax,%r10
005f   5f:	49 d1 ea             	shr    %r10
0062   62:	4c 89 d6             	mov    %r10,%rsi
0065   65:	48 c1 e6 04          	shl    $0x4,%rsi
0069   69:	4c 01 de             	add    %r11,%rsi
006c   6c:	4c 63 06             	movslq (%rsi),%r8
006f   6f:	49 81 c0 00 00 00 00 	add    $0x0,%r8
0072 			72: R_X86_64_32S	_stext
0076   76:	4c 39 c1             	cmp    %r8,%rcx
0079   79:	0f 82 a2 00 00 00    	jb     121 <poke_int3_handler+0x121>
007f   7f:	0f 87 83 00 00 00    	ja     108 <poke_int3_handler+0x108>
0085   85:	48 85 f6             	test   %rsi,%rsi
0088   88:	74 45                	je     cf <poke_int3_handler+0xcf>
008a   8a:	0f b6 46 08          	movzbl 0x8(%rsi),%eax
008e   8e:	44 8d 40 34          	lea    0x34(%rax),%r8d
0092   92:	41 80 f8 1f          	cmp    $0x1f,%r8b
0096   96:	76 02                	jbe    9a <poke_int3_handler+0x9a>
0098   98:	0f 0b                	ud2    
009a   9a:	45 0f b6 c0          	movzbl %r8b,%r8d
009e   9e:	4d 0f be 80 00 00 00 	movsbq 0x0(%r8),%r8
00a5   a5:	00 
00a2 			a2: R_X86_64_32S	.rodata+0x620
00a6   a6:	4c 01 c1             	add    %r8,%rcx
00a9   a9:	3c e8                	cmp    $0xe8,%al
00ab   ab:	74 29                	je     d6 <poke_int3_handler+0xd6>
00ad   ad:	76 1c                	jbe    cb <poke_int3_handler+0xcb>
00af   af:	83 e0 fd             	and    $0xfffffffd,%eax
00b2   b2:	3c e9                	cmp    $0xe9,%al
00b4   b4:	75 e2                	jne    98 <poke_int3_handler+0x98>
00b6   b6:	48 63 46 04          	movslq 0x4(%rsi),%rax
00ba   ba:	48 01 c1             	add    %rax,%rcx
00bd   bd:	b8 01 00 00 00       	mov    $0x1,%eax
00c2   c2:	48 89 8f 80 00 00 00 	mov    %rcx,0x80(%rdi)
00c9   c9:	eb 86                	jmp    51 <poke_int3_handler+0x51>
00cb   cb:	3c cc                	cmp    $0xcc,%al
00cd   cd:	75 c9                	jne    98 <poke_int3_handler+0x98>
00cf   cf:	31 c0                	xor    %eax,%eax
00d1   d1:	e9 7b ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
00d6   d6:	48 63 46 04          	movslq 0x4(%rsi),%rax
00da   da:	49 83 c1 04          	add    $0x4,%r9
00de   de:	48 01 c1             	add    %rax,%rcx
00e1   e1:	48 8b 87 98 00 00 00 	mov    0x98(%rdi),%rax
00e8   e8:	48 8d 70 f8          	lea    -0x8(%rax),%rsi
00ec   ec:	48 89 b7 98 00 00 00 	mov    %rsi,0x98(%rdi)
00f3   f3:	4c 89 48 f8          	mov    %r9,-0x8(%rax)
00f7   f7:	b8 01 00 00 00       	mov    $0x1,%eax
00fc   fc:	48 89 8f 80 00 00 00 	mov    %rcx,0x80(%rdi)
0103  103:	e9 49 ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
0108  108:	48 83 e8 01          	sub    $0x1,%rax
010c  10c:	4c 8d 5e 10          	lea    0x10(%rsi),%r11
0110  110:	48 d1 e8             	shr    %rax
0113  113:	48 85 c0             	test   %rax,%rax
0116  116:	0f 85 40 ff ff ff    	jne    5c <poke_int3_handler+0x5c>
011c  11c:	e9 30 ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
0121  121:	4c 89 d0             	mov    %r10,%rax
0124  124:	eb ed                	jmp    113 <poke_int3_handler+0x113>

[-- Attachment #5: poke_int3_handler-clang10.asm --]
[-- Type: text/plain, Size: 7799 bytes --]

0000 0000000000000000 <poke_int3_handler>:
0000    0:	55                   	push   %rbp
0001    1:	48 89 e5             	mov    %rsp,%rbp
0004    4:	41 57                	push   %r15
0006    6:	41 56                	push   %r14
0008    8:	41 55                	push   %r13
000a    a:	41 54                	push   %r12
000c    c:	53                   	push   %rbx
000d    d:	45 31 ff             	xor    %r15d,%r15d
0010   10:	f6 87 88 00 00 00 03 	testb  $0x3,0x88(%rdi)
0017   17:	0f 85 3a 01 00 00    	jne    157 <poke_int3_handler+0x157>
001d   1d:	4c 8b 2d 00 00 00 00 	mov    0x0(%rip),%r13        # 24 <poke_int3_handler+0x24>
0020 			20: R_X86_64_PC32	.bss+0x4
0024   24:	4d 85 ed             	test   %r13,%r13
0027   27:	0f 84 2a 01 00 00    	je     157 <poke_int3_handler+0x157>
002d   2d:	45 8b 65 0c          	mov    0xc(%r13),%r12d
0031   31:	45 85 e4             	test   %r12d,%r12d
0034   34:	0f 84 1d 01 00 00    	je     157 <poke_int3_handler+0x157>
003a   3a:	49 89 fe             	mov    %rdi,%r14
003d   3d:	0f 1f 00             	nopl   (%rax)
0040   40:	41 8d 4c 24 01       	lea    0x1(%r12),%ecx
0045   45:	31 db                	xor    %ebx,%ebx
0047   47:	44 89 e0             	mov    %r12d,%eax
004a   4a:	f0 41 0f b1 4d 0c    	lock cmpxchg %ecx,0xc(%r13)
0050   50:	0f 94 c3             	sete   %bl
0053   53:	41 89 c4             	mov    %eax,%r12d
0056   56:	80 fb 01             	cmp    $0x1,%bl
0059   59:	77 0e                	ja     69 <poke_int3_handler+0x69>
005b   5b:	84 db                	test   %bl,%bl
005d   5d:	75 2c                	jne    8b <poke_int3_handler+0x8b>
005f   5f:	45 85 e4             	test   %r12d,%r12d
0062   62:	75 dc                	jne    40 <poke_int3_handler+0x40>
0064   64:	e9 ee 00 00 00       	jmpq   157 <poke_int3_handler+0x157>
0069   69:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
006c 			6c: R_X86_64_32S	.data+0x160
0070   70:	48 89 de             	mov    %rbx,%rsi
0073   73:	e8 00 00 00 00       	callq  78 <poke_int3_handler+0x78>
0074 			74: R_X86_64_PLT32	__ubsan_handle_load_invalid_value-0x4
0078   78:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
007b 			7b: R_X86_64_32S	.data+0x180
007f   7f:	48 89 de             	mov    %rbx,%rsi
0082   82:	e8 00 00 00 00       	callq  87 <poke_int3_handler+0x87>
0083 			83: R_X86_64_PLT32	__ubsan_handle_load_invalid_value-0x4
0087   87:	84 db                	test   %bl,%bl
0089   89:	74 d4                	je     5f <poke_int3_handler+0x5f>
008b   8b:	49 8b 86 80 00 00 00 	mov    0x80(%r14),%rax
0092   92:	48 83 c0 ff          	add    $0xffffffffffffffff,%rax
0096   96:	41 83 7d 08 02       	cmpl   $0x2,0x8(%r13)
009b   9b:	0f 8d c4 00 00 00    	jge    165 <poke_int3_handler+0x165>
00a1   a1:	49 8b 4d 00          	mov    0x0(%r13),%rcx
00a5   a5:	48 63 11             	movslq (%rcx),%rdx
00a8   a8:	48 8d 92 00 00 00 00 	lea    0x0(%rdx),%rdx
00ab 			ab: R_X86_64_32S	_stext
00af   af:	45 31 ff             	xor    %r15d,%r15d
00b2   b2:	48 39 c2             	cmp    %rax,%rdx
00b5   b5:	0f 85 97 00 00 00    	jne    152 <poke_int3_handler+0x152>
00bb   bb:	8a 59 08             	mov    0x8(%rcx),%bl
00be   be:	31 d2                	xor    %edx,%edx
00c0   c0:	80 fb e8             	cmp    $0xe8,%bl
00c3   c3:	7f 0c                	jg     d1 <poke_int3_handler+0xd1>
00c5   c5:	80 fb cc             	cmp    $0xcc,%bl
00c8   c8:	74 1f                	je     e9 <poke_int3_handler+0xe9>
00ca   ca:	80 fb e8             	cmp    $0xe8,%bl
00cd   cd:	74 13                	je     e2 <poke_int3_handler+0xe2>
00cf   cf:	eb 1d                	jmp    ee <poke_int3_handler+0xee>
00d1   d1:	80 fb e9             	cmp    $0xe9,%bl
00d4   d4:	74 0c                	je     e2 <poke_int3_handler+0xe2>
00d6   d6:	80 fb eb             	cmp    $0xeb,%bl
00d9   d9:	75 13                	jne    ee <poke_int3_handler+0xee>
00db   db:	ba 02 00 00 00       	mov    $0x2,%edx
00e0   e0:	eb 0c                	jmp    ee <poke_int3_handler+0xee>
00e2   e2:	ba 05 00 00 00       	mov    $0x5,%edx
00e7   e7:	eb 05                	jmp    ee <poke_int3_handler+0xee>
00e9   e9:	ba 01 00 00 00       	mov    $0x1,%edx
00ee   ee:	48 01 d0             	add    %rdx,%rax
00f1   f1:	8a 51 08             	mov    0x8(%rcx),%dl
00f4   f4:	80 fa e8             	cmp    $0xe8,%dl
00f7   f7:	7f 3b                	jg     134 <poke_int3_handler+0x134>
00f9   f9:	45 31 ff             	xor    %r15d,%r15d
00fc   fc:	80 fa cc             	cmp    $0xcc,%dl
00ff   ff:	74 51                	je     152 <poke_int3_handler+0x152>
0101  101:	80 fa e8             	cmp    $0xe8,%dl
0104  104:	0f 85 b1 00 00 00    	jne    1bb <poke_int3_handler+0x1bb>
010a  10a:	48 63 49 04          	movslq 0x4(%rcx),%rcx
010e  10e:	48 01 c1             	add    %rax,%rcx
0111  111:	49 8b 86 80 00 00 00 	mov    0x80(%r14),%rax
0118  118:	49 8b 96 98 00 00 00 	mov    0x98(%r14),%rdx
011f  11f:	48 83 c0 04          	add    $0x4,%rax
0123  123:	48 8d 72 f8          	lea    -0x8(%rdx),%rsi
0127  127:	49 89 b6 98 00 00 00 	mov    %rsi,0x98(%r14)
012e  12e:	48 89 42 f8          	mov    %rax,-0x8(%rdx)
0132  132:	eb 11                	jmp    145 <poke_int3_handler+0x145>
0134  134:	80 fa e9             	cmp    $0xe9,%dl
0137  137:	74 05                	je     13e <poke_int3_handler+0x13e>
0139  139:	80 fa eb             	cmp    $0xeb,%dl
013c  13c:	75 7d                	jne    1bb <poke_int3_handler+0x1bb>
013e  13e:	48 63 49 04          	movslq 0x4(%rcx),%rcx
0142  142:	48 01 c1             	add    %rax,%rcx
0145  145:	49 89 8e 80 00 00 00 	mov    %rcx,0x80(%r14)
014c  14c:	41 bf 01 00 00 00    	mov    $0x1,%r15d
0152  152:	f0 41 ff 4d 0c       	lock decl 0xc(%r13)
0157  157:	44 89 f8             	mov    %r15d,%eax
015a  15a:	5b                   	pop    %rbx
015b  15b:	41 5c                	pop    %r12
015d  15d:	41 5d                	pop    %r13
015f  15f:	41 5e                	pop    %r14
0161  161:	41 5f                	pop    %r15
0163  163:	5d                   	pop    %rbp
0164  164:	c3                   	retq   
0165  165:	49 63 55 08          	movslq 0x8(%r13),%rdx
0169  169:	45 31 ff             	xor    %r15d,%r15d
016c  16c:	48 85 d2             	test   %rdx,%rdx
016f  16f:	74 e1                	je     152 <poke_int3_handler+0x152>
0171  171:	49 8b 4d 00          	mov    0x0(%r13),%rcx
0175  175:	eb 05                	jmp    17c <poke_int3_handler+0x17c>
0177  177:	48 d1 ea             	shr    %rdx
017a  17a:	74 d6                	je     152 <poke_int3_handler+0x152>
017c  17c:	48 89 d6             	mov    %rdx,%rsi
017f  17f:	48 83 e6 fe          	and    $0xfffffffffffffffe,%rsi
0183  183:	48 63 3c f1          	movslq (%rcx,%rsi,8),%rdi
0187  187:	48 8d bf 00 00 00 00 	lea    0x0(%rdi),%rdi
018a 			18a: R_X86_64_32S	_stext
018e  18e:	48 39 c7             	cmp    %rax,%rdi
0191  191:	77 e4                	ja     177 <poke_int3_handler+0x177>
0193  193:	48 8d 0c f1          	lea    (%rcx,%rsi,8),%rcx
0197  197:	48 63 31             	movslq (%rcx),%rsi
019a  19a:	48 8d b6 00 00 00 00 	lea    0x0(%rsi),%rsi
019d 			19d: R_X86_64_32S	_stext
01a1  1a1:	48 39 c6             	cmp    %rax,%rsi
01a4  1a4:	73 0a                	jae    1b0 <poke_int3_handler+0x1b0>
01a6  1a6:	48 83 c1 10          	add    $0x10,%rcx
01aa  1aa:	48 83 c2 ff          	add    $0xffffffffffffffff,%rdx
01ae  1ae:	eb c7                	jmp    177 <poke_int3_handler+0x177>
01b0  1b0:	48 85 c9             	test   %rcx,%rcx
01b3  1b3:	0f 85 02 ff ff ff    	jne    bb <poke_int3_handler+0xbb>
01b9  1b9:	eb 97                	jmp    152 <poke_int3_handler+0x152>
01bb  1bb:	0f 0b                	ud2    
01bd  1bd:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
01c0 			1c0: R_X86_64_32S	.data+0xc0
01c4  1c4:	e8 00 00 00 00       	callq  1c9 <int3_selftest>
01c5 			1c5: R_X86_64_PLT32	__ubsan_handle_builtin_unreachable-0x4

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 11:21                       ` Peter Zijlstra
@ 2020-05-14 11:24                         ` Peter Zijlstra
  2020-05-14 11:35                         ` Peter Zijlstra
                                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 11:24 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 01:21:42PM +0200, Peter Zijlstra wrote:
> On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 15:24, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > > So far so good, except: both __no_sanitize_or_inline and
> > > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > > just doesn't avoid explicit kcsan_check calls, like those in
> > > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > > just because macros won't be redefined just for __no_sanitize
> > > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > > access is unchecked.
> > > >
> > > > This will have the expected result:
> > > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > > >
> > > > This will not work as expected:
> > > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> > > > races are reported
> > > >
> > > > All this could be fixed if GCC devs would finally take my patch to
> > > > make -fsanitize=thread distinguish volatile [1], but then we have to
> > > > wait ~years for the new compilers to reach us. So please don't hold
> > > > your breath for this one any time soon.
> > > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html
> > >
> > > Right, but that does not address the much larger issue of the attribute
> > > vs inline tranwreck :/
> > 
> > Could you check if Clang is equally broken for you? I think GCC and
> > Clang have differing behaviour on this. No idea what it takes to fix
> > GCC though.
> 
> So I have some good and some maybe not so good news.
> 
> Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
> find that I could actually build alternative.o for gcc-{8,9,10} and
> indeed clang-10. Any earlier gcc (I tried, 5,6,7) does not build:
> 
> ../arch/x86/include/asm/ptrace.h:126:28: error: inlining failed in call to always_inline ‘user_mode’: function attribute mismatch
> 
> I dumped the poke_int3_handler output using objdump, find the attached
> files.
> 
> It looks like clang-10 doesn't want to turn UBSAN off :/ The GCC files
> look OK, no funny calls in those.
> 
> (the config has KASAN/UBSAN on, it looks like KCSAN and KASAN are
> mutually exclusive)

I just swapped them and rebuild with gcc-10 and that still looks ok.


0000 0000000000000000 <poke_int3_handler>:
0000    0:	f6 87 88 00 00 00 03 	testb  $0x3,0x88(%rdi)
0007    7:	75 4d                	jne    56 <poke_int3_handler+0x56>
0009    9:	48 8b 15 00 00 00 00 	mov    0x0(%rip),%rdx        # 10 <poke_int3_handler+0x10>
000c 			c: R_X86_64_PC32	.bss+0x101c
0010   10:	48 85 d2             	test   %rdx,%rdx
0013   13:	74 41                	je     56 <poke_int3_handler+0x56>
0015   15:	8b 42 0c             	mov    0xc(%rdx),%eax
0018   18:	48 8d 4a 0c          	lea    0xc(%rdx),%rcx
001c   1c:	85 c0                	test   %eax,%eax
001e   1e:	74 36                	je     56 <poke_int3_handler+0x56>
0020   20:	8d 70 01             	lea    0x1(%rax),%esi
0023   23:	f0 0f b1 31          	lock cmpxchg %esi,(%rcx)
0027   27:	75 f3                	jne    1c <poke_int3_handler+0x1c>
0029   29:	4c 8b 8f 80 00 00 00 	mov    0x80(%rdi),%r9
0030   30:	48 63 42 08          	movslq 0x8(%rdx),%rax
0034   34:	48 8b 32             	mov    (%rdx),%rsi
0037   37:	49 8d 49 ff          	lea    -0x1(%r9),%rcx
003b   3b:	83 f8 01             	cmp    $0x1,%eax
003e   3e:	7f 19                	jg     59 <poke_int3_handler+0x59>
0040   40:	4c 63 06             	movslq (%rsi),%r8
0043   43:	31 c0                	xor    %eax,%eax
0045   45:	49 81 c0 00 00 00 00 	add    $0x0,%r8
0048 			48: R_X86_64_32S	_stext
004c   4c:	4c 39 c1             	cmp    %r8,%rcx
004f   4f:	74 39                	je     8a <poke_int3_handler+0x8a>
0051   51:	f0 ff 4a 0c          	lock decl 0xc(%rdx)
0055   55:	c3                   	retq   
0056   56:	31 c0                	xor    %eax,%eax
0058   58:	c3                   	retq   
0059   59:	49 89 f3             	mov    %rsi,%r11
005c   5c:	49 89 c2             	mov    %rax,%r10
005f   5f:	49 d1 ea             	shr    %r10
0062   62:	4c 89 d6             	mov    %r10,%rsi
0065   65:	48 c1 e6 04          	shl    $0x4,%rsi
0069   69:	4c 01 de             	add    %r11,%rsi
006c   6c:	4c 63 06             	movslq (%rsi),%r8
006f   6f:	49 81 c0 00 00 00 00 	add    $0x0,%r8
0072 			72: R_X86_64_32S	_stext
0076   76:	4c 39 c1             	cmp    %r8,%rcx
0079   79:	0f 82 a2 00 00 00    	jb     121 <poke_int3_handler+0x121>
007f   7f:	0f 87 83 00 00 00    	ja     108 <poke_int3_handler+0x108>
0085   85:	48 85 f6             	test   %rsi,%rsi
0088   88:	74 45                	je     cf <poke_int3_handler+0xcf>
008a   8a:	0f b6 46 08          	movzbl 0x8(%rsi),%eax
008e   8e:	44 8d 40 34          	lea    0x34(%rax),%r8d
0092   92:	41 80 f8 1f          	cmp    $0x1f,%r8b
0096   96:	76 02                	jbe    9a <poke_int3_handler+0x9a>
0098   98:	0f 0b                	ud2    
009a   9a:	45 0f b6 c0          	movzbl %r8b,%r8d
009e   9e:	4d 0f be 80 00 00 00 	movsbq 0x0(%r8),%r8
00a5   a5:	00 
00a2 			a2: R_X86_64_32S	.rodata
00a6   a6:	4c 01 c1             	add    %r8,%rcx
00a9   a9:	3c e8                	cmp    $0xe8,%al
00ab   ab:	74 29                	je     d6 <poke_int3_handler+0xd6>
00ad   ad:	76 1c                	jbe    cb <poke_int3_handler+0xcb>
00af   af:	83 e0 fd             	and    $0xfffffffd,%eax
00b2   b2:	3c e9                	cmp    $0xe9,%al
00b4   b4:	75 e2                	jne    98 <poke_int3_handler+0x98>
00b6   b6:	48 63 46 04          	movslq 0x4(%rsi),%rax
00ba   ba:	48 01 c1             	add    %rax,%rcx
00bd   bd:	b8 01 00 00 00       	mov    $0x1,%eax
00c2   c2:	48 89 8f 80 00 00 00 	mov    %rcx,0x80(%rdi)
00c9   c9:	eb 86                	jmp    51 <poke_int3_handler+0x51>
00cb   cb:	3c cc                	cmp    $0xcc,%al
00cd   cd:	75 c9                	jne    98 <poke_int3_handler+0x98>
00cf   cf:	31 c0                	xor    %eax,%eax
00d1   d1:	e9 7b ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
00d6   d6:	48 63 46 04          	movslq 0x4(%rsi),%rax
00da   da:	49 83 c1 04          	add    $0x4,%r9
00de   de:	48 01 c1             	add    %rax,%rcx
00e1   e1:	48 8b 87 98 00 00 00 	mov    0x98(%rdi),%rax
00e8   e8:	48 8d 70 f8          	lea    -0x8(%rax),%rsi
00ec   ec:	48 89 b7 98 00 00 00 	mov    %rsi,0x98(%rdi)
00f3   f3:	4c 89 48 f8          	mov    %r9,-0x8(%rax)
00f7   f7:	b8 01 00 00 00       	mov    $0x1,%eax
00fc   fc:	48 89 8f 80 00 00 00 	mov    %rcx,0x80(%rdi)
0103  103:	e9 49 ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
0108  108:	48 83 e8 01          	sub    $0x1,%rax
010c  10c:	4c 8d 5e 10          	lea    0x10(%rsi),%r11
0110  110:	48 d1 e8             	shr    %rax
0113  113:	48 85 c0             	test   %rax,%rax
0116  116:	0f 85 40 ff ff ff    	jne    5c <poke_int3_handler+0x5c>
011c  11c:	e9 30 ff ff ff       	jmpq   51 <poke_int3_handler+0x51>
0121  121:	4c 89 d0             	mov    %r10,%rax
0124  124:	eb ed                	jmp    113 <poke_int3_handler+0x113>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 11:21                       ` Peter Zijlstra
  2020-05-14 11:24                         ` Peter Zijlstra
@ 2020-05-14 11:35                         ` Peter Zijlstra
  2020-05-14 12:01                         ` Will Deacon
  2020-05-14 12:20                         ` Peter Zijlstra
  3 siblings, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 11:35 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 01:21:42PM +0200, Peter Zijlstra wrote:
> So I have some good and some maybe not so good news.
> 
> Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
> find that I could actually build alternative.o for gcc-{8,9,10} and
> indeed clang-10. 


And, for completeness, here's the vomit from gcc-10 without the patch
for a KASAN+UBSAN build:


0000 0000000000000000 <poke_int3_handler>:
0000    0:	41 57                	push   %r15
0002    2:	41 56                	push   %r14
0004    4:	41 55                	push   %r13
0006    6:	41 54                	push   %r12
0008    8:	55                   	push   %rbp
0009    9:	53                   	push   %rbx
000a    a:	48 89 fb             	mov    %rdi,%rbx
000d    d:	48 81 c7 88 00 00 00 	add    $0x88,%rdi
0014   14:	48 83 ec 10          	sub    $0x10,%rsp
0018   18:	e8 00 00 00 00       	callq  1d <poke_int3_handler+0x1d>
0019 			19: R_X86_64_PLT32	__asan_load8_noabort-0x4
001d   1d:	f6 83 88 00 00 00 03 	testb  $0x3,0x88(%rbx)
0024   24:	0f 85 99 00 00 00    	jne    c3 <poke_int3_handler+0xc3>
002a   2a:	48 8b 2d 00 00 00 00 	mov    0x0(%rip),%rbp        # 31 <poke_int3_handler+0x31>
002d 			2d: R_X86_64_PC32	.bss+0x105c
0031   31:	48 85 ed             	test   %rbp,%rbp
0034   34:	0f 84 89 00 00 00    	je     c3 <poke_int3_handler+0xc3>
003a   3a:	4c 8d 65 0c          	lea    0xc(%rbp),%r12
003e   3e:	4c 89 e7             	mov    %r12,%rdi
0041   41:	e8 00 00 00 00       	callq  46 <poke_int3_handler+0x46>
0042 			42: R_X86_64_PLT32	__asan_load4_noabort-0x4
0046   46:	8b 45 0c             	mov    0xc(%rbp),%eax
0049   49:	85 c0                	test   %eax,%eax
004b   4b:	74 76                	je     c3 <poke_int3_handler+0xc3>
004d   4d:	8d 50 01             	lea    0x1(%rax),%edx
0050   50:	f0 41 0f b1 14 24    	lock cmpxchg %edx,(%r12)
0056   56:	75 f1                	jne    49 <poke_int3_handler+0x49>
0058   58:	48 8d bb 80 00 00 00 	lea    0x80(%rbx),%rdi
005f   5f:	e8 00 00 00 00       	callq  64 <poke_int3_handler+0x64>
0060 			60: R_X86_64_PLT32	__asan_load8_noabort-0x4
0064   64:	48 8b 83 80 00 00 00 	mov    0x80(%rbx),%rax
006b   6b:	48 8d 7d 08          	lea    0x8(%rbp),%rdi
006f   6f:	48 89 04 24          	mov    %rax,(%rsp)
0073   73:	4c 8d 60 ff          	lea    -0x1(%rax),%r12
0077   77:	e8 00 00 00 00       	callq  7c <poke_int3_handler+0x7c>
0078 			78: R_X86_64_PLT32	__asan_load4_noabort-0x4
007c   7c:	4c 63 7d 08          	movslq 0x8(%rbp),%r15
0080   80:	48 89 ef             	mov    %rbp,%rdi
0083   83:	e8 00 00 00 00       	callq  88 <poke_int3_handler+0x88>
0084 			84: R_X86_64_PLT32	__asan_load8_noabort-0x4
0088   88:	4c 8b 6d 00          	mov    0x0(%rbp),%r13
008c   8c:	41 83 ff 01          	cmp    $0x1,%r15d
0090   90:	7f 36                	jg     c8 <poke_int3_handler+0xc8>
0092   92:	4c 89 ef             	mov    %r13,%rdi
0095   95:	e8 00 00 00 00       	callq  9a <poke_int3_handler+0x9a>
0096 			96: R_X86_64_PLT32	__asan_load4_noabort-0x4
009a   9a:	49 63 55 00          	movslq 0x0(%r13),%rdx
009e   9e:	45 31 c0             	xor    %r8d,%r8d
00a1   a1:	48 81 c2 00 00 00 00 	add    $0x0,%rdx
00a4 			a4: R_X86_64_32S	_stext
00a8   a8:	49 39 d4             	cmp    %rdx,%r12
00ab   ab:	74 5d                	je     10a <poke_int3_handler+0x10a>
00ad   ad:	f0 ff 4d 0c          	lock decl 0xc(%rbp)
00b1   b1:	48 83 c4 10          	add    $0x10,%rsp
00b5   b5:	44 89 c0             	mov    %r8d,%eax
00b8   b8:	5b                   	pop    %rbx
00b9   b9:	5d                   	pop    %rbp
00ba   ba:	41 5c                	pop    %r12
00bc   bc:	41 5d                	pop    %r13
00be   be:	41 5e                	pop    %r14
00c0   c0:	41 5f                	pop    %r15
00c2   c2:	c3                   	retq   
00c3   c3:	45 31 c0             	xor    %r8d,%r8d
00c6   c6:	eb e9                	jmp    b1 <poke_int3_handler+0xb1>
00c8   c8:	4c 89 6c 24 08       	mov    %r13,0x8(%rsp)
00cd   cd:	4d 89 fe             	mov    %r15,%r14
00d0   d0:	48 8b 74 24 08       	mov    0x8(%rsp),%rsi
00d5   d5:	49 d1 ee             	shr    %r14
00d8   d8:	4c 89 f0             	mov    %r14,%rax
00db   db:	48 c1 e0 04          	shl    $0x4,%rax
00df   df:	4c 8d 2c 06          	lea    (%rsi,%rax,1),%r13
00e3   e3:	4c 89 ef             	mov    %r13,%rdi
00e6   e6:	e8 00 00 00 00       	callq  eb <poke_int3_handler+0xeb>
00e7 			e7: R_X86_64_PLT32	__asan_load4_noabort-0x4
00eb   eb:	49 63 4d 00          	movslq 0x0(%r13),%rcx
00ef   ef:	48 81 c1 00 00 00 00 	add    $0x0,%rcx
00f2 			f2: R_X86_64_32S	_stext
00f6   f6:	49 39 cc             	cmp    %rcx,%r12
00f9   f9:	0f 82 fd 00 00 00    	jb     1fc <poke_int3_handler+0x1fc>
00ff   ff:	0f 87 d9 00 00 00    	ja     1de <poke_int3_handler+0x1de>
0105  105:	4d 85 ed             	test   %r13,%r13
0108  108:	74 7b                	je     185 <poke_int3_handler+0x185>
010a  10a:	49 8d 7d 08          	lea    0x8(%r13),%rdi
010e  10e:	e8 00 00 00 00       	callq  113 <poke_int3_handler+0x113>
010f 			10f: R_X86_64_PLT32	__asan_load1_noabort-0x4
0113  113:	45 0f b6 75 08       	movzbl 0x8(%r13),%r14d
0118  118:	45 8d 7e 34          	lea    0x34(%r14),%r15d
011c  11c:	41 80 ff 1f          	cmp    $0x1f,%r15b
0120  120:	76 0e                	jbe    130 <poke_int3_handler+0x130>
0122  122:	0f 0b                	ud2    
0124  124:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
0127 			127: R_X86_64_32S	.data+0xc0
012b  12b:	e8 00 00 00 00       	callq  130 <poke_int3_handler+0x130>
012c 			12c: R_X86_64_PLT32	__ubsan_handle_builtin_unreachable-0x4
0130  130:	45 0f b6 ff          	movzbl %r15b,%r15d
0134  134:	49 8d bf 00 00 00 00 	lea    0x0(%r15),%rdi
0137 			137: R_X86_64_32S	.rodata+0x620
013b  13b:	e8 00 00 00 00       	callq  140 <poke_int3_handler+0x140>
013c 			13c: R_X86_64_PLT32	__asan_load1_noabort-0x4
0140  140:	49 0f be 97 00 00 00 	movsbq 0x0(%r15),%rdx
0147  147:	00 
0144 			144: R_X86_64_32S	.rodata+0x620
0148  148:	49 01 d4             	add    %rdx,%r12
014b  14b:	41 80 fe e8          	cmp    $0xe8,%r14b
014f  14f:	74 3c                	je     18d <poke_int3_handler+0x18d>
0151  151:	76 2c                	jbe    17f <poke_int3_handler+0x17f>
0153  153:	41 83 e6 fd          	and    $0xfffffffd,%r14d
0157  157:	41 80 fe e9          	cmp    $0xe9,%r14b
015b  15b:	75 c5                	jne    122 <poke_int3_handler+0x122>
015d  15d:	49 8d 7d 04          	lea    0x4(%r13),%rdi
0161  161:	e8 00 00 00 00       	callq  166 <poke_int3_handler+0x166>
0162 			162: R_X86_64_PLT32	__asan_load4_noabort-0x4
0166  166:	49 63 45 04          	movslq 0x4(%r13),%rax
016a  16a:	41 b8 01 00 00 00    	mov    $0x1,%r8d
0170  170:	49 01 c4             	add    %rax,%r12
0173  173:	4c 89 a3 80 00 00 00 	mov    %r12,0x80(%rbx)
017a  17a:	e9 2e ff ff ff       	jmpq   ad <poke_int3_handler+0xad>
017f  17f:	41 80 fe cc          	cmp    $0xcc,%r14b
0183  183:	75 9d                	jne    122 <poke_int3_handler+0x122>
0185  185:	45 31 c0             	xor    %r8d,%r8d
0188  188:	e9 20 ff ff ff       	jmpq   ad <poke_int3_handler+0xad>
018d  18d:	49 8d 7d 04          	lea    0x4(%r13),%rdi
0191  191:	e8 00 00 00 00       	callq  196 <poke_int3_handler+0x196>
0192 			192: R_X86_64_PLT32	__asan_load4_noabort-0x4
0196  196:	49 63 45 04          	movslq 0x4(%r13),%rax
019a  19a:	48 8d bb 98 00 00 00 	lea    0x98(%rbx),%rdi
01a1  1a1:	49 01 c4             	add    %rax,%r12
01a4  1a4:	e8 00 00 00 00       	callq  1a9 <poke_int3_handler+0x1a9>
01a5 			1a5: R_X86_64_PLT32	__asan_load8_noabort-0x4
01a9  1a9:	4c 8b b3 98 00 00 00 	mov    0x98(%rbx),%r14
01b0  1b0:	49 8d 7e f8          	lea    -0x8(%r14),%rdi
01b4  1b4:	48 89 bb 98 00 00 00 	mov    %rdi,0x98(%rbx)
01bb  1bb:	e8 00 00 00 00       	callq  1c0 <poke_int3_handler+0x1c0>
01bc 			1bc: R_X86_64_PLT32	__asan_store8_noabort-0x4
01c0  1c0:	4c 8b 2c 24          	mov    (%rsp),%r13
01c4  1c4:	41 b8 01 00 00 00    	mov    $0x1,%r8d
01ca  1ca:	49 83 c5 04          	add    $0x4,%r13
01ce  1ce:	4d 89 6e f8          	mov    %r13,-0x8(%r14)
01d2  1d2:	4c 89 a3 80 00 00 00 	mov    %r12,0x80(%rbx)
01d9  1d9:	e9 cf fe ff ff       	jmpq   ad <poke_int3_handler+0xad>
01de  1de:	49 8d 45 10          	lea    0x10(%r13),%rax
01e2  1e2:	4d 8d 77 ff          	lea    -0x1(%r15),%r14
01e6  1e6:	48 89 44 24 08       	mov    %rax,0x8(%rsp)
01eb  1eb:	49 d1 ee             	shr    %r14
01ee  1ee:	4d 89 f7             	mov    %r14,%r15
01f1  1f1:	4d 85 ff             	test   %r15,%r15
01f4  1f4:	0f 85 d3 fe ff ff    	jne    cd <poke_int3_handler+0xcd>
01fa  1fa:	eb 89                	jmp    185 <poke_int3_handler+0x185>
01fc  1fc:	4d 89 f7             	mov    %r14,%r15
01ff  1ff:	eb f0                	jmp    1f1 <poke_int3_handler+0x1f1>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 11:21                       ` Peter Zijlstra
  2020-05-14 11:24                         ` Peter Zijlstra
  2020-05-14 11:35                         ` Peter Zijlstra
@ 2020-05-14 12:01                         ` Will Deacon
  2020-05-14 12:27                           ` Peter Zijlstra
  2020-05-14 12:20                         ` Peter Zijlstra
  3 siblings, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-14 12:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Marco Elver, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 01:21:42PM +0200, Peter Zijlstra wrote:
> On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 15:24, Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Wed, May 13, 2020 at 03:15:55PM +0200, Marco Elver wrote:
> > > > So far so good, except: both __no_sanitize_or_inline and
> > > > __no_kcsan_or_inline *do* avoid KCSAN instrumenting plain accesses, it
> > > > just doesn't avoid explicit kcsan_check calls, like those in
> > > > READ/WRITE_ONCE if KCSAN is enabled for the compilation unit. That's
> > > > just because macros won't be redefined just for __no_sanitize
> > > > functions. Similarly, READ_ONCE_NOCHECK does work as expected, and its
> > > > access is unchecked.
> > > >
> > > > This will have the expected result:
> > > > __no_sanitize_or_inline void foo(void) { x++; } // no data races reported
> > > >
> > > > This will not work as expected:
> > > > __no_sanitize_or_inline void foo(void) { READ_ONCE(x); }  // data
> > > > races are reported
> > > >
> > > > All this could be fixed if GCC devs would finally take my patch to
> > > > make -fsanitize=thread distinguish volatile [1], but then we have to
> > > > wait ~years for the new compilers to reach us. So please don't hold
> > > > your breath for this one any time soon.
> > > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544452.html
> > >
> > > Right, but that does not address the much larger issue of the attribute
> > > vs inline tranwreck :/
> > 
> > Could you check if Clang is equally broken for you? I think GCC and
> > Clang have differing behaviour on this. No idea what it takes to fix
> > GCC though.
> 
> So I have some good and some maybe not so good news.
> 
> Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
> find that I could actually build alternative.o for gcc-{8,9,10} and
> indeed clang-10. Any earlier gcc (I tried, 5,6,7) does not build:
> 
> ../arch/x86/include/asm/ptrace.h:126:28: error: inlining failed in call to always_inline ‘user_mode’: function attribute mismatch
> 
> I dumped the poke_int3_handler output using objdump, find the attached
> files.
> 
> It looks like clang-10 doesn't want to turn UBSAN off :/ The GCC files
> look OK, no funny calls in those.
> 
> (the config has KASAN/UBSAN on, it looks like KCSAN and KASAN are
> mutually exclusive)
> 
> ---
> 
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 77c83833d91e..06d8db612efc 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -990,7 +990,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
>  	return 0;
>  }
>  
> -int noinstr poke_int3_handler(struct pt_regs *regs)
> +int noinstr __no_kcsan __no_sanitize_address __no_sanitize_undefined poke_int3_handler(struct pt_regs *regs)
>  {
>  	struct bp_patching_desc *desc;
>  	struct text_poke_loc *tp;
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index 2cb42d8bdedc..5e83aada6553 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -15,6 +15,13 @@
>  /* all clang versions usable with the kernel support KASAN ABI version 5 */
>  #define KASAN_ABI_VERSION 5
>  
> +#if __has_feature(undefined_sanitizer)

Hmm, this might want to be __has_feature(undefined_behavior_sanitizer)
(and damn is that hard for a Brit to type out!)

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 11:21                       ` Peter Zijlstra
                                           ` (2 preceding siblings ...)
  2020-05-14 12:01                         ` Will Deacon
@ 2020-05-14 12:20                         ` Peter Zijlstra
  3 siblings, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 12:20 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 01:21:42PM +0200, Peter Zijlstra wrote:
> Given the patch below (on top of tglx's entry-v5-the-rest tag); I did
> find that I could actually build alternative.o for gcc-{8,9,10} and
> indeed clang-10. Any earlier gcc (I tried, 5,6,7) does not build:

Damn!, I forgot the patch from https://lkml.kernel.org/r/20200513111447.GE3001@hirez.programming.kicks-ass.net

With that included, on a GCC-10 KCSAN+UBSAN build, I now get this, and
that is very much not okay. This is the thing Will complained about as
well I think.

Hohumm :-(

---
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index d6d61c4455fa..ba89cabe5fcf 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -28,10 +28,6 @@ KASAN_SANITIZE_dumpstack_$(BITS).o			:= n
 KASAN_SANITIZE_stacktrace.o				:= n
 KASAN_SANITIZE_paravirt.o				:= n
 
-# With some compiler versions the generated code results in boot hangs, caused
-# by several compilation units. To be safe, disable all instrumentation.
-KCSAN_SANITIZE := n
-
 OBJECT_FILES_NON_STANDARD_test_nx.o			:= y
 OBJECT_FILES_NON_STANDARD_paravirt_patch.o		:= y
 
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 77c83833d91e..06d8db612efc 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -990,7 +990,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
 	return 0;
 }
 
-int noinstr poke_int3_handler(struct pt_regs *regs)
+int noinstr __no_kcsan __no_sanitize_address __no_sanitize_undefined poke_int3_handler(struct pt_regs *regs)
 {
 	struct bp_patching_desc *desc;
 	struct text_poke_loc *tp;
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 2cb42d8bdedc..c728ae9dcf96 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -15,6 +15,9 @@
 /* all clang versions usable with the kernel support KASAN ABI version 5 */
 #define KASAN_ABI_VERSION 5
 
+#define __no_sanitize_undefined \
+		__attribute__((no_sanitize("undefined")))
+
 #if __has_feature(address_sanitizer) || __has_feature(hwaddress_sanitizer)
 /* Emulate GCC's __SANITIZE_ADDRESS__ flag */
 #define __SANITIZE_ADDRESS__
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 7dd4e0349ef3..8196a121a78e 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -138,6 +138,12 @@
 #define KASAN_ABI_VERSION 3
 #endif
 
+#if __has_attribute(__no_sanitize_undefined__)
+#define __no_sanitize_undefined __attribute__((no_sanitize_undefined))
+#else
+#define __no_sanitize_undefined
+#endif
+
 #if __has_attribute(__no_sanitize_address__)
 #define __no_sanitize_address __attribute__((no_sanitize_address))
 #else
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3bb962959d8b..2ea532b19e75 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -241,12 +241,12 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
  * atomicity or dependency ordering guarantees. Note that this may result
  * in tears!
  */
-#define __READ_ONCE(x)	(*(const volatile __unqual_scalar_typeof(x) *)&(x))
+#define __READ_ONCE(x)	data_race((*(const volatile __unqual_scalar_typeof(x) *)&(x)))
 
 #define __READ_ONCE_SCALAR(x)						\
 ({									\
 	typeof(x) *__xp = &(x);						\
-	__unqual_scalar_typeof(x) __x = data_race(__READ_ONCE(*__xp));	\
+	__unqual_scalar_typeof(x) __x = __READ_ONCE(*__xp);		\
 	kcsan_check_atomic_read(__xp, sizeof(*__xp));			\
 	smp_read_barrier_depends();					\
 	(typeof(x))__x;							\
@@ -260,14 +260,14 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
 
 #define __WRITE_ONCE(x, val)						\
 do {									\
-	*(volatile typeof(x) *)&(x) = (val);				\
+	data_race(*(volatile typeof(x) *)&(x) = (val));			\
 } while (0)
 
 #define __WRITE_ONCE_SCALAR(x, val)					\
 do {									\
 	typeof(x) *__xp = &(x);						\
 	kcsan_check_atomic_write(__xp, sizeof(*__xp));			\
-	data_race(({ __WRITE_ONCE(*__xp, val); 0; }));			\
+	__WRITE_ONCE(*__xp, val);					\
 } while (0)
 
 #define WRITE_ONCE(x, val)						\

---
0000 0000000000000000 <poke_int3_handler>:
0000    0:	41 55                	push   %r13
0002    2:	41 54                	push   %r12
0004    4:	55                   	push   %rbp
0005    5:	53                   	push   %rbx
0006    6:	48 83 ec 10          	sub    $0x10,%rsp
000a    a:	65 48 8b 04 25 28 00 	mov    %gs:0x28,%rax
0011   11:	00 00 
0013   13:	48 89 44 24 08       	mov    %rax,0x8(%rsp)
0018   18:	31 c0                	xor    %eax,%eax
001a   1a:	f6 87 88 00 00 00 03 	testb  $0x3,0x88(%rdi)
0021   21:	74 21                	je     44 <poke_int3_handler+0x44>
0023   23:	31 c0                	xor    %eax,%eax
0025   25:	48 8b 4c 24 08       	mov    0x8(%rsp),%rcx
002a   2a:	65 48 2b 0c 25 28 00 	sub    %gs:0x28,%rcx
0031   31:	00 00 
0033   33:	0f 85 79 01 00 00    	jne    1b2 <poke_int3_handler+0x1b2>
0039   39:	48 83 c4 10          	add    $0x10,%rsp
003d   3d:	5b                   	pop    %rbx
003e   3e:	5d                   	pop    %rbp
003f   3f:	41 5c                	pop    %r12
0041   41:	41 5d                	pop    %r13
0043   43:	c3                   	retq   
0044   44:	48 89 fb             	mov    %rdi,%rbx
0047   47:	e8 00 00 00 00       	callq  4c <poke_int3_handler+0x4c>
0048 			48: R_X86_64_PLT32	kcsan_disable_current-0x4
004c   4c:	4c 8b 2d 00 00 00 00 	mov    0x0(%rip),%r13        # 53 <poke_int3_handler+0x53>
004f 			4f: R_X86_64_PC32	.bss+0x101c
0053   53:	48 8d 6c 24 08       	lea    0x8(%rsp),%rbp
0058   58:	49 89 e4             	mov    %rsp,%r12
005b   5b:	48 89 e8             	mov    %rbp,%rax
005e   5e:	4c 29 e0             	sub    %r12,%rax
0061   61:	48 83 f8 08          	cmp    $0x8,%rax
0065   65:	0f 87 4c 01 00 00    	ja     1b7 <poke_int3_handler+0x1b7>
006b   6b:	4c 29 e5             	sub    %r12,%rbp
006e   6e:	4c 89 2c 24          	mov    %r13,(%rsp)
0072   72:	e8 00 00 00 00       	callq  77 <poke_int3_handler+0x77>
0073 			73: R_X86_64_PLT32	kcsan_enable_current_nowarn-0x4
0077   77:	48 83 fd 08          	cmp    $0x8,%rbp
007b   7b:	0f 87 53 01 00 00    	ja     1d4 <poke_int3_handler+0x1d4>
0081   81:	4c 8b 24 24          	mov    (%rsp),%r12
0085   85:	4d 85 e4             	test   %r12,%r12
0088   88:	74 99                	je     23 <poke_int3_handler+0x23>
008a   8a:	e8 00 00 00 00       	callq  8f <poke_int3_handler+0x8f>
008b 			8b: R_X86_64_PLT32	kcsan_disable_current-0x4
008f   8f:	4d 8d 6c 24 0c       	lea    0xc(%r12),%r13
0094   94:	41 8b 6c 24 0c       	mov    0xc(%r12),%ebp
0099   99:	e8 00 00 00 00       	callq  9e <poke_int3_handler+0x9e>
009a 			9a: R_X86_64_PLT32	kcsan_enable_current_nowarn-0x4
009e   9e:	85 ed                	test   %ebp,%ebp
00a0   a0:	74 81                	je     23 <poke_int3_handler+0x23>
00a2   a2:	8d 55 01             	lea    0x1(%rbp),%edx
00a5   a5:	89 e8                	mov    %ebp,%eax
00a7   a7:	f0 41 0f b1 55 00    	lock cmpxchg %edx,0x0(%r13)
00ad   ad:	89 c5                	mov    %eax,%ebp
00af   af:	75 ed                	jne    9e <poke_int3_handler+0x9e>
00b1   b1:	48 8b bb 80 00 00 00 	mov    0x80(%rbx),%rdi
00b8   b8:	49 63 44 24 08       	movslq 0x8(%r12),%rax
00bd   bd:	49 8b 0c 24          	mov    (%r12),%rcx
00c1   c1:	48 8d 57 ff          	lea    -0x1(%rdi),%rdx
00c5   c5:	83 f8 01             	cmp    $0x1,%eax
00c8   c8:	7f 1c                	jg     e6 <poke_int3_handler+0xe6>
00ca   ca:	48 63 31             	movslq (%rcx),%rsi
00cd   cd:	31 c0                	xor    %eax,%eax
00cf   cf:	48 81 c6 00 00 00 00 	add    $0x0,%rsi
00d2 			d2: R_X86_64_32S	_stext
00d6   d6:	48 39 f2             	cmp    %rsi,%rdx
00d9   d9:	74 3c                	je     117 <poke_int3_handler+0x117>
00db   db:	f0 41 ff 4c 24 0c    	lock decl 0xc(%r12)
00e1   e1:	e9 3f ff ff ff       	jmpq   25 <poke_int3_handler+0x25>
00e6   e6:	49 89 c9             	mov    %rcx,%r9
00e9   e9:	49 89 c0             	mov    %rax,%r8
00ec   ec:	49 d1 e8             	shr    %r8
00ef   ef:	4c 89 c1             	mov    %r8,%rcx
00f2   f2:	48 c1 e1 04          	shl    $0x4,%rcx
00f6   f6:	4c 01 c9             	add    %r9,%rcx
00f9   f9:	48 63 31             	movslq (%rcx),%rsi
00fc   fc:	48 81 c6 00 00 00 00 	add    $0x0,%rsi
00ff 			ff: R_X86_64_32S	_stext
0103  103:	48 39 f2             	cmp    %rsi,%rdx
0106  106:	0f 82 88 00 00 00    	jb     194 <poke_int3_handler+0x194>
010c  10c:	0f 87 93 00 00 00    	ja     1a5 <poke_int3_handler+0x1a5>
0112  112:	48 85 c9             	test   %rcx,%rcx
0115  115:	74 28                	je     13f <poke_int3_handler+0x13f>
0117  117:	0f b6 41 08          	movzbl 0x8(%rcx),%eax
011b  11b:	8d 70 34             	lea    0x34(%rax),%esi
011e  11e:	40 80 fe 1f          	cmp    $0x1f,%sil
0122  122:	76 02                	jbe    126 <poke_int3_handler+0x126>
0124  124:	0f 0b                	ud2    
0126  126:	40 0f b6 f6          	movzbl %sil,%esi
012a  12a:	48 0f be b6 00 00 00 	movsbq 0x0(%rsi),%rsi
0131  131:	00 
012e 			12e: R_X86_64_32S	.rodata
0132  132:	48 01 f2             	add    %rsi,%rdx
0135  135:	3c e8                	cmp    $0xe8,%al
0137  137:	74 29                	je     162 <poke_int3_handler+0x162>
0139  139:	77 08                	ja     143 <poke_int3_handler+0x143>
013b  13b:	3c cc                	cmp    $0xcc,%al
013d  13d:	75 e5                	jne    124 <poke_int3_handler+0x124>
013f  13f:	31 c0                	xor    %eax,%eax
0141  141:	eb 98                	jmp    db <poke_int3_handler+0xdb>
0143  143:	83 e0 fd             	and    $0xfffffffd,%eax
0146  146:	3c e9                	cmp    $0xe9,%al
0148  148:	75 da                	jne    124 <poke_int3_handler+0x124>
014a  14a:	48 63 41 04          	movslq 0x4(%rcx),%rax
014e  14e:	48 01 c2             	add    %rax,%rdx
0151  151:	b8 01 00 00 00       	mov    $0x1,%eax
0156  156:	48 89 93 80 00 00 00 	mov    %rdx,0x80(%rbx)
015d  15d:	e9 79 ff ff ff       	jmpq   db <poke_int3_handler+0xdb>
0162  162:	48 63 41 04          	movslq 0x4(%rcx),%rax
0166  166:	48 83 c7 04          	add    $0x4,%rdi
016a  16a:	48 01 c2             	add    %rax,%rdx
016d  16d:	48 8b 83 98 00 00 00 	mov    0x98(%rbx),%rax
0174  174:	48 8d 48 f8          	lea    -0x8(%rax),%rcx
0178  178:	48 89 8b 98 00 00 00 	mov    %rcx,0x98(%rbx)
017f  17f:	48 89 78 f8          	mov    %rdi,-0x8(%rax)
0183  183:	b8 01 00 00 00       	mov    $0x1,%eax
0188  188:	48 89 93 80 00 00 00 	mov    %rdx,0x80(%rbx)
018f  18f:	e9 47 ff ff ff       	jmpq   db <poke_int3_handler+0xdb>
0194  194:	4c 89 c0             	mov    %r8,%rax
0197  197:	48 85 c0             	test   %rax,%rax
019a  19a:	0f 85 49 ff ff ff    	jne    e9 <poke_int3_handler+0xe9>
01a0  1a0:	e9 36 ff ff ff       	jmpq   db <poke_int3_handler+0xdb>
01a5  1a5:	48 83 e8 01          	sub    $0x1,%rax
01a9  1a9:	4c 8d 49 10          	lea    0x10(%rcx),%r9
01ad  1ad:	48 d1 e8             	shr    %rax
01b0  1b0:	eb e5                	jmp    197 <poke_int3_handler+0x197>
01b2  1b2:	e8 00 00 00 00       	callq  1b7 <poke_int3_handler+0x1b7>
01b3 			1b3: R_X86_64_PLT32	__stack_chk_fail-0x4
01b7  1b7:	4c 01 e0             	add    %r12,%rax
01ba  1ba:	0f 82 ab fe ff ff    	jb     6b <poke_int3_handler+0x6b>
01c0  1c0:	4c 89 e6             	mov    %r12,%rsi
01c3  1c3:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
01c6 			1c6: R_X86_64_32S	.data+0x80
01ca  1ca:	e8 00 00 00 00       	callq  1cf <poke_int3_handler+0x1cf>
01cb 			1cb: R_X86_64_PLT32	__ubsan_handle_type_mismatch_v1-0x4
01cf  1cf:	e9 97 fe ff ff       	jmpq   6b <poke_int3_handler+0x6b>
01d4  1d4:	4c 01 e5             	add    %r12,%rbp
01d7  1d7:	0f 82 a4 fe ff ff    	jb     81 <poke_int3_handler+0x81>
01dd  1dd:	4c 89 e6             	mov    %r12,%rsi
01e0  1e0:	48 c7 c7 00 00 00 00 	mov    $0x0,%rdi
01e3 			1e3: R_X86_64_32S	.data+0x60
01e7  1e7:	e8 00 00 00 00       	callq  1ec <poke_int3_handler+0x1ec>
01e8 			1e8: R_X86_64_PLT32	__ubsan_handle_type_mismatch_v1-0x4
01ec  1ec:	e9 90 fe ff ff       	jmpq   81 <poke_int3_handler+0x81>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 12:01                         ` Will Deacon
@ 2020-05-14 12:27                           ` Peter Zijlstra
  2020-05-14 13:07                             ` Marco Elver
  0 siblings, 1 reply; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 12:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marco Elver, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 01:01:04PM +0100, Will Deacon wrote:

> > +#if __has_feature(undefined_sanitizer)
> 
> Hmm, this might want to be __has_feature(undefined_behavior_sanitizer)
> (and damn is that hard for a Brit to type out!)

(I know right, it should be behaviour, dammit!)

I tried without the condition, eg.:

+#define __no_sanitize_undefined \
+               __attribute__((no_sanitize("undefined")))

and it still generated UBSAN gunk.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 12:27                           ` Peter Zijlstra
@ 2020-05-14 13:07                             ` Marco Elver
  2020-05-14 13:14                               ` Peter Zijlstra
  0 siblings, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-05-14 13:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, 14 May 2020 at 14:27, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, May 14, 2020 at 01:01:04PM +0100, Will Deacon wrote:
>
> > > +#if __has_feature(undefined_sanitizer)
> >
> > Hmm, this might want to be __has_feature(undefined_behavior_sanitizer)
> > (and damn is that hard for a Brit to type out!)
>
> (I know right, it should be behaviour, dammit!)
>
> I tried without the condition, eg.:
>
> +#define __no_sanitize_undefined \
> +               __attribute__((no_sanitize("undefined")))
>
> and it still generated UBSAN gunk.

Which ubsan calls are left? I'm trying to reproduce.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 13:07                             ` Marco Elver
@ 2020-05-14 13:14                               ` Peter Zijlstra
  0 siblings, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 13:14 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 03:07:08PM +0200, Marco Elver wrote:
> On Thu, 14 May 2020 at 14:27, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, May 14, 2020 at 01:01:04PM +0100, Will Deacon wrote:
> >
> > > > +#if __has_feature(undefined_sanitizer)
> > >
> > > Hmm, this might want to be __has_feature(undefined_behavior_sanitizer)
> > > (and damn is that hard for a Brit to type out!)
> >
> > (I know right, it should be behaviour, dammit!)
> >
> > I tried without the condition, eg.:
> >
> > +#define __no_sanitize_undefined \
> > +               __attribute__((no_sanitize("undefined")))
> >
> > and it still generated UBSAN gunk.
> 
> Which ubsan calls are left? I'm trying to reproduce.

To be more precise, the patches were on top of:

  git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git entry-v5-the-rest

$ grep ubsan poke_int3_handler-clang10.asm
0074                    74: R_X86_64_PLT32      __ubsan_handle_load_invalid_value-0x4
0083                    83: R_X86_64_PLT32      __ubsan_handle_load_invalid_value-0x4
01c5                    1c5: R_X86_64_PLT32     __ubsan_handle_builtin_unreachable-0x4

I think the config was defconfig_x86-64 inspired with KASAN+UBSAN
enabled.

So I build with:

  touch arch/x86/kernel/alterantive.c;
  make CC=clang-10 V=1 O=defconfig-build/ arch/x86/kernel/alterantive.o

And then dump the output with:

  ./objdump-func.sh defconfig-build/arch/x86/kernel/alterantive.o poke_int3_handler

$ # cat objdump-func.sh
#!/bin/bash
objdump -dr $1 | awk "/^\$/ { P=0; } /$2[^>]*>:\$/ { P=1; O=strtonum(\"0x\" \$1); } { if (P) { o=strtonum(\"0x\" \$1); printf(\"%04x \", o-O); print \$0; } }"


Hope that is enough to reproduce.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 11:05                               ` Will Deacon
@ 2020-05-14 13:35                                 ` Marco Elver
  2020-05-14 13:47                                   ` Peter Zijlstra
                                                     ` (5 more replies)
  2020-06-03 18:52                                 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Borislav Petkov
  1 sibling, 6 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-14 13:35 UTC (permalink / raw)
  To: Will Deacon, kasan-dev
  Cc: Peter Zijlstra, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, 14 May 2020 at 13:05, Will Deacon <will@kernel.org> wrote:
>
> Hi Marco,
>
> On Thu, May 14, 2020 at 09:31:49AM +0200, Marco Elver wrote:
> > Ouch. With the __{READ,WRITE}_ONCE requirement, we're going to need
> > Clang 11 though.
> >
> > Because without the data_race() around __*_ONCE,
> > arch_atomic_{read,set} will be broken for KCSAN, but we can't have
> > data_race() because it would still add
> > kcsan_{enable,disable}_current() calls to __no_sanitize functions (if
> > compilation unit is instrumented). We can't make arch_atomic functions
> > __no_sanitize_or_inline, because even in code that we want to
> > sanitize, they should remain __always_inline (so they work properly in
> > __no_sanitize functions). Therefore, Clang 11 with support for
> > distinguishing volatiles will be the compiler that will satisfy all
> > the constraints.
> >
> > If this is what we want, let me prepare a series on top of
> > -tip/locking/kcsan with all the things I think we need.
>
> Stepping back a second, the locking/kcsan branch is at least functional at
> the moment by virtue of KCSAN_SANITIZE := n being used liberally in
> arch/x86/. However, I still think we want to do better than that because (a)
> it would be good to get more x86 coverage and (b) enabling this for arm64,
> where objtool is not yet available, will be fragile if we have to whitelist
> object files. There's also a fair bit of arm64 low-level code spread around
> drivers/, so it feels like we'd end up with a really bad case of whack-a-mole.
>
> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> and the behaviour for __always_inline is:
>
>   * An __always_inline function inlined into a __no_sanitize function is
>     not instrumented
>   * An __always_inline function inlined into an instrumented function is
>     instrumented
>   * You can't mark a function as both __always_inline __no_sanitize, because
>     __no_sanitize functions are never inlined
>
> GCC, on the other hand, may still inline __no_sanitize functions and then
> subsequently instrument them.
>
> So if were willing to make KCSAN depend on Clang >= 7, then we could:
>
>   - Remove the data_race() from __{READ,WRITE}_ONCE()
>   - Wrap arch_atomic*() in data_race() when called from the instrumented
>     atomic wrappers
>
> At which point, I *think* everything works as expected. READ_ONCE_NOCHECK()
> won't generate any surprises, and Peter can happily use arch_atomic()
> from non-instrumented code.
>
> Thoughts? I don't see the need to support buggy compilers when enabling
> a new debug feature.

This is also a reply to
https://lkml.kernel.org/r/20200514122038.GH3001@hirez.programming.kicks-ass.net
-- the problem with __READ_ONCE would be solved with what Will
proposed above.

Let me try to spell out the requirements I see so far (this is for
KCSAN only though -- other sanitizers might be similar):

  1. __no_kcsan functions should not call anything, not even
kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE.
[Requires leaving data_race() off of these.]

  2. __always_inline functions inlined into __no_sanitize function is
not instrumented. [Has always been satisfied by GCC and Clang.]

  3. __always_inline functions inlined into instrumented function is
instrumented. [Has always been satisfied by GCC and Clang.]

  4. __no_kcsan functions should never be spuriously inlined into
instrumented functions, causing the accesses of the __no_kcsan
function to be instrumented. [Satisfied by Clang >= 7. All GCC
versions are broken.]

  5. we should not break atomic_{read,set} for KCSAN. [Because of #1,
we'd need to add data_race() around the arch-calls in
atomic_{read,set}; or rely on Clang 11's -tsan-distinguish-volatile
support (GCC 11 might get this as well).]

  6. never emit __tsan_func_{entry,exit}. [Clang supports disabling
this, GCC doesn't.]

  7. kernel is supported by compiler. [Clang >= 9 seems to build -tip
for me, anything below complains about lack of asm goto. GCC trivial.]

So, because of #4 & #6 & #7 we're down to Clang >= 9. Because of #5
we'll have to make a choice between Clang >= 9 or Clang >= 11
(released in ~June). In an ideal world we might even fix GCC in
future.

That's not even considering the problems around UBSan and KASAN. But
maybe one step at a time?

Any preferences?

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 13:35                                 ` Marco Elver
@ 2020-05-14 13:47                                   ` Peter Zijlstra
  2020-05-14 13:50                                   ` Peter Zijlstra
                                                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 13:47 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, kasan-dev, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
>   2. __always_inline functions inlined into __no_sanitize function is
> not instrumented. [Has always been satisfied by GCC and Clang.]

GCC <= 7 fails to compile in this case.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 13:35                                 ` Marco Elver
  2020-05-14 13:47                                   ` Peter Zijlstra
@ 2020-05-14 13:50                                   ` Peter Zijlstra
  2020-05-14 13:56                                   ` Peter Zijlstra
                                                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 13:50 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, kasan-dev, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
>   4. __no_kcsan functions should never be spuriously inlined into
> instrumented functions, causing the accesses of the __no_kcsan
> function to be instrumented. [Satisfied by Clang >= 7. All GCC
> versions are broken.]

The current noinstr annotation implies noinline, for a similar issue, we
need the function to be emitted in a specific section. So while yuck,
this is not an immediate issue for us.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 13:35                                 ` Marco Elver
  2020-05-14 13:47                                   ` Peter Zijlstra
  2020-05-14 13:50                                   ` Peter Zijlstra
@ 2020-05-14 13:56                                   ` Peter Zijlstra
  2020-05-14 14:24                                   ` Peter Zijlstra
                                                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 13:56 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, kasan-dev, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:

>   5. we should not break atomic_{read,set} for KCSAN. [Because of #1,
> we'd need to add data_race() around the arch-calls in
> atomic_{read,set}; or rely on Clang 11's -tsan-distinguish-volatile
> support (GCC 11 might get this as well).]

Putting the data_race() in atomic_{read,set} would 'break' any sanitized
user of arch_atomic_{read,set}(). Now it so happens there aren't any
such just now, but we need to be aware of that.

I'm thinking the volatile thing is the nicest solution, but yes, that'll
make us depend on 11 everything.


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-13 13:58                     ` Marco Elver
  2020-05-14 11:21                       ` Peter Zijlstra
@ 2020-05-14 14:13                       ` Peter Zijlstra
  2020-05-14 14:20                         ` Marco Elver
  1 sibling, 1 reply; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 14:13 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> On Wed, 13 May 2020 at 15:24, Peter Zijlstra <peterz@infradead.org> wrote:

> > Also, could not this compiler instrumentation live as a kernel specific
> > GCC-plugin instead of being part of GCC proper? Because in that case,
> > we'd have much better control over it.
> 
> I'd like it if we could make it a GCC-plugin for GCC, but how? I don't
> see a way to affect TSAN instrumentation. FWIW Clang already has
> distinguish-volatile support (unreleased Clang 11).

Ah, I figured not use the built-in TSAN at all, do a complete
replacement of the instrumentation with a plugin. AFAIU plugins are able
to emit instrumentation, but this isn't something I know a lot about.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 14:13                       ` Peter Zijlstra
@ 2020-05-14 14:20                         ` Marco Elver
  2020-05-15  9:20                           ` Peter Zijlstra
  0 siblings, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-05-14 14:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, 14 May 2020 at 16:13, Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> > On Wed, 13 May 2020 at 15:24, Peter Zijlstra <peterz@infradead.org> wrote:
>
> > > Also, could not this compiler instrumentation live as a kernel specific
> > > GCC-plugin instead of being part of GCC proper? Because in that case,
> > > we'd have much better control over it.
> >
> > I'd like it if we could make it a GCC-plugin for GCC, but how? I don't
> > see a way to affect TSAN instrumentation. FWIW Clang already has
> > distinguish-volatile support (unreleased Clang 11).
>
> Ah, I figured not use the built-in TSAN at all, do a complete
> replacement of the instrumentation with a plugin. AFAIU plugins are able
> to emit instrumentation, but this isn't something I know a lot about.

Interesting option. But it will likely not solve the no_sanitize and
inlining problem, because those are deeply tied to the optimization
pipelines.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 13:35                                 ` Marco Elver
                                                     ` (2 preceding siblings ...)
  2020-05-14 13:56                                   ` Peter Zijlstra
@ 2020-05-14 14:24                                   ` Peter Zijlstra
  2020-05-14 15:09                                     ` Thomas Gleixner
  2020-05-15 13:55                                     ` David Laight
  2020-05-14 15:38                                   ` Paul E. McKenney
  2020-05-22 16:08                                   ` [tip: locking/kcsan] kcsan: Restrict supported compilers tip-bot2 for Marco Elver
  5 siblings, 2 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-14 14:24 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, kasan-dev, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:

> Let me try to spell out the requirements I see so far (this is for
> KCSAN only though -- other sanitizers might be similar):
> 
>   1. __no_kcsan functions should not call anything, not even
> kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE.
> [Requires leaving data_race() off of these.]
> 
>   2. __always_inline functions inlined into __no_sanitize function is
> not instrumented. [Has always been satisfied by GCC and Clang.]
> 
>   3. __always_inline functions inlined into instrumented function is
> instrumented. [Has always been satisfied by GCC and Clang.]
> 
>   4. __no_kcsan functions should never be spuriously inlined into
> instrumented functions, causing the accesses of the __no_kcsan
> function to be instrumented. [Satisfied by Clang >= 7. All GCC
> versions are broken.]
> 
>   5. we should not break atomic_{read,set} for KCSAN. [Because of #1,
> we'd need to add data_race() around the arch-calls in
> atomic_{read,set}; or rely on Clang 11's -tsan-distinguish-volatile
> support (GCC 11 might get this as well).]
> 
>   6. never emit __tsan_func_{entry,exit}. [Clang supports disabling
> this, GCC doesn't.]
> 
>   7. kernel is supported by compiler. [Clang >= 9 seems to build -tip
> for me, anything below complains about lack of asm goto. GCC trivial.]
> 
> So, because of #4 & #6 & #7 we're down to Clang >= 9. Because of #5
> we'll have to make a choice between Clang >= 9 or Clang >= 11
> (released in ~June). In an ideal world we might even fix GCC in
> future.
> 
> That's not even considering the problems around UBSan and KASAN. But
> maybe one step at a time?

Exact same requirements, KASAN even has the data_race() problem through
READ_ONCE_NOCHECK(), UBSAN doesn't and might be simpler because of it.

> Any preferences?

I suppose DTRT, if we then write the Makefile rule like:

KCSAN_SANITIZE := KCSAN_FUNCTION_ATTRIBUTES

and set that to either 'y'/'n' depending on the compiler at hand
supporting enough magic to make it all work.

I suppose all the sanitize stuff is most important for developers and
we tend to have the latest compiler versions anyway, right?

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 14:24                                   ` Peter Zijlstra
@ 2020-05-14 15:09                                     ` Thomas Gleixner
  2020-05-14 15:29                                       ` Marco Elver
  2020-05-15 13:55                                     ` David Laight
  1 sibling, 1 reply; 127+ messages in thread
From: Thomas Gleixner @ 2020-05-14 15:09 UTC (permalink / raw)
  To: Peter Zijlstra, Marco Elver
  Cc: Will Deacon, kasan-dev, LKML, Paul E. McKenney, Ingo Molnar,
	Dmitry Vyukov

Peter Zijlstra <peterz@infradead.org> writes:
> On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
>> Any preferences?
>
> I suppose DTRT, if we then write the Makefile rule like:
>
> KCSAN_SANITIZE := KCSAN_FUNCTION_ATTRIBUTES
>
> and set that to either 'y'/'n' depending on the compiler at hand
> supporting enough magic to make it all work.
>
> I suppose all the sanitize stuff is most important for developers and
> we tend to have the latest compiler versions anyway, right?

Developers and CI/testing stuff. Yes we really should require a sane
compiler instead of introducing boatloads of horrible workarounds all
over the place which then break when the code changes slightly.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 15:09                                     ` Thomas Gleixner
@ 2020-05-14 15:29                                       ` Marco Elver
  2020-05-14 19:37                                         ` Thomas Gleixner
  0 siblings, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-05-14 15:29 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, Will Deacon, kasan-dev, LKML, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, 14 May 2020 at 17:09, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> Peter Zijlstra <peterz@infradead.org> writes:
> > On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
> >> Any preferences?
> >
> > I suppose DTRT, if we then write the Makefile rule like:
> >
> > KCSAN_SANITIZE := KCSAN_FUNCTION_ATTRIBUTES
> >
> > and set that to either 'y'/'n' depending on the compiler at hand
> > supporting enough magic to make it all work.
> >
> > I suppose all the sanitize stuff is most important for developers and
> > we tend to have the latest compiler versions anyway, right?
>
> Developers and CI/testing stuff. Yes we really should require a sane
> compiler instead of introducing boatloads of horrible workarounds all
> over the place which then break when the code changes slightly.

In which case, let me prepare a series on top of -tip for switching at
least KCSAN to Clang 11. If that's what we'll need, I don't see a
better option right now.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 13:35                                 ` Marco Elver
                                                     ` (3 preceding siblings ...)
  2020-05-14 14:24                                   ` Peter Zijlstra
@ 2020-05-14 15:38                                   ` Paul E. McKenney
  2020-05-22 16:08                                   ` [tip: locking/kcsan] kcsan: Restrict supported compilers tip-bot2 for Marco Elver
  5 siblings, 0 replies; 127+ messages in thread
From: Paul E. McKenney @ 2020-05-14 15:38 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, kasan-dev, Peter Zijlstra, LKML, Thomas Gleixner,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
> On Thu, 14 May 2020 at 13:05, Will Deacon <will@kernel.org> wrote:
> >
> > Hi Marco,
> >
> > On Thu, May 14, 2020 at 09:31:49AM +0200, Marco Elver wrote:
> > > Ouch. With the __{READ,WRITE}_ONCE requirement, we're going to need
> > > Clang 11 though.
> > >
> > > Because without the data_race() around __*_ONCE,
> > > arch_atomic_{read,set} will be broken for KCSAN, but we can't have
> > > data_race() because it would still add
> > > kcsan_{enable,disable}_current() calls to __no_sanitize functions (if
> > > compilation unit is instrumented). We can't make arch_atomic functions
> > > __no_sanitize_or_inline, because even in code that we want to
> > > sanitize, they should remain __always_inline (so they work properly in
> > > __no_sanitize functions). Therefore, Clang 11 with support for
> > > distinguishing volatiles will be the compiler that will satisfy all
> > > the constraints.
> > >
> > > If this is what we want, let me prepare a series on top of
> > > -tip/locking/kcsan with all the things I think we need.
> >
> > Stepping back a second, the locking/kcsan branch is at least functional at
> > the moment by virtue of KCSAN_SANITIZE := n being used liberally in
> > arch/x86/. However, I still think we want to do better than that because (a)
> > it would be good to get more x86 coverage and (b) enabling this for arm64,
> > where objtool is not yet available, will be fragile if we have to whitelist
> > object files. There's also a fair bit of arm64 low-level code spread around
> > drivers/, so it feels like we'd end up with a really bad case of whack-a-mole.
> >
> > Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> > and the behaviour for __always_inline is:
> >
> >   * An __always_inline function inlined into a __no_sanitize function is
> >     not instrumented
> >   * An __always_inline function inlined into an instrumented function is
> >     instrumented
> >   * You can't mark a function as both __always_inline __no_sanitize, because
> >     __no_sanitize functions are never inlined
> >
> > GCC, on the other hand, may still inline __no_sanitize functions and then
> > subsequently instrument them.
> >
> > So if were willing to make KCSAN depend on Clang >= 7, then we could:
> >
> >   - Remove the data_race() from __{READ,WRITE}_ONCE()
> >   - Wrap arch_atomic*() in data_race() when called from the instrumented
> >     atomic wrappers
> >
> > At which point, I *think* everything works as expected. READ_ONCE_NOCHECK()
> > won't generate any surprises, and Peter can happily use arch_atomic()
> > from non-instrumented code.
> >
> > Thoughts? I don't see the need to support buggy compilers when enabling
> > a new debug feature.
> 
> This is also a reply to
> https://lkml.kernel.org/r/20200514122038.GH3001@hirez.programming.kicks-ass.net
> -- the problem with __READ_ONCE would be solved with what Will
> proposed above.
> 
> Let me try to spell out the requirements I see so far (this is for
> KCSAN only though -- other sanitizers might be similar):
> 
>   1. __no_kcsan functions should not call anything, not even
> kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE.
> [Requires leaving data_race() off of these.]
> 
>   2. __always_inline functions inlined into __no_sanitize function is
> not instrumented. [Has always been satisfied by GCC and Clang.]
> 
>   3. __always_inline functions inlined into instrumented function is
> instrumented. [Has always been satisfied by GCC and Clang.]
> 
>   4. __no_kcsan functions should never be spuriously inlined into
> instrumented functions, causing the accesses of the __no_kcsan
> function to be instrumented. [Satisfied by Clang >= 7. All GCC
> versions are broken.]
> 
>   5. we should not break atomic_{read,set} for KCSAN. [Because of #1,
> we'd need to add data_race() around the arch-calls in
> atomic_{read,set}; or rely on Clang 11's -tsan-distinguish-volatile
> support (GCC 11 might get this as well).]
> 
>   6. never emit __tsan_func_{entry,exit}. [Clang supports disabling
> this, GCC doesn't.]
> 
>   7. kernel is supported by compiler. [Clang >= 9 seems to build -tip
> for me, anything below complains about lack of asm goto. GCC trivial.]
> 
> So, because of #4 & #6 & #7 we're down to Clang >= 9. Because of #5
> we'll have to make a choice between Clang >= 9 or Clang >= 11
> (released in ~June). In an ideal world we might even fix GCC in
> future.
> 
> That's not even considering the problems around UBSan and KASAN. But
> maybe one step at a time?
> 
> Any preferences?

I am already having to choose where I run KCSAN based on what compiler
is available, so I cannot argue too hard against a dependency on a
specific compiler.  I reserve the right to ask for help installing it,
if need be though.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 15:29                                       ` Marco Elver
@ 2020-05-14 19:37                                         ` Thomas Gleixner
  0 siblings, 0 replies; 127+ messages in thread
From: Thomas Gleixner @ 2020-05-14 19:37 UTC (permalink / raw)
  To: Marco Elver
  Cc: Peter Zijlstra, Will Deacon, kasan-dev, LKML, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

Marco Elver <elver@google.com> writes:
> On Thu, 14 May 2020 at 17:09, Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> Peter Zijlstra <peterz@infradead.org> writes:
>> > On Thu, May 14, 2020 at 03:35:58PM +0200, Marco Elver wrote:
>> >> Any preferences?
>> >
>> > I suppose DTRT, if we then write the Makefile rule like:
>> >
>> > KCSAN_SANITIZE := KCSAN_FUNCTION_ATTRIBUTES
>> >
>> > and set that to either 'y'/'n' depending on the compiler at hand
>> > supporting enough magic to make it all work.
>> >
>> > I suppose all the sanitize stuff is most important for developers and
>> > we tend to have the latest compiler versions anyway, right?
>>
>> Developers and CI/testing stuff. Yes we really should require a sane
>> compiler instead of introducing boatloads of horrible workarounds all
>> over the place which then break when the code changes slightly.
>
> In which case, let me prepare a series on top of -tip for switching at
> least KCSAN to Clang 11. If that's what we'll need, I don't see a
> better option right now.

And for a change that might make this time GCC people look at their open
bugs. :)

/me mumbles jumplabels and goes back to juggle patches

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 14:20                         ` Marco Elver
@ 2020-05-15  9:20                           ` Peter Zijlstra
  0 siblings, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-15  9:20 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 04:20:42PM +0200, Marco Elver wrote:
> On Thu, 14 May 2020 at 16:13, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Wed, May 13, 2020 at 03:58:30PM +0200, Marco Elver wrote:
> > > On Wed, 13 May 2020 at 15:24, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > > > Also, could not this compiler instrumentation live as a kernel specific
> > > > GCC-plugin instead of being part of GCC proper? Because in that case,
> > > > we'd have much better control over it.
> > >
> > > I'd like it if we could make it a GCC-plugin for GCC, but how? I don't
> > > see a way to affect TSAN instrumentation. FWIW Clang already has
> > > distinguish-volatile support (unreleased Clang 11).
> >
> > Ah, I figured not use the built-in TSAN at all, do a complete
> > replacement of the instrumentation with a plugin. AFAIU plugins are able
> > to emit instrumentation, but this isn't something I know a lot about.
> 
> Interesting option. But it will likely not solve the no_sanitize and
> inlining problem, because those are deeply tied to the optimization
> pipelines.

So I'm imagining adding the instrumentation is done at a very late pass,
after all, all we want is to add instrumentation to any memops. I
imagine this is done right before doing register allocation and emitting
asm.

At this point we can look if the current function has a no_sanitize
attribute, no?

That is, this is done after all the optimization and inlining stages
anyway; why would we care about that?

Maybe I'm too naive of compiler internals; this really isn't my area :-)

^ permalink raw reply	[flat|nested] 127+ messages in thread

* RE: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 14:24                                   ` Peter Zijlstra
  2020-05-14 15:09                                     ` Thomas Gleixner
@ 2020-05-15 13:55                                     ` David Laight
  2020-05-15 14:04                                       ` Marco Elver
  2020-05-15 14:07                                       ` Peter Zijlstra
  1 sibling, 2 replies; 127+ messages in thread
From: David Laight @ 2020-05-15 13:55 UTC (permalink / raw)
  To: 'Peter Zijlstra', Marco Elver
  Cc: Will Deacon, kasan-dev, LKML, Thomas Gleixner, Paul E. McKenney,
	Ingo Molnar, Dmitry Vyukov

From: Peter Zijlstra
> Sent: 14 May 2020 15:25
..
> Exact same requirements, KASAN even has the data_race() problem through
> READ_ONCE_NOCHECK(), UBSAN doesn't and might be simpler because of it.

What happens if you implement READ_ONCE_NOCHECK() with an
asm() statement containing a memory load?

Is that enough to kill all the instrumentation?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-15 13:55                                     ` David Laight
@ 2020-05-15 14:04                                       ` Marco Elver
  2020-05-15 14:07                                       ` Peter Zijlstra
  1 sibling, 0 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-15 14:04 UTC (permalink / raw)
  To: David Laight
  Cc: Peter Zijlstra, Will Deacon, kasan-dev, LKML, Thomas Gleixner,
	Paul E. McKenney, Ingo Molnar, Dmitry Vyukov

On Fri, 15 May 2020 at 15:55, David Laight <David.Laight@aculab.com> wrote:
>
> From: Peter Zijlstra
> > Sent: 14 May 2020 15:25
> ..
> > Exact same requirements, KASAN even has the data_race() problem through
> > READ_ONCE_NOCHECK(), UBSAN doesn't and might be simpler because of it.
>
> What happens if you implement READ_ONCE_NOCHECK() with an
> asm() statement containing a memory load?
>
> Is that enough to kill all the instrumentation?

Yes, it is.

However, READ_ONCE_NOCHECK() for KASAN can be fixed if the problem is
randomly uninlined READ_ONCE_NOCHECK() in KASAN_SANITIZE := n
compilation units. KASAN's __no_kasan_or_inline is still conditionally
defined based on CONFIG_KASAN and not __SANITIZE_ADDRESS__. I'm about
to send a patch that does that for KASAN, since for KCSAN we've been
doing it for a while. However, if that was the exact problem Peter
observed I can't tell.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-15 13:55                                     ` David Laight
  2020-05-15 14:04                                       ` Marco Elver
@ 2020-05-15 14:07                                       ` Peter Zijlstra
  1 sibling, 0 replies; 127+ messages in thread
From: Peter Zijlstra @ 2020-05-15 14:07 UTC (permalink / raw)
  To: David Laight
  Cc: Marco Elver, Will Deacon, kasan-dev, LKML, Thomas Gleixner,
	Paul E. McKenney, Ingo Molnar, Dmitry Vyukov

On Fri, May 15, 2020 at 01:55:43PM +0000, David Laight wrote:
> From: Peter Zijlstra
> > Sent: 14 May 2020 15:25
> ..
> > Exact same requirements, KASAN even has the data_race() problem through
> > READ_ONCE_NOCHECK(), UBSAN doesn't and might be simpler because of it.
> 
> What happens if you implement READ_ONCE_NOCHECK() with an
> asm() statement containing a memory load?
> 
> Is that enough to kill all the instrumentation?

You'll have to implement it for all archs, but yes, I think that ought
to work.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-11 20:41 ` [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables Will Deacon
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
@ 2020-05-17  0:00   ` Guenter Roeck
  2020-05-17  0:07     ` Guenter Roeck
  1 sibling, 1 reply; 127+ messages in thread
From: Guenter Roeck @ 2020-05-17  0:00 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, elver, tglx, paulmck, mingo, peterz, David S. Miller

On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> Now that the page table allocator can free page table allocations
> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> to avoid needlessly wasting memory.
> 
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Something in the sparc32 patches in linux-next causes all my sparc32 emulations
to crash. bisect points to this patch, but reverting it doesn't help, and neither
does reverting the rest of the series.

Guenter

---
Bisect log:

# bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect start 'HEAD' 'v5.7-rc5'
# bad: [3674d7aa7a8e61d993886c2fb7c896c5ef85e988] Merge remote-tracking branch 'crypto/master'
git bisect bad 3674d7aa7a8e61d993886c2fb7c896c5ef85e988
# bad: [1ab4d6ff0a3ee4b29441d8b0076bc8d4734bd16e] Merge remote-tracking branch 'hwmon-staging/hwmon-next'
git bisect bad 1ab4d6ff0a3ee4b29441d8b0076bc8d4734bd16e
# good: [dccfae3ab84387c94f2efc574d41efae005eeee5] Merge remote-tracking branch 'tegra/for-next'
git bisect good dccfae3ab84387c94f2efc574d41efae005eeee5
# bad: [20f9d1287c9f0047b81497197c9f4893485bbe15] Merge remote-tracking branch 'djw-vfs/vfs-for-next'
git bisect bad 20f9d1287c9f0047b81497197c9f4893485bbe15
# bad: [6537897637b5b91f921cb0ac6c465a593f4a665e] Merge remote-tracking branch 'sparc-next/master'
git bisect bad 6537897637b5b91f921cb0ac6c465a593f4a665e
# good: [bca1583e0693e0ba76450b684c5910f7083eeef4] Merge remote-tracking branch 'mips/mips-next'
git bisect good bca1583e0693e0ba76450b684c5910f7083eeef4
# good: [1f12096aca212af8fad3ef58d5673cde691a1452] Merge the lockless page table walk rework into next
git bisect good 1f12096aca212af8fad3ef58d5673cde691a1452
# good: [23a457b8d57dc8d0cc1dbd1882993dd2fcc4b0c0] s390: nvme reipl
git bisect good 23a457b8d57dc8d0cc1dbd1882993dd2fcc4b0c0
# good: [f57f5010c0c3fe2d924a957ddf1d17fbebb54d47] Merge remote-tracking branch 'risc-v/for-next'
git bisect good f57f5010c0c3fe2d924a957ddf1d17fbebb54d47
# good: [1d5fd6c33b04e5d5b665446c3b56f2148f0f1272] sh: add missing DECLARE_EXPORT() for __ashiftrt_r4_xx
git bisect good 1d5fd6c33b04e5d5b665446c3b56f2148f0f1272
# bad: [8c8f3156dd40f8bdc58f2ac461374bc804c28e3b] sparc32: mm: Reduce allocation size for PMD and PTE tables
git bisect bad 8c8f3156dd40f8bdc58f2ac461374bc804c28e3b
# good: [8e958839e4b9fb6ea4385ff2c52d1333a3a618de] sparc32: mm: Restructure sparc32 MMU page-table layout
git bisect good 8e958839e4b9fb6ea4385ff2c52d1333a3a618de
# good: [3f407976ac2953116cb8880a7a18b63bcc81829d] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *
git bisect good 3f407976ac2953116cb8880a7a18b63bcc81829d
# first bad commit: [8c8f3156dd40f8bdc58f2ac461374bc804c28e3b] sparc32: mm: Reduce allocation size for PMD and PTE tables

---
Log messages:

Lots of:

BUG: scheduling while atomic: kthreadd/2/0xffffffff
Modules linked in:
CPU: 0 PID: 2 Comm: kthreadd Tainted: G        W         5.7.0-rc5-next-20200515 #1
[f04f2c94 :
here+0x16c/0x250 ]
[f04f2df0 :
schedule+0x78/0x11c ]
[f003f100 :
kthreadd+0x188/0x1a4 ]
[f0008448 :
ret_from_kernel_thread+0xc/0x38 ]
[00000000 :
0x0 ]

followed by:

Kernel panic - not syncing: Aiee, killing interrupt handler!
CPU: 0 PID: 19 Comm: cryptomgr_test Tainted: G        W         5.7.0-rc5-next-20200515 #1
[f0024400 :
do_exit+0x7c8/0xa88 ]
[f0075540 :
__module_put_and_exit+0xc/0x18 ]
[f0221428 :
cryptomgr_test+0x28/0x48 ]
[f003edc0 :
kthread+0xf4/0x12c ]
[f0008448 :
ret_from_kernel_thread+0xc/0x38 ]
[00000000 :
0x0 ]

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-17  0:00   ` [PATCH v5 04/18] " Guenter Roeck
@ 2020-05-17  0:07     ` Guenter Roeck
  2020-05-18  8:37       ` Will Deacon
  0 siblings, 1 reply; 127+ messages in thread
From: Guenter Roeck @ 2020-05-17  0:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, elver, tglx, paulmck, mingo, peterz, David S. Miller

On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > Now that the page table allocator can free page table allocations
> > smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > to avoid needlessly wasting memory.
> > 
> > Cc: "David S. Miller" <davem@davemloft.net>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Will Deacon <will@kernel.org>
> 
> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> does reverting the rest of the series.
> 
Actually, turns out I see the same pattern (lots of scheduling while atomic
followed by 'killing interrupt handler' in cryptomgr_test) with several
powerpc boot tests.  I am currently bisecting those crashes. I'll report
the results here as well as soon as I have it.

Guenter

> Guenter
> 
> ---
> Bisect log:
> 
> # bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> git bisect start 'HEAD' 'v5.7-rc5'
> # bad: [3674d7aa7a8e61d993886c2fb7c896c5ef85e988] Merge remote-tracking branch 'crypto/master'
> git bisect bad 3674d7aa7a8e61d993886c2fb7c896c5ef85e988
> # bad: [1ab4d6ff0a3ee4b29441d8b0076bc8d4734bd16e] Merge remote-tracking branch 'hwmon-staging/hwmon-next'
> git bisect bad 1ab4d6ff0a3ee4b29441d8b0076bc8d4734bd16e
> # good: [dccfae3ab84387c94f2efc574d41efae005eeee5] Merge remote-tracking branch 'tegra/for-next'
> git bisect good dccfae3ab84387c94f2efc574d41efae005eeee5
> # bad: [20f9d1287c9f0047b81497197c9f4893485bbe15] Merge remote-tracking branch 'djw-vfs/vfs-for-next'
> git bisect bad 20f9d1287c9f0047b81497197c9f4893485bbe15
> # bad: [6537897637b5b91f921cb0ac6c465a593f4a665e] Merge remote-tracking branch 'sparc-next/master'
> git bisect bad 6537897637b5b91f921cb0ac6c465a593f4a665e
> # good: [bca1583e0693e0ba76450b684c5910f7083eeef4] Merge remote-tracking branch 'mips/mips-next'
> git bisect good bca1583e0693e0ba76450b684c5910f7083eeef4
> # good: [1f12096aca212af8fad3ef58d5673cde691a1452] Merge the lockless page table walk rework into next
> git bisect good 1f12096aca212af8fad3ef58d5673cde691a1452
> # good: [23a457b8d57dc8d0cc1dbd1882993dd2fcc4b0c0] s390: nvme reipl
> git bisect good 23a457b8d57dc8d0cc1dbd1882993dd2fcc4b0c0
> # good: [f57f5010c0c3fe2d924a957ddf1d17fbebb54d47] Merge remote-tracking branch 'risc-v/for-next'
> git bisect good f57f5010c0c3fe2d924a957ddf1d17fbebb54d47
> # good: [1d5fd6c33b04e5d5b665446c3b56f2148f0f1272] sh: add missing DECLARE_EXPORT() for __ashiftrt_r4_xx
> git bisect good 1d5fd6c33b04e5d5b665446c3b56f2148f0f1272
> # bad: [8c8f3156dd40f8bdc58f2ac461374bc804c28e3b] sparc32: mm: Reduce allocation size for PMD and PTE tables
> git bisect bad 8c8f3156dd40f8bdc58f2ac461374bc804c28e3b
> # good: [8e958839e4b9fb6ea4385ff2c52d1333a3a618de] sparc32: mm: Restructure sparc32 MMU page-table layout
> git bisect good 8e958839e4b9fb6ea4385ff2c52d1333a3a618de
> # good: [3f407976ac2953116cb8880a7a18b63bcc81829d] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *
> git bisect good 3f407976ac2953116cb8880a7a18b63bcc81829d
> # first bad commit: [8c8f3156dd40f8bdc58f2ac461374bc804c28e3b] sparc32: mm: Reduce allocation size for PMD and PTE tables
> 
> ---
> Log messages:
> 
> Lots of:
> 
> BUG: scheduling while atomic: kthreadd/2/0xffffffff
> Modules linked in:
> CPU: 0 PID: 2 Comm: kthreadd Tainted: G        W         5.7.0-rc5-next-20200515 #1
> [f04f2c94 :
> here+0x16c/0x250 ]
> [f04f2df0 :
> schedule+0x78/0x11c ]
> [f003f100 :
> kthreadd+0x188/0x1a4 ]
> [f0008448 :
> ret_from_kernel_thread+0xc/0x38 ]
> [00000000 :
> 0x0 ]
> 
> followed by:
> 
> Kernel panic - not syncing: Aiee, killing interrupt handler!
> CPU: 0 PID: 19 Comm: cryptomgr_test Tainted: G        W         5.7.0-rc5-next-20200515 #1
> [f0024400 :
> do_exit+0x7c8/0xa88 ]
> [f0075540 :
> __module_put_and_exit+0xc/0x18 ]
> [f0221428 :
> cryptomgr_test+0x28/0x48 ]
> [f003edc0 :
> kthread+0xf4/0x12c ]
> [f0008448 :
> ret_from_kernel_thread+0xc/0x38 ]
> [00000000 :
> 0x0 ]

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-17  0:07     ` Guenter Roeck
@ 2020-05-18  8:37       ` Will Deacon
  2020-05-18  9:18         ` Mike Rapoport
                           ` (2 more replies)
  0 siblings, 3 replies; 127+ messages in thread
From: Will Deacon @ 2020-05-18  8:37 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, elver, tglx, paulmck, mingo, peterz, David S. Miller, rppt

On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > Now that the page table allocator can free page table allocations
> > > smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > to avoid needlessly wasting memory.
> > > 
> > > Cc: "David S. Miller" <davem@davemloft.net>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Signed-off-by: Will Deacon <will@kernel.org>
> > 
> > Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > does reverting the rest of the series.
> > 
> Actually, turns out I see the same pattern (lots of scheduling while atomic
> followed by 'killing interrupt handler' in cryptomgr_test) with several
> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> the results here as well as soon as I have it.

FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
issues. However, linux-next is a different story, where I don't get very far
at all:

BUG: Bad page state in process swapper  pfn:005b4

If you're seeing this on powerpc too, I wonder if it's related to:

https://lore.kernel.org/r/20200514170327.31389-1-rppt@kernel.org

since I think it just hit -next and the diffstat is all over the place. I've
added Mike to CC just in case.

Will

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18  8:37       ` Will Deacon
@ 2020-05-18  9:18         ` Mike Rapoport
  2020-05-18  9:48         ` Guenter Roeck
  2020-05-20 17:03         ` Mike Rapoport
  2 siblings, 0 replies; 127+ messages in thread
From: Mike Rapoport @ 2020-05-18  9:18 UTC (permalink / raw)
  To: Will Deacon
  Cc: Guenter Roeck, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > > On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > > Now that the page table allocator can free page table allocations
> > > > smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > > to avoid needlessly wasting memory.
> > > > 
> > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > > Signed-off-by: Will Deacon <will@kernel.org>
> > > 
> > > Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > > to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > > does reverting the rest of the series.
> > > 
> > Actually, turns out I see the same pattern (lots of scheduling while atomic
> > followed by 'killing interrupt handler' in cryptomgr_test) with several
> > powerpc boot tests.  I am currently bisecting those crashes. I'll report
> > the results here as well as soon as I have it.
> 
> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> issues. However, linux-next is a different story, where I don't get very far
> at all:
> 
> BUG: Bad page state in process swapper  pfn:005b4
> 
> If you're seeing this on powerpc too, I wonder if it's related to:
> 
> https://lore.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
> 
> since I think it just hit -next and the diffstat is all over the place. I've
> added Mike to CC just in case.

Thanks, Will, I'll take a look.

> Will

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18  8:37       ` Will Deacon
  2020-05-18  9:18         ` Mike Rapoport
@ 2020-05-18  9:48         ` Guenter Roeck
  2020-05-18 14:23           ` Mike Rapoport
  2020-05-20 17:03         ` Mike Rapoport
  2 siblings, 1 reply; 127+ messages in thread
From: Guenter Roeck @ 2020-05-18  9:48 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, elver, tglx, paulmck, mingo, peterz, David S. Miller, rppt

On 5/18/20 1:37 AM, Will Deacon wrote:
> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>> Now that the page table allocator can free page table allocations
>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>> to avoid needlessly wasting memory.
>>>>
>>>> Cc: "David S. Miller" <davem@davemloft.net>
>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>> Signed-off-by: Will Deacon <will@kernel.org>
>>>
>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>> does reverting the rest of the series.
>>>
>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
>> the results here as well as soon as I have it.
> 
> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> issues. However, linux-next is a different story, where I don't get very far
> at all:
> 
> BUG: Bad page state in process swapper  pfn:005b4
> 
> If you're seeing this on powerpc too, I wonder if it's related to:
> 
> https://lore.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
> 
> since I think it just hit -next and the diffstat is all over the place. I've
> added Mike to CC just in case.
> 

Here are the bisect results for ppc:

# bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect start 'HEAD' 'v5.7-rc5'
# good: [3674d7aa7a8e61d993886c2fb7c896c5ef85e988] Merge remote-tracking branch 'crypto/master'
git bisect good 3674d7aa7a8e61d993886c2fb7c896c5ef85e988
# good: [87f6f21783522e6d62127cf33ae5e95f50874beb] Merge remote-tracking branch 'spi/for-next'
git bisect good 87f6f21783522e6d62127cf33ae5e95f50874beb
# good: [5c428e8277d5d97c85126387d4e00aa5adde4400] Merge remote-tracking branch 'staging/staging-next'
git bisect good 5c428e8277d5d97c85126387d4e00aa5adde4400
# good: [f68de67ed934e7bdef4799fd7777c86f33f14982] Merge remote-tracking branch 'hyperv/hyperv-next'
git bisect good f68de67ed934e7bdef4799fd7777c86f33f14982
# bad: [54acd2dc52b069da59639eea0d0c92726f32fb01] mm/memblock: fix a typo in comment "implict"->"implicit"
git bisect bad 54acd2dc52b069da59639eea0d0c92726f32fb01
# good: [784a17aa58a529b84f7cc50f351ed4acf3bd11f3] mm: remove the pgprot argument to __vmalloc
git bisect good 784a17aa58a529b84f7cc50f351ed4acf3bd11f3
# good: [6cd8137ff37e9a37aee2d2a8889c8beb8eab192f] khugepaged: replace the usage of system(3) in the test
git bisect good 6cd8137ff37e9a37aee2d2a8889c8beb8eab192f
# bad: [6987da379826ed01b8a1cf046b67cc8cc10117cc] sparc: remove unnecessary includes
git bisect bad 6987da379826ed01b8a1cf046b67cc8cc10117cc
# good: [bc17b545388f64c09e83e367898e28f60277c584] mm/hugetlb: define a generic fallback for is_hugepage_only_range()
git bisect good bc17b545388f64c09e83e367898e28f60277c584
# good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
# bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
# good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
# good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
# first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's

I don't know if that is accurate either. Maybe things are so broken
that bisect gets confused, or the problem is due to interaction
between different patch series.

Guenter

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18  9:48         ` Guenter Roeck
@ 2020-05-18 14:23           ` Mike Rapoport
  2020-05-18 16:08             ` Guenter Roeck
  2020-05-18 18:09             ` Guenter Roeck
  0 siblings, 2 replies; 127+ messages in thread
From: Mike Rapoport @ 2020-05-18 14:23 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Ira Weiny, Will Deacon, linux-kernel, elver, tglx, paulmck,
	mingo, peterz, David S. Miller

On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> On 5/18/20 1:37 AM, Will Deacon wrote:
> > On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> >> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> >>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> >>>> Now that the page table allocator can free page table allocations
> >>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> >>>> to avoid needlessly wasting memory.
> >>>>
> >>>> Cc: "David S. Miller" <davem@davemloft.net>
> >>>> Cc: Peter Zijlstra <peterz@infradead.org>
> >>>> Signed-off-by: Will Deacon <will@kernel.org>
> >>>
> >>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> >>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> >>> does reverting the rest of the series.
> >>>
> >> Actually, turns out I see the same pattern (lots of scheduling while atomic
> >> followed by 'killing interrupt handler' in cryptomgr_test) with several
> >> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> >> the results here as well as soon as I have it.
> > 
> > FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > issues. However, linux-next is a different story, where I don't get very far
> > at all:
> > 
> > BUG: Bad page state in process swapper  pfn:005b4

This one seems to be due to commit 24aab577764f ("mm: memmap_init:
iterate over memblock regions rather that check each PFN") and reverting
it and partially reverting the next cleanup commits makes those
dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with

Run /sbin/init as init process
  with arguments:
    /sbin/init
  with environment:
    HOME=/
    TERM=linux
Starting init: /sbin/init exists but couldn't execute it (error -14)

I've tried to bisect mmotm and I've got the first bad commits in
different places in the middle of arch/kmap series [1] so I've added Ira
to CC as well :)

I'll continue to look into "bad page" on sparc32

[1] https://lore.kernel.org/dri-devel/20200507150004.1423069-11-ira.weiny@intel.com/

> Here are the bisect results for ppc:
> 
> # bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> git bisect start 'HEAD' 'v5.7-rc5'

...

> # good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
> git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
> # bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
> git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
> # good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
> git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
> # good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
> git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
> # first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
> 
> I don't know if that is accurate either. Maybe things are so broken
> that bisect gets confused, or the problem is due to interaction
> between different patch series.

My results with the workaround for sparc32 boot look similar:

# bad: [2bbf0589bfeb27800c730b76eacf34528eee5418] pci: test for unexpectedly disabled bridges
git bisect bad 2bbf0589bfeb27800c730b76eacf34528eee5418
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect good 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8
# bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
# bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
# good: [e27369856a2d42ae4d84bc2c4ddac1e696c40d7c] mm: remove the prot argument from vm_map_ram
git bisect good e27369856a2d42ae4d84bc2c4ddac1e696c40d7c
# good: [6911f2b29f6daae2c4b51e6a37f794056d8afabd] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
git bisect good 6911f2b29f6daae2c4b51e6a37f794056d8afabd
# good: [8cef4726f20ae37c3cf3f7a449f5b8a088247a27] hugetlbfs: clean up command line processing
git bisect good 8cef4726f20ae37c3cf3f7a449f5b8a088247a27
# good: [94f38895e0a68ceac3ceece6528123ed3129cedd] arch/kmap: ensure kmap_prot visibility
git bisect good 94f38895e0a68ceac3ceece6528123ed3129cedd
# skip: [fcc77c28bf9155c681712b25c0f5e6125d10ba2e] kmap: consolidate kmap_prot definitions
git bisect skip fcc77c28bf9155c681712b25c0f5e6125d10ba2e
# bad: [175a67be7ee750b2aa2a4a2fedeff18fdce787ac] kmap-consolidate-kmap_prot-definitions-checkpatch-fixes
git bisect bad 175a67be7ee750b2aa2a4a2fedeff18fdce787ac
# bad: [54db8ed321d66a00b6c69bbd5bf7c59809b3fd42] drm: vmwgfx: include linux/highmem.h
git bisect bad 54db8ed321d66a00b6c69bbd5bf7c59809b3fd42
# bad: [6671299c829d19c6ceb0fd1a14b690f6115c6d3d] arch/kmap: define kmap_atomic_prot() for all arch's
git bisect bad 6671299c829d19c6ceb0fd1a14b690f6115c6d3d
# bad: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
git bisect bad f800fb6e517710e04391821e4b1908606c8a6b24
# first bad commit: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values


> Guenter

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18 14:23           ` Mike Rapoport
@ 2020-05-18 16:08             ` Guenter Roeck
  2020-05-18 18:11               ` Ira Weiny
  2020-05-18 18:14               ` Ira Weiny
  2020-05-18 18:09             ` Guenter Roeck
  1 sibling, 2 replies; 127+ messages in thread
From: Guenter Roeck @ 2020-05-18 16:08 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Ira Weiny, Will Deacon, linux-kernel, elver, tglx, paulmck,
	mingo, peterz, David S. Miller

On Mon, May 18, 2020 at 05:23:10PM +0300, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> > On 5/18/20 1:37 AM, Will Deacon wrote:
> > > On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > >> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > >>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > >>>> Now that the page table allocator can free page table allocations
> > >>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > >>>> to avoid needlessly wasting memory.
> > >>>>
> > >>>> Cc: "David S. Miller" <davem@davemloft.net>
> > >>>> Cc: Peter Zijlstra <peterz@infradead.org>
> > >>>> Signed-off-by: Will Deacon <will@kernel.org>
> > >>>
> > >>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > >>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > >>> does reverting the rest of the series.
> > >>>
> > >> Actually, turns out I see the same pattern (lots of scheduling while atomic
> > >> followed by 'killing interrupt handler' in cryptomgr_test) with several
> > >> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> > >> the results here as well as soon as I have it.
> > > 
> > > FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > > issues. However, linux-next is a different story, where I don't get very far
> > > at all:
> > > 
> > > BUG: Bad page state in process swapper  pfn:005b4
> 
> This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> iterate over memblock regions rather that check each PFN") and reverting
> it and partially reverting the next cleanup commits makes those
> dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
> 
> Run /sbin/init as init process
>   with arguments:
>     /sbin/init
>   with environment:
>     HOME=/
>     TERM=linux
> Starting init: /sbin/init exists but couldn't execute it (error -14)
> 

Interesting; that is also seen on microblazeel:petalogix-ml605. Bisect there
suggests 'arch/kmap_atomic: consolidate duplicate code' as the culprit,
which is part of Ira's series.

Today's -next is even worse, unfortunately; now all microblaze boot tests
(both little and big endian) fail, plus everything that failed last
time, plus new compile failures. Another round of bisects ...

Guenter

> I've tried to bisect mmotm and I've got the first bad commits in
> different places in the middle of arch/kmap series [1] so I've added Ira
> to CC as well :)
> 
> I'll continue to look into "bad page" on sparc32
> 
> [1] https://lore.kernel.org/dri-devel/20200507150004.1423069-11-ira.weiny@intel.com/
> 
> > Here are the bisect results for ppc:
> > 
> > # bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
> > # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> > git bisect start 'HEAD' 'v5.7-rc5'
> 
> ...
> 
> > # good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
> > git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
> > # bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
> > git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
> > # good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
> > git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
> > # good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
> > git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
> > # first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
> > 
> > I don't know if that is accurate either. Maybe things are so broken
> > that bisect gets confused, or the problem is due to interaction
> > between different patch series.
> 
> My results with the workaround for sparc32 boot look similar:
> 
> # bad: [2bbf0589bfeb27800c730b76eacf34528eee5418] pci: test for unexpectedly disabled bridges
> git bisect bad 2bbf0589bfeb27800c730b76eacf34528eee5418
> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> git bisect good 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8
> # bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
> git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
> # bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
> git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
> # good: [e27369856a2d42ae4d84bc2c4ddac1e696c40d7c] mm: remove the prot argument from vm_map_ram
> git bisect good e27369856a2d42ae4d84bc2c4ddac1e696c40d7c
> # good: [6911f2b29f6daae2c4b51e6a37f794056d8afabd] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
> git bisect good 6911f2b29f6daae2c4b51e6a37f794056d8afabd
> # good: [8cef4726f20ae37c3cf3f7a449f5b8a088247a27] hugetlbfs: clean up command line processing
> git bisect good 8cef4726f20ae37c3cf3f7a449f5b8a088247a27
> # good: [94f38895e0a68ceac3ceece6528123ed3129cedd] arch/kmap: ensure kmap_prot visibility
> git bisect good 94f38895e0a68ceac3ceece6528123ed3129cedd
> # skip: [fcc77c28bf9155c681712b25c0f5e6125d10ba2e] kmap: consolidate kmap_prot definitions
> git bisect skip fcc77c28bf9155c681712b25c0f5e6125d10ba2e
> # bad: [175a67be7ee750b2aa2a4a2fedeff18fdce787ac] kmap-consolidate-kmap_prot-definitions-checkpatch-fixes
> git bisect bad 175a67be7ee750b2aa2a4a2fedeff18fdce787ac
> # bad: [54db8ed321d66a00b6c69bbd5bf7c59809b3fd42] drm: vmwgfx: include linux/highmem.h
> git bisect bad 54db8ed321d66a00b6c69bbd5bf7c59809b3fd42
> # bad: [6671299c829d19c6ceb0fd1a14b690f6115c6d3d] arch/kmap: define kmap_atomic_prot() for all arch's
> git bisect bad 6671299c829d19c6ceb0fd1a14b690f6115c6d3d
> # bad: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
> git bisect bad f800fb6e517710e04391821e4b1908606c8a6b24
> # first bad commit: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
> 
> 
> > Guenter
> 
> -- 
> Sincerely yours,
> Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18 14:23           ` Mike Rapoport
  2020-05-18 16:08             ` Guenter Roeck
@ 2020-05-18 18:09             ` Guenter Roeck
  2020-05-18 18:21               ` Ira Weiny
  2020-05-18 19:15               ` Mike Rapoport
  1 sibling, 2 replies; 127+ messages in thread
From: Guenter Roeck @ 2020-05-18 18:09 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Ira Weiny, Will Deacon, linux-kernel, elver, tglx, paulmck,
	mingo, peterz, David S. Miller

On 5/18/20 7:23 AM, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
>> On 5/18/20 1:37 AM, Will Deacon wrote:
>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>>> Now that the page table allocator can free page table allocations
>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>>> to avoid needlessly wasting memory.
>>>>>>
>>>>>> Cc: "David S. Miller" <davem@davemloft.net>
>>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>>>> Signed-off-by: Will Deacon <will@kernel.org>
>>>>>
>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>>> does reverting the rest of the series.
>>>>>
>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
>>>> the results here as well as soon as I have it.
>>>
>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>>> issues. However, linux-next is a different story, where I don't get very far
>>> at all:
>>>
>>> BUG: Bad page state in process swapper  pfn:005b4
> 
> This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> iterate over memblock regions rather that check each PFN") and reverting
> it and partially reverting the next cleanup commits makes those
> dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
> 
> Run /sbin/init as init process
>   with arguments:
>     /sbin/init
>   with environment:
>     HOME=/
>     TERM=linux
> Starting init: /sbin/init exists but couldn't execute it (error -14)
> 
> I've tried to bisect mmotm and I've got the first bad commits in
> different places in the middle of arch/kmap series [1] so I've added Ira
> to CC as well :)
> 
> I'll continue to look into "bad page" on sparc32
> 
> [1] https://lore.kernel.org/dri-devel/20200507150004.1423069-11-ira.weiny@intel.com/
> 
>> Here are the bisect results for ppc:
>>
>> # bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
>> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
>> git bisect start 'HEAD' 'v5.7-rc5'
> 
> ...
> 
>> # good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
>> git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
>> # bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
>> git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
>> # good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
>> git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
>> # good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
>> git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
>> # first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
>>
>> I don't know if that is accurate either. Maybe things are so broken
>> that bisect gets confused, or the problem is due to interaction
>> between different patch series.
> 
> My results with the workaround for sparc32 boot look similar:
> 
> # bad: [2bbf0589bfeb27800c730b76eacf34528eee5418] pci: test for unexpectedly disabled bridges
> git bisect bad 2bbf0589bfeb27800c730b76eacf34528eee5418
> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> git bisect good 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8
> # bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
> git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
> # bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
> git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
> # good: [e27369856a2d42ae4d84bc2c4ddac1e696c40d7c] mm: remove the prot argument from vm_map_ram
> git bisect good e27369856a2d42ae4d84bc2c4ddac1e696c40d7c
> # good: [6911f2b29f6daae2c4b51e6a37f794056d8afabd] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
> git bisect good 6911f2b29f6daae2c4b51e6a37f794056d8afabd
> # good: [8cef4726f20ae37c3cf3f7a449f5b8a088247a27] hugetlbfs: clean up command line processing
> git bisect good 8cef4726f20ae37c3cf3f7a449f5b8a088247a27
> # good: [94f38895e0a68ceac3ceece6528123ed3129cedd] arch/kmap: ensure kmap_prot visibility
> git bisect good 94f38895e0a68ceac3ceece6528123ed3129cedd
> # skip: [fcc77c28bf9155c681712b25c0f5e6125d10ba2e] kmap: consolidate kmap_prot definitions
> git bisect skip fcc77c28bf9155c681712b25c0f5e6125d10ba2e
> # bad: [175a67be7ee750b2aa2a4a2fedeff18fdce787ac] kmap-consolidate-kmap_prot-definitions-checkpatch-fixes
> git bisect bad 175a67be7ee750b2aa2a4a2fedeff18fdce787ac
> # bad: [54db8ed321d66a00b6c69bbd5bf7c59809b3fd42] drm: vmwgfx: include linux/highmem.h
> git bisect bad 54db8ed321d66a00b6c69bbd5bf7c59809b3fd42
> # bad: [6671299c829d19c6ceb0fd1a14b690f6115c6d3d] arch/kmap: define kmap_atomic_prot() for all arch's
> git bisect bad 6671299c829d19c6ceb0fd1a14b690f6115c6d3d
> # bad: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
> git bisect bad f800fb6e517710e04391821e4b1908606c8a6b24
> # first bad commit: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
> 
> 

Below is another set of bisect results, from next-20200518. It points to one
of your commits. This is for microblaze (big endian) boot failures.

Guenter

---
# bad: [72bc15d0018ebfbc9c389539d636e2e9a9002b3b] Add linux-next specific files for 20200518
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect start 'HEAD' 'v5.7-rc5'
# good: [b5b9a1a40fcf10db8f140c987b715e6816e1292d] Merge remote-tracking branch 'crypto/master'
git bisect good b5b9a1a40fcf10db8f140c987b715e6816e1292d
# good: [6a349e7cf4cec11b63ca8e3095c990e146f48784] Merge remote-tracking branch 'tip/auto-latest'
git bisect good 6a349e7cf4cec11b63ca8e3095c990e146f48784
# good: [0c5e27cea5e173afc1971ce9a521e022c288548c] Merge remote-tracking branch 'staging/staging-next'
git bisect good 0c5e27cea5e173afc1971ce9a521e022c288548c
# good: [7e90955569a080b17030161db6152917f3b0e061] Merge remote-tracking branch 'hyperv/hyperv-next'
git bisect good 7e90955569a080b17030161db6152917f3b0e061
# good: [c0218a9a3a60cf081f5545302d0fc28a8d68059b] fs/buffer.c: add debug print for __getblk_gfp() stall problem
git bisect good c0218a9a3a60cf081f5545302d0fc28a8d68059b
# good: [bcda3c9d968d3a8b596904fb2ff8009717ffb6ef] Merge branch 'akpm-current/current'
git bisect good bcda3c9d968d3a8b596904fb2ff8009717ffb6ef
# good: [5b271f59a6aee147db3d7137f6132f74977131c1] kernel: use show_stack_loglvl()
git bisect good 5b271f59a6aee147db3d7137f6132f74977131c1
# good: [dec7b12bacc0859e689c4a42714c7bf4d0b98cfd] mm/mmap.c: add more sanity checks to get_unmapped_area()
git bisect good dec7b12bacc0859e689c4a42714c7bf4d0b98cfd
# bad: [feda7bcd5e1846039cc1a999bf4090b1fee890e8] mm: fix build error for mips of process_madvise
git bisect bad feda7bcd5e1846039cc1a999bf4090b1fee890e8
# good: [0533da2f2fa20c28ac5b4573bd6bb0d445638c6a] x86/mm: simplify init_trampoline() and surrounding logic
git bisect good 0533da2f2fa20c28ac5b4573bd6bb0d445638c6a
# bad: [2b166035a0202b90f5860178b8ae43d41a42117f] mm: consolidate pud_index() and pud_offset() definitions
git bisect bad 2b166035a0202b90f5860178b8ae43d41a42117f
# bad: [01f489acfb0783379cc764d503477c0f6df49a0b] mm: consolidate pte_index() and pte_offset_*() definitions
git bisect bad 01f489acfb0783379cc764d503477c0f6df49a0b
# bad: [c57a43e52bf5fdc4152bb17db6e9c5d35569dcfd] mm: pgtable: add shortcuts for accessing kernel PMD and PTE
git bisect bad c57a43e52bf5fdc4152bb17db6e9c5d35569dcfd
# first bad commit: [c57a43e52bf5fdc4152bb17db6e9c5d35569dcfd] mm: pgtable: add shortcuts for accessing kernel PMD and PTE



^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18 16:08             ` Guenter Roeck
@ 2020-05-18 18:11               ` Ira Weiny
  2020-05-18 18:14               ` Ira Weiny
  1 sibling, 0 replies; 127+ messages in thread
From: Ira Weiny @ 2020-05-18 18:11 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mike Rapoport, Will Deacon, linux-kernel, elver, tglx, paulmck,
	mingo, peterz, David S. Miller

On Mon, May 18, 2020 at 09:08:11AM -0700, Guenter Roeck wrote:
> On Mon, May 18, 2020 at 05:23:10PM +0300, Mike Rapoport wrote:
> > On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> > > On 5/18/20 1:37 AM, Will Deacon wrote:
> > > > On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > > >> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > > >>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > >>>> Now that the page table allocator can free page table allocations
> > > >>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > >>>> to avoid needlessly wasting memory.
> > > >>>>
> > > >>>> Cc: "David S. Miller" <davem@davemloft.net>
> > > >>>> Cc: Peter Zijlstra <peterz@infradead.org>
> > > >>>> Signed-off-by: Will Deacon <will@kernel.org>
> > > >>>
> > > >>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > > >>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > > >>> does reverting the rest of the series.
> > > >>>
> > > >> Actually, turns out I see the same pattern (lots of scheduling while atomic
> > > >> followed by 'killing interrupt handler' in cryptomgr_test) with several
> > > >> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> > > >> the results here as well as soon as I have it.
> > > > 
> > > > FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > > > issues. However, linux-next is a different story, where I don't get very far
> > > > at all:
> > > > 
> > > > BUG: Bad page state in process swapper  pfn:005b4
> > 
> > This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> > iterate over memblock regions rather that check each PFN") and reverting
> > it and partially reverting the next cleanup commits makes those
> > dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
> > 
> > Run /sbin/init as init process
> >   with arguments:
> >     /sbin/init
> >   with environment:
> >     HOME=/
> >     TERM=linux
> > Starting init: /sbin/init exists but couldn't execute it (error -14)
> > 
> 
> Interesting; that is also seen on microblazeel:petalogix-ml605. Bisect there
> suggests 'arch/kmap_atomic: consolidate duplicate code' as the culprit,
> which is part of Ira's series.
> 
> Today's -next is even worse, unfortunately; now all microblaze boot tests
> (both little and big endian) fail, plus everything that failed last
> time, plus new compile failures. Another round of bisects ...

I've found this bug in microblaze for sure still looking through the other archs...

commit 82c284b2bb74ca195dfcd35b70a175f010b9fd46 (HEAD -> lm-kmap17)
Author: Ira Weiny <ira.weiny@intel.com>
Date:   Mon May 18 11:01:10 2020 -0700

    microblaze/kmap: Don't enable pagefault/preempt twice
    
    The kunmap_atomic clean up failed to remove the pagefault/preempt
    enables on this path.
    
    Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
    Signed-off-by: Ira Weiny <ira.weiny@intel.com>

diff --git a/arch/microblaze/mm/highmem.c b/arch/microblaze/mm/highmem.c
index ee8a422b2b76..92e0890416c9 100644
--- a/arch/microblaze/mm/highmem.c
+++ b/arch/microblaze/mm/highmem.c
@@ -57,11 +57,8 @@ void kunmap_atomic_high(void *kvaddr)
        int type;
        unsigned int idx;
 
-       if (vaddr < __fix_to_virt(FIX_KMAP_END)) {
-               pagefault_enable();
-               preempt_enable();
+       if (vaddr < __fix_to_virt(FIX_KMAP_END))
                return;
-       }
 
        type = kmap_atomic_idx();


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18 16:08             ` Guenter Roeck
  2020-05-18 18:11               ` Ira Weiny
@ 2020-05-18 18:14               ` Ira Weiny
  1 sibling, 0 replies; 127+ messages in thread
From: Ira Weiny @ 2020-05-18 18:14 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mike Rapoport, Will Deacon, linux-kernel, elver, tglx, paulmck,
	mingo, peterz, David S. Miller

On Mon, May 18, 2020 at 09:08:11AM -0700, Guenter Roeck wrote:
> On Mon, May 18, 2020 at 05:23:10PM +0300, Mike Rapoport wrote:
> > On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> > > On 5/18/20 1:37 AM, Will Deacon wrote:
> > > > On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > > >> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > > >>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > >>>> Now that the page table allocator can free page table allocations
> > > >>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > >>>> to avoid needlessly wasting memory.
> > > >>>>
> > > >>>> Cc: "David S. Miller" <davem@davemloft.net>
> > > >>>> Cc: Peter Zijlstra <peterz@infradead.org>
> > > >>>> Signed-off-by: Will Deacon <will@kernel.org>
> > > >>>
> > > >>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > > >>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > > >>> does reverting the rest of the series.
> > > >>>
> > > >> Actually, turns out I see the same pattern (lots of scheduling while atomic
> > > >> followed by 'killing interrupt handler' in cryptomgr_test) with several
> > > >> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> > > >> the results here as well as soon as I have it.
> > > > 
> > > > FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > > > issues. However, linux-next is a different story, where I don't get very far
> > > > at all:
> > > > 
> > > > BUG: Bad page state in process swapper  pfn:005b4
> > 
> > This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> > iterate over memblock regions rather that check each PFN") and reverting
> > it and partially reverting the next cleanup commits makes those
> > dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
> > 
> > Run /sbin/init as init process
> >   with arguments:
> >     /sbin/init
> >   with environment:
> >     HOME=/
> >     TERM=linux
> > Starting init: /sbin/init exists but couldn't execute it (error -14)
> > 
> 
> Interesting; that is also seen on microblazeel:petalogix-ml605. Bisect there
> suggests 'arch/kmap_atomic: consolidate duplicate code' as the culprit,
> which is part of Ira's series.
> 
> Today's -next is even worse, unfortunately; now all microblaze boot tests
> (both little and big endian) fail, plus everything that failed last
> time, plus new compile failures. Another round of bisects ...

Sparc had the same problem...


commit 6e5c523370c510f5fae3436b193ab5dabe0fef06 (HEAD -> lm-kmap17)
Author: Ira Weiny <ira.weiny@intel.com>
Date:   Mon May 18 11:13:16 2020 -0700

    arch/sparc: Don't enable pagefault/preempt twice
    
    The kunmap_atomic clean up failed to remove the pagefault/preempt
    enables on this path.
    
    Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
    Signed-off-by: Ira Weiny <ira.weiny@intel.com>

diff --git a/arch/sparc/mm/highmem.c b/arch/sparc/mm/highmem.c
index d237d902f9c3..13fb197bb26c 100644
--- a/arch/sparc/mm/highmem.c
+++ b/arch/sparc/mm/highmem.c
@@ -86,11 +86,8 @@ void kunmap_atomic_high(void *kvaddr)
        unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK;
        int type;
 
-       if (vaddr < FIXADDR_START) { // FIXME
-               pagefault_enable();
-               preempt_enable();
+       if (vaddr < FIXADDR_START) // FIXME
                return;
-       }
 
        type = kmap_atomic_idx();
 

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18 18:09             ` Guenter Roeck
@ 2020-05-18 18:21               ` Ira Weiny
  2020-05-18 19:15               ` Mike Rapoport
  1 sibling, 0 replies; 127+ messages in thread
From: Ira Weiny @ 2020-05-18 18:21 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mike Rapoport, Will Deacon, linux-kernel, elver, tglx, paulmck,
	mingo, peterz, David S. Miller

On Mon, May 18, 2020 at 11:09:46AM -0700, Guenter Roeck wrote:
> On 5/18/20 7:23 AM, Mike Rapoport wrote:
> > On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> >> On 5/18/20 1:37 AM, Will Deacon wrote:
> >>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> >>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> >>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> >>>>>> Now that the page table allocator can free page table allocations
> >>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> >>>>>> to avoid needlessly wasting memory.
> >>>>>>
> >>>>>> Cc: "David S. Miller" <davem@davemloft.net>
> >>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
> >>>>>> Signed-off-by: Will Deacon <will@kernel.org>
> >>>>>
> >>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> >>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> >>>>> does reverting the rest of the series.
> >>>>>
> >>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
> >>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
> >>>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> >>>> the results here as well as soon as I have it.
> >>>
> >>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> >>> issues. However, linux-next is a different story, where I don't get very far
> >>> at all:
> >>>
> >>> BUG: Bad page state in process swapper  pfn:005b4
> > 
> > This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> > iterate over memblock regions rather that check each PFN") and reverting
> > it and partially reverting the next cleanup commits makes those
> > dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
> > 
> > Run /sbin/init as init process
> >   with arguments:
> >     /sbin/init
> >   with environment:
> >     HOME=/
> >     TERM=linux
> > Starting init: /sbin/init exists but couldn't execute it (error -14)
> > 
> > I've tried to bisect mmotm and I've got the first bad commits in
> > different places in the middle of arch/kmap series [1] so I've added Ira
> > to CC as well :)
> > 
> > I'll continue to look into "bad page" on sparc32

mips is broken too.

Does anyone know what this FIXME was for?

...
        if (vaddr < FIXADDR_START) { // FIXME
...

I'm going to remove it...

Ira


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18 18:09             ` Guenter Roeck
  2020-05-18 18:21               ` Ira Weiny
@ 2020-05-18 19:15               ` Mike Rapoport
  2020-05-19 16:40                 ` Guenter Roeck
  1 sibling, 1 reply; 127+ messages in thread
From: Mike Rapoport @ 2020-05-18 19:15 UTC (permalink / raw)
  To: Guenter Roeck, Andrew Morton
  Cc: Ira Weiny, Will Deacon, linux-kernel, elver, tglx, paulmck,
	mingo, peterz, David S. Miller

On Mon, May 18, 2020 at 11:09:46AM -0700, Guenter Roeck wrote:
> On 5/18/20 7:23 AM, Mike Rapoport wrote:
> 
> Below is another set of bisect results, from next-20200518. It points to one
> of your commits. This is for microblaze (big endian) boot failures.

The microblaze one was easy, as for sparc32 I still have no clue for the
root cause :(

Andrew, can you please fold it into "mm: pgtable: add shortcuts for
accessing kernel PMD and PTE"? 

From 167250de28aa526342641b2647294a755d234090 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <rppt@linux.ibm.com>
Date: Mon, 18 May 2020 22:08:10 +0300
Subject: [PATCH] microblaze: fix page table traversal in setup_rt_frame()

The replacement of long folded page table traversal with the direct access
to PMD entry wrongly used the kernel page table in setup_rt_frame()
function instead of the process (current->mm) page table.

Fix it.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
---
 arch/microblaze/kernel/signal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/microblaze/kernel/signal.c b/arch/microblaze/kernel/signal.c
index 28b1ec4b4e79..bdd6d0c86e16 100644
--- a/arch/microblaze/kernel/signal.c
+++ b/arch/microblaze/kernel/signal.c
@@ -194,7 +194,7 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
 
 	address = ((unsigned long)frame->tramp);
 #ifdef CONFIG_MMU
-	pmdp = pmd_off_k(address);
+	pmdp = pmd_off(current->mm, address);
 
 	preempt_disable();
 	ptep = pte_offset_map(pmdp, address);
-- 
2.26.2


> Guenter
> 
> ---

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18 19:15               ` Mike Rapoport
@ 2020-05-19 16:40                 ` Guenter Roeck
  0 siblings, 0 replies; 127+ messages in thread
From: Guenter Roeck @ 2020-05-19 16:40 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Ira Weiny, Will Deacon, linux-kernel, elver, tglx,
	paulmck, mingo, peterz, David S. Miller

On Mon, May 18, 2020 at 10:15:11PM +0300, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 11:09:46AM -0700, Guenter Roeck wrote:
> > On 5/18/20 7:23 AM, Mike Rapoport wrote:
> > 
> > Below is another set of bisect results, from next-20200518. It points to one
> > of your commits. This is for microblaze (big endian) boot failures.
> 
> The microblaze one was easy, as for sparc32 I still have no clue for the
> root cause :(
> 
> Andrew, can you please fold it into "mm: pgtable: add shortcuts for
> accessing kernel PMD and PTE"? 
> 
> From 167250de28aa526342641b2647294a755d234090 Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <rppt@linux.ibm.com>
> Date: Mon, 18 May 2020 22:08:10 +0300
> Subject: [PATCH] microblaze: fix page table traversal in setup_rt_frame()
> 
> The replacement of long folded page table traversal with the direct access
> to PMD entry wrongly used the kernel page table in setup_rt_frame()
> function instead of the process (current->mm) page table.
> 
> Fix it.
> 
> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>

Tested-by: Guenter Roeck <linux@roeck-us.net>

> ---
>  arch/microblaze/kernel/signal.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/microblaze/kernel/signal.c b/arch/microblaze/kernel/signal.c
> index 28b1ec4b4e79..bdd6d0c86e16 100644
> --- a/arch/microblaze/kernel/signal.c
> +++ b/arch/microblaze/kernel/signal.c
> @@ -194,7 +194,7 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
>  
>  	address = ((unsigned long)frame->tramp);
>  #ifdef CONFIG_MMU
> -	pmdp = pmd_off_k(address);
> +	pmdp = pmd_off(current->mm, address);
>  
>  	preempt_disable();
>  	ptep = pte_offset_map(pmdp, address);
> -- 
> 2.26.2
> 
> 
> > Guenter
> > 
> > ---
> 
> -- 
> Sincerely yours,
> Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-18  8:37       ` Will Deacon
  2020-05-18  9:18         ` Mike Rapoport
  2020-05-18  9:48         ` Guenter Roeck
@ 2020-05-20 17:03         ` Mike Rapoport
  2020-05-20 19:03           ` Guenter Roeck
  2 siblings, 1 reply; 127+ messages in thread
From: Mike Rapoport @ 2020-05-20 17:03 UTC (permalink / raw)
  To: Will Deacon
  Cc: Guenter Roeck, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > > On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > > Now that the page table allocator can free page table allocations
> > > > smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > > to avoid needlessly wasting memory.
> > > > 
> > > > Cc: "David S. Miller" <davem@davemloft.net>
> > > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > > Signed-off-by: Will Deacon <will@kernel.org>
> > > 
> > > Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > > to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > > does reverting the rest of the series.
> > > 
> > Actually, turns out I see the same pattern (lots of scheduling while atomic
> > followed by 'killing interrupt handler' in cryptomgr_test) with several
> > powerpc boot tests.  I am currently bisecting those crashes. I'll report
> > the results here as well as soon as I have it.
> 
> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> issues. However, linux-next is a different story, where I don't get very far
> at all:
> 
> BUG: Bad page state in process swapper  pfn:005b4
 
This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
regions rather that check each PFN"). The commit sha is valid for
v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)

As it seems, sparc32 never registered the memory occupied by the kernel
image with memblock_add() and it only reserves this memory with
meblock_reserve(). 

I don't know what would happen on real HW, but with 

qemu-system-sparc -kernel /path/to/kernel

the memory occupied by the kernel is reserved in openbios and removed
from mem.available. The prom setup code in the kernel used mem.available
to set up the memory banks and essentially there is a hole for the
memory occupied by the kernel.

Later in bootmem_init() this memory is memblock_reserve()d.

Before the problematic commit, memmap initialization would call
__init_single_page() for the pages in that hole, the
free_low_memory_core_early() would mark them as resrved and everything
would be Ok.

After the change in memmap initialization, the hole is skipped and the
page structs for it are not inited. And when they are passed from
memblock to page allocator as reserved it gets confused.

Simply registering the memory occupied by the kernel with memblock_add()
resolves this issue, at least for qemu-system-arm and I cannot see how
it can harm any other setup.

If all that makes sense I'll send a proper patch :)

diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
index 906eda1158b4..3cb3dffcbcdc 100644
--- a/arch/sparc/mm/init_32.c
+++ b/arch/sparc/mm/init_32.c
@@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
 	/* Reserve the kernel text/data/bss. */
 	size = (start_pfn << PAGE_SHIFT) - phys_base;
 	memblock_reserve(phys_base, size);
+	memblock_add(phys_base, size);
 
 	size = memblock_phys_mem_size() - memblock_reserved_size();
 	*pages_avail = (size >> PAGE_SHIFT) - high_pages;

> Will

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-20 17:03         ` Mike Rapoport
@ 2020-05-20 19:03           ` Guenter Roeck
  2020-05-20 19:51             ` Mike Rapoport
  0 siblings, 1 reply; 127+ messages in thread
From: Guenter Roeck @ 2020-05-20 19:03 UTC (permalink / raw)
  To: Mike Rapoport, Will Deacon
  Cc: linux-kernel, elver, tglx, paulmck, mingo, peterz, David S. Miller

On 5/20/20 10:03 AM, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>> Now that the page table allocator can free page table allocations
>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>> to avoid needlessly wasting memory.
>>>>>
>>>>> Cc: "David S. Miller" <davem@davemloft.net>
>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>>> Signed-off-by: Will Deacon <will@kernel.org>
>>>>
>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>> does reverting the rest of the series.
>>>>
>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
>>> the results here as well as soon as I have it.
>>
>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>> issues. However, linux-next is a different story, where I don't get very far
>> at all:
>>
>> BUG: Bad page state in process swapper  pfn:005b4
>  
> This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
> regions rather that check each PFN"). The commit sha is valid for
> v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)
> 
> As it seems, sparc32 never registered the memory occupied by the kernel
> image with memblock_add() and it only reserves this memory with
> meblock_reserve(). 
> 
> I don't know what would happen on real HW, but with 
> 
> qemu-system-sparc -kernel /path/to/kernel
> 
> the memory occupied by the kernel is reserved in openbios and removed
> from mem.available. The prom setup code in the kernel used mem.available
> to set up the memory banks and essentially there is a hole for the
> memory occupied by the kernel.
> 
> Later in bootmem_init() this memory is memblock_reserve()d.
> 
> Before the problematic commit, memmap initialization would call
> __init_single_page() for the pages in that hole, the
> free_low_memory_core_early() would mark them as resrved and everything
> would be Ok.
> 
> After the change in memmap initialization, the hole is skipped and the
> page structs for it are not inited. And when they are passed from
> memblock to page allocator as reserved it gets confused.
> 
> Simply registering the memory occupied by the kernel with memblock_add()
> resolves this issue, at least for qemu-system-arm and I cannot see how
> it can harm any other setup.
> 
> If all that makes sense I'll send a proper patch :)
> 
> diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
> index 906eda1158b4..3cb3dffcbcdc 100644
> --- a/arch/sparc/mm/init_32.c
> +++ b/arch/sparc/mm/init_32.c
> @@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
>  	/* Reserve the kernel text/data/bss. */
>  	size = (start_pfn << PAGE_SHIFT) - phys_base;
>  	memblock_reserve(phys_base, size);
> +	memblock_add(phys_base, size);
>  
>  	size = memblock_phys_mem_size() - memblock_reserved_size();
>  	*pages_avail = (size >> PAGE_SHIFT) - high_pages;
> 
>> Will
> 

With above patch applied on top of Ira's patch, I get:

BUG: spinlock recursion on CPU#0, S01syslogd/139
 lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
[f0067a64 :
do_raw_spin_lock+0xa8/0xd8 ]
[f00d5034 :
copy_page_range+0x328/0x804 ]
[f0025be4 :
dup_mm+0x334/0x434 ]
[f0027124 :
copy_process+0x1224/0x12b0 ]
[f0027344 :
_do_fork+0x54/0x30c ]
[f0027670 :
do_fork+0x5c/0x6c ]
[f000de44 :
sparc_do_fork+0x18/0x38 ]
[f000b7f4 :
do_syscall+0x34/0x40 ]
[5010cd4c :
0x5010cd4c ]

Looks like yet another problem.

I can not revert c03584e30534 because it results in a compile failure.

Guenter

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-20 19:03           ` Guenter Roeck
@ 2020-05-20 19:51             ` Mike Rapoport
  2020-05-21 23:02               ` Guenter Roeck
  0 siblings, 1 reply; 127+ messages in thread
From: Mike Rapoport @ 2020-05-20 19:51 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Will Deacon, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> On 5/20/20 10:03 AM, Mike Rapoport wrote:
> > On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> >> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> >>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> >>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> >>>>> Now that the page table allocator can free page table allocations
> >>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> >>>>> to avoid needlessly wasting memory.
> >>>>>
> >>>>> Cc: "David S. Miller" <davem@davemloft.net>
> >>>>> Cc: Peter Zijlstra <peterz@infradead.org>
> >>>>> Signed-off-by: Will Deacon <will@kernel.org>
> >>>>
> >>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> >>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> >>>> does reverting the rest of the series.
> >>>>
> >>> Actually, turns out I see the same pattern (lots of scheduling while atomic
> >>> followed by 'killing interrupt handler' in cryptomgr_test) with several
> >>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> >>> the results here as well as soon as I have it.
> >>
> >> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> >> issues. However, linux-next is a different story, where I don't get very far
> >> at all:
> >>
> >> BUG: Bad page state in process swapper  pfn:005b4
> >  
> > This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
> > regions rather that check each PFN"). The commit sha is valid for
> > v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)
> > 
> > As it seems, sparc32 never registered the memory occupied by the kernel
> > image with memblock_add() and it only reserves this memory with
> > meblock_reserve(). 
> > 
> > I don't know what would happen on real HW, but with 
> > 
> > qemu-system-sparc -kernel /path/to/kernel
> > 
> > the memory occupied by the kernel is reserved in openbios and removed
> > from mem.available. The prom setup code in the kernel used mem.available
> > to set up the memory banks and essentially there is a hole for the
> > memory occupied by the kernel.
> > 
> > Later in bootmem_init() this memory is memblock_reserve()d.
> > 
> > Before the problematic commit, memmap initialization would call
> > __init_single_page() for the pages in that hole, the
> > free_low_memory_core_early() would mark them as resrved and everything
> > would be Ok.
> > 
> > After the change in memmap initialization, the hole is skipped and the
> > page structs for it are not inited. And when they are passed from
> > memblock to page allocator as reserved it gets confused.
> > 
> > Simply registering the memory occupied by the kernel with memblock_add()
> > resolves this issue, at least for qemu-system-arm and I cannot see how
> > it can harm any other setup.
> > 
> > If all that makes sense I'll send a proper patch :)
> > 
> > diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
> > index 906eda1158b4..3cb3dffcbcdc 100644
> > --- a/arch/sparc/mm/init_32.c
> > +++ b/arch/sparc/mm/init_32.c
> > @@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
> >  	/* Reserve the kernel text/data/bss. */
> >  	size = (start_pfn << PAGE_SHIFT) - phys_base;
> >  	memblock_reserve(phys_base, size);
> > +	memblock_add(phys_base, size);
> >  
> >  	size = memblock_phys_mem_size() - memblock_reserved_size();
> >  	*pages_avail = (size >> PAGE_SHIFT) - high_pages;
> > 
> >> Will
> > 
> 
> With above patch applied on top of Ira's patch, I get:
> 
> BUG: spinlock recursion on CPU#0, S01syslogd/139
>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> [f0067a64 :
> do_raw_spin_lock+0xa8/0xd8 ]
> [f00d5034 :
> copy_page_range+0x328/0x804 ]
> [f0025be4 :
> dup_mm+0x334/0x434 ]
> [f0027124 :
> copy_process+0x1224/0x12b0 ]
> [f0027344 :
> _do_fork+0x54/0x30c ]
> [f0027670 :
> do_fork+0x5c/0x6c ]
> [f000de44 :
> sparc_do_fork+0x18/0x38 ]
> [f000b7f4 :
> do_syscall+0x34/0x40 ]
> [5010cd4c :
> 0x5010cd4c ]
> 
> Looks like yet another problem.

I've checked the patch above on top of the mmots which already has Ira's
patches and it booted fine. I've used sparc32_defconfig to build the
kernel and qemu-system-sparc with default machine and CPU. 

> I can not revert c03584e30534 because it results in a compile failure.

Here's the "revert" of c03584e30534:

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d001d61e64d5..c9d9d3f9ebf4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5901,6 +5901,42 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
 	return false;
 }
 
+#ifdef CONFIG_SPARSEMEM
+/* Skip PFNs that belong to non-present sections */
+static inline __meminit unsigned long next_pfn(unsigned long pfn)
+{
+	const unsigned long section_nr = pfn_to_section_nr(++pfn);
+
+	if (present_section_nr(section_nr))
+		return pfn;
+	return section_nr_to_pfn(next_present_section_nr(section_nr));
+}
+#else
+static inline __meminit unsigned long next_pfn(unsigned long pfn)
+{
+	return pfn++;
+}
+#endif
+
+#ifdef CONFIG_NODES_SPAN_OTHER_NODES
+/* Only safe to use early in boot when initialisation is single-threaded */
+static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
+{
+	int nid;
+
+	nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache);
+	if (nid >= 0 && nid != node)
+		return false;
+	return true;
+}
+
+#else
+static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
+{
+	return true;
+}
+#endif
+
 /*
  * Initially all pages are reserved - free ones are freed
  * up by memblock_free_all() once the early boot process is
@@ -5940,6 +5976,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		 * function.  They do not exist on hotplugged memory.
 		 */
 		if (context == MEMMAP_EARLY) {
+			if (!early_pfn_valid(pfn)) {
+				pfn = next_pfn(pfn);
+				continue;
+			}
+			if (!early_pfn_in_nid(pfn, nid)) {
+				pfn++;
+				continue;
+			}
 			if (overlap_memmap_init(zone, &pfn))
 				continue;
 			if (defer_init(nid, pfn, end_pfn))
@@ -6055,23 +6099,9 @@ static void __meminit zone_init_free_lists(struct zone *zone)
 }
 
 void __meminit __weak memmap_init(unsigned long size, int nid,
-				  unsigned long zone,
-				  unsigned long range_start_pfn)
+				  unsigned long zone, unsigned long start_pfn)
 {
-	unsigned long start_pfn, end_pfn;
-	unsigned long range_end_pfn = range_start_pfn + size;
-	int i;
-
-	for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
-		start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
-		end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
-
-		if (end_pfn > start_pfn) {
-			size = end_pfn - start_pfn;
-			memmap_init_zone(size, nid, zone, start_pfn,
-					 MEMMAP_EARLY, NULL);
-		}
-	}
+	memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL);
 }
 
 static int zone_batchsize(struct zone *zone)

> Guenter

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [tip: locking/kcsan] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
@ 2020-05-20 22:17     ` Borislav Petkov
  2020-05-20 22:30       ` Marco Elver
                         ` (2 more replies)
  0 siblings, 3 replies; 127+ messages in thread
From: Borislav Petkov @ 2020-05-20 22:17 UTC (permalink / raw)
  To: Will Deacon, Peter Zijlstra (Intel)
  Cc: linux-kernel, linux-tip-commits, Thomas Gleixner, Marco Elver, x86

Hi,

On Tue, May 12, 2020 at 02:36:53PM -0000, tip-bot2 for Will Deacon wrote:
> The following commit has been merged into the locking/kcsan branch of tip:
> 
> Commit-ID:     cdd28ad2d8110099e43527e96d059c5639809680
> Gitweb:        https://git.kernel.org/tip/cdd28ad2d8110099e43527e96d059c5639809680
> Author:        Will Deacon <will@kernel.org>
> AuthorDate:    Mon, 11 May 2020 21:41:49 +01:00
> Committer:     Thomas Gleixner <tglx@linutronix.de>
> CommitterDate: Tue, 12 May 2020 11:04:17 +02:00
> 
> READ_ONCE: Use data_race() to avoid KCSAN instrumentation
> 
> Rather then open-code the disabling/enabling of KCSAN across the guts of
> {READ,WRITE}_ONCE(), defer to the data_race() macro instead.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Marco Elver <elver@google.com>
> Link: https://lkml.kernel.org/r/20200511204150.27858-18-will@kernel.org

so this commit causes a kernel build slowdown depending on the .config
of between 50% and over 100%. I just bisected locking/kcsan and got

NOT_OK:	cdd28ad2d811 READ_ONCE: Use data_race() to avoid KCSAN instrumentation
OK:	88f1be32068d kcsan: Rework data_race() so that it can be used by READ_ONCE()

with a simple:

$ git clean -dqfx && mk defconfig
$ time make -j<NUM_CORES+1>

I'm not even booting the kernels - simply checking out the above commits
and building the target kernels. I.e., something in that commit is
making gcc go nuts in the compilation phases.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [tip: locking/kcsan] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-20 22:17     ` Borislav Petkov
@ 2020-05-20 22:30       ` Marco Elver
  2020-05-21  7:25         ` Borislav Petkov
  2020-05-21  3:30       ` Nathan Chancellor
  2020-05-22 16:08       ` [tip: locking/kcsan] compiler.h: Avoid nested statement expression in data_race() tip-bot2 for Marco Elver
  2 siblings, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-05-20 22:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Will Deacon, Peter Zijlstra (Intel),
	LKML, linux-tip-commits, Thomas Gleixner, x86

On Thu, 21 May 2020 at 00:17, Borislav Petkov <bp@alien8.de> wrote:
>
> Hi,
>
> On Tue, May 12, 2020 at 02:36:53PM -0000, tip-bot2 for Will Deacon wrote:
> > The following commit has been merged into the locking/kcsan branch of tip:
> >
> > Commit-ID:     cdd28ad2d8110099e43527e96d059c5639809680
> > Gitweb:        https://git.kernel.org/tip/cdd28ad2d8110099e43527e96d059c5639809680
> > Author:        Will Deacon <will@kernel.org>
> > AuthorDate:    Mon, 11 May 2020 21:41:49 +01:00
> > Committer:     Thomas Gleixner <tglx@linutronix.de>
> > CommitterDate: Tue, 12 May 2020 11:04:17 +02:00
> >
> > READ_ONCE: Use data_race() to avoid KCSAN instrumentation
> >
> > Rather then open-code the disabling/enabling of KCSAN across the guts of
> > {READ,WRITE}_ONCE(), defer to the data_race() macro instead.
> >
> > Signed-off-by: Will Deacon <will@kernel.org>
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Marco Elver <elver@google.com>
> > Link: https://lkml.kernel.org/r/20200511204150.27858-18-will@kernel.org
>
> so this commit causes a kernel build slowdown depending on the .config
> of between 50% and over 100%. I just bisected locking/kcsan and got
>
> NOT_OK: cdd28ad2d811 READ_ONCE: Use data_race() to avoid KCSAN instrumentation
> OK:     88f1be32068d kcsan: Rework data_race() so that it can be used by READ_ONCE()
>
> with a simple:
>
> $ git clean -dqfx && mk defconfig
> $ time make -j<NUM_CORES+1>
>
> I'm not even booting the kernels - simply checking out the above commits
> and building the target kernels. I.e., something in that commit is
> making gcc go nuts in the compilation phases.

This should be fixed when the series that includes this commit is applied:
https://lore.kernel.org/lkml/20200515150338.190344-9-elver@google.com/

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [tip: locking/kcsan] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-20 22:17     ` Borislav Petkov
  2020-05-20 22:30       ` Marco Elver
@ 2020-05-21  3:30       ` Nathan Chancellor
  2020-05-22 16:08       ` [tip: locking/kcsan] compiler.h: Avoid nested statement expression in data_race() tip-bot2 for Marco Elver
  2 siblings, 0 replies; 127+ messages in thread
From: Nathan Chancellor @ 2020-05-21  3:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Will Deacon, Peter Zijlstra (Intel),
	linux-kernel, linux-tip-commits, Thomas Gleixner, Marco Elver,
	x86, clang-built-linux

On Thu, May 21, 2020 at 12:17:12AM +0200, Borislav Petkov wrote:
> Hi,
> 
> On Tue, May 12, 2020 at 02:36:53PM -0000, tip-bot2 for Will Deacon wrote:
> > The following commit has been merged into the locking/kcsan branch of tip:
> > 
> > Commit-ID:     cdd28ad2d8110099e43527e96d059c5639809680
> > Gitweb:        https://git.kernel.org/tip/cdd28ad2d8110099e43527e96d059c5639809680
> > Author:        Will Deacon <will@kernel.org>
> > AuthorDate:    Mon, 11 May 2020 21:41:49 +01:00
> > Committer:     Thomas Gleixner <tglx@linutronix.de>
> > CommitterDate: Tue, 12 May 2020 11:04:17 +02:00
> > 
> > READ_ONCE: Use data_race() to avoid KCSAN instrumentation
> > 
> > Rather then open-code the disabling/enabling of KCSAN across the guts of
> > {READ,WRITE}_ONCE(), defer to the data_race() macro instead.
> > 
> > Signed-off-by: Will Deacon <will@kernel.org>
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Marco Elver <elver@google.com>
> > Link: https://lkml.kernel.org/r/20200511204150.27858-18-will@kernel.org
> 
> so this commit causes a kernel build slowdown depending on the .config
> of between 50% and over 100%. I just bisected locking/kcsan and got
> 
> NOT_OK:	cdd28ad2d811 READ_ONCE: Use data_race() to avoid KCSAN instrumentation
> OK:	88f1be32068d kcsan: Rework data_race() so that it can be used by READ_ONCE()
> 
> with a simple:
> 
> $ git clean -dqfx && mk defconfig
> $ time make -j<NUM_CORES+1>
> 
> I'm not even booting the kernels - simply checking out the above commits
> and building the target kernels. I.e., something in that commit is
> making gcc go nuts in the compilation phases.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette

For what it's worth, I also noticed the same thing with clang. I only
verified the issue in one of my first build targets, an arm defconfig
build, which regressed from 2.5 minutes to 10+ minutes.

More details available on our issue tracker (Nick did some more
profiling on other configs with both clang and gcc):

https://github.com/ClangBuiltLinux/linux/issues/1032

More than happy to do further triage as time permits. I do note Marco's
message about the upcoming series to eliminate this but it would be nice
if this did not regress in the meantime.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [tip: locking/kcsan] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-20 22:30       ` Marco Elver
@ 2020-05-21  7:25         ` Borislav Petkov
  2020-05-21  9:37           ` Marco Elver
  0 siblings, 1 reply; 127+ messages in thread
From: Borislav Petkov @ 2020-05-21  7:25 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, Peter Zijlstra (Intel),
	LKML, linux-tip-commits, Thomas Gleixner, x86

On Thu, May 21, 2020 at 12:30:39AM +0200, Marco Elver wrote:
> This should be fixed when the series that includes this commit is applied:
> https://lore.kernel.org/lkml/20200515150338.190344-9-elver@google.com/

Yap, that fixes it.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [tip: locking/kcsan] READ_ONCE: Use data_race() to avoid KCSAN instrumentation
  2020-05-21  7:25         ` Borislav Petkov
@ 2020-05-21  9:37           ` Marco Elver
  0 siblings, 0 replies; 127+ messages in thread
From: Marco Elver @ 2020-05-21  9:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Will Deacon, Peter Zijlstra (Intel),
	LKML, linux-tip-commits, Thomas Gleixner, x86

On Thu, 21 May 2020 at 09:26, Borislav Petkov <bp@alien8.de> wrote:
>
> On Thu, May 21, 2020 at 12:30:39AM +0200, Marco Elver wrote:
> > This should be fixed when the series that includes this commit is applied:
> > https://lore.kernel.org/lkml/20200515150338.190344-9-elver@google.com/
>
> Yap, that fixes it.
>
> Thx.

Thanks for confirming. I think Peter also mentioned that nested
statement expressions caused issues.

This probably also means we shouldn't have a nested "data_race()"
macro, to avoid any kind of nested statement expressions where
possible.

I will send a v2 of the above series to add that patch.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-20 19:51             ` Mike Rapoport
@ 2020-05-21 23:02               ` Guenter Roeck
  2020-05-24 12:32                 ` Mike Rapoport
  0 siblings, 1 reply; 127+ messages in thread
From: Guenter Roeck @ 2020-05-21 23:02 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Will Deacon, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On 5/20/20 12:51 PM, Mike Rapoport wrote:
> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>> On 5/20/20 10:03 AM, Mike Rapoport wrote:
>>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
>>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>>>> Now that the page table allocator can free page table allocations
>>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>>>> to avoid needlessly wasting memory.
>>>>>>>
>>>>>>> Cc: "David S. Miller" <davem@davemloft.net>
>>>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>>>>> Signed-off-by: Will Deacon <will@kernel.org>
>>>>>>
>>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>>>> does reverting the rest of the series.
>>>>>>
>>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>>>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
>>>>> the results here as well as soon as I have it.
>>>>
>>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>>>> issues. However, linux-next is a different story, where I don't get very far
>>>> at all:
>>>>
>>>> BUG: Bad page state in process swapper  pfn:005b4
>>>  
>>> This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
>>> regions rather that check each PFN"). The commit sha is valid for
>>> v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)
>>>
>>> As it seems, sparc32 never registered the memory occupied by the kernel
>>> image with memblock_add() and it only reserves this memory with
>>> meblock_reserve(). 
>>>
>>> I don't know what would happen on real HW, but with 
>>>
>>> qemu-system-sparc -kernel /path/to/kernel
>>>
>>> the memory occupied by the kernel is reserved in openbios and removed
>>> from mem.available. The prom setup code in the kernel used mem.available
>>> to set up the memory banks and essentially there is a hole for the
>>> memory occupied by the kernel.
>>>
>>> Later in bootmem_init() this memory is memblock_reserve()d.
>>>
>>> Before the problematic commit, memmap initialization would call
>>> __init_single_page() for the pages in that hole, the
>>> free_low_memory_core_early() would mark them as resrved and everything
>>> would be Ok.
>>>
>>> After the change in memmap initialization, the hole is skipped and the
>>> page structs for it are not inited. And when they are passed from
>>> memblock to page allocator as reserved it gets confused.
>>>
>>> Simply registering the memory occupied by the kernel with memblock_add()
>>> resolves this issue, at least for qemu-system-arm and I cannot see how
>>> it can harm any other setup.
>>>
>>> If all that makes sense I'll send a proper patch :)
>>>
>>> diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
>>> index 906eda1158b4..3cb3dffcbcdc 100644
>>> --- a/arch/sparc/mm/init_32.c
>>> +++ b/arch/sparc/mm/init_32.c
>>> @@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
>>>  	/* Reserve the kernel text/data/bss. */
>>>  	size = (start_pfn << PAGE_SHIFT) - phys_base;
>>>  	memblock_reserve(phys_base, size);
>>> +	memblock_add(phys_base, size);
>>>  
>>>  	size = memblock_phys_mem_size() - memblock_reserved_size();
>>>  	*pages_avail = (size >> PAGE_SHIFT) - high_pages;
>>>
>>>> Will
>>>
>>
>> With above patch applied on top of Ira's patch, I get:
>>
>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>> [f0067a64 :
>> do_raw_spin_lock+0xa8/0xd8 ]
>> [f00d5034 :
>> copy_page_range+0x328/0x804 ]
>> [f0025be4 :
>> dup_mm+0x334/0x434 ]
>> [f0027124 :
>> copy_process+0x1224/0x12b0 ]
>> [f0027344 :
>> _do_fork+0x54/0x30c ]
>> [f0027670 :
>> do_fork+0x5c/0x6c ]
>> [f000de44 :
>> sparc_do_fork+0x18/0x38 ]
>> [f000b7f4 :
>> do_syscall+0x34/0x40 ]
>> [5010cd4c :
>> 0x5010cd4c ]
>>
>> Looks like yet another problem.
> 
> I've checked the patch above on top of the mmots which already has Ira's
> patches and it booted fine. I've used sparc32_defconfig to build the
> kernel and qemu-system-sparc with default machine and CPU. 
> 

Try sparc32_defconfig+SMP.

>> I can not revert c03584e30534 because it results in a compile failure.
> 
> Here's the "revert" of c03584e30534:
> 

Same problem (spinlock recursion) after applying it.

Guenter

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] compiler.h: Avoid nested statement expression in data_race()
  2020-05-20 22:17     ` Borislav Petkov
  2020-05-20 22:30       ` Marco Elver
  2020-05-21  3:30       ` Nathan Chancellor
@ 2020-05-22 16:08       ` tip-bot2 for Marco Elver
  2 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Marco Elver @ 2020-05-22 16:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Borislav Petkov, Nathan Chancellor, Marco Elver,
	Peter Zijlstra (Intel),
	Will Deacon, Nick Desaulniers, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     aa7d8a2ee1e9b80e36ce2aa0d817c14ab3e23157
Gitweb:        https://git.kernel.org/tip/aa7d8a2ee1e9b80e36ce2aa0d817c14ab3e23157
Author:        Marco Elver <elver@google.com>
AuthorDate:    Thu, 21 May 2020 16:20:45 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Fri, 22 May 2020 15:24:21 +02:00

compiler.h: Avoid nested statement expression in data_race()

It appears that compilers have trouble with nested statement
expressions. Therefore, remove one level of statement expression nesting
from the data_race() macro. This will help avoiding potential problems
in the future as its usage increases.

Reported-by: Borislav Petkov <bp@suse.de>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lkml.kernel.org/r/20200520221712.GA21166@zn.tnic
Link: https://lkml.kernel.org/r/20200521142047.169334-10-elver@google.com
---
 include/linux/compiler.h | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 7444f02..379a507 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -211,12 +211,12 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
  */
 #define data_race(expr)							\
 ({									\
-	__kcsan_disable_current();					\
-	({								\
-		__unqual_scalar_typeof(({ expr; })) __v = ({ expr; });	\
-		__kcsan_enable_current();				\
-		__v;							\
+	__unqual_scalar_typeof(({ expr; })) __v = ({			\
+		__kcsan_disable_current();				\
+		expr;							\
 	});								\
+	__kcsan_enable_current();					\
+	__v;								\
 })
 
 /*

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] kcsan: Remove 'noinline' from __no_kcsan_or_inline
  2020-05-13 18:54                         ` Marco Elver
  2020-05-13 21:25                           ` Will Deacon
@ 2020-05-22 16:08                           ` tip-bot2 for Marco Elver
  1 sibling, 0 replies; 127+ messages in thread
From: tip-bot2 for Marco Elver @ 2020-05-22 16:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Marco Elver, Borislav Petkov, Peter Zijlstra (Intel),
	Will Deacon, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     f487a549ea30ee894055d8d20e81c1996a6e10a0
Gitweb:        https://git.kernel.org/tip/f487a549ea30ee894055d8d20e81c1996a6e10a0
Author:        Marco Elver <elver@google.com>
AuthorDate:    Thu, 21 May 2020 16:20:41 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Fri, 22 May 2020 15:12:39 +02:00

kcsan: Remove 'noinline' from __no_kcsan_or_inline

Some compilers incorrectly inline small __no_kcsan functions, which then
results in instrumenting the accesses. For this reason, the 'noinline'
attribute was added to __no_kcsan_or_inline. All known versions of GCC
are affected by this. Supported versions of Clang are unaffected, and
never inline a no_sanitize function.

However, the attribute 'noinline' in __no_kcsan_or_inline causes
unexpected code generation in functions that are __no_kcsan and call a
__no_kcsan_or_inline function.

In certain situations it is expected that the __no_kcsan_or_inline
function is actually inlined by the __no_kcsan function, and *no* calls
are emitted. By removing the 'noinline' attribute, give the compiler
the ability to inline and generate the expected code in __no_kcsan
functions.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/CANpmjNNOpJk0tprXKB_deiNAv_UmmORf1-2uajLhnLWQQ1hvoA@mail.gmail.com
Link: https://lkml.kernel.org/r/20200521142047.169334-6-elver@google.com
---
 include/linux/compiler.h | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e24cc3a..17c98b2 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -276,11 +276,9 @@ do {									\
 #ifdef __SANITIZE_THREAD__
 /*
  * Rely on __SANITIZE_THREAD__ instead of CONFIG_KCSAN, to avoid not inlining in
- * compilation units where instrumentation is disabled. The attribute 'noinline'
- * is required for older compilers, where implicit inlining of very small
- * functions renders __no_sanitize_thread ineffective.
+ * compilation units where instrumentation is disabled.
  */
-# define __no_kcsan_or_inline __no_kcsan noinline notrace __maybe_unused
+# define __no_kcsan_or_inline __no_kcsan notrace __maybe_unused
 # define __no_sanitize_or_inline __no_kcsan_or_inline
 #else
 # define __no_kcsan_or_inline __always_inline

^ permalink raw reply	[flat|nested] 127+ messages in thread

* [tip: locking/kcsan] kcsan: Restrict supported compilers
  2020-05-14 13:35                                 ` Marco Elver
                                                     ` (4 preceding siblings ...)
  2020-05-14 15:38                                   ` Paul E. McKenney
@ 2020-05-22 16:08                                   ` tip-bot2 for Marco Elver
  5 siblings, 0 replies; 127+ messages in thread
From: tip-bot2 for Marco Elver @ 2020-05-22 16:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Marco Elver, Borislav Petkov, Peter Zijlstra (Intel),
	Will Deacon, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     0d473b1d6e5c240f8ffed02715c718024802d0fa
Gitweb:        https://git.kernel.org/tip/0d473b1d6e5c240f8ffed02715c718024802d0fa
Author:        Marco Elver <elver@google.com>
AuthorDate:    Thu, 21 May 2020 16:20:42 +02:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Fri, 22 May 2020 14:46:02 +02:00

kcsan: Restrict supported compilers

The first version of Clang that supports -tsan-distinguish-volatile will
be able to support KCSAN. The first Clang release to do so, will be
Clang 11. This is due to satisfying all the following requirements:

1. Never emit calls to __tsan_func_{entry,exit}.

2. __no_kcsan functions should not call anything, not even
   kcsan_{enable,disable}_current(), when using __{READ,WRITE}_ONCE => Requires
   leaving them plain!

3. Support atomic_{read,set}*() with KCSAN, which rely on
   arch_atomic_{read,set}*() using __{READ,WRITE}_ONCE() => Because of
   #2, rely on Clang 11's -tsan-distinguish-volatile support. We will
   double-instrument atomic_{read,set}*(), but that's reasonable given
   it's still lower cost than the data_race() variant due to avoiding 2
   extra calls (kcsan_{en,dis}able_current() calls).

4. __always_inline functions inlined into __no_kcsan functions are never
   instrumented.

5. __always_inline functions inlined into instrumented functions are
   instrumented.

6. __no_kcsan_or_inline functions may be inlined into __no_kcsan functions =>
   Implies leaving 'noinline' off of __no_kcsan_or_inline.

7. Because of #6, __no_kcsan and __no_kcsan_or_inline functions should never be
   spuriously inlined into instrumented functions, causing the accesses of the
   __no_kcsan function to be instrumented.

Older versions of Clang do not satisfy #3. The latest GCC currently
doesn't support at least #1, #3, and #7.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com
Link: https://lkml.kernel.org/r/20200521142047.169334-7-elver@google.com
---
 lib/Kconfig.kcsan |  9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/Kconfig.kcsan b/lib/Kconfig.kcsan
index b5d88ac..5ee88e5 100644
--- a/lib/Kconfig.kcsan
+++ b/lib/Kconfig.kcsan
@@ -3,6 +3,12 @@
 config HAVE_ARCH_KCSAN
 	bool
 
+config HAVE_KCSAN_COMPILER
+	def_bool CC_IS_CLANG && $(cc-option,-fsanitize=thread -mllvm -tsan-distinguish-volatile=1)
+	help
+	  For the list of compilers that support KCSAN, please see
+	  <file:Documentation/dev-tools/kcsan.rst>.
+
 config KCSAN_KCOV_BROKEN
 	def_bool KCOV && CC_HAS_SANCOV_TRACE_PC
 	depends on CC_IS_CLANG
@@ -15,7 +21,8 @@ config KCSAN_KCOV_BROKEN
 
 menuconfig KCSAN
 	bool "KCSAN: dynamic data race detector"
-	depends on HAVE_ARCH_KCSAN && DEBUG_KERNEL && !KASAN
+	depends on HAVE_ARCH_KCSAN && HAVE_KCSAN_COMPILER
+	depends on DEBUG_KERNEL && !KASAN
 	depends on !KCSAN_KCOV_BROKEN
 	select STACKTRACE
 	help

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-21 23:02               ` Guenter Roeck
@ 2020-05-24 12:32                 ` Mike Rapoport
  2020-05-24 14:01                   ` Guenter Roeck
  2020-05-26 13:26                   ` Will Deacon
  0 siblings, 2 replies; 127+ messages in thread
From: Mike Rapoport @ 2020-05-24 12:32 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Will Deacon, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> On 5/20/20 12:51 PM, Mike Rapoport wrote:
> > On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> >> On 5/20/20 10:03 AM, Mike Rapoport wrote:
> >>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> >>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> >>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> >>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> >>>>>>> Now that the page table allocator can free page table allocations
> >>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> >>>>>>> to avoid needlessly wasting memory.
> >>>>>>>
> >>>>>>> Cc: "David S. Miller" <davem@davemloft.net>
> >>>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
> >>>>>>> Signed-off-by: Will Deacon <will@kernel.org>
> >>>>>>
> >>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> >>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> >>>>>> does reverting the rest of the series.
> >>>>>>
> >>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
> >>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
> >>>>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> >>>>> the results here as well as soon as I have it.
> >>>>
> >>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> >>>> issues. However, linux-next is a different story, where I don't get very far
> >>>> at all:
> >>>>
> >>>> BUG: Bad page state in process swapper  pfn:005b4
> >>
> >> With above patch applied on top of Ira's patch, I get:
> >>
> >> BUG: spinlock recursion on CPU#0, S01syslogd/139
> >>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> >> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> >> [f0067a64 :
> >> do_raw_spin_lock+0xa8/0xd8 ]
> >> [f00d5034 :
> >> copy_page_range+0x328/0x804 ]
> >> [f0025be4 :
> >> dup_mm+0x334/0x434 ]
> >> [f0027124 :
> >> copy_process+0x1224/0x12b0 ]
> >> [f0027344 :
> >> _do_fork+0x54/0x30c ]
> >> [f0027670 :
> >> do_fork+0x5c/0x6c ]
> >> [f000de44 :
> >> sparc_do_fork+0x18/0x38 ]
> >> [f000b7f4 :
> >> do_syscall+0x34/0x40 ]
> >> [5010cd4c :
> >> 0x5010cd4c ]
> >>
> >> Looks like yet another problem.
> > 
> > I've checked the patch above on top of the mmots which already has Ira's
> > patches and it booted fine. I've used sparc32_defconfig to build the
> > kernel and qemu-system-sparc with default machine and CPU. 
> > 
> 
> Try sparc32_defconfig+SMP.
 
I see a differernt problem, but this could be related:

INIT: version 2.86 booting
rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
	(detected by 0, t=5252 jiffies, g=-935, q=3)
rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
rcu_sched       R  running task        0    10      2 0x00000000

I'm running a bit old debian [1] with qemu-img-sparc.

My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
allocation size for PMD and PTE tables"). The commit ID is valid for
next-20200522.

If I revert this commit and fixup the page table initialization [2] I've
broken, the build with CONFIG_SMP=n works fine, but the build with
CONFIG_SMP=y does not work even if I add nosmp to the kernel command
line. 

[1] https://people.debian.org/~aurel32/qemu/sparc/debian_etch_sparc_small.qcow2
[2] sparc32 meminit fixup:

diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
index e45160839f79..eb2946b1df8a 100644
--- a/arch/sparc/mm/init_32.c
+++ b/arch/sparc/mm/init_32.c
@@ -192,6 +192,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
 	/* Reserve the kernel text/data/bss. */
 	size = (start_pfn << PAGE_SHIFT) - phys_base;
 	memblock_reserve(phys_base, size);
+	memblock_add(phys_base, size);
 
 	size = memblock_phys_mem_size() - memblock_reserved_size();
 	*pages_avail = (size >> PAGE_SHIFT) - high_pages;
diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index 75b56bdd38ef..6cb1ea2d2b5c 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -304,7 +304,7 @@ static void __init srmmu_nocache_init(void)
 		pgd = pgd_offset_k(vaddr);
 		p4d = p4d_offset(__nocache_fix(pgd), vaddr);
 		pud = pud_offset(__nocache_fix(p4d), vaddr);
-		pmd = pmd_offset(__nocache_fix(pud), vaddr);
+		pmd = pmd_offset(__nocache_fix(pgd), vaddr);
 		pte = pte_offset_kernel(__nocache_fix(pmd), vaddr);
 
 		pteval = ((paddr >> 4) | SRMMU_ET_PTE | SRMMU_PRIV);

> Guenter

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-24 12:32                 ` Mike Rapoport
@ 2020-05-24 14:01                   ` Guenter Roeck
  2020-05-26 13:26                   ` Will Deacon
  1 sibling, 0 replies; 127+ messages in thread
From: Guenter Roeck @ 2020-05-24 14:01 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Will Deacon, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On 5/24/20 5:32 AM, Mike Rapoport wrote:
> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
>> On 5/20/20 12:51 PM, Mike Rapoport wrote:
>>> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>>>> On 5/20/20 10:03 AM, Mike Rapoport wrote:
>>>>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
>>>>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>>>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>>>>>> Now that the page table allocator can free page table allocations
>>>>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>>>>>> to avoid needlessly wasting memory.
>>>>>>>>>
>>>>>>>>> Cc: "David S. Miller" <davem@davemloft.net>
>>>>>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>>>>>>>> Signed-off-by: Will Deacon <will@kernel.org>
>>>>>>>>
>>>>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>>>>>> does reverting the rest of the series.
>>>>>>>>
>>>>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>>>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>>>>>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
>>>>>>> the results here as well as soon as I have it.
>>>>>>
>>>>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>>>>>> issues. However, linux-next is a different story, where I don't get very far
>>>>>> at all:
>>>>>>
>>>>>> BUG: Bad page state in process swapper  pfn:005b4
>>>>
>>>> With above patch applied on top of Ira's patch, I get:
>>>>
>>>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>>>>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>>>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>>>> [f0067a64 :
>>>> do_raw_spin_lock+0xa8/0xd8 ]
>>>> [f00d5034 :
>>>> copy_page_range+0x328/0x804 ]
>>>> [f0025be4 :
>>>> dup_mm+0x334/0x434 ]
>>>> [f0027124 :
>>>> copy_process+0x1224/0x12b0 ]
>>>> [f0027344 :
>>>> _do_fork+0x54/0x30c ]
>>>> [f0027670 :
>>>> do_fork+0x5c/0x6c ]
>>>> [f000de44 :
>>>> sparc_do_fork+0x18/0x38 ]
>>>> [f000b7f4 :
>>>> do_syscall+0x34/0x40 ]
>>>> [5010cd4c :
>>>> 0x5010cd4c ]
>>>>
>>>> Looks like yet another problem.
>>>
>>> I've checked the patch above on top of the mmots which already has Ira's
>>> patches and it booted fine. I've used sparc32_defconfig to build the
>>> kernel and qemu-system-sparc with default machine and CPU. 
>>>
>>
>> Try sparc32_defconfig+SMP.
>  
> I see a differernt problem, but this could be related:
> 
> INIT: version 2.86 booting
> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> 	(detected by 0, t=5252 jiffies, g=-935, q=3)
> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> rcu: RCU grace-period kthread stack dump:
> rcu_sched       R  running task        0    10      2 0x00000000
> 
> I'm running a bit old debian [1] with qemu-img-sparc.
> 
> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> allocation size for PMD and PTE tables"). The commit ID is valid for
> next-20200522.
> 
Here is what I currently get:

next-20200522:
	All builds/tests crash
next-20200522 plus upstream commit 0cfc8a8d70dc ("sparc32: fix page table traversal in srmmu_nocache_init()"):
	nosmp images (sparc32_defconfig) boot fine
	smp images (sparc32_defconfig+SMP) crash with "BUG: Bad page state"
next-20200522 plus 0cfc8a8d70dc plus memblock_add() from below:
	smp images crash with spinlock recursion as above
next-20200522 plus 0cfc8a8d70dc plus revert of 8c8f3156dd40:
	smp images crash with "BUG: Bad page state"
next-20200522 plus 0cfc8a8d70dc plus revert of 8c8f3156dd40 plus memblock_add():
	All builds/tests pass

This is with my root file system. I tried the debian image but I seem to be
missing some command line option needed to make it work.

Guenter

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-24 12:32                 ` Mike Rapoport
  2020-05-24 14:01                   ` Guenter Roeck
@ 2020-05-26 13:26                   ` Will Deacon
  2020-05-26 14:01                     ` Will Deacon
  1 sibling, 1 reply; 127+ messages in thread
From: Will Deacon @ 2020-05-26 13:26 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Guenter Roeck, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> > On 5/20/20 12:51 PM, Mike Rapoport wrote:
> > > On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> > >> On 5/20/20 10:03 AM, Mike Rapoport wrote:
> > >>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> > >>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > >>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > >>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > >>>>>>> Now that the page table allocator can free page table allocations
> > >>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > >>>>>>> to avoid needlessly wasting memory.
> > >>>>>>>
> > >>>>>>> Cc: "David S. Miller" <davem@davemloft.net>
> > >>>>>>> Cc: Peter Zijlstra <peterz@infradead.org>
> > >>>>>>> Signed-off-by: Will Deacon <will@kernel.org>
> > >>>>>>
> > >>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > >>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > >>>>>> does reverting the rest of the series.
> > >>>>>>
> > >>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
> > >>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
> > >>>>> powerpc boot tests.  I am currently bisecting those crashes. I'll report
> > >>>>> the results here as well as soon as I have it.
> > >>>>
> > >>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > >>>> issues. However, linux-next is a different story, where I don't get very far
> > >>>> at all:
> > >>>>
> > >>>> BUG: Bad page state in process swapper  pfn:005b4
> > >>
> > >> With above patch applied on top of Ira's patch, I get:
> > >>
> > >> BUG: spinlock recursion on CPU#0, S01syslogd/139
> > >>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> > >> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> > >> [f0067a64 :
> > >> do_raw_spin_lock+0xa8/0xd8 ]
> > >> [f00d5034 :
> > >> copy_page_range+0x328/0x804 ]
> > >> [f0025be4 :
> > >> dup_mm+0x334/0x434 ]
> > >> [f0027124 :
> > >> copy_process+0x1224/0x12b0 ]
> > >> [f0027344 :
> > >> _do_fork+0x54/0x30c ]
> > >> [f0027670 :
> > >> do_fork+0x5c/0x6c ]
> > >> [f000de44 :
> > >> sparc_do_fork+0x18/0x38 ]
> > >> [f000b7f4 :
> > >> do_syscall+0x34/0x40 ]
> > >> [5010cd4c :
> > >> 0x5010cd4c ]
> > >>
> > >> Looks like yet another problem.
> > > 
> > > I've checked the patch above on top of the mmots which already has Ira's
> > > patches and it booted fine. I've used sparc32_defconfig to build the
> > > kernel and qemu-system-sparc with default machine and CPU. 
> > > 
> > 
> > Try sparc32_defconfig+SMP.
>  
> I see a differernt problem, but this could be related:
> 
> INIT: version 2.86 booting
> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> 	(detected by 0, t=5252 jiffies, g=-935, q=3)
> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> rcu: RCU grace-period kthread stack dump:
> rcu_sched       R  running task        0    10      2 0x00000000
> 
> I'm running a bit old debian [1] with qemu-img-sparc.
> 
> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> allocation size for PMD and PTE tables"). The commit ID is valid for
> next-20200522.

Can you try the diff below please?

Will

--->8

diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index c861c0f0df73..7c05c0dea511 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -363,20 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
 
 	if ((ptep = pte_alloc_one_kernel(mm)) == 0)
 		return NULL;
+
 	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
-	if (!pgtable_pte_page_ctor(page)) {
-		__free_page(page);
+	if (!PageTable(page) && !pgtable_pte_page_ctor(page))
 		return NULL;
-	}
+
 	return ptep;
 }
 
 void pte_free(struct mm_struct *mm, pgtable_t ptep)
 {
-	struct page *page;
-
-	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
-	pgtable_pte_page_dtor(page);
 	srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
 }
 
diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..97458119cce8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
 # Default to 4 for wider testing, though 8 might be more appropriate.
 # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
 # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
+# SPARC32 allocates multiple pte tables within a single page, and therefore
+# a per-page lock leads to problems when multiple tables need to be locked
+# at the same time (e.g. copy_page_range()).
 # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
 #
 config SPLIT_PTLOCK_CPUS
@@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
 	default "999999" if !MMU
 	default "999999" if ARM && !CPU_CACHE_VIPT
 	default "999999" if PARISC && !PA20
+	default "999999" if SPARC32
 	default "4"
 
 config ARCH_ENABLE_SPLIT_PMD_PTLOCK

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-26 13:26                   ` Will Deacon
@ 2020-05-26 14:01                     ` Will Deacon
  2020-05-26 15:21                       ` Mike Rapoport
  2020-05-26 16:18                       ` Guenter Roeck
  0 siblings, 2 replies; 127+ messages in thread
From: Will Deacon @ 2020-05-26 14:01 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Guenter Roeck, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
> On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
> > On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> > > On 5/20/20 12:51 PM, Mike Rapoport wrote:
> > > > On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> > > >> With above patch applied on top of Ira's patch, I get:
> > > >>
> > > >> BUG: spinlock recursion on CPU#0, S01syslogd/139
> > > >>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> > > >> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> > > >> [f0067a64 :
> > > >> do_raw_spin_lock+0xa8/0xd8 ]
> > > >> [f00d5034 :
> > > >> copy_page_range+0x328/0x804 ]
> > > >> [f0025be4 :
> > > >> dup_mm+0x334/0x434 ]
> > > >> [f0027124 :
> > > >> copy_process+0x1224/0x12b0 ]
> > > >> [f0027344 :
> > > >> _do_fork+0x54/0x30c ]
> > > >> [f0027670 :
> > > >> do_fork+0x5c/0x6c ]
> > > >> [f000de44 :
> > > >> sparc_do_fork+0x18/0x38 ]
> > > >> [f000b7f4 :
> > > >> do_syscall+0x34/0x40 ]
> > > >> [5010cd4c :
> > > >> 0x5010cd4c ]
> > > >>
> > > >> Looks like yet another problem.
> > > > 
> > > > I've checked the patch above on top of the mmots which already has Ira's
> > > > patches and it booted fine. I've used sparc32_defconfig to build the
> > > > kernel and qemu-system-sparc with default machine and CPU. 
> > > > 
> > > 
> > > Try sparc32_defconfig+SMP.
> >  
> > I see a differernt problem, but this could be related:
> > 
> > INIT: version 2.86 booting
> > rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> > 	(detected by 0, t=5252 jiffies, g=-935, q=3)
> > rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> > rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> > rcu: RCU grace-period kthread stack dump:
> > rcu_sched       R  running task        0    10      2 0x00000000
> > 
> > I'm running a bit old debian [1] with qemu-img-sparc.
> > 
> > My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> > allocation size for PMD and PTE tables"). The commit ID is valid for
> > next-20200522.
> 
> Can you try the diff below please?

Actually, that's racy. New version below!

Will

--->8

diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index c861c0f0df73..068029471aa4 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
 
 	if ((ptep = pte_alloc_one_kernel(mm)) == 0)
 		return NULL;
+
 	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
-	if (!pgtable_pte_page_ctor(page)) {
-		__free_page(page);
-		return NULL;
+
+	spin_lock(&mm->page_table_lock);
+	if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
+		page_ref_dec(page);
+		ptep = NULL;
 	}
+	spin_unlock(&mm->page_table_lock);
+
 	return ptep;
 }
 
@@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
 	struct page *page;
 
 	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
-	pgtable_pte_page_dtor(page);
+
+	spin_lock(&mm->page_table_lock);
+	if (page_ref_dec_return(page) == 1)
+		pgtable_pte_page_dtor(page);
+	spin_unlock(&mm->page_table_lock);
+
 	srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
 }
 
diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..97458119cce8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
 # Default to 4 for wider testing, though 8 might be more appropriate.
 # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
 # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
+# SPARC32 allocates multiple pte tables within a single page, and therefore
+# a per-page lock leads to problems when multiple tables need to be locked
+# at the same time (e.g. copy_page_range()).
 # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
 #
 config SPLIT_PTLOCK_CPUS
@@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
 	default "999999" if !MMU
 	default "999999" if ARM && !CPU_CACHE_VIPT
 	default "999999" if PARISC && !PA20
+	default "999999" if SPARC32
 	default "4"
 
 config ARCH_ENABLE_SPLIT_PMD_PTLOCK

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-26 14:01                     ` Will Deacon
@ 2020-05-26 15:21                       ` Mike Rapoport
  2020-05-26 16:18                       ` Guenter Roeck
  1 sibling, 0 replies; 127+ messages in thread
From: Mike Rapoport @ 2020-05-26 15:21 UTC (permalink / raw)
  To: Will Deacon
  Cc: Guenter Roeck, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On Tue, May 26, 2020 at 03:01:27PM +0100, Will Deacon wrote:
> On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
> > On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
> > > On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> > > > On 5/20/20 12:51 PM, Mike Rapoport wrote:
> > > > > On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> > > > >> With above patch applied on top of Ira's patch, I get:
> > > > >>
> > > > >> BUG: spinlock recursion on CPU#0, S01syslogd/139
> > > > >>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> > > > >> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> > > > >> [f0067a64 :
> > > > >> do_raw_spin_lock+0xa8/0xd8 ]
> > > > >> [f00d5034 :
> > > > >> copy_page_range+0x328/0x804 ]
> > > > >> [f0025be4 :
> > > > >> dup_mm+0x334/0x434 ]
> > > > >> [f0027124 :
> > > > >> copy_process+0x1224/0x12b0 ]
> > > > >> [f0027344 :
> > > > >> _do_fork+0x54/0x30c ]
> > > > >> [f0027670 :
> > > > >> do_fork+0x5c/0x6c ]
> > > > >> [f000de44 :
> > > > >> sparc_do_fork+0x18/0x38 ]
> > > > >> [f000b7f4 :
> > > > >> do_syscall+0x34/0x40 ]
> > > > >> [5010cd4c :
> > > > >> 0x5010cd4c ]
> > > > >>
> > > > >> Looks like yet another problem.
> > > > > 
> > > > > I've checked the patch above on top of the mmots which already has Ira's
> > > > > patches and it booted fine. I've used sparc32_defconfig to build the
> > > > > kernel and qemu-system-sparc with default machine and CPU. 
> > > > > 
> > > > 
> > > > Try sparc32_defconfig+SMP.
> > >  
> > > I see a differernt problem, but this could be related:
> > > 
> > > INIT: version 2.86 booting
> > > rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> > > 	(detected by 0, t=5252 jiffies, g=-935, q=3)
> > > rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> > > rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> > > rcu: RCU grace-period kthread stack dump:
> > > rcu_sched       R  running task        0    10      2 0x00000000
> > > 
> > > I'm running a bit old debian [1] with qemu-img-sparc.
> > > 
> > > My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> > > allocation size for PMD and PTE tables"). The commit ID is valid for
> > > next-20200522.
> > 
> > Can you try the diff below please?
> 
> Actually, that's racy. New version below!

Well, both versions worked for me with sparc32_defconfig+SMP build when
I ran qemu-system-sparc with default machine (SS-5) that does not allow
SMP.

I could not check with actial SMP because
qemu-system-sparc -M SS-10 -smp 2 and qemu-system-sparc -M SS-20 -smp 2
fail early with an exception even in v5.7-rc7...

> Will
> 
> --->8
> 
> diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
> index c861c0f0df73..068029471aa4 100644
> --- a/arch/sparc/mm/srmmu.c
> +++ b/arch/sparc/mm/srmmu.c
> @@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
>  
>  	if ((ptep = pte_alloc_one_kernel(mm)) == 0)
>  		return NULL;
> +
>  	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> -	if (!pgtable_pte_page_ctor(page)) {
> -		__free_page(page);
> -		return NULL;
> +
> +	spin_lock(&mm->page_table_lock);
> +	if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
> +		page_ref_dec(page);
> +		ptep = NULL;
>  	}
> +	spin_unlock(&mm->page_table_lock);
> +
>  	return ptep;
>  }
>  
> @@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
>  	struct page *page;
>  
>  	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> -	pgtable_pte_page_dtor(page);
> +
> +	spin_lock(&mm->page_table_lock);
> +	if (page_ref_dec_return(page) == 1)
> +		pgtable_pte_page_dtor(page);
> +	spin_unlock(&mm->page_table_lock);
> +
>  	srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
>  }
>  
> diff --git a/mm/Kconfig b/mm/Kconfig
> index c1acc34c1c35..97458119cce8 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
>  # Default to 4 for wider testing, though 8 might be more appropriate.
>  # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
>  # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
> +# SPARC32 allocates multiple pte tables within a single page, and therefore
> +# a per-page lock leads to problems when multiple tables need to be locked
> +# at the same time (e.g. copy_page_range()).
>  # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
>  #
>  config SPLIT_PTLOCK_CPUS
> @@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
>  	default "999999" if !MMU
>  	default "999999" if ARM && !CPU_CACHE_VIPT
>  	default "999999" if PARISC && !PA20
> +	default "999999" if SPARC32
>  	default "4"
>  
>  config ARCH_ENABLE_SPLIT_PMD_PTLOCK

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-26 14:01                     ` Will Deacon
  2020-05-26 15:21                       ` Mike Rapoport
@ 2020-05-26 16:18                       ` Guenter Roeck
  2020-05-26 16:29                         ` Mike Rapoport
  1 sibling, 1 reply; 127+ messages in thread
From: Guenter Roeck @ 2020-05-26 16:18 UTC (permalink / raw)
  To: Will Deacon, Mike Rapoport
  Cc: linux-kernel, elver, tglx, paulmck, mingo, peterz, David S. Miller

On 5/26/20 7:01 AM, Will Deacon wrote:
> On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
>> On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
>>> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
>>>> On 5/20/20 12:51 PM, Mike Rapoport wrote:
>>>>> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>>>>>> With above patch applied on top of Ira's patch, I get:
>>>>>>
>>>>>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>>>>>>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>>>>>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>>>>>> [f0067a64 :
>>>>>> do_raw_spin_lock+0xa8/0xd8 ]
>>>>>> [f00d5034 :
>>>>>> copy_page_range+0x328/0x804 ]
>>>>>> [f0025be4 :
>>>>>> dup_mm+0x334/0x434 ]
>>>>>> [f0027124 :
>>>>>> copy_process+0x1224/0x12b0 ]
>>>>>> [f0027344 :
>>>>>> _do_fork+0x54/0x30c ]
>>>>>> [f0027670 :
>>>>>> do_fork+0x5c/0x6c ]
>>>>>> [f000de44 :
>>>>>> sparc_do_fork+0x18/0x38 ]
>>>>>> [f000b7f4 :
>>>>>> do_syscall+0x34/0x40 ]
>>>>>> [5010cd4c :
>>>>>> 0x5010cd4c ]
>>>>>>
>>>>>> Looks like yet another problem.
>>>>>
>>>>> I've checked the patch above on top of the mmots which already has Ira's
>>>>> patches and it booted fine. I've used sparc32_defconfig to build the
>>>>> kernel and qemu-system-sparc with default machine and CPU. 
>>>>>
>>>>
>>>> Try sparc32_defconfig+SMP.
>>>  
>>> I see a differernt problem, but this could be related:
>>>
>>> INIT: version 2.86 booting
>>> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>>> 	(detected by 0, t=5252 jiffies, g=-935, q=3)
>>> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
>>> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
>>> rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
>>> rcu: RCU grace-period kthread stack dump:
>>> rcu_sched       R  running task        0    10      2 0x00000000
>>>
>>> I'm running a bit old debian [1] with qemu-img-sparc.
>>>
>>> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
>>> allocation size for PMD and PTE tables"). The commit ID is valid for
>>> next-20200522.
>>
>> Can you try the diff below please?
> 
> Actually, that's racy. New version below!
> 

Applied on top of next-20200526, with defconfig+SMP, I still get:

BUG: Bad page state in process swapper/0  pfn:0069f

many times. Did I have to revert something else ? Sorry, I lost track.


Note that "-smp 2" on SS-10 works for me (with the same page state
messages).

Guenter


> Will
> 
> --->8
> 
> diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
> index c861c0f0df73..068029471aa4 100644
> --- a/arch/sparc/mm/srmmu.c
> +++ b/arch/sparc/mm/srmmu.c
> @@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
>  
>  	if ((ptep = pte_alloc_one_kernel(mm)) == 0)
>  		return NULL;
> +
>  	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> -	if (!pgtable_pte_page_ctor(page)) {
> -		__free_page(page);
> -		return NULL;
> +
> +	spin_lock(&mm->page_table_lock);
> +	if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
> +		page_ref_dec(page);
> +		ptep = NULL;
>  	}
> +	spin_unlock(&mm->page_table_lock);
> +
>  	return ptep;
>  }
>  
> @@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
>  	struct page *page;
>  
>  	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> -	pgtable_pte_page_dtor(page);
> +
> +	spin_lock(&mm->page_table_lock);
> +	if (page_ref_dec_return(page) == 1)
> +		pgtable_pte_page_dtor(page);
> +	spin_unlock(&mm->page_table_lock);
> +
>  	srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
>  }
>  
> diff --git a/mm/Kconfig b/mm/Kconfig
> index c1acc34c1c35..97458119cce8 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
>  # Default to 4 for wider testing, though 8 might be more appropriate.
>  # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
>  # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
> +# SPARC32 allocates multiple pte tables within a single page, and therefore
> +# a per-page lock leads to problems when multiple tables need to be locked
> +# at the same time (e.g. copy_page_range()).
>  # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
>  #
>  config SPLIT_PTLOCK_CPUS
> @@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
>  	default "999999" if !MMU
>  	default "999999" if ARM && !CPU_CACHE_VIPT
>  	default "999999" if PARISC && !PA20
> +	default "999999" if SPARC32
>  	default "4"
>  
>  config ARCH_ENABLE_SPLIT_PMD_PTLOCK
> 


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-26 16:18                       ` Guenter Roeck
@ 2020-05-26 16:29                         ` Mike Rapoport
  2020-05-26 17:15                           ` Guenter Roeck
  0 siblings, 1 reply; 127+ messages in thread
From: Mike Rapoport @ 2020-05-26 16:29 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Will Deacon, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On Tue, May 26, 2020 at 09:18:54AM -0700, Guenter Roeck wrote:
> On 5/26/20 7:01 AM, Will Deacon wrote:
> > On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
> >> On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
> >>> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> >>>> On 5/20/20 12:51 PM, Mike Rapoport wrote:
> >>>>> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> >>>>>> With above patch applied on top of Ira's patch, I get:
> >>>>>>
> >>>>>> BUG: spinlock recursion on CPU#0, S01syslogd/139
> >>>>>>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> >>>>>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> >>>>>> [f0067a64 :
> >>>>>> do_raw_spin_lock+0xa8/0xd8 ]
> >>>>>> [f00d5034 :
> >>>>>> copy_page_range+0x328/0x804 ]
> >>>>>> [f0025be4 :
> >>>>>> dup_mm+0x334/0x434 ]
> >>>>>> [f0027124 :
> >>>>>> copy_process+0x1224/0x12b0 ]
> >>>>>> [f0027344 :
> >>>>>> _do_fork+0x54/0x30c ]
> >>>>>> [f0027670 :
> >>>>>> do_fork+0x5c/0x6c ]
> >>>>>> [f000de44 :
> >>>>>> sparc_do_fork+0x18/0x38 ]
> >>>>>> [f000b7f4 :
> >>>>>> do_syscall+0x34/0x40 ]
> >>>>>> [5010cd4c :
> >>>>>> 0x5010cd4c ]
> >>>>>>
> >>>>>> Looks like yet another problem.
> >>>>>
> >>>>> I've checked the patch above on top of the mmots which already has Ira's
> >>>>> patches and it booted fine. I've used sparc32_defconfig to build the
> >>>>> kernel and qemu-system-sparc with default machine and CPU. 
> >>>>>
> >>>>
> >>>> Try sparc32_defconfig+SMP.
> >>>  
> >>> I see a differernt problem, but this could be related:
> >>>
> >>> INIT: version 2.86 booting
> >>> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> >>> 	(detected by 0, t=5252 jiffies, g=-935, q=3)
> >>> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> >>> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> >>> rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> >>> rcu: RCU grace-period kthread stack dump:
> >>> rcu_sched       R  running task        0    10      2 0x00000000
> >>>
> >>> I'm running a bit old debian [1] with qemu-img-sparc.
> >>>
> >>> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> >>> allocation size for PMD and PTE tables"). The commit ID is valid for
> >>> next-20200522.
> >>
> >> Can you try the diff below please?
> > 
> > Actually, that's racy. New version below!
> > 
> 
> Applied on top of next-20200526, with defconfig+SMP, I still get:
> 
> BUG: Bad page state in process swapper/0  pfn:0069f
> 
> many times. Did I have to revert something else ? Sorry, I lost track.
 
The bad page messages are fixed by [1], but this is not in mmotm or
linux-next. This is not related to SMP hangs.

[1] https://lore.kernel.org/lkml/20200524165358.27188-1-rppt@kernel.org/

> Note that "-smp 2" on SS-10 works for me (with the same page state
> messages).
> 
> Guenter
> 
> 
> > Will
> > 
> > --->8
> > 
> > diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
> > index c861c0f0df73..068029471aa4 100644
> > --- a/arch/sparc/mm/srmmu.c
> > +++ b/arch/sparc/mm/srmmu.c
> > @@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
> >  
> >  	if ((ptep = pte_alloc_one_kernel(mm)) == 0)
> >  		return NULL;
> > +
> >  	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> > -	if (!pgtable_pte_page_ctor(page)) {
> > -		__free_page(page);
> > -		return NULL;
> > +
> > +	spin_lock(&mm->page_table_lock);
> > +	if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
> > +		page_ref_dec(page);
> > +		ptep = NULL;
> >  	}
> > +	spin_unlock(&mm->page_table_lock);
> > +
> >  	return ptep;
> >  }
> >  
> > @@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
> >  	struct page *page;
> >  
> >  	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> > -	pgtable_pte_page_dtor(page);
> > +
> > +	spin_lock(&mm->page_table_lock);
> > +	if (page_ref_dec_return(page) == 1)
> > +		pgtable_pte_page_dtor(page);
> > +	spin_unlock(&mm->page_table_lock);
> > +
> >  	srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
> >  }
> >  
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index c1acc34c1c35..97458119cce8 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
> >  # Default to 4 for wider testing, though 8 might be more appropriate.
> >  # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
> >  # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
> > +# SPARC32 allocates multiple pte tables within a single page, and therefore
> > +# a per-page lock leads to problems when multiple tables need to be locked
> > +# at the same time (e.g. copy_page_range()).
> >  # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
> >  #
> >  config SPLIT_PTLOCK_CPUS
> > @@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
> >  	default "999999" if !MMU
> >  	default "999999" if ARM && !CPU_CACHE_VIPT
> >  	default "999999" if PARISC && !PA20
> > +	default "999999" if SPARC32
> >  	default "4"
> >  
> >  config ARCH_ENABLE_SPLIT_PMD_PTLOCK
> > 
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables
  2020-05-26 16:29                         ` Mike Rapoport
@ 2020-05-26 17:15                           ` Guenter Roeck
  0 siblings, 0 replies; 127+ messages in thread
From: Guenter Roeck @ 2020-05-26 17:15 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Will Deacon, linux-kernel, elver, tglx, paulmck, mingo, peterz,
	David S. Miller

On 5/26/20 9:29 AM, Mike Rapoport wrote:
> On Tue, May 26, 2020 at 09:18:54AM -0700, Guenter Roeck wrote:
>> On 5/26/20 7:01 AM, Will Deacon wrote:
>>> On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
>>>> On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
>>>>> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
>>>>>> On 5/20/20 12:51 PM, Mike Rapoport wrote:
>>>>>>> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>>>>>>>> With above patch applied on top of Ira's patch, I get:
>>>>>>>>
>>>>>>>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>>>>>>>>  lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>>>>>>>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>>>>>>>> [f0067a64 :
>>>>>>>> do_raw_spin_lock+0xa8/0xd8 ]
>>>>>>>> [f00d5034 :
>>>>>>>> copy_page_range+0x328/0x804 ]
>>>>>>>> [f0025be4 :
>>>>>>>> dup_mm+0x334/0x434 ]
>>>>>>>> [f0027124 :
>>>>>>>> copy_process+0x1224/0x12b0 ]
>>>>>>>> [f0027344 :
>>>>>>>> _do_fork+0x54/0x30c ]
>>>>>>>> [f0027670 :
>>>>>>>> do_fork+0x5c/0x6c ]
>>>>>>>> [f000de44 :
>>>>>>>> sparc_do_fork+0x18/0x38 ]
>>>>>>>> [f000b7f4 :
>>>>>>>> do_syscall+0x34/0x40 ]
>>>>>>>> [5010cd4c :
>>>>>>>> 0x5010cd4c ]
>>>>>>>>
>>>>>>>> Looks like yet another problem.
>>>>>>>
>>>>>>> I've checked the patch above on top of the mmots which already has Ira's
>>>>>>> patches and it booted fine. I've used sparc32_defconfig to build the
>>>>>>> kernel and qemu-system-sparc with default machine and CPU. 
>>>>>>>
>>>>>>
>>>>>> Try sparc32_defconfig+SMP.
>>>>>  
>>>>> I see a differernt problem, but this could be related:
>>>>>
>>>>> INIT: version 2.86 booting
>>>>> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>>>>> 	(detected by 0, t=5252 jiffies, g=-935, q=3)
>>>>> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
>>>>> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
>>>>> rcu: 	Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
>>>>> rcu: RCU grace-period kthread stack dump:
>>>>> rcu_sched       R  running task        0    10      2 0x00000000
>>>>>
>>>>> I'm running a bit old debian [1] with qemu-img-sparc.
>>>>>
>>>>> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
>>>>> allocation size for PMD and PTE tables"). The commit ID is valid for
>>>>> next-20200522.
>>>>
>>>> Can you try the diff below please?
>>>
>>> Actually, that's racy. New version below!
>>>
>>
>> Applied on top of next-20200526, with defconfig+SMP, I still get:
>>
>> BUG: Bad page state in process swapper/0  pfn:0069f
>>
>> many times. Did I have to revert something else ? Sorry, I lost track.
>  
> The bad page messages are fixed by [1], but this is not in mmotm or
> linux-next. This is not related to SMP hangs.
> 
> [1] https://lore.kernel.org/lkml/20200524165358.27188-1-rppt@kernel.org/
> 

With that applied, all boot tests pass for me (including tests with
"-smp 2" on SS-10).

Guenter

>> Note that "-smp 2" on SS-10 works for me (with the same page state
>> messages).
>>
>> Guenter
>>
>>
>>> Will
>>>
>>> --->8
>>>
>>> diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
>>> index c861c0f0df73..068029471aa4 100644
>>> --- a/arch/sparc/mm/srmmu.c
>>> +++ b/arch/sparc/mm/srmmu.c
>>> @@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
>>>  
>>>  	if ((ptep = pte_alloc_one_kernel(mm)) == 0)
>>>  		return NULL;
>>> +
>>>  	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
>>> -	if (!pgtable_pte_page_ctor(page)) {
>>> -		__free_page(page);
>>> -		return NULL;
>>> +
>>> +	spin_lock(&mm->page_table_lock);
>>> +	if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
>>> +		page_ref_dec(page);
>>> +		ptep = NULL;
>>>  	}
>>> +	spin_unlock(&mm->page_table_lock);
>>> +
>>>  	return ptep;
>>>  }
>>>  
>>> @@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
>>>  	struct page *page;
>>>  
>>>  	page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
>>> -	pgtable_pte_page_dtor(page);
>>> +
>>> +	spin_lock(&mm->page_table_lock);
>>> +	if (page_ref_dec_return(page) == 1)
>>> +		pgtable_pte_page_dtor(page);
>>> +	spin_unlock(&mm->page_table_lock);
>>> +
>>>  	srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
>>>  }
>>>  
>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>> index c1acc34c1c35..97458119cce8 100644
>>> --- a/mm/Kconfig
>>> +++ b/mm/Kconfig
>>> @@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
>>>  # Default to 4 for wider testing, though 8 might be more appropriate.
>>>  # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
>>>  # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
>>> +# SPARC32 allocates multiple pte tables within a single page, and therefore
>>> +# a per-page lock leads to problems when multiple tables need to be locked
>>> +# at the same time (e.g. copy_page_range()).
>>>  # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
>>>  #
>>>  config SPLIT_PTLOCK_CPUS
>>> @@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
>>>  	default "999999" if !MMU
>>>  	default "999999" if ARM && !CPU_CACHE_VIPT
>>>  	default "999999" if PARISC && !PA20
>>> +	default "999999" if SPARC32
>>>  	default "4"
>>>  
>>>  config ARCH_ENABLE_SPLIT_PMD_PTLOCK
>>>
>>
> 


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-05-14 11:05                               ` Will Deacon
  2020-05-14 13:35                                 ` Marco Elver
@ 2020-06-03 18:52                                 ` Borislav Petkov
  2020-06-03 19:23                                   ` Marco Elver
  1 sibling, 1 reply; 127+ messages in thread
From: Borislav Petkov @ 2020-06-03 18:52 UTC (permalink / raw)
  To: Will Deacon
  Cc: Marco Elver, Peter Zijlstra, LKML, Thomas Gleixner,
	Paul E. McKenney, Ingo Molnar, Dmitry Vyukov

On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> and the behaviour for __always_inline is:
> 
>   * An __always_inline function inlined into a __no_sanitize function is
>     not instrumented
>   * An __always_inline function inlined into an instrumented function is
>     instrumented
>   * You can't mark a function as both __always_inline __no_sanitize, because
>     __no_sanitize functions are never inlined
> 
> GCC, on the other hand, may still inline __no_sanitize functions and then
> subsequently instrument them.

Yeah, about that: I've been looking for a way to trigger this so that
I can show preprocessed source to gcc people. So do you guys have a
.config or somesuch I can try?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-06-03 18:52                                 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Borislav Petkov
@ 2020-06-03 19:23                                   ` Marco Elver
  2020-06-03 22:05                                     ` Borislav Petkov
  2020-06-08 17:32                                     ` Martin Liška
  0 siblings, 2 replies; 127+ messages in thread
From: Marco Elver @ 2020-06-03 19:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Will Deacon, Peter Zijlstra, LKML, Thomas Gleixner,
	Paul E. McKenney, Ingo Molnar, Dmitry Vyukov



On Wed, 03 Jun 2020, Borislav Petkov wrote:

> On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
> > Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> > and the behaviour for __always_inline is:
> > 
> >   * An __always_inline function inlined into a __no_sanitize function is
> >     not instrumented
> >   * An __always_inline function inlined into an instrumented function is
> >     instrumented
> >   * You can't mark a function as both __always_inline __no_sanitize, because
> >     __no_sanitize functions are never inlined
> > 
> > GCC, on the other hand, may still inline __no_sanitize functions and then
> > subsequently instrument them.
> 
> Yeah, about that: I've been looking for a way to trigger this so that
> I can show preprocessed source to gcc people. So do you guys have a
> .config or somesuch I can try?

For example take this:

	int x;

	static inline __attribute__((no_sanitize_thread)) void do_not_sanitize(void) {
	  x++;
	}

	void sanitize_this(void) {
	  do_not_sanitize();
	}

Then

	gcc-10 -O3 -fsanitize=thread -o example.o -c example.c
	objdump -D example.o

will show that do_not_sanitize() was inlined into sanitize_this() and is
instrumented. (With Clang this doesn't happen.)

Hope this is enough.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-06-03 19:23                                   ` Marco Elver
@ 2020-06-03 22:05                                     ` Borislav Petkov
  2020-06-08 17:32                                     ` Martin Liška
  1 sibling, 0 replies; 127+ messages in thread
From: Borislav Petkov @ 2020-06-03 22:05 UTC (permalink / raw)
  To: Marco Elver
  Cc: Will Deacon, Peter Zijlstra, LKML, Thomas Gleixner,
	Paul E. McKenney, Ingo Molnar, Dmitry Vyukov

On Wed, Jun 03, 2020 at 09:23:53PM +0200, Marco Elver wrote:
> Hope this is enough.

Thanks - it is. :-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-06-03 19:23                                   ` Marco Elver
  2020-06-03 22:05                                     ` Borislav Petkov
@ 2020-06-08 17:32                                     ` Martin Liška
  2020-06-08 19:56                                       ` Marco Elver
  1 sibling, 1 reply; 127+ messages in thread
From: Martin Liška @ 2020-06-08 17:32 UTC (permalink / raw)
  To: Marco Elver, Borislav Petkov
  Cc: Will Deacon, Peter Zijlstra, LKML, Thomas Gleixner,
	Paul E. McKenney, Ingo Molnar, Dmitry Vyukov, ndesaulniers

On 6/3/20 9:23 PM, Marco Elver wrote:
> 
> 
> On Wed, 03 Jun 2020, Borislav Petkov wrote:
> 
>> On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
>>> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
>>> and the behaviour for __always_inline is:
>>>
>>>    * An __always_inline function inlined into a __no_sanitize function is
>>>      not instrumented
>>>    * An __always_inline function inlined into an instrumented function is
>>>      instrumented
>>>    * You can't mark a function as both __always_inline __no_sanitize, because
>>>      __no_sanitize functions are never inlined
>>>
>>> GCC, on the other hand, may still inline __no_sanitize functions and then
>>> subsequently instrument them.
>>
>> Yeah, about that: I've been looking for a way to trigger this so that
>> I can show preprocessed source to gcc people. So do you guys have a
>> .config or somesuch I can try?
> 
> For example take this:
> 
> 	int x;
> 
> 	static inline __attribute__((no_sanitize_thread)) void do_not_sanitize(void) {
> 	  x++;
> 	}
> 
> 	void sanitize_this(void) {
> 	  do_not_sanitize();
> 	}
> 
> Then
> 
> 	gcc-10 -O3 -fsanitize=thread -o example.o -c example.c
> 	objdump -D example.o

Hello.

Thank you for the example. It seems to me that Clang does not inline a no_sanitize_* function
into one which is instrumented. Is it a documented behavior ([1] doesn't mention that)?
If so, we can do the same in GCC.

Thanks,
Martin

[1] https://clang.llvm.org/docs/AttributeReference.html#no-sanitize

> 
> will show that do_not_sanitize() was inlined into sanitize_this() and is
> instrumented. (With Clang this doesn't happen.)
> 
> Hope this is enough.
> 
> Thanks,
> -- Marco
> 


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-06-08 17:32                                     ` Martin Liška
@ 2020-06-08 19:56                                       ` Marco Elver
  2020-06-09 11:55                                         ` Martin Liška
  0 siblings, 1 reply; 127+ messages in thread
From: Marco Elver @ 2020-06-08 19:56 UTC (permalink / raw)
  To: Martin Liška
  Cc: Borislav Petkov, Will Deacon, Peter Zijlstra, LKML,
	Thomas Gleixner, Paul E. McKenney, Ingo Molnar, Dmitry Vyukov,
	Nick Desaulniers

On Mon, 8 Jun 2020 at 19:32, Martin Liška <mliska@suse.cz> wrote:
>
> On 6/3/20 9:23 PM, Marco Elver wrote:
> >
> >
> > On Wed, 03 Jun 2020, Borislav Petkov wrote:
> >
> >> On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
> >>> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
> >>> and the behaviour for __always_inline is:
> >>>
> >>>    * An __always_inline function inlined into a __no_sanitize function is
> >>>      not instrumented
> >>>    * An __always_inline function inlined into an instrumented function is
> >>>      instrumented
> >>>    * You can't mark a function as both __always_inline __no_sanitize, because
> >>>      __no_sanitize functions are never inlined
> >>>
> >>> GCC, on the other hand, may still inline __no_sanitize functions and then
> >>> subsequently instrument them.
> >>
> >> Yeah, about that: I've been looking for a way to trigger this so that
> >> I can show preprocessed source to gcc people. So do you guys have a
> >> .config or somesuch I can try?
> >
> > For example take this:
> >
> >       int x;
> >
> >       static inline __attribute__((no_sanitize_thread)) void do_not_sanitize(void) {
> >         x++;
> >       }
> >
> >       void sanitize_this(void) {
> >         do_not_sanitize();
> >       }
> >
> > Then
> >
> >       gcc-10 -O3 -fsanitize=thread -o example.o -c example.c
> >       objdump -D example.o
>
> Hello.
>
> Thank you for the example. It seems to me that Clang does not inline a no_sanitize_* function
> into one which is instrumented. Is it a documented behavior ([1] doesn't mention that)?
> If so, we can do the same in GCC.

It is not explicitly mentioned in [1]. But the contract of
"no_sanitize" is "that a particular instrumentation or set of
instrumentations should not be applied". That contract is broken if a
function is instrumented, however that may happen. It sadly does
happen with GCC when a function is inlined. Presumably because the
sanitizer passes for TSAN/ASAN/MSAN run after the optimizer -- this
definitely can't change. Also because it currently gives us the
property that __always_inline functions are instrumented according to
the function they are inlined into (a property we want).

The easy fix to no_sanitize seems to be to do what Clang does, and
never inline no_sanitize functions (with or without "inline"
attribute).  always_inline functions should remain unchanged
(specifying no_sanitize on an always_inline function is an error).

Note this applies to all sanitizers (TSAN/ASAN/MSAN) and their
no_sanitize attribute that GCC has.

The list of requirements were also summarized in more detail here:
https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/

Hope that makes sense. (I also need to send a v2 for param
tsan-distinguish-volatile, but haven't gotten around to it yet --
hopefully soon. And then we also need a param
tsan-instrument-func-entry-exit, which LLVM has for TSAN. One step at
a time though.)

Thanks,
-- Marco


> Thanks,
> Martin
>
> [1] https://clang.llvm.org/docs/AttributeReference.html#no-sanitize
>
> >
> > will show that do_not_sanitize() was inlined into sanitize_this() and is
> > instrumented. (With Clang this doesn't happen.)
> >
> > Hope this is enough.
> >
> > Thanks,
> > -- Marco
> >
>

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-06-08 19:56                                       ` Marco Elver
@ 2020-06-09 11:55                                         ` Martin Liška
  2020-06-09 12:36                                           ` Martin Liška
  0 siblings, 1 reply; 127+ messages in thread
From: Martin Liška @ 2020-06-09 11:55 UTC (permalink / raw)
  To: Marco Elver
  Cc: Borislav Petkov, Will Deacon, Peter Zijlstra, LKML,
	Thomas Gleixner, Paul E. McKenney, Ingo Molnar, Dmitry Vyukov,
	Nick Desaulniers

On 6/8/20 9:56 PM, Marco Elver wrote:
> On Mon, 8 Jun 2020 at 19:32, Martin Liška <mliska@suse.cz> wrote:
>>
>> On 6/3/20 9:23 PM, Marco Elver wrote:
>>>
>>>
>>> On Wed, 03 Jun 2020, Borislav Petkov wrote:
>>>
>>>> On Thu, May 14, 2020 at 12:05:38PM +0100, Will Deacon wrote:
>>>>> Talking off-list, Clang >= 7 is pretty reasonable wrt inlining decisions
>>>>> and the behaviour for __always_inline is:
>>>>>
>>>>>     * An __always_inline function inlined into a __no_sanitize function is
>>>>>       not instrumented
>>>>>     * An __always_inline function inlined into an instrumented function is
>>>>>       instrumented
>>>>>     * You can't mark a function as both __always_inline __no_sanitize, because
>>>>>       __no_sanitize functions are never inlined
>>>>>
>>>>> GCC, on the other hand, may still inline __no_sanitize functions and then
>>>>> subsequently instrument them.
>>>>
>>>> Yeah, about that: I've been looking for a way to trigger this so that
>>>> I can show preprocessed source to gcc people. So do you guys have a
>>>> .config or somesuch I can try?
>>>
>>> For example take this:
>>>
>>>        int x;
>>>
>>>        static inline __attribute__((no_sanitize_thread)) void do_not_sanitize(void) {
>>>          x++;
>>>        }
>>>
>>>        void sanitize_this(void) {
>>>          do_not_sanitize();
>>>        }
>>>
>>> Then
>>>
>>>        gcc-10 -O3 -fsanitize=thread -o example.o -c example.c
>>>        objdump -D example.o
>>
>> Hello.
>>
>> Thank you for the example. It seems to me that Clang does not inline a no_sanitize_* function
>> into one which is instrumented. Is it a documented behavior ([1] doesn't mention that)?
>> If so, we can do the same in GCC.
> 
> It is not explicitly mentioned in [1]. But the contract of
> "no_sanitize" is "that a particular instrumentation or set of
> instrumentations should not be applied". That contract is broken if a
> function is instrumented, however that may happen. It sadly does
> happen with GCC when a function is inlined. Presumably because the
> sanitizer passes for TSAN/ASAN/MSAN run after the optimizer -- this
> definitely can't change. Also because it currently gives us the
> property that __always_inline functions are instrumented according to
> the function they are inlined into (a property we want).
> 
> The easy fix to no_sanitize seems to be to do what Clang does, and
> never inline no_sanitize functions (with or without "inline"
> attribute).  always_inline functions should remain unchanged
> (specifying no_sanitize on an always_inline function is an error).

Hello.

Works for me and I've just sent patch for that:
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547618.html

> 
> Note this applies to all sanitizers (TSAN/ASAN/MSAN) and their
> no_sanitize attribute that GCC has.

Sure.

> 
> The list of requirements were also summarized in more detail here:
> https://lore.kernel.org/lkml/CANpmjNMTsY_8241bS7=XAfqvZHFLrVEkv_uM4aDUWE_kh3Rvbw@mail.gmail.com/
> 
> Hope that makes sense. (I also need to send a v2 for param
> tsan-distinguish-volatile, but haven't gotten around to it yet --
> hopefully soon.

The patch is approved now.

  And then we also need a param
> tsan-instrument-func-entry-exit, which LLVM has for TSAN. One step at
> a time though.)

Yes, please send a patch for it.

Martin

> 
> Thanks,
> -- Marco
> 
> 
>> Thanks,
>> Martin
>>
>> [1] https://clang.llvm.org/docs/AttributeReference.html#no-sanitize
>>
>>>
>>> will show that do_not_sanitize() was inlined into sanitize_this() and is
>>> instrumented. (With Clang this doesn't happen.)
>>>
>>> Hope this is enough.
>>>
>>> Thanks,
>>> -- Marco
>>>
>>


^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-06-09 11:55                                         ` Martin Liška
@ 2020-06-09 12:36                                           ` Martin Liška
  2020-06-09 13:45                                             ` Marco Elver
  0 siblings, 1 reply; 127+ messages in thread
From: Martin Liška @ 2020-06-09 12:36 UTC (permalink / raw)
  To: Marco Elver
  Cc: Borislav Petkov, Will Deacon, Peter Zijlstra, LKML,
	Thomas Gleixner, Paul E. McKenney, Ingo Molnar, Dmitry Vyukov,
	Nick Desaulniers

On 6/9/20 1:55 PM, Martin Liška wrote:
> Works for me and I've just sent patch for that:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547618.html

The patch has landed into master.

Martin

^ permalink raw reply	[flat|nested] 127+ messages in thread

* Re: [PATCH v5 00/18] Rework READ_ONCE() to improve codegen
  2020-06-09 12:36                                           ` Martin Liška
@ 2020-06-09 13:45                                             ` Marco Elver
  0 siblings, 0 replies; 127+ messages in thread
From: Marco Elver @ 2020-06-09 13:45 UTC (permalink / raw)
  To: Martin Liška
  Cc: Borislav Petkov, Will Deacon, Peter Zijlstra, LKML,
	Thomas Gleixner, Paul E. McKenney, Ingo Molnar, Dmitry Vyukov,
	Nick Desaulniers

On Tue, 9 Jun 2020 at 14:36, Martin Liška <mliska@suse.cz> wrote:
>
> On 6/9/20 1:55 PM, Martin Liška wrote:
> > Works for me and I've just sent patch for that:
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547618.html
>
> The patch has landed into master.

Great, thank you for turning this around so quickly!

I've just sent v3 of the tsan-distinguish-volatile patch:
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547633.html -- I
think there is only the func-entry-exit param left.

Thanks,
-- Marco


-- Marco

^ permalink raw reply	[flat|nested] 127+ messages in thread

end of thread, other threads:[~2020-06-09 13:45 UTC | newest]

Thread overview: 127+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-11 20:41 [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Will Deacon
2020-05-11 20:41 ` [PATCH v5 01/18] sparc32: mm: Fix argument checking in __srmmu_get_nocache() Will Deacon
2020-05-12 14:37   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 02/18] sparc32: mm: Restructure sparc32 MMU page-table layout Will Deacon
2020-05-12 14:37   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 03/18] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page * Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-17  0:00   ` [PATCH v5 04/18] " Guenter Roeck
2020-05-17  0:07     ` Guenter Roeck
2020-05-18  8:37       ` Will Deacon
2020-05-18  9:18         ` Mike Rapoport
2020-05-18  9:48         ` Guenter Roeck
2020-05-18 14:23           ` Mike Rapoport
2020-05-18 16:08             ` Guenter Roeck
2020-05-18 18:11               ` Ira Weiny
2020-05-18 18:14               ` Ira Weiny
2020-05-18 18:09             ` Guenter Roeck
2020-05-18 18:21               ` Ira Weiny
2020-05-18 19:15               ` Mike Rapoport
2020-05-19 16:40                 ` Guenter Roeck
2020-05-20 17:03         ` Mike Rapoport
2020-05-20 19:03           ` Guenter Roeck
2020-05-20 19:51             ` Mike Rapoport
2020-05-21 23:02               ` Guenter Roeck
2020-05-24 12:32                 ` Mike Rapoport
2020-05-24 14:01                   ` Guenter Roeck
2020-05-26 13:26                   ` Will Deacon
2020-05-26 14:01                     ` Will Deacon
2020-05-26 15:21                       ` Mike Rapoport
2020-05-26 16:18                       ` Guenter Roeck
2020-05-26 16:29                         ` Mike Rapoport
2020-05-26 17:15                           ` Guenter Roeck
2020-05-11 20:41 ` [PATCH v5 05/18] compiler/gcc: Raise minimum GCC version for kernel builds to 4.8 Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 06/18] netfilter: Avoid assigning 'const' pointer to non-const pointer Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 07/18] net: tls: " Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 08/18] fault_inject: Don't rely on "return value" from WRITE_ONCE() Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 09/18] arm64: csum: Disable KASAN for do_csum() Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 10/18] READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE() Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 11/18] READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 12/18] READ_ONCE: Drop pointer qualifiers when reading from scalar types Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 13/18] locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 14/18] arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 15/18] gcov: Remove old GCC 3.4 support Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 16/18] kcsan: Rework data_race() so that it can be used by READ_ONCE() Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-11 20:41 ` [PATCH v5 17/18] READ_ONCE: Use data_race() to avoid KCSAN instrumentation Will Deacon
2020-05-12  8:23   ` Peter Zijlstra
2020-05-12  9:49     ` Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-20 22:17     ` Borislav Petkov
2020-05-20 22:30       ` Marco Elver
2020-05-21  7:25         ` Borislav Petkov
2020-05-21  9:37           ` Marco Elver
2020-05-21  3:30       ` Nathan Chancellor
2020-05-22 16:08       ` [tip: locking/kcsan] compiler.h: Avoid nested statement expression in data_race() tip-bot2 for Marco Elver
2020-05-11 20:41 ` [PATCH v5 18/18] linux/compiler.h: Remove redundant '#else' Will Deacon
2020-05-12 14:36   ` [tip: locking/kcsan] " tip-bot2 for Will Deacon
2020-05-12  8:18 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Peter Zijlstra
2020-05-12 17:53   ` Marco Elver
2020-05-12 18:55     ` Marco Elver
2020-05-12 19:07     ` Peter Zijlstra
2020-05-12 20:31       ` Marco Elver
2020-05-13 11:10         ` Peter Zijlstra
2020-05-13 11:14           ` Peter Zijlstra
2020-05-13 11:48           ` Marco Elver
2020-05-13 12:32             ` Peter Zijlstra
2020-05-13 12:40               ` Will Deacon
2020-05-13 13:15                 ` Marco Elver
2020-05-13 13:24                   ` Peter Zijlstra
2020-05-13 13:58                     ` Marco Elver
2020-05-14 11:21                       ` Peter Zijlstra
2020-05-14 11:24                         ` Peter Zijlstra
2020-05-14 11:35                         ` Peter Zijlstra
2020-05-14 12:01                         ` Will Deacon
2020-05-14 12:27                           ` Peter Zijlstra
2020-05-14 13:07                             ` Marco Elver
2020-05-14 13:14                               ` Peter Zijlstra
2020-05-14 12:20                         ` Peter Zijlstra
2020-05-14 14:13                       ` Peter Zijlstra
2020-05-14 14:20                         ` Marco Elver
2020-05-15  9:20                           ` Peter Zijlstra
2020-05-13 16:50                   ` Will Deacon
2020-05-13 17:32                     ` Marco Elver
2020-05-13 17:47                       ` Will Deacon
2020-05-13 18:54                         ` Marco Elver
2020-05-13 21:25                           ` Will Deacon
2020-05-14  7:31                             ` Marco Elver
2020-05-14 11:05                               ` Will Deacon
2020-05-14 13:35                                 ` Marco Elver
2020-05-14 13:47                                   ` Peter Zijlstra
2020-05-14 13:50                                   ` Peter Zijlstra
2020-05-14 13:56                                   ` Peter Zijlstra
2020-05-14 14:24                                   ` Peter Zijlstra
2020-05-14 15:09                                     ` Thomas Gleixner
2020-05-14 15:29                                       ` Marco Elver
2020-05-14 19:37                                         ` Thomas Gleixner
2020-05-15 13:55                                     ` David Laight
2020-05-15 14:04                                       ` Marco Elver
2020-05-15 14:07                                       ` Peter Zijlstra
2020-05-14 15:38                                   ` Paul E. McKenney
2020-05-22 16:08                                   ` [tip: locking/kcsan] kcsan: Restrict supported compilers tip-bot2 for Marco Elver
2020-06-03 18:52                                 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen Borislav Petkov
2020-06-03 19:23                                   ` Marco Elver
2020-06-03 22:05                                     ` Borislav Petkov
2020-06-08 17:32                                     ` Martin Liška
2020-06-08 19:56                                       ` Marco Elver
2020-06-09 11:55                                         ` Martin Liška
2020-06-09 12:36                                           ` Martin Liška
2020-06-09 13:45                                             ` Marco Elver
2020-05-22 16:08                           ` [tip: locking/kcsan] kcsan: Remove 'noinline' from __no_kcsan_or_inline tip-bot2 for Marco Elver
2020-05-13 13:21                 ` [PATCH v5 00/18] Rework READ_ONCE() to improve codegen David Laight
2020-05-13 16:32                   ` Thomas Gleixner
2020-05-12 21:14       ` Will Deacon
2020-05-12 22:00         ` Marco Elver

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox