linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] x86, kasan: add KASAN checks to atomic operations
@ 2017-03-14 19:24 Dmitry Vyukov
  2017-03-14 19:24 ` [PATCH 1/3] kasan: allow kasan_check_read/write() to accept pointers to volatiles Dmitry Vyukov
                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-14 19:24 UTC (permalink / raw)
  To: mark.rutland, peterz, aryabinin, mingo
  Cc: will.deacon, akpm, kasan-dev, linux-mm, x86, linux-kernel, Dmitry Vyukov

KASAN uses compiler instrumentation to intercept all memory accesses.
But it does not see memory accesses done in assembly code.
One notable user of assembly code is atomic operations. Frequently,
for example, an atomic reference decrement is the last access to an
object and a good candidate for a racy use-after-free.

Atomic operations are defined in arch files, but KASAN instrumentation
is required for several archs that support KASAN. Later we will need
similar hooks for KMSAN (uninit use detector) and KTSAN (data race
detector).

This change introduces wrappers around atomic operations that can be
used to add KASAN/KMSAN/KTSAN instrumentation across several archs,
and adds KASAN checks to them.

This patch uses the wrappers only for x86 arch. Arm64 will be switched
later. And we also plan to instrument bitops in a similar way.

Within a day it has found its first bug:

BUG: KASAN: use-after-free in atomic_dec_and_test
arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158
Write of size 4 by task syz-executor6/25698
CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
 atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
 put_task_struct include/linux/sched/task.h:93 [inline]
 put_ctx+0xcf/0x110 kernel/events/core.c:1131
 perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
 perf_release+0x37/0x50 kernel/events/core.c:4338
 __fput+0x332/0x800 fs/file_table.c:209
 ____fput+0x15/0x20 fs/file_table.c:245
 task_work_run+0x197/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0xb38/0x29c0 kernel/exit.c:880
 do_group_exit+0x149/0x420 kernel/exit.c:984
 get_signal+0x7e0/0x1820 kernel/signal.c:2318
 do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
 exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
 syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
 do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x4458d9
RSP: 002b:00007f3f07187cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000007080c8 RCX: 00000000004458d9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000007080c8
RBP: 00000000007080a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f3f071889c0 R15: 00007f3f07188700
Object at ffff880079c30140, in cache task_struct size: 5376
Allocated:
PID = 25681
 kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
 alloc_task_struct_node kernel/fork.c:153 [inline]
 dup_task_struct kernel/fork.c:495 [inline]
 copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
 copy_process kernel/fork.c:1531 [inline]
 _do_fork+0x200/0x1010 kernel/fork.c:1994
 SYSC_clone kernel/fork.c:2104 [inline]
 SyS_clone+0x37/0x50 kernel/fork.c:2098
 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
 return_from_SYSCALL_64+0x0/0x7a
Freed:
PID = 25681
 __cache_free mm/slab.c:3514 [inline]
 kmem_cache_free+0x71/0x240 mm/slab.c:3774
 free_task_struct kernel/fork.c:158 [inline]
 free_task+0x151/0x1d0 kernel/fork.c:370
 copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
 copy_process kernel/fork.c:1531 [inline]
 _do_fork+0x200/0x1010 kernel/fork.c:1994
 SYSC_clone kernel/fork.c:2104 [inline]
 SyS_clone+0x37/0x50 kernel/fork.c:2098
 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
 return_from_SYSCALL_64+0x0/0x7a

Dmitry Vyukov (3):
  kasan: allow kasan_check_read/write() to accept pointers to volatiles
  asm-generic, x86: wrap atomic operations
  asm-generic: add KASAN instrumentation to atomic operations

 arch/x86/include/asm/atomic.h             | 100 ++++++------
 arch/x86/include/asm/atomic64_32.h        |  86 ++++++-----
 arch/x86/include/asm/atomic64_64.h        |  90 +++++------
 arch/x86/include/asm/cmpxchg.h            |  12 +-
 arch/x86/include/asm/cmpxchg_32.h         |   8 +-
 arch/x86/include/asm/cmpxchg_64.h         |   4 +-
 include/asm-generic/atomic-instrumented.h | 246 ++++++++++++++++++++++++++++++
 include/linux/kasan-checks.h              |  10 +-
 mm/kasan/kasan.c                          |   4 +-
 9 files changed, 411 insertions(+), 149 deletions(-)
 create mode 100644 include/asm-generic/atomic-instrumented.h

-- 
2.12.0.367.g23dc2f6d3c-goog

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/3] kasan: allow kasan_check_read/write() to accept pointers to volatiles
  2017-03-14 19:24 [PATCH 0/3] x86, kasan: add KASAN checks to atomic operations Dmitry Vyukov
@ 2017-03-14 19:24 ` Dmitry Vyukov
  2017-03-14 19:24 ` [PATCH 2/3] asm-generic, x86: wrap atomic operations Dmitry Vyukov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-14 19:24 UTC (permalink / raw)
  To: mark.rutland, peterz, aryabinin, mingo
  Cc: will.deacon, akpm, kasan-dev, linux-mm, x86, linux-kernel, Dmitry Vyukov

Currently kasan_check_read/write() accept 'const void*', make them
accept 'const volatile void*'. This is required for instrumentation
of atomic operations and there is just no reason to not allow that.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>,
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>,
Cc: Ingo Molnar <mingo@redhat.com>,
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
---
 include/linux/kasan-checks.h | 10 ++++++----
 mm/kasan/kasan.c             |  4 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/include/linux/kasan-checks.h b/include/linux/kasan-checks.h
index b7f8aced7870..41960fecf783 100644
--- a/include/linux/kasan-checks.h
+++ b/include/linux/kasan-checks.h
@@ -2,11 +2,13 @@
 #define _LINUX_KASAN_CHECKS_H
 
 #ifdef CONFIG_KASAN
-void kasan_check_read(const void *p, unsigned int size);
-void kasan_check_write(const void *p, unsigned int size);
+void kasan_check_read(const volatile void *p, unsigned int size);
+void kasan_check_write(const volatile void *p, unsigned int size);
 #else
-static inline void kasan_check_read(const void *p, unsigned int size) { }
-static inline void kasan_check_write(const void *p, unsigned int size) { }
+static inline void kasan_check_read(const volatile void *p, unsigned int size)
+{ }
+static inline void kasan_check_write(const volatile void *p, unsigned int size)
+{ }
 #endif
 
 #endif
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index 98b27195e38b..db46e66eb1d4 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -333,13 +333,13 @@ static void check_memory_region(unsigned long addr,
 	check_memory_region_inline(addr, size, write, ret_ip);
 }
 
-void kasan_check_read(const void *p, unsigned int size)
+void kasan_check_read(const volatile void *p, unsigned int size)
 {
 	check_memory_region((unsigned long)p, size, false, _RET_IP_);
 }
 EXPORT_SYMBOL(kasan_check_read);
 
-void kasan_check_write(const void *p, unsigned int size)
+void kasan_check_write(const volatile void *p, unsigned int size)
 {
 	check_memory_region((unsigned long)p, size, true, _RET_IP_);
 }
-- 
2.12.0.367.g23dc2f6d3c-goog

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-14 19:24 [PATCH 0/3] x86, kasan: add KASAN checks to atomic operations Dmitry Vyukov
  2017-03-14 19:24 ` [PATCH 1/3] kasan: allow kasan_check_read/write() to accept pointers to volatiles Dmitry Vyukov
@ 2017-03-14 19:24 ` Dmitry Vyukov
  2017-03-20 16:41   ` Andrey Ryabinin
                     ` (2 more replies)
  2017-03-14 19:24 ` [PATCH 3/3] asm-generic: add KASAN instrumentation to " Dmitry Vyukov
  2017-03-30 22:30 ` [PATCH 0/3] x86, kasan: add KASAN checks " Andrew Morton
  3 siblings, 3 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-14 19:24 UTC (permalink / raw)
  To: mark.rutland, peterz, aryabinin, mingo
  Cc: will.deacon, akpm, kasan-dev, linux-mm, x86, linux-kernel, Dmitry Vyukov

KASAN uses compiler instrumentation to intercept all memory accesses.
But it does not see memory accesses done in assembly code.
One notable user of assembly code is atomic operations. Frequently,
for example, an atomic reference decrement is the last access to an
object and a good candidate for a racy use-after-free.

Atomic operations are defined in arch files, but KASAN instrumentation
is required for several archs that support KASAN. Later we will need
similar hooks for KMSAN (uninit use detector) and KTSAN (data race
detector).

This change introduces wrappers around atomic operations that can be
used to add KASAN/KMSAN/KTSAN instrumentation across several archs.
This patch uses the wrappers only for x86 arch. Arm64 will be switched
later.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>,
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>,
Cc: Ingo Molnar <mingo@redhat.com>,
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
---
 arch/x86/include/asm/atomic.h             | 100 +++++++-------
 arch/x86/include/asm/atomic64_32.h        |  86 ++++++------
 arch/x86/include/asm/atomic64_64.h        |  90 ++++++-------
 arch/x86/include/asm/cmpxchg.h            |  12 +-
 arch/x86/include/asm/cmpxchg_32.h         |   8 +-
 arch/x86/include/asm/cmpxchg_64.h         |   4 +-
 include/asm-generic/atomic-instrumented.h | 210 ++++++++++++++++++++++++++++++
 7 files changed, 367 insertions(+), 143 deletions(-)

diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..95dd167eb3af 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -16,36 +16,46 @@
 #define ATOMIC_INIT(i)	{ (i) }
 
 /**
- * atomic_read - read atomic variable
+ * arch_atomic_read - read atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically reads the value of @v.
  */
-static __always_inline int atomic_read(const atomic_t *v)
+static __always_inline int arch_atomic_read(const atomic_t *v)
 {
-	return READ_ONCE((v)->counter);
+	/*
+	 * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
+	 * instrumentation. Double instrumentation is unnecessary.
+	 */
+	return READ_ONCE_NOCHECK((v)->counter);
 }
 
 /**
- * atomic_set - set atomic variable
+ * arch_atomic_set - set atomic variable
  * @v: pointer of type atomic_t
  * @i: required value
  *
  * Atomically sets the value of @v to @i.
  */
-static __always_inline void atomic_set(atomic_t *v, int i)
+static __always_inline void arch_atomic_set(atomic_t *v, int i)
 {
+	/*
+	 * We could use WRITE_ONCE_NOCHECK() if it exists, similar to
+	 * READ_ONCE_NOCHECK() in arch_atomic_read(). But there is no such
+	 * thing at the moment, and introducing it for this case does not
+	 * worth it.
+	 */
 	WRITE_ONCE(v->counter, i);
 }
 
 /**
- * atomic_add - add integer to atomic variable
+ * arch_atomic_add - add integer to atomic variable
  * @i: integer value to add
  * @v: pointer of type atomic_t
  *
  * Atomically adds @i to @v.
  */
-static __always_inline void atomic_add(int i, atomic_t *v)
+static __always_inline void arch_atomic_add(int i, atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "addl %1,%0"
 		     : "+m" (v->counter)
@@ -53,13 +63,13 @@ static __always_inline void atomic_add(int i, atomic_t *v)
 }
 
 /**
- * atomic_sub - subtract integer from atomic variable
+ * arch_atomic_sub - subtract integer from atomic variable
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
  *
  * Atomically subtracts @i from @v.
  */
-static __always_inline void atomic_sub(int i, atomic_t *v)
+static __always_inline void arch_atomic_sub(int i, atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "subl %1,%0"
 		     : "+m" (v->counter)
@@ -67,7 +77,7 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
 }
 
 /**
- * atomic_sub_and_test - subtract value from variable and test result
+ * arch_atomic_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
  *
@@ -75,63 +85,63 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
  * true if the result is zero, or false for all
  * other cases.
  */
-static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
+static __always_inline bool arch_atomic_sub_and_test(int i, atomic_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, "er", i, "%0", e);
 }
 
 /**
- * atomic_inc - increment atomic variable
+ * arch_atomic_inc - increment atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically increments @v by 1.
  */
-static __always_inline void atomic_inc(atomic_t *v)
+static __always_inline void arch_atomic_inc(atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "incl %0"
 		     : "+m" (v->counter));
 }
 
 /**
- * atomic_dec - decrement atomic variable
+ * arch_atomic_dec - decrement atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically decrements @v by 1.
  */
-static __always_inline void atomic_dec(atomic_t *v)
+static __always_inline void arch_atomic_dec(atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "decl %0"
 		     : "+m" (v->counter));
 }
 
 /**
- * atomic_dec_and_test - decrement and test
+ * arch_atomic_dec_and_test - decrement and test
  * @v: pointer of type atomic_t
  *
  * Atomically decrements @v by 1 and
  * returns true if the result is 0, or false for all other
  * cases.
  */
-static __always_inline bool atomic_dec_and_test(atomic_t *v)
+static __always_inline bool arch_atomic_dec_and_test(atomic_t *v)
 {
 	GEN_UNARY_RMWcc(LOCK_PREFIX "decl", v->counter, "%0", e);
 }
 
 /**
- * atomic_inc_and_test - increment and test
+ * arch_atomic_inc_and_test - increment and test
  * @v: pointer of type atomic_t
  *
  * Atomically increments @v by 1
  * and returns true if the result is zero, or false for all
  * other cases.
  */
-static __always_inline bool atomic_inc_and_test(atomic_t *v)
+static __always_inline bool arch_atomic_inc_and_test(atomic_t *v)
 {
 	GEN_UNARY_RMWcc(LOCK_PREFIX "incl", v->counter, "%0", e);
 }
 
 /**
- * atomic_add_negative - add and test if negative
+ * arch_atomic_add_negative - add and test if negative
  * @i: integer value to add
  * @v: pointer of type atomic_t
  *
@@ -139,60 +149,60 @@ static __always_inline bool atomic_inc_and_test(atomic_t *v)
  * if the result is negative, or false when
  * result is greater than or equal to zero.
  */
-static __always_inline bool atomic_add_negative(int i, atomic_t *v)
+static __always_inline bool arch_atomic_add_negative(int i, atomic_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "addl", v->counter, "er", i, "%0", s);
 }
 
 /**
- * atomic_add_return - add integer and return
+ * arch_atomic_add_return - add integer and return
  * @i: integer value to add
  * @v: pointer of type atomic_t
  *
  * Atomically adds @i to @v and returns @i + @v
  */
-static __always_inline int atomic_add_return(int i, atomic_t *v)
+static __always_inline int arch_atomic_add_return(int i, atomic_t *v)
 {
 	return i + xadd(&v->counter, i);
 }
 
 /**
- * atomic_sub_return - subtract integer and return
+ * arch_atomic_sub_return - subtract integer and return
  * @v: pointer of type atomic_t
  * @i: integer value to subtract
  *
  * Atomically subtracts @i from @v and returns @v - @i
  */
-static __always_inline int atomic_sub_return(int i, atomic_t *v)
+static __always_inline int arch_atomic_sub_return(int i, atomic_t *v)
 {
-	return atomic_add_return(-i, v);
+	return arch_atomic_add_return(-i, v);
 }
 
-#define atomic_inc_return(v)  (atomic_add_return(1, v))
-#define atomic_dec_return(v)  (atomic_sub_return(1, v))
+#define arch_atomic_inc_return(v)  (arch_atomic_add_return(1, v))
+#define arch_atomic_dec_return(v)  (arch_atomic_sub_return(1, v))
 
-static __always_inline int atomic_fetch_add(int i, atomic_t *v)
+static __always_inline int arch_atomic_fetch_add(int i, atomic_t *v)
 {
 	return xadd(&v->counter, i);
 }
 
-static __always_inline int atomic_fetch_sub(int i, atomic_t *v)
+static __always_inline int arch_atomic_fetch_sub(int i, atomic_t *v)
 {
 	return xadd(&v->counter, -i);
 }
 
-static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+static __always_inline int arch_atomic_cmpxchg(atomic_t *v, int old, int new)
 {
-	return cmpxchg(&v->counter, old, new);
+	return arch_cmpxchg(&v->counter, old, new);
 }
 
-static inline int atomic_xchg(atomic_t *v, int new)
+static inline int arch_atomic_xchg(atomic_t *v, int new)
 {
-	return xchg(&v->counter, new);
+	return arch_xchg(&v->counter, new);
 }
 
 #define ATOMIC_OP(op)							\
-static inline void atomic_##op(int i, atomic_t *v)			\
+static inline void arch_atomic_##op(int i, atomic_t *v)			\
 {									\
 	asm volatile(LOCK_PREFIX #op"l %1,%0"				\
 			: "+m" (v->counter)				\
@@ -201,11 +211,11 @@ static inline void atomic_##op(int i, atomic_t *v)			\
 }
 
 #define ATOMIC_FETCH_OP(op, c_op)					\
-static inline int atomic_fetch_##op(int i, atomic_t *v)		\
+static inline int arch_atomic_fetch_##op(int i, atomic_t *v)		\
 {									\
-	int old, val = atomic_read(v);					\
+	int old, val = arch_atomic_read(v);				\
 	for (;;) {							\
-		old = atomic_cmpxchg(v, val, val c_op i);		\
+		old = arch_atomic_cmpxchg(v, val, val c_op i);		\
 		if (old == val)						\
 			break;						\
 		val = old;						\
@@ -226,7 +236,7 @@ ATOMIC_OPS(xor, ^)
 #undef ATOMIC_OP
 
 /**
- * __atomic_add_unless - add unless the number is already a given value
+ * __arch_atomic_add_unless - add unless the number is already a given value
  * @v: pointer of type atomic_t
  * @a: the amount to add to v...
  * @u: ...unless v is equal to u.
@@ -234,14 +244,14 @@ ATOMIC_OPS(xor, ^)
  * Atomically adds @a to @v, so long as @v was not already @u.
  * Returns the old value of @v.
  */
-static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
+static __always_inline int __arch_atomic_add_unless(atomic_t *v, int a, int u)
 {
 	int c, old;
-	c = atomic_read(v);
+	c = arch_atomic_read(v);
 	for (;;) {
 		if (unlikely(c == (u)))
 			break;
-		old = atomic_cmpxchg((v), c, c + (a));
+		old = arch_atomic_cmpxchg((v), c, c + (a));
 		if (likely(old == c))
 			break;
 		c = old;
@@ -250,13 +260,13 @@ static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
 }
 
 /**
- * atomic_inc_short - increment of a short integer
+ * arch_atomic_inc_short - increment of a short integer
  * @v: pointer to type int
  *
  * Atomically adds 1 to @v
  * Returns the new value of @u
  */
-static __always_inline short int atomic_inc_short(short int *v)
+static __always_inline short int arch_atomic_inc_short(short int *v)
 {
 	asm(LOCK_PREFIX "addw $1, %0" : "+m" (*v));
 	return *v;
@@ -268,4 +278,6 @@ static __always_inline short int atomic_inc_short(short int *v)
 # include <asm/atomic64_64.h>
 #endif
 
+#include <asm-generic/atomic-instrumented.h>
+
 #endif /* _ASM_X86_ATOMIC_H */
diff --git a/arch/x86/include/asm/atomic64_32.h b/arch/x86/include/asm/atomic64_32.h
index 71d7705fb303..4a48b20000f3 100644
--- a/arch/x86/include/asm/atomic64_32.h
+++ b/arch/x86/include/asm/atomic64_32.h
@@ -61,7 +61,7 @@ ATOMIC64_DECL(add_unless);
 #undef ATOMIC64_EXPORT
 
 /**
- * atomic64_cmpxchg - cmpxchg atomic64 variable
+ * arch_atomic64_cmpxchg - cmpxchg atomic64 variable
  * @v: pointer to type atomic64_t
  * @o: expected value
  * @n: new value
@@ -70,20 +70,21 @@ ATOMIC64_DECL(add_unless);
  * the old value.
  */
 
-static inline long long atomic64_cmpxchg(atomic64_t *v, long long o, long long n)
+static inline long long arch_atomic64_cmpxchg(atomic64_t *v, long long o,
+					      long long n)
 {
-	return cmpxchg64(&v->counter, o, n);
+	return arch_cmpxchg64(&v->counter, o, n);
 }
 
 /**
- * atomic64_xchg - xchg atomic64 variable
+ * arch_atomic64_xchg - xchg atomic64 variable
  * @v: pointer to type atomic64_t
  * @n: value to assign
  *
  * Atomically xchgs the value of @v to @n and returns
  * the old value.
  */
-static inline long long atomic64_xchg(atomic64_t *v, long long n)
+static inline long long arch_atomic64_xchg(atomic64_t *v, long long n)
 {
 	long long o;
 	unsigned high = (unsigned)(n >> 32);
@@ -95,13 +96,13 @@ static inline long long atomic64_xchg(atomic64_t *v, long long n)
 }
 
 /**
- * atomic64_set - set atomic64 variable
+ * arch_atomic64_set - set atomic64 variable
  * @v: pointer to type atomic64_t
  * @i: value to assign
  *
  * Atomically sets the value of @v to @n.
  */
-static inline void atomic64_set(atomic64_t *v, long long i)
+static inline void arch_atomic64_set(atomic64_t *v, long long i)
 {
 	unsigned high = (unsigned)(i >> 32);
 	unsigned low = (unsigned)i;
@@ -111,12 +112,12 @@ static inline void atomic64_set(atomic64_t *v, long long i)
 }
 
 /**
- * atomic64_read - read atomic64 variable
+ * arch_atomic64_read - read atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically reads the value of @v and returns it.
  */
-static inline long long atomic64_read(const atomic64_t *v)
+static inline long long arch_atomic64_read(const atomic64_t *v)
 {
 	long long r;
 	alternative_atomic64(read, "=&A" (r), "c" (v) : "memory");
@@ -124,13 +125,13 @@ static inline long long atomic64_read(const atomic64_t *v)
  }
 
 /**
- * atomic64_add_return - add and return
+ * arch_atomic64_add_return - add and return
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
  * Atomically adds @i to @v and returns @i + *@v
  */
-static inline long long atomic64_add_return(long long i, atomic64_t *v)
+static inline long long arch_atomic64_add_return(long long i, atomic64_t *v)
 {
 	alternative_atomic64(add_return,
 			     ASM_OUTPUT2("+A" (i), "+c" (v)),
@@ -141,7 +142,7 @@ static inline long long atomic64_add_return(long long i, atomic64_t *v)
 /*
  * Other variants with different arithmetic operators:
  */
-static inline long long atomic64_sub_return(long long i, atomic64_t *v)
+static inline long long arch_atomic64_sub_return(long long i, atomic64_t *v)
 {
 	alternative_atomic64(sub_return,
 			     ASM_OUTPUT2("+A" (i), "+c" (v)),
@@ -149,7 +150,7 @@ static inline long long atomic64_sub_return(long long i, atomic64_t *v)
 	return i;
 }
 
-static inline long long atomic64_inc_return(atomic64_t *v)
+static inline long long arch_atomic64_inc_return(atomic64_t *v)
 {
 	long long a;
 	alternative_atomic64(inc_return, "=&A" (a),
@@ -157,7 +158,7 @@ static inline long long atomic64_inc_return(atomic64_t *v)
 	return a;
 }
 
-static inline long long atomic64_dec_return(atomic64_t *v)
+static inline long long arch_atomic64_dec_return(atomic64_t *v)
 {
 	long long a;
 	alternative_atomic64(dec_return, "=&A" (a),
@@ -166,13 +167,13 @@ static inline long long atomic64_dec_return(atomic64_t *v)
 }
 
 /**
- * atomic64_add - add integer to atomic64 variable
+ * arch_atomic64_add - add integer to atomic64 variable
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
  * Atomically adds @i to @v.
  */
-static inline long long atomic64_add(long long i, atomic64_t *v)
+static inline long long arch_atomic64_add(long long i, atomic64_t *v)
 {
 	__alternative_atomic64(add, add_return,
 			       ASM_OUTPUT2("+A" (i), "+c" (v)),
@@ -181,13 +182,13 @@ static inline long long atomic64_add(long long i, atomic64_t *v)
 }
 
 /**
- * atomic64_sub - subtract the atomic64 variable
+ * arch_atomic64_sub - subtract the atomic64 variable
  * @i: integer value to subtract
  * @v: pointer to type atomic64_t
  *
  * Atomically subtracts @i from @v.
  */
-static inline long long atomic64_sub(long long i, atomic64_t *v)
+static inline long long arch_atomic64_sub(long long i, atomic64_t *v)
 {
 	__alternative_atomic64(sub, sub_return,
 			       ASM_OUTPUT2("+A" (i), "+c" (v)),
@@ -196,7 +197,7 @@ static inline long long atomic64_sub(long long i, atomic64_t *v)
 }
 
 /**
- * atomic64_sub_and_test - subtract value from variable and test result
+ * arch_atomic64_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
  * @v: pointer to type atomic64_t
  *
@@ -204,46 +205,46 @@ static inline long long atomic64_sub(long long i, atomic64_t *v)
  * true if the result is zero, or false for all
  * other cases.
  */
-static inline int atomic64_sub_and_test(long long i, atomic64_t *v)
+static inline int arch_atomic64_sub_and_test(long long i, atomic64_t *v)
 {
-	return atomic64_sub_return(i, v) == 0;
+	return arch_atomic64_sub_return(i, v) == 0;
 }
 
 /**
- * atomic64_inc - increment atomic64 variable
+ * arch_atomic64_inc - increment atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically increments @v by 1.
  */
-static inline void atomic64_inc(atomic64_t *v)
+static inline void arch_atomic64_inc(atomic64_t *v)
 {
 	__alternative_atomic64(inc, inc_return, /* no output */,
 			       "S" (v) : "memory", "eax", "ecx", "edx");
 }
 
 /**
- * atomic64_dec - decrement atomic64 variable
+ * arch_atomic64_dec - decrement atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically decrements @v by 1.
  */
-static inline void atomic64_dec(atomic64_t *v)
+static inline void arch_atomic64_dec(atomic64_t *v)
 {
 	__alternative_atomic64(dec, dec_return, /* no output */,
 			       "S" (v) : "memory", "eax", "ecx", "edx");
 }
 
 /**
- * atomic64_dec_and_test - decrement and test
+ * arch_atomic64_dec_and_test - decrement and test
  * @v: pointer to type atomic64_t
  *
  * Atomically decrements @v by 1 and
  * returns true if the result is 0, or false for all other
  * cases.
  */
-static inline int atomic64_dec_and_test(atomic64_t *v)
+static inline int arch_atomic64_dec_and_test(atomic64_t *v)
 {
-	return atomic64_dec_return(v) == 0;
+	return arch_atomic64_dec_return(v) == 0;
 }
 
 /**
@@ -254,13 +255,13 @@ static inline int atomic64_dec_and_test(atomic64_t *v)
  * and returns true if the result is zero, or false for all
  * other cases.
  */
-static inline int atomic64_inc_and_test(atomic64_t *v)
+static inline int arch_atomic64_inc_and_test(atomic64_t *v)
 {
-	return atomic64_inc_return(v) == 0;
+	return arch_atomic64_inc_return(v) == 0;
 }
 
 /**
- * atomic64_add_negative - add and test if negative
+ * arch_atomic64_add_negative - add and test if negative
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
@@ -268,13 +269,13 @@ static inline int atomic64_inc_and_test(atomic64_t *v)
  * if the result is negative, or false when
  * result is greater than or equal to zero.
  */
-static inline int atomic64_add_negative(long long i, atomic64_t *v)
+static inline int arch_atomic64_add_negative(long long i, atomic64_t *v)
 {
-	return atomic64_add_return(i, v) < 0;
+	return arch_atomic64_add_return(i, v) < 0;
 }
 
 /**
- * atomic64_add_unless - add unless the number is a given value
+ * arch_atomic64_add_unless - add unless the number is a given value
  * @v: pointer of type atomic64_t
  * @a: the amount to add to v...
  * @u: ...unless v is equal to u.
@@ -282,7 +283,8 @@ static inline int atomic64_add_negative(long long i, atomic64_t *v)
  * Atomically adds @a to @v, so long as it was not @u.
  * Returns non-zero if the add was done, zero otherwise.
  */
-static inline int atomic64_add_unless(atomic64_t *v, long long a, long long u)
+static inline int arch_atomic64_add_unless(atomic64_t *v, long long a,
+					   long long u)
 {
 	unsigned low = (unsigned)u;
 	unsigned high = (unsigned)(u >> 32);
@@ -293,7 +295,7 @@ static inline int atomic64_add_unless(atomic64_t *v, long long a, long long u)
 }
 
 
-static inline int atomic64_inc_not_zero(atomic64_t *v)
+static inline int arch_atomic64_inc_not_zero(atomic64_t *v)
 {
 	int r;
 	alternative_atomic64(inc_not_zero, "=&a" (r),
@@ -301,7 +303,7 @@ static inline int atomic64_inc_not_zero(atomic64_t *v)
 	return r;
 }
 
-static inline long long atomic64_dec_if_positive(atomic64_t *v)
+static inline long long arch_atomic64_dec_if_positive(atomic64_t *v)
 {
 	long long r;
 	alternative_atomic64(dec_if_positive, "=&A" (r),
@@ -313,25 +315,25 @@ static inline long long atomic64_dec_if_positive(atomic64_t *v)
 #undef __alternative_atomic64
 
 #define ATOMIC64_OP(op, c_op)						\
-static inline void atomic64_##op(long long i, atomic64_t *v)		\
+static inline void arch_atomic64_##op(long long i, atomic64_t *v)	\
 {									\
 	long long old, c = 0;						\
-	while ((old = atomic64_cmpxchg(v, c, c c_op i)) != c)		\
+	while ((old = arch_atomic64_cmpxchg(v, c, c c_op i)) != c)	\
 		c = old;						\
 }
 
 #define ATOMIC64_FETCH_OP(op, c_op)					\
-static inline long long atomic64_fetch_##op(long long i, atomic64_t *v)	\
+static inline long long arch_atomic64_fetch_##op(long long i, atomic64_t *v) \
 {									\
 	long long old, c = 0;						\
-	while ((old = atomic64_cmpxchg(v, c, c c_op i)) != c)		\
+	while ((old = arch_atomic64_cmpxchg(v, c, c c_op i)) != c)	\
 		c = old;						\
 	return old;							\
 }
 
 ATOMIC64_FETCH_OP(add, +)
 
-#define atomic64_fetch_sub(i, v)	atomic64_fetch_add(-(i), (v))
+#define arch_atomic64_fetch_sub(i, v)	arch_atomic64_fetch_add(-(i), (v))
 
 #define ATOMIC64_OPS(op, c_op)						\
 	ATOMIC64_OP(op, c_op)						\
diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h
index 89ed2f6ae2f7..de9555d35cb0 100644
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -16,31 +16,31 @@
  * Atomically reads the value of @v.
  * Doesn't imply a read memory barrier.
  */
-static inline long atomic64_read(const atomic64_t *v)
+static inline long arch_atomic64_read(const atomic64_t *v)
 {
-	return READ_ONCE((v)->counter);
+	return READ_ONCE_NOCHECK((v)->counter);
 }
 
 /**
- * atomic64_set - set atomic64 variable
+ * arch_atomic64_set - set atomic64 variable
  * @v: pointer to type atomic64_t
  * @i: required value
  *
  * Atomically sets the value of @v to @i.
  */
-static inline void atomic64_set(atomic64_t *v, long i)
+static inline void arch_atomic64_set(atomic64_t *v, long i)
 {
 	WRITE_ONCE(v->counter, i);
 }
 
 /**
- * atomic64_add - add integer to atomic64 variable
+ * arch_atomic64_add - add integer to atomic64 variable
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
  * Atomically adds @i to @v.
  */
-static __always_inline void atomic64_add(long i, atomic64_t *v)
+static __always_inline void arch_atomic64_add(long i, atomic64_t *v)
 {
 	asm volatile(LOCK_PREFIX "addq %1,%0"
 		     : "=m" (v->counter)
@@ -48,13 +48,13 @@ static __always_inline void atomic64_add(long i, atomic64_t *v)
 }
 
 /**
- * atomic64_sub - subtract the atomic64 variable
+ * arch_atomic64_sub - subtract the atomic64 variable
  * @i: integer value to subtract
  * @v: pointer to type atomic64_t
  *
  * Atomically subtracts @i from @v.
  */
-static inline void atomic64_sub(long i, atomic64_t *v)
+static inline void arch_atomic64_sub(long i, atomic64_t *v)
 {
 	asm volatile(LOCK_PREFIX "subq %1,%0"
 		     : "=m" (v->counter)
@@ -62,7 +62,7 @@ static inline void atomic64_sub(long i, atomic64_t *v)
 }
 
 /**
- * atomic64_sub_and_test - subtract value from variable and test result
+ * arch_atomic64_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
  * @v: pointer to type atomic64_t
  *
@@ -70,18 +70,18 @@ static inline void atomic64_sub(long i, atomic64_t *v)
  * true if the result is zero, or false for all
  * other cases.
  */
-static inline bool atomic64_sub_and_test(long i, atomic64_t *v)
+static inline bool arch_atomic64_sub_and_test(long i, atomic64_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "subq", v->counter, "er", i, "%0", e);
 }
 
 /**
- * atomic64_inc - increment atomic64 variable
+ * arch_atomic64_inc - increment atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically increments @v by 1.
  */
-static __always_inline void atomic64_inc(atomic64_t *v)
+static __always_inline void arch_atomic64_inc(atomic64_t *v)
 {
 	asm volatile(LOCK_PREFIX "incq %0"
 		     : "=m" (v->counter)
@@ -89,12 +89,12 @@ static __always_inline void atomic64_inc(atomic64_t *v)
 }
 
 /**
- * atomic64_dec - decrement atomic64 variable
+ * arch_atomic64_dec - decrement atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically decrements @v by 1.
  */
-static __always_inline void atomic64_dec(atomic64_t *v)
+static __always_inline void arch_atomic64_dec(atomic64_t *v)
 {
 	asm volatile(LOCK_PREFIX "decq %0"
 		     : "=m" (v->counter)
@@ -102,33 +102,33 @@ static __always_inline void atomic64_dec(atomic64_t *v)
 }
 
 /**
- * atomic64_dec_and_test - decrement and test
+ * arch_atomic64_dec_and_test - decrement and test
  * @v: pointer to type atomic64_t
  *
  * Atomically decrements @v by 1 and
  * returns true if the result is 0, or false for all other
  * cases.
  */
-static inline bool atomic64_dec_and_test(atomic64_t *v)
+static inline bool arch_atomic64_dec_and_test(atomic64_t *v)
 {
 	GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e);
 }
 
 /**
- * atomic64_inc_and_test - increment and test
+ * arch_atomic64_inc_and_test - increment and test
  * @v: pointer to type atomic64_t
  *
  * Atomically increments @v by 1
  * and returns true if the result is zero, or false for all
  * other cases.
  */
-static inline bool atomic64_inc_and_test(atomic64_t *v)
+static inline bool arch_atomic64_inc_and_test(atomic64_t *v)
 {
 	GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e);
 }
 
 /**
- * atomic64_add_negative - add and test if negative
+ * arch_atomic64_add_negative - add and test if negative
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
@@ -136,53 +136,53 @@ static inline bool atomic64_inc_and_test(atomic64_t *v)
  * if the result is negative, or false when
  * result is greater than or equal to zero.
  */
-static inline bool atomic64_add_negative(long i, atomic64_t *v)
+static inline bool arch_atomic64_add_negative(long i, atomic64_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
 }
 
 /**
- * atomic64_add_return - add and return
+ * arch_atomic64_add_return - add and return
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
  * Atomically adds @i to @v and returns @i + @v
  */
-static __always_inline long atomic64_add_return(long i, atomic64_t *v)
+static __always_inline long arch_atomic64_add_return(long i, atomic64_t *v)
 {
 	return i + xadd(&v->counter, i);
 }
 
-static inline long atomic64_sub_return(long i, atomic64_t *v)
+static inline long arch_atomic64_sub_return(long i, atomic64_t *v)
 {
-	return atomic64_add_return(-i, v);
+	return arch_atomic64_add_return(-i, v);
 }
 
-static inline long atomic64_fetch_add(long i, atomic64_t *v)
+static inline long arch_atomic64_fetch_add(long i, atomic64_t *v)
 {
 	return xadd(&v->counter, i);
 }
 
-static inline long atomic64_fetch_sub(long i, atomic64_t *v)
+static inline long arch_atomic64_fetch_sub(long i, atomic64_t *v)
 {
 	return xadd(&v->counter, -i);
 }
 
-#define atomic64_inc_return(v)  (atomic64_add_return(1, (v)))
-#define atomic64_dec_return(v)  (atomic64_sub_return(1, (v)))
+#define arch_atomic64_inc_return(v)  (arch_atomic64_add_return(1, (v)))
+#define arch_atomic64_dec_return(v)  (arch_atomic64_sub_return(1, (v)))
 
-static inline long atomic64_cmpxchg(atomic64_t *v, long old, long new)
+static inline long arch_atomic64_cmpxchg(atomic64_t *v, long old, long new)
 {
-	return cmpxchg(&v->counter, old, new);
+	return arch_cmpxchg(&v->counter, old, new);
 }
 
-static inline long atomic64_xchg(atomic64_t *v, long new)
+static inline long arch_atomic64_xchg(atomic64_t *v, long new)
 {
-	return xchg(&v->counter, new);
+	return arch_xchg(&v->counter, new);
 }
 
 /**
- * atomic64_add_unless - add unless the number is a given value
+ * arch_atomic64_add_unless - add unless the number is a given value
  * @v: pointer of type atomic64_t
  * @a: the amount to add to v...
  * @u: ...unless v is equal to u.
@@ -190,14 +190,14 @@ static inline long atomic64_xchg(atomic64_t *v, long new)
  * Atomically adds @a to @v, so long as it was not @u.
  * Returns the old value of @v.
  */
-static inline bool atomic64_add_unless(atomic64_t *v, long a, long u)
+static inline bool arch_atomic64_add_unless(atomic64_t *v, long a, long u)
 {
 	long c, old;
-	c = atomic64_read(v);
+	c = arch_atomic64_read(v);
 	for (;;) {
 		if (unlikely(c == (u)))
 			break;
-		old = atomic64_cmpxchg((v), c, c + (a));
+		old = arch_atomic64_cmpxchg((v), c, c + (a));
 		if (likely(old == c))
 			break;
 		c = old;
@@ -205,24 +205,24 @@ static inline bool atomic64_add_unless(atomic64_t *v, long a, long u)
 	return c != (u);
 }
 
-#define atomic64_inc_not_zero(v) atomic64_add_unless((v), 1, 0)
+#define arch_atomic64_inc_not_zero(v) arch_atomic64_add_unless((v), 1, 0)
 
 /*
- * atomic64_dec_if_positive - decrement by 1 if old value positive
+ * arch_atomic64_dec_if_positive - decrement by 1 if old value positive
  * @v: pointer of type atomic_t
  *
  * The function returns the old value of *v minus 1, even if
  * the atomic variable, v, was not decremented.
  */
-static inline long atomic64_dec_if_positive(atomic64_t *v)
+static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
 {
 	long c, old, dec;
-	c = atomic64_read(v);
+	c = arch_atomic64_read(v);
 	for (;;) {
 		dec = c - 1;
 		if (unlikely(dec < 0))
 			break;
-		old = atomic64_cmpxchg((v), c, dec);
+		old = arch_atomic64_cmpxchg((v), c, dec);
 		if (likely(old == c))
 			break;
 		c = old;
@@ -231,7 +231,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
 }
 
 #define ATOMIC64_OP(op)							\
-static inline void atomic64_##op(long i, atomic64_t *v)			\
+static inline void arch_atomic64_##op(long i, atomic64_t *v)		\
 {									\
 	asm volatile(LOCK_PREFIX #op"q %1,%0"				\
 			: "+m" (v->counter)				\
@@ -240,11 +240,11 @@ static inline void atomic64_##op(long i, atomic64_t *v)			\
 }
 
 #define ATOMIC64_FETCH_OP(op, c_op)					\
-static inline long atomic64_fetch_##op(long i, atomic64_t *v)		\
+static inline long arch_atomic64_fetch_##op(long i, atomic64_t *v)	\
 {									\
-	long old, val = atomic64_read(v);				\
+	long old, val = arch_atomic64_read(v);				\
 	for (;;) {							\
-		old = atomic64_cmpxchg(v, val, val c_op i);		\
+		old = arch_atomic64_cmpxchg(v, val, val c_op i);	\
 		if (old == val)						\
 			break;						\
 		val = old;						\
diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 97848cdfcb1a..75f09809666f 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -74,7 +74,7 @@ extern void __add_wrong_size(void)
  * use "asm volatile" and "memory" clobbers to prevent gcc from moving
  * information around.
  */
-#define xchg(ptr, v)	__xchg_op((ptr), (v), xchg, "")
+#define arch_xchg(ptr, v)	__xchg_op((ptr), (v), xchg, "")
 
 /*
  * Atomic compare and exchange.  Compare OLD with MEM, if identical,
@@ -144,13 +144,13 @@ extern void __add_wrong_size(void)
 # include <asm/cmpxchg_64.h>
 #endif
 
-#define cmpxchg(ptr, old, new)						\
+#define arch_cmpxchg(ptr, old, new)					\
 	__cmpxchg(ptr, old, new, sizeof(*(ptr)))
 
-#define sync_cmpxchg(ptr, old, new)					\
+#define arch_sync_cmpxchg(ptr, old, new)				\
 	__sync_cmpxchg(ptr, old, new, sizeof(*(ptr)))
 
-#define cmpxchg_local(ptr, old, new)					\
+#define arch_cmpxchg_local(ptr, old, new)				\
 	__cmpxchg_local(ptr, old, new, sizeof(*(ptr)))
 
 /*
@@ -179,10 +179,10 @@ extern void __add_wrong_size(void)
 	__ret;								\
 })
 
-#define cmpxchg_double(p1, p2, o1, o2, n1, n2) \
+#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \
 	__cmpxchg_double(LOCK_PREFIX, p1, p2, o1, o2, n1, n2)
 
-#define cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \
+#define arch_cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \
 	__cmpxchg_double(, p1, p2, o1, o2, n1, n2)
 
 #endif	/* ASM_X86_CMPXCHG_H */
diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h
index e4959d023af8..d897291d2bf9 100644
--- a/arch/x86/include/asm/cmpxchg_32.h
+++ b/arch/x86/include/asm/cmpxchg_32.h
@@ -35,10 +35,10 @@ static inline void set_64bit(volatile u64 *ptr, u64 value)
 }
 
 #ifdef CONFIG_X86_CMPXCHG64
-#define cmpxchg64(ptr, o, n)						\
+#define arch_cmpxchg64(ptr, o, n)					\
 	((__typeof__(*(ptr)))__cmpxchg64((ptr), (unsigned long long)(o), \
 					 (unsigned long long)(n)))
-#define cmpxchg64_local(ptr, o, n)					\
+#define arch_cmpxchg64_local(ptr, o, n)					\
 	((__typeof__(*(ptr)))__cmpxchg64_local((ptr), (unsigned long long)(o), \
 					       (unsigned long long)(n)))
 #endif
@@ -75,7 +75,7 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)
  * to simulate the cmpxchg8b on the 80386 and 80486 CPU.
  */
 
-#define cmpxchg64(ptr, o, n)					\
+#define arch_cmpxchg64(ptr, o, n)				\
 ({								\
 	__typeof__(*(ptr)) __ret;				\
 	__typeof__(*(ptr)) __old = (o);				\
@@ -92,7 +92,7 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)
 	__ret; })
 
 
-#define cmpxchg64_local(ptr, o, n)				\
+#define arch_cmpxchg64_local(ptr, o, n)				\
 ({								\
 	__typeof__(*(ptr)) __ret;				\
 	__typeof__(*(ptr)) __old = (o);				\
diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h
index caa23a34c963..fafaebacca2d 100644
--- a/arch/x86/include/asm/cmpxchg_64.h
+++ b/arch/x86/include/asm/cmpxchg_64.h
@@ -6,13 +6,13 @@ static inline void set_64bit(volatile u64 *ptr, u64 val)
 	*ptr = val;
 }
 
-#define cmpxchg64(ptr, o, n)						\
+#define arch_cmpxchg64(ptr, o, n)					\
 ({									\
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg((ptr), (o), (n));					\
 })
 
-#define cmpxchg64_local(ptr, o, n)					\
+#define arch_cmpxchg64_local(ptr, o, n)					\
 ({									\
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_local((ptr), (o), (n));					\
diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
new file mode 100644
index 000000000000..41e5e6ffdfc8
--- /dev/null
+++ b/include/asm-generic/atomic-instrumented.h
@@ -0,0 +1,210 @@
+#ifndef _LINUX_ATOMIC_INSTRUMENTED_H
+#define _LINUX_ATOMIC_INSTRUMENTED_H
+
+static __always_inline int atomic_read(const atomic_t *v)
+{
+	return arch_atomic_read(v);
+}
+
+static __always_inline long long atomic64_read(const atomic64_t *v)
+{
+	return arch_atomic64_read(v);
+}
+
+
+static __always_inline void atomic_set(atomic_t *v, int i)
+{
+	arch_atomic_set(v, i);
+}
+
+static __always_inline void atomic64_set(atomic64_t *v, long long i)
+{
+	arch_atomic64_set(v, i);
+}
+
+static __always_inline int atomic_xchg(atomic_t *v, int i)
+{
+	return arch_atomic_xchg(v, i);
+}
+
+static __always_inline long long atomic64_xchg(atomic64_t *v, long long i)
+{
+	return arch_atomic64_xchg(v, i);
+}
+
+static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+	return arch_atomic_cmpxchg(v, old, new);
+}
+
+static __always_inline long long atomic64_cmpxchg(atomic64_t *v, long long old,
+						  long long new)
+{
+	return arch_atomic64_cmpxchg(v, old, new);
+}
+
+static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
+{
+	return __arch_atomic_add_unless(v, a, u);
+}
+
+
+static __always_inline bool atomic64_add_unless(atomic64_t *v, long long a,
+						long long u)
+{
+	return arch_atomic64_add_unless(v, a, u);
+}
+
+static __always_inline short int atomic_inc_short(short int *v)
+{
+	return arch_atomic_inc_short(v);
+}
+
+#define __INSTR_VOID1(op, sz)						\
+static __always_inline void atomic##sz##_##op(atomic##sz##_t *v)	\
+{									\
+	arch_atomic##sz##_##op(v);					\
+}
+
+#define INSTR_VOID1(op)	\
+__INSTR_VOID1(op,);	\
+__INSTR_VOID1(op, 64)
+
+INSTR_VOID1(inc);
+INSTR_VOID1(dec);
+
+#undef __INSTR_VOID1
+#undef INSTR_VOID1
+
+#define __INSTR_VOID2(op, sz, type)					\
+static __always_inline void atomic##sz##_##op(type i, atomic##sz##_t *v)\
+{									\
+	arch_atomic##sz##_##op(i, v);					\
+}
+
+#define INSTR_VOID2(op)		\
+__INSTR_VOID2(op, , int);	\
+__INSTR_VOID2(op, 64, long long)
+
+INSTR_VOID2(add);
+INSTR_VOID2(sub);
+INSTR_VOID2(and);
+INSTR_VOID2(or);
+INSTR_VOID2(xor);
+
+#undef __INSTR_VOID2
+#undef INSTR_VOID2
+
+#define __INSTR_RET1(op, sz, type, rtype)				\
+static __always_inline rtype atomic##sz##_##op(atomic##sz##_t *v)	\
+{									\
+	return arch_atomic##sz##_##op(v);				\
+}
+
+#define INSTR_RET1(op)		\
+__INSTR_RET1(op, , int, int);	\
+__INSTR_RET1(op, 64, long long, long long)
+
+INSTR_RET1(inc_return);
+INSTR_RET1(dec_return);
+__INSTR_RET1(inc_not_zero, 64, long long, long long);
+__INSTR_RET1(dec_if_positive, 64, long long, long long);
+
+#define INSTR_RET_BOOL1(op)	\
+__INSTR_RET1(op, , int, bool);	\
+__INSTR_RET1(op, 64, long long, bool)
+
+INSTR_RET_BOOL1(dec_and_test);
+INSTR_RET_BOOL1(inc_and_test);
+
+#undef __INSTR_RET1
+#undef INSTR_RET1
+#undef INSTR_RET_BOOL1
+
+#define __INSTR_RET2(op, sz, type, rtype)				\
+static __always_inline rtype atomic##sz##_##op(type i, atomic##sz##_t *v) \
+{									\
+	return arch_atomic##sz##_##op(i, v);				\
+}
+
+#define INSTR_RET2(op)		\
+__INSTR_RET2(op, , int, int);	\
+__INSTR_RET2(op, 64, long long, long long)
+
+INSTR_RET2(add_return);
+INSTR_RET2(sub_return);
+INSTR_RET2(fetch_add);
+INSTR_RET2(fetch_sub);
+INSTR_RET2(fetch_and);
+INSTR_RET2(fetch_or);
+INSTR_RET2(fetch_xor);
+
+#define INSTR_RET_BOOL2(op)		\
+__INSTR_RET2(op, , int, bool);		\
+__INSTR_RET2(op, 64, long long, bool)
+
+INSTR_RET_BOOL2(sub_and_test);
+INSTR_RET_BOOL2(add_negative);
+
+#undef __INSTR_RET2
+#undef INSTR_RET2
+#undef INSTR_RET_BOOL2
+
+/*
+ * In the following macros we need to be careful to not clash with arch_ macros.
+ * arch_xchg() can be defined as an extended statement expression as well,
+ * if we define a __ptr variable, and arch_xchg() also defines __ptr variable,
+ * and we pass __ptr as an argument to arch_xchg(), it will use own __ptr
+ * instead of ours. This leads to unpleasant crashes. To avoid the problem
+ * the following macros declare variables with lots of underscores.
+ */
+
+#define xchg(ptr, v)					\
+({							\
+	__typeof__(ptr) ____ptr = (ptr);		\
+	arch_xchg(____ptr, (v));			\
+})
+
+#define cmpxchg(ptr, old, new)				\
+({							\
+	__typeof__(ptr) ___ptr = (ptr);			\
+	arch_cmpxchg(___ptr, (old), (new));		\
+})
+
+#define sync_cmpxchg(ptr, old, new)			\
+({							\
+	__typeof__(ptr) ___ptr = (ptr);			\
+	arch_sync_cmpxchg(___ptr, (old), (new));	\
+})
+
+#define cmpxchg_local(ptr, old, new)			\
+({							\
+	__typeof__(ptr) ____ptr = (ptr);		\
+	arch_cmpxchg_local(____ptr, (old), (new));	\
+})
+
+#define cmpxchg64(ptr, old, new)			\
+({							\
+	__typeof__(ptr) ____ptr = (ptr);		\
+	arch_cmpxchg64(____ptr, (old), (new));		\
+})
+
+#define cmpxchg64_local(ptr, old, new)			\
+({							\
+	__typeof__(ptr) ____ptr = (ptr);		\
+	arch_cmpxchg64_local(____ptr, (old), (new));	\
+})
+
+#define cmpxchg_double(p1, p2, o1, o2, n1, n2)				\
+({									\
+	__typeof__(p1) ____p1 = (p1);					\
+	arch_cmpxchg_double(____p1, (p2), (o1), (o2), (n1), (n2));	\
+})
+
+#define cmpxchg_double_local(p1, p2, o1, o2, n1, n2)			\
+({									\
+	__typeof__(p1) ____p1 = (p1);					\
+	arch_cmpxchg_double_local(____p1, (p2), (o1), (o2), (n1), (n2));\
+})
+
+#endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
-- 
2.12.0.367.g23dc2f6d3c-goog

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/3] asm-generic: add KASAN instrumentation to atomic operations
  2017-03-14 19:24 [PATCH 0/3] x86, kasan: add KASAN checks to atomic operations Dmitry Vyukov
  2017-03-14 19:24 ` [PATCH 1/3] kasan: allow kasan_check_read/write() to accept pointers to volatiles Dmitry Vyukov
  2017-03-14 19:24 ` [PATCH 2/3] asm-generic, x86: wrap atomic operations Dmitry Vyukov
@ 2017-03-14 19:24 ` Dmitry Vyukov
  2017-03-30 22:30 ` [PATCH 0/3] x86, kasan: add KASAN checks " Andrew Morton
  3 siblings, 0 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-14 19:24 UTC (permalink / raw)
  To: mark.rutland, peterz, aryabinin, mingo
  Cc: will.deacon, akpm, kasan-dev, linux-mm, x86, linux-kernel, Dmitry Vyukov

KASAN uses compiler instrumentation to intercept all memory accesses.
But it does not see memory accesses done in assembly code.
One notable user of assembly code is atomic operations. Frequently,
for example, an atomic reference decrement is the last access to an
object and a good candidate for a racy use-after-free.

Add manual KASAN checks to atomic operations.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>,
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>,
Cc: Ingo Molnar <mingo@redhat.com>,
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Cc: x86@kernel.org
---
 include/asm-generic/atomic-instrumented.h | 36 +++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
index 41e5e6ffdfc8..951bcd083925 100644
--- a/include/asm-generic/atomic-instrumented.h
+++ b/include/asm-generic/atomic-instrumented.h
@@ -1,50 +1,72 @@
+/*
+ * This file provides wrappers with KASAN instrumentation for atomic operations.
+ * To use this functionality an arch's atomic.h file needs to define all
+ * atomic operations with arch_ prefix (e.g. arch_atomic_read()) and include
+ * this file at the end. This file provides atomic_read() that forwards to
+ * arch_atomic_read() for actual atomic operation.
+ * Note: if an arch atomic operation is implemented by means of other atomic
+ * operations (e.g. atomic_read()/atomic_cmpxchg() loop), then it needs to use
+ * arch_ variants (i.e. arch_atomic_read()/arch_atomic_cmpxchg()) to avoid
+ * double instrumentation.
+ */
 #ifndef _LINUX_ATOMIC_INSTRUMENTED_H
 #define _LINUX_ATOMIC_INSTRUMENTED_H
 
+#include <linux/kasan-checks.h>
+
 static __always_inline int atomic_read(const atomic_t *v)
 {
+	kasan_check_read(v, sizeof(*v));
 	return arch_atomic_read(v);
 }
 
 static __always_inline long long atomic64_read(const atomic64_t *v)
 {
+	kasan_check_read(v, sizeof(*v));
 	return arch_atomic64_read(v);
 }
 
 
 static __always_inline void atomic_set(atomic_t *v, int i)
 {
+	kasan_check_write(v, sizeof(*v));
 	arch_atomic_set(v, i);
 }
 
 static __always_inline void atomic64_set(atomic64_t *v, long long i)
 {
+	kasan_check_write(v, sizeof(*v));
 	arch_atomic64_set(v, i);
 }
 
 static __always_inline int atomic_xchg(atomic_t *v, int i)
 {
+	kasan_check_write(v, sizeof(*v));
 	return arch_atomic_xchg(v, i);
 }
 
 static __always_inline long long atomic64_xchg(atomic64_t *v, long long i)
 {
+	kasan_check_write(v, sizeof(*v));
 	return arch_atomic64_xchg(v, i);
 }
 
 static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
 {
+	kasan_check_write(v, sizeof(*v));
 	return arch_atomic_cmpxchg(v, old, new);
 }
 
 static __always_inline long long atomic64_cmpxchg(atomic64_t *v, long long old,
 						  long long new)
 {
+	kasan_check_write(v, sizeof(*v));
 	return arch_atomic64_cmpxchg(v, old, new);
 }
 
 static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
 {
+	kasan_check_write(v, sizeof(*v));
 	return __arch_atomic_add_unless(v, a, u);
 }
 
@@ -52,17 +74,20 @@ static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
 static __always_inline bool atomic64_add_unless(atomic64_t *v, long long a,
 						long long u)
 {
+	kasan_check_write(v, sizeof(*v));
 	return arch_atomic64_add_unless(v, a, u);
 }
 
 static __always_inline short int atomic_inc_short(short int *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	return arch_atomic_inc_short(v);
 }
 
 #define __INSTR_VOID1(op, sz)						\
 static __always_inline void atomic##sz##_##op(atomic##sz##_t *v)	\
 {									\
+	kasan_check_write(v, sizeof(*v));				\
 	arch_atomic##sz##_##op(v);					\
 }
 
@@ -79,6 +104,7 @@ INSTR_VOID1(dec);
 #define __INSTR_VOID2(op, sz, type)					\
 static __always_inline void atomic##sz##_##op(type i, atomic##sz##_t *v)\
 {									\
+	kasan_check_write(v, sizeof(*v));				\
 	arch_atomic##sz##_##op(i, v);					\
 }
 
@@ -98,6 +124,7 @@ INSTR_VOID2(xor);
 #define __INSTR_RET1(op, sz, type, rtype)				\
 static __always_inline rtype atomic##sz##_##op(atomic##sz##_t *v)	\
 {									\
+	kasan_check_write(v, sizeof(*v));				\
 	return arch_atomic##sz##_##op(v);				\
 }
 
@@ -124,6 +151,7 @@ INSTR_RET_BOOL1(inc_and_test);
 #define __INSTR_RET2(op, sz, type, rtype)				\
 static __always_inline rtype atomic##sz##_##op(type i, atomic##sz##_t *v) \
 {									\
+	kasan_check_write(v, sizeof(*v));				\
 	return arch_atomic##sz##_##op(i, v);				\
 }
 
@@ -162,48 +190,56 @@ INSTR_RET_BOOL2(add_negative);
 #define xchg(ptr, v)					\
 ({							\
 	__typeof__(ptr) ____ptr = (ptr);		\
+	kasan_check_write(____ptr, sizeof(*____ptr));	\
 	arch_xchg(____ptr, (v));			\
 })
 
 #define cmpxchg(ptr, old, new)				\
 ({							\
 	__typeof__(ptr) ___ptr = (ptr);			\
+	kasan_check_write(___ptr, sizeof(*___ptr));	\
 	arch_cmpxchg(___ptr, (old), (new));		\
 })
 
 #define sync_cmpxchg(ptr, old, new)			\
 ({							\
 	__typeof__(ptr) ___ptr = (ptr);			\
+	kasan_check_write(___ptr, sizeof(*___ptr));	\
 	arch_sync_cmpxchg(___ptr, (old), (new));	\
 })
 
 #define cmpxchg_local(ptr, old, new)			\
 ({							\
 	__typeof__(ptr) ____ptr = (ptr);		\
+	kasan_check_write(____ptr, sizeof(*____ptr));	\
 	arch_cmpxchg_local(____ptr, (old), (new));	\
 })
 
 #define cmpxchg64(ptr, old, new)			\
 ({							\
 	__typeof__(ptr) ____ptr = (ptr);		\
+	kasan_check_write(____ptr, sizeof(*____ptr));	\
 	arch_cmpxchg64(____ptr, (old), (new));		\
 })
 
 #define cmpxchg64_local(ptr, old, new)			\
 ({							\
 	__typeof__(ptr) ____ptr = (ptr);		\
+	kasan_check_write(____ptr, sizeof(*____ptr));	\
 	arch_cmpxchg64_local(____ptr, (old), (new));	\
 })
 
 #define cmpxchg_double(p1, p2, o1, o2, n1, n2)				\
 ({									\
 	__typeof__(p1) ____p1 = (p1);					\
+	kasan_check_write(____p1, 2 * sizeof(*____p1));			\
 	arch_cmpxchg_double(____p1, (p2), (o1), (o2), (n1), (n2));	\
 })
 
 #define cmpxchg_double_local(p1, p2, o1, o2, n1, n2)			\
 ({									\
 	__typeof__(p1) ____p1 = (p1);					\
+	kasan_check_write(____p1, 2 * sizeof(*____p1));			\
 	arch_cmpxchg_double_local(____p1, (p2), (o1), (o2), (n1), (n2));\
 })
 
-- 
2.12.0.367.g23dc2f6d3c-goog

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-14 19:24 ` [PATCH 2/3] asm-generic, x86: wrap atomic operations Dmitry Vyukov
@ 2017-03-20 16:41   ` Andrey Ryabinin
  2017-03-20 17:17   ` Mark Rutland
  2017-03-24  6:52   ` Ingo Molnar
  2 siblings, 0 replies; 27+ messages in thread
From: Andrey Ryabinin @ 2017-03-20 16:41 UTC (permalink / raw)
  To: Dmitry Vyukov, mark.rutland, peterz, mingo
  Cc: will.deacon, akpm, kasan-dev, linux-mm, x86, linux-kernel

On 03/14/2017 10:24 PM, Dmitry Vyukov wrote:

> diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
> index 14635c5ea025..95dd167eb3af 100644
> --- a/arch/x86/include/asm/atomic.h
> +++ b/arch/x86/include/asm/atomic.h
> @@ -16,36 +16,46 @@
>  #define ATOMIC_INIT(i)	{ (i) }
>  
>  /**
> - * atomic_read - read atomic variable
> + * arch_atomic_read - read atomic variable
>   * @v: pointer of type atomic_t
>   *
>   * Atomically reads the value of @v.
>   */
> -static __always_inline int atomic_read(const atomic_t *v)
> +static __always_inline int arch_atomic_read(const atomic_t *v)
>  {
> -	return READ_ONCE((v)->counter);
> +	/*
> +	 * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
> +	 * instrumentation. Double instrumentation is unnecessary.
> +	 */
> +	return READ_ONCE_NOCHECK((v)->counter);

This is kinda questionable "optimization". READ_ONCE_NOCHECK() is actually
function call to  __read_once_size_nocheck().
So this _NOCHECK adds some bloat to the kernel.
E.g. on my .config remove of _NOCHECK() saves ~400K of .text:

size vmlinux_read
   text    data     bss     dec     hex filename
40548235        16418838        23289856        80256929        4c89fa1 vmlinux_read
size vmlinux_read_nocheck
   text    data     bss     dec     hex filename
40938418        16451958        23289856        80680232        4cf1528 vmlinux_read_nocheck

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-14 19:24 ` [PATCH 2/3] asm-generic, x86: wrap atomic operations Dmitry Vyukov
  2017-03-20 16:41   ` Andrey Ryabinin
@ 2017-03-20 17:17   ` Mark Rutland
  2017-03-21  9:25     ` Andrey Ryabinin
  2017-03-24  6:52   ` Ingo Molnar
  2 siblings, 1 reply; 27+ messages in thread
From: Mark Rutland @ 2017-03-20 17:17 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: peterz, aryabinin, mingo, will.deacon, akpm, kasan-dev, linux-mm,
	x86, linux-kernel

Hi,

On Tue, Mar 14, 2017 at 08:24:13PM +0100, Dmitry Vyukov wrote:
>  /**
> - * atomic_read - read atomic variable
> + * arch_atomic_read - read atomic variable
>   * @v: pointer of type atomic_t
>   *
>   * Atomically reads the value of @v.
>   */
> -static __always_inline int atomic_read(const atomic_t *v)
> +static __always_inline int arch_atomic_read(const atomic_t *v)
>  {
> -	return READ_ONCE((v)->counter);
> +	/*
> +	 * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
> +	 * instrumentation. Double instrumentation is unnecessary.
> +	 */
> +	return READ_ONCE_NOCHECK((v)->counter);
>  }

Just to check, we do this to avoid duplicate reports, right?

If so, double instrumentation isn't solely "unnecessary"; it has a
functional difference, and we should explicitly describe that in the
comment.

... or are duplicate reports supressed somehow?

[...]

> +static __always_inline void arch_atomic_set(atomic_t *v, int i)
>  {
> +	/*
> +	 * We could use WRITE_ONCE_NOCHECK() if it exists, similar to
> +	 * READ_ONCE_NOCHECK() in arch_atomic_read(). But there is no such
> +	 * thing at the moment, and introducing it for this case does not
> +	 * worth it.
> +	 */
>  	WRITE_ONCE(v->counter, i);
>  }

If we are trying to avoid duplicate reports, we should do the same here.

[...]

> +static __always_inline short int atomic_inc_short(short int *v)
> +{
> +	return arch_atomic_inc_short(v);
> +}

This is x86-specific, and AFAICT, not used anywhere.

Given that it is arch-specific, I don't think it should be instrumented
here. If it isn't used, we could get rid of it entirely...

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-20 17:17   ` Mark Rutland
@ 2017-03-21  9:25     ` Andrey Ryabinin
  2017-03-21 10:41       ` Mark Rutland
  0 siblings, 1 reply; 27+ messages in thread
From: Andrey Ryabinin @ 2017-03-21  9:25 UTC (permalink / raw)
  To: Mark Rutland, Dmitry Vyukov
  Cc: peterz, mingo, will.deacon, akpm, kasan-dev, linux-mm, x86, linux-kernel

On 03/20/2017 08:17 PM, Mark Rutland wrote:
> Hi,
> 
> On Tue, Mar 14, 2017 at 08:24:13PM +0100, Dmitry Vyukov wrote:
>>  /**
>> - * atomic_read - read atomic variable
>> + * arch_atomic_read - read atomic variable
>>   * @v: pointer of type atomic_t
>>   *
>>   * Atomically reads the value of @v.
>>   */
>> -static __always_inline int atomic_read(const atomic_t *v)
>> +static __always_inline int arch_atomic_read(const atomic_t *v)
>>  {
>> -	return READ_ONCE((v)->counter);
>> +	/*
>> +	 * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
>> +	 * instrumentation. Double instrumentation is unnecessary.
>> +	 */
>> +	return READ_ONCE_NOCHECK((v)->counter);
>>  }
> 
> Just to check, we do this to avoid duplicate reports, right?
> 
> If so, double instrumentation isn't solely "unnecessary"; it has a
> functional difference, and we should explicitly describe that in the
> comment.
> 
> ... or are duplicate reports supressed somehow?
> 

They are not suppressed yet. But I think we should just switch kasan to single shot mode,
i.e. report only the first error. Single bug quite often has multiple invalid memory accesses
causing storm in dmesg. Also write OOB might corrupt metadata so the next report will print
bogus alloc/free stacktraces.
In most cases we need to look only at the first report, so reporting anything after the first
is just counterproductive.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-21  9:25     ` Andrey Ryabinin
@ 2017-03-21 10:41       ` Mark Rutland
  2017-03-21 18:06         ` Dmitry Vyukov
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Rutland @ 2017-03-21 10:41 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Dmitry Vyukov, peterz, mingo, will.deacon, akpm, kasan-dev,
	linux-mm, x86, linux-kernel

On Tue, Mar 21, 2017 at 12:25:06PM +0300, Andrey Ryabinin wrote:
> On 03/20/2017 08:17 PM, Mark Rutland wrote:
> > Hi,
> > 
> > On Tue, Mar 14, 2017 at 08:24:13PM +0100, Dmitry Vyukov wrote:
> >>  /**
> >> - * atomic_read - read atomic variable
> >> + * arch_atomic_read - read atomic variable
> >>   * @v: pointer of type atomic_t
> >>   *
> >>   * Atomically reads the value of @v.
> >>   */
> >> -static __always_inline int atomic_read(const atomic_t *v)
> >> +static __always_inline int arch_atomic_read(const atomic_t *v)
> >>  {
> >> -	return READ_ONCE((v)->counter);
> >> +	/*
> >> +	 * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
> >> +	 * instrumentation. Double instrumentation is unnecessary.
> >> +	 */
> >> +	return READ_ONCE_NOCHECK((v)->counter);
> >>  }
> > 
> > Just to check, we do this to avoid duplicate reports, right?
> > 
> > If so, double instrumentation isn't solely "unnecessary"; it has a
> > functional difference, and we should explicitly describe that in the
> > comment.
> > 
> > ... or are duplicate reports supressed somehow?
> 
> They are not suppressed yet. But I think we should just switch kasan
> to single shot mode, i.e. report only the first error. Single bug
> quite often has multiple invalid memory accesses causing storm in
> dmesg. Also write OOB might corrupt metadata so the next report will
> print bogus alloc/free stacktraces.
> In most cases we need to look only at the first report, so reporting
> anything after the first is just counterproductive.

FWIW, that sounds sane to me.

Given that, I agree with your comment regarding READ_ONCE{,_NOCHECK}().

If anyone really wants all the reports, we could have a boot-time option
to do that.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-21 10:41       ` Mark Rutland
@ 2017-03-21 18:06         ` Dmitry Vyukov
  2017-03-21 21:20           ` Arnd Bergmann
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-21 18:06 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Andrey Ryabinin, Peter Zijlstra, Ingo Molnar, Will Deacon,
	Andrew Morton, kasan-dev, linux-mm, x86, LKML

On Tue, Mar 21, 2017 at 11:41 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Tue, Mar 21, 2017 at 12:25:06PM +0300, Andrey Ryabinin wrote:
>> On 03/20/2017 08:17 PM, Mark Rutland wrote:
>> > Hi,
>> >
>> > On Tue, Mar 14, 2017 at 08:24:13PM +0100, Dmitry Vyukov wrote:
>> >>  /**
>> >> - * atomic_read - read atomic variable
>> >> + * arch_atomic_read - read atomic variable
>> >>   * @v: pointer of type atomic_t
>> >>   *
>> >>   * Atomically reads the value of @v.
>> >>   */
>> >> -static __always_inline int atomic_read(const atomic_t *v)
>> >> +static __always_inline int arch_atomic_read(const atomic_t *v)
>> >>  {
>> >> -  return READ_ONCE((v)->counter);
>> >> +  /*
>> >> +   * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
>> >> +   * instrumentation. Double instrumentation is unnecessary.
>> >> +   */
>> >> +  return READ_ONCE_NOCHECK((v)->counter);
>> >>  }
>> >
>> > Just to check, we do this to avoid duplicate reports, right?
>> >
>> > If so, double instrumentation isn't solely "unnecessary"; it has a
>> > functional difference, and we should explicitly describe that in the
>> > comment.
>> >
>> > ... or are duplicate reports supressed somehow?
>>
>> They are not suppressed yet. But I think we should just switch kasan
>> to single shot mode, i.e. report only the first error. Single bug
>> quite often has multiple invalid memory accesses causing storm in
>> dmesg. Also write OOB might corrupt metadata so the next report will
>> print bogus alloc/free stacktraces.
>> In most cases we need to look only at the first report, so reporting
>> anything after the first is just counterproductive.
>
> FWIW, that sounds sane to me.
>
> Given that, I agree with your comment regarding READ_ONCE{,_NOCHECK}().
>
> If anyone really wants all the reports, we could have a boot-time option
> to do that.


I don't mind changing READ_ONCE_NOCHECK to READ_ONCE. But I don't have
strong preference either way.

We could do:
#define arch_atomic_read_is_already_instrumented 1
and then skip instrumentation in asm-generic if it's defined. But I
don't think it's worth it.

There is no functional difference, it's only an optimization (now
somewhat questionable). As Andrey said, one can get a splash of
reports anyway, and it's the first one that is important. We use KASAN
with panic_on_warn=1 so we don't even see the rest.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-21 18:06         ` Dmitry Vyukov
@ 2017-03-21 21:20           ` Arnd Bergmann
  2017-03-22 10:42             ` Dmitry Vyukov
  0 siblings, 1 reply; 27+ messages in thread
From: Arnd Bergmann @ 2017-03-21 21:20 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Mark Rutland, Andrey Ryabinin, Peter Zijlstra, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML

On Tue, Mar 21, 2017 at 7:06 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Tue, Mar 21, 2017 at 11:41 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> On Tue, Mar 21, 2017 at 12:25:06PM +0300, Andrey Ryabinin wrote:
>
> I don't mind changing READ_ONCE_NOCHECK to READ_ONCE. But I don't have
> strong preference either way.
>
> We could do:
> #define arch_atomic_read_is_already_instrumented 1
> and then skip instrumentation in asm-generic if it's defined. But I
> don't think it's worth it.
>
> There is no functional difference, it's only an optimization (now
> somewhat questionable). As Andrey said, one can get a splash of
> reports anyway, and it's the first one that is important. We use KASAN
> with panic_on_warn=1 so we don't even see the rest.

I'm getting couple of new stack size warnings that are all the result
of the _NOCHECK.

/git/arm-soc/mm/page_alloc.c: In function 'show_free_areas':
/git/arm-soc/mm/page_alloc.c:4685:1: error: the frame size of 3368
bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
 }
/git/arm-soc/lib/atomic64_test.c: In function 'test_atomic':
/git/arm-soc/lib/atomic64_test.c:148:1: error: the frame size of 6528
bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
 }
 ^
/git/arm-soc/lib/atomic64_test.c: In function 'test_atomic64':
/git/arm-soc/lib/atomic64_test.c:243:1: error: the frame size of 7112
bytes is larger than 3072 bytes [-Werror=frame-larger-than=]

This is with my previous set of patches already applied, so
READ_ONCE should not cause problems. Reverting
the READ_ONCE_NOCHECK() in atomic_read() and atomic64_read()
back to READ_ONCE()

I also get a build failure as a result of your patch, but this one is
not addressed by using READ_ONCE():

In file included from /git/arm-soc/arch/x86/include/asm/atomic.h:7:0,
                 from /git/arm-soc/include/linux/atomic.h:4,
                 from /git/arm-soc/arch/x86/include/asm/thread_info.h:53,
                 from /git/arm-soc/include/linux/thread_info.h:25,
                 from /git/arm-soc/arch/x86/include/asm/preempt.h:6,
                 from /git/arm-soc/include/linux/preempt.h:80,
                 from /git/arm-soc/include/linux/spinlock.h:50,
                 from /git/arm-soc/include/linux/mmzone.h:7,
                 from /git/arm-soc/include/linux/gfp.h:5,
                 from /git/arm-soc/include/linux/mm.h:9,
                 from /git/arm-soc/mm/slub.c:12:
/git/arm-soc/mm/slub.c: In function '__slab_free':
/git/arm-soc/arch/x86/include/asm/cmpxchg.h:174:2: error: 'asm'
operand has impossible constraints
  asm volatile(pfx "cmpxchg%c4b %2; sete %0"   \
  ^
/git/arm-soc/arch/x86/include/asm/cmpxchg.h:183:2: note: in expansion
of macro '__cmpxchg_double'
  __cmpxchg_double(LOCK_PREFIX, p1, p2, o1, o2, n1, n2)
  ^~~~~~~~~~~~~~~~
/git/arm-soc/include/asm-generic/atomic-instrumented.h:236:2: note: in
expansion of macro 'arch_cmpxchg_double'
  arch_cmpxchg_double(____p1, (p2), (o1), (o2), (n1), (n2)); \
  ^~~~~~~~~~~~~~~~~~~
/git/arm-soc/mm/slub.c:385:7: note: in expansion of macro 'cmpxchg_double'
   if (cmpxchg_double(&page->freelist, &page->counters,
       ^~~~~~~~~~~~~~
/git/arm-soc/scripts/Makefile.build:308: recipe for target 'mm/slub.o' failed

http://pastebin.com/raw/qXVpi9Ev has the defconfig file I used, and I get the
error with any gcc version I tried (4.9 through 7.0.1).

      Arnd

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-21 21:20           ` Arnd Bergmann
@ 2017-03-22 10:42             ` Dmitry Vyukov
  2017-03-22 11:30               ` Arnd Bergmann
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-22 10:42 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Andrey Ryabinin, Peter Zijlstra, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML

On Tue, Mar 21, 2017 at 10:20 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tue, Mar 21, 2017 at 7:06 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Tue, Mar 21, 2017 at 11:41 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>>> On Tue, Mar 21, 2017 at 12:25:06PM +0300, Andrey Ryabinin wrote:
>>
>> I don't mind changing READ_ONCE_NOCHECK to READ_ONCE. But I don't have
>> strong preference either way.
>>
>> We could do:
>> #define arch_atomic_read_is_already_instrumented 1
>> and then skip instrumentation in asm-generic if it's defined. But I
>> don't think it's worth it.
>>
>> There is no functional difference, it's only an optimization (now
>> somewhat questionable). As Andrey said, one can get a splash of
>> reports anyway, and it's the first one that is important. We use KASAN
>> with panic_on_warn=1 so we don't even see the rest.
>
> I'm getting couple of new stack size warnings that are all the result
> of the _NOCHECK.
>
> /git/arm-soc/mm/page_alloc.c: In function 'show_free_areas':
> /git/arm-soc/mm/page_alloc.c:4685:1: error: the frame size of 3368
> bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
>  }
> /git/arm-soc/lib/atomic64_test.c: In function 'test_atomic':
> /git/arm-soc/lib/atomic64_test.c:148:1: error: the frame size of 6528
> bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
>  }
>  ^
> /git/arm-soc/lib/atomic64_test.c: In function 'test_atomic64':
> /git/arm-soc/lib/atomic64_test.c:243:1: error: the frame size of 7112
> bytes is larger than 3072 bytes [-Werror=frame-larger-than=]
>
> This is with my previous set of patches already applied, so
> READ_ONCE should not cause problems. Reverting
> the READ_ONCE_NOCHECK() in atomic_read() and atomic64_read()
> back to READ_ONCE()
>
> I also get a build failure as a result of your patch, but this one is
> not addressed by using READ_ONCE():
>
> In file included from /git/arm-soc/arch/x86/include/asm/atomic.h:7:0,
>                  from /git/arm-soc/include/linux/atomic.h:4,
>                  from /git/arm-soc/arch/x86/include/asm/thread_info.h:53,
>                  from /git/arm-soc/include/linux/thread_info.h:25,
>                  from /git/arm-soc/arch/x86/include/asm/preempt.h:6,
>                  from /git/arm-soc/include/linux/preempt.h:80,
>                  from /git/arm-soc/include/linux/spinlock.h:50,
>                  from /git/arm-soc/include/linux/mmzone.h:7,
>                  from /git/arm-soc/include/linux/gfp.h:5,
>                  from /git/arm-soc/include/linux/mm.h:9,
>                  from /git/arm-soc/mm/slub.c:12:
> /git/arm-soc/mm/slub.c: In function '__slab_free':
> /git/arm-soc/arch/x86/include/asm/cmpxchg.h:174:2: error: 'asm'
> operand has impossible constraints
>   asm volatile(pfx "cmpxchg%c4b %2; sete %0"   \
>   ^
> /git/arm-soc/arch/x86/include/asm/cmpxchg.h:183:2: note: in expansion
> of macro '__cmpxchg_double'
>   __cmpxchg_double(LOCK_PREFIX, p1, p2, o1, o2, n1, n2)
>   ^~~~~~~~~~~~~~~~
> /git/arm-soc/include/asm-generic/atomic-instrumented.h:236:2: note: in
> expansion of macro 'arch_cmpxchg_double'
>   arch_cmpxchg_double(____p1, (p2), (o1), (o2), (n1), (n2)); \
>   ^~~~~~~~~~~~~~~~~~~
> /git/arm-soc/mm/slub.c:385:7: note: in expansion of macro 'cmpxchg_double'
>    if (cmpxchg_double(&page->freelist, &page->counters,
>        ^~~~~~~~~~~~~~
> /git/arm-soc/scripts/Makefile.build:308: recipe for target 'mm/slub.o' failed
>
> http://pastebin.com/raw/qXVpi9Ev has the defconfig file I used, and I get the
> error with any gcc version I tried (4.9 through 7.0.1).


Initially I've tested with my stock gcc 4.8.4 (Ubuntu
4.8.4-2ubuntu1~14.04.3) and amusingly it works. But I can reproduce
the bug with 7.0.1.
Filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80148
Will think about kernel fix.
Thanks!

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-22 10:42             ` Dmitry Vyukov
@ 2017-03-22 11:30               ` Arnd Bergmann
  2017-03-22 12:14                 ` Dmitry Vyukov
  0 siblings, 1 reply; 27+ messages in thread
From: Arnd Bergmann @ 2017-03-22 11:30 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Mark Rutland, Andrey Ryabinin, Peter Zijlstra, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML

On Wed, Mar 22, 2017 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Tue, Mar 21, 2017 at 10:20 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Tue, Mar 21, 2017 at 7:06 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>
> Initially I've tested with my stock gcc 4.8.4 (Ubuntu
> 4.8.4-2ubuntu1~14.04.3) and amusingly it works. But I can reproduce
> the bug with 7.0.1.

It's probably because gcc-4.8 didn't support KASAN yet, so the added
check had no effect.

      Arnd

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-22 11:30               ` Arnd Bergmann
@ 2017-03-22 12:14                 ` Dmitry Vyukov
  2017-03-22 12:48                   ` Arnd Bergmann
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-22 12:14 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mark Rutland, Andrey Ryabinin, Peter Zijlstra, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML

On Wed, Mar 22, 2017 at 12:30 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Wed, Mar 22, 2017 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Tue, Mar 21, 2017 at 10:20 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>>> On Tue, Mar 21, 2017 at 7:06 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>
>> Initially I've tested with my stock gcc 4.8.4 (Ubuntu
>> 4.8.4-2ubuntu1~14.04.3) and amusingly it works. But I can reproduce
>> the bug with 7.0.1.
>
> It's probably because gcc-4.8 didn't support KASAN yet, so the added
> check had no effect.

I've tested without KASAN with both compilers.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-22 12:14                 ` Dmitry Vyukov
@ 2017-03-22 12:48                   ` Arnd Bergmann
  0 siblings, 0 replies; 27+ messages in thread
From: Arnd Bergmann @ 2017-03-22 12:48 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Mark Rutland, Andrey Ryabinin, Peter Zijlstra, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML

On Wed, Mar 22, 2017 at 1:14 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Wed, Mar 22, 2017 at 12:30 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Wed, Mar 22, 2017 at 11:42 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> On Tue, Mar 21, 2017 at 10:20 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>>>> On Tue, Mar 21, 2017 at 7:06 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>
>>> Initially I've tested with my stock gcc 4.8.4 (Ubuntu
>>> 4.8.4-2ubuntu1~14.04.3) and amusingly it works. But I can reproduce
>>> the bug with 7.0.1.
>>
>> It's probably because gcc-4.8 didn't support KASAN yet, so the added
>> check had no effect.
>
> I've tested without KASAN with both compilers.

Ah ok. I had not realized that this happened even without KASAN. I only saw this
problem in one out of hundreds of defconfig builds and assumed it was related
since this came from a kasan change.

      Arnd

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-14 19:24 ` [PATCH 2/3] asm-generic, x86: wrap atomic operations Dmitry Vyukov
  2017-03-20 16:41   ` Andrey Ryabinin
  2017-03-20 17:17   ` Mark Rutland
@ 2017-03-24  6:52   ` Ingo Molnar
  2017-03-24  7:14     ` Dmitry Vyukov
  2 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-03-24  6:52 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: mark.rutland, peterz, aryabinin, mingo, will.deacon, akpm,
	kasan-dev, linux-mm, x86, linux-kernel, Peter Zijlstra,
	Linus Torvalds, Thomas Gleixner, H. Peter Anvin


* Dmitry Vyukov <dvyukov@google.com> wrote:

> KASAN uses compiler instrumentation to intercept all memory accesses.
> But it does not see memory accesses done in assembly code.
> One notable user of assembly code is atomic operations. Frequently,
> for example, an atomic reference decrement is the last access to an
> object and a good candidate for a racy use-after-free.
> 
> Atomic operations are defined in arch files, but KASAN instrumentation
> is required for several archs that support KASAN. Later we will need
> similar hooks for KMSAN (uninit use detector) and KTSAN (data race
> detector).
> 
> This change introduces wrappers around atomic operations that can be
> used to add KASAN/KMSAN/KTSAN instrumentation across several archs.
> This patch uses the wrappers only for x86 arch. Arm64 will be switched
> later.
> 
> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Will Deacon <will.deacon@arm.com>,
> Cc: Andrew Morton <akpm@linux-foundation.org>,
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>,
> Cc: Ingo Molnar <mingo@redhat.com>,
> Cc: kasan-dev@googlegroups.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Cc: x86@kernel.org
> ---
>  arch/x86/include/asm/atomic.h             | 100 +++++++-------
>  arch/x86/include/asm/atomic64_32.h        |  86 ++++++------
>  arch/x86/include/asm/atomic64_64.h        |  90 ++++++-------
>  arch/x86/include/asm/cmpxchg.h            |  12 +-
>  arch/x86/include/asm/cmpxchg_32.h         |   8 +-
>  arch/x86/include/asm/cmpxchg_64.h         |   4 +-
>  include/asm-generic/atomic-instrumented.h | 210 ++++++++++++++++++++++++++++++
>  7 files changed, 367 insertions(+), 143 deletions(-)

Ugh, that's disgusting really...

> 
> diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
> index 14635c5ea025..95dd167eb3af 100644
> --- a/arch/x86/include/asm/atomic.h
> +++ b/arch/x86/include/asm/atomic.h
> @@ -16,36 +16,46 @@
>  #define ATOMIC_INIT(i)	{ (i) }
>  
>  /**
> - * atomic_read - read atomic variable
> + * arch_atomic_read - read atomic variable
>   * @v: pointer of type atomic_t
>   *
>   * Atomically reads the value of @v.
>   */
> -static __always_inline int atomic_read(const atomic_t *v)
> +static __always_inline int arch_atomic_read(const atomic_t *v)
>  {
> -	return READ_ONCE((v)->counter);
> +	/*
> +	 * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
> +	 * instrumentation. Double instrumentation is unnecessary.
> +	 */
> +	return READ_ONCE_NOCHECK((v)->counter);
>  }

Firstly, the patch is way too large, please split off new the documentation parts 
of the patch to reduce the size and to make it easier to read!

Secondly, the next patch should do the rename to arch_atomic_*() pattern - and 
nothing else:

>  
>  /**
> - * atomic_set - set atomic variable
> + * arch_atomic_set - set atomic variable
>   * @v: pointer of type atomic_t
>   * @i: required value
>   *
>   * Atomically sets the value of @v to @i.
>   */
> -static __always_inline void atomic_set(atomic_t *v, int i)
> +static __always_inline void arch_atomic_set(atomic_t *v, int i)


Third, the prototype CPP complications:

> +#define __INSTR_VOID1(op, sz)						\
> +static __always_inline void atomic##sz##_##op(atomic##sz##_t *v)	\
> +{									\
> +	arch_atomic##sz##_##op(v);					\
> +}
> +
> +#define INSTR_VOID1(op)	\
> +__INSTR_VOID1(op,);	\
> +__INSTR_VOID1(op, 64)
> +
> +INSTR_VOID1(inc);
> +INSTR_VOID1(dec);
> +
> +#undef __INSTR_VOID1
> +#undef INSTR_VOID1
> +
> +#define __INSTR_VOID2(op, sz, type)					\
> +static __always_inline void atomic##sz##_##op(type i, atomic##sz##_t *v)\
> +{									\
> +	arch_atomic##sz##_##op(i, v);					\
> +}
> +
> +#define INSTR_VOID2(op)		\
> +__INSTR_VOID2(op, , int);	\
> +__INSTR_VOID2(op, 64, long long)
> +
> +INSTR_VOID2(add);
> +INSTR_VOID2(sub);
> +INSTR_VOID2(and);
> +INSTR_VOID2(or);
> +INSTR_VOID2(xor);
> +
> +#undef __INSTR_VOID2
> +#undef INSTR_VOID2
> +
> +#define __INSTR_RET1(op, sz, type, rtype)				\
> +static __always_inline rtype atomic##sz##_##op(atomic##sz##_t *v)	\
> +{									\
> +	return arch_atomic##sz##_##op(v);				\
> +}
> +
> +#define INSTR_RET1(op)		\
> +__INSTR_RET1(op, , int, int);	\
> +__INSTR_RET1(op, 64, long long, long long)
> +
> +INSTR_RET1(inc_return);
> +INSTR_RET1(dec_return);
> +__INSTR_RET1(inc_not_zero, 64, long long, long long);
> +__INSTR_RET1(dec_if_positive, 64, long long, long long);
> +
> +#define INSTR_RET_BOOL1(op)	\
> +__INSTR_RET1(op, , int, bool);	\
> +__INSTR_RET1(op, 64, long long, bool)
> +
> +INSTR_RET_BOOL1(dec_and_test);
> +INSTR_RET_BOOL1(inc_and_test);
> +
> +#undef __INSTR_RET1
> +#undef INSTR_RET1
> +#undef INSTR_RET_BOOL1
> +
> +#define __INSTR_RET2(op, sz, type, rtype)				\
> +static __always_inline rtype atomic##sz##_##op(type i, atomic##sz##_t *v) \
> +{									\
> +	return arch_atomic##sz##_##op(i, v);				\
> +}
> +
> +#define INSTR_RET2(op)		\
> +__INSTR_RET2(op, , int, int);	\
> +__INSTR_RET2(op, 64, long long, long long)
> +
> +INSTR_RET2(add_return);
> +INSTR_RET2(sub_return);
> +INSTR_RET2(fetch_add);
> +INSTR_RET2(fetch_sub);
> +INSTR_RET2(fetch_and);
> +INSTR_RET2(fetch_or);
> +INSTR_RET2(fetch_xor);
> +
> +#define INSTR_RET_BOOL2(op)		\
> +__INSTR_RET2(op, , int, bool);		\
> +__INSTR_RET2(op, 64, long long, bool)
> +
> +INSTR_RET_BOOL2(sub_and_test);
> +INSTR_RET_BOOL2(add_negative);
> +
> +#undef __INSTR_RET2
> +#undef INSTR_RET2
> +#undef INSTR_RET_BOOL2

Are just utterly disgusting that turn perfectly readable code into an unreadable, 
unmaintainable mess.

You need to find some better, cleaner solution please, or convince me that no such 
solution is possible. NAK for the time being.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-24  6:52   ` Ingo Molnar
@ 2017-03-24  7:14     ` Dmitry Vyukov
  2017-03-24  8:39       ` Dmitry Vyukov
  2017-03-24 10:57       ` Ingo Molnar
  0 siblings, 2 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-24  7:14 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Peter Zijlstra, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Peter Zijlstra, Linus Torvalds, Thomas Gleixner, H. Peter Anvin

On Fri, Mar 24, 2017 at 7:52 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Dmitry Vyukov <dvyukov@google.com> wrote:
>
>> KASAN uses compiler instrumentation to intercept all memory accesses.
>> But it does not see memory accesses done in assembly code.
>> One notable user of assembly code is atomic operations. Frequently,
>> for example, an atomic reference decrement is the last access to an
>> object and a good candidate for a racy use-after-free.
>>
>> Atomic operations are defined in arch files, but KASAN instrumentation
>> is required for several archs that support KASAN. Later we will need
>> similar hooks for KMSAN (uninit use detector) and KTSAN (data race
>> detector).
>>
>> This change introduces wrappers around atomic operations that can be
>> used to add KASAN/KMSAN/KTSAN instrumentation across several archs.
>> This patch uses the wrappers only for x86 arch. Arm64 will be switched
>> later.
>>
>> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Will Deacon <will.deacon@arm.com>,
>> Cc: Andrew Morton <akpm@linux-foundation.org>,
>> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>,
>> Cc: Ingo Molnar <mingo@redhat.com>,
>> Cc: kasan-dev@googlegroups.com
>> Cc: linux-mm@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: x86@kernel.org
>> ---
>>  arch/x86/include/asm/atomic.h             | 100 +++++++-------
>>  arch/x86/include/asm/atomic64_32.h        |  86 ++++++------
>>  arch/x86/include/asm/atomic64_64.h        |  90 ++++++-------
>>  arch/x86/include/asm/cmpxchg.h            |  12 +-
>>  arch/x86/include/asm/cmpxchg_32.h         |   8 +-
>>  arch/x86/include/asm/cmpxchg_64.h         |   4 +-
>>  include/asm-generic/atomic-instrumented.h | 210 ++++++++++++++++++++++++++++++
>>  7 files changed, 367 insertions(+), 143 deletions(-)
>
> Ugh, that's disgusting really...
>
>>
>> diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
>> index 14635c5ea025..95dd167eb3af 100644
>> --- a/arch/x86/include/asm/atomic.h
>> +++ b/arch/x86/include/asm/atomic.h
>> @@ -16,36 +16,46 @@
>>  #define ATOMIC_INIT(i)       { (i) }
>>
>>  /**
>> - * atomic_read - read atomic variable
>> + * arch_atomic_read - read atomic variable
>>   * @v: pointer of type atomic_t
>>   *
>>   * Atomically reads the value of @v.
>>   */
>> -static __always_inline int atomic_read(const atomic_t *v)
>> +static __always_inline int arch_atomic_read(const atomic_t *v)
>>  {
>> -     return READ_ONCE((v)->counter);
>> +     /*
>> +      * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
>> +      * instrumentation. Double instrumentation is unnecessary.
>> +      */
>> +     return READ_ONCE_NOCHECK((v)->counter);
>>  }

Hello Ingo,

> Firstly, the patch is way too large, please split off new the documentation parts
> of the patch to reduce the size and to make it easier to read!
>
> Secondly, the next patch should do the rename to arch_atomic_*() pattern - and
> nothing else:

Next after what? Please provide full list of patches as you see them.
How do we avoid build breakage if we do only the rename in a separate patch?



>>  /**
>> - * atomic_set - set atomic variable
>> + * arch_atomic_set - set atomic variable
>>   * @v: pointer of type atomic_t
>>   * @i: required value
>>   *
>>   * Atomically sets the value of @v to @i.
>>   */
>> -static __always_inline void atomic_set(atomic_t *v, int i)
>> +static __always_inline void arch_atomic_set(atomic_t *v, int i)
>
>
> Third, the prototype CPP complications:
>
>> +#define __INSTR_VOID1(op, sz)                                                \
>> +static __always_inline void atomic##sz##_##op(atomic##sz##_t *v)     \
>> +{                                                                    \
>> +     arch_atomic##sz##_##op(v);                                      \
>> +}
>> +
>> +#define INSTR_VOID1(op)      \
>> +__INSTR_VOID1(op,);  \
>> +__INSTR_VOID1(op, 64)
>> +
>> +INSTR_VOID1(inc);
>> +INSTR_VOID1(dec);
>> +
>> +#undef __INSTR_VOID1
>> +#undef INSTR_VOID1
>> +
>> +#define __INSTR_VOID2(op, sz, type)                                  \
>> +static __always_inline void atomic##sz##_##op(type i, atomic##sz##_t *v)\
>> +{                                                                    \
>> +     arch_atomic##sz##_##op(i, v);                                   \
>> +}
>> +
>> +#define INSTR_VOID2(op)              \
>> +__INSTR_VOID2(op, , int);    \
>> +__INSTR_VOID2(op, 64, long long)
>> +
>> +INSTR_VOID2(add);
>> +INSTR_VOID2(sub);
>> +INSTR_VOID2(and);
>> +INSTR_VOID2(or);
>> +INSTR_VOID2(xor);
>> +
>> +#undef __INSTR_VOID2
>> +#undef INSTR_VOID2
>> +
>> +#define __INSTR_RET1(op, sz, type, rtype)                            \
>> +static __always_inline rtype atomic##sz##_##op(atomic##sz##_t *v)    \
>> +{                                                                    \
>> +     return arch_atomic##sz##_##op(v);                               \
>> +}
>> +
>> +#define INSTR_RET1(op)               \
>> +__INSTR_RET1(op, , int, int);        \
>> +__INSTR_RET1(op, 64, long long, long long)
>> +
>> +INSTR_RET1(inc_return);
>> +INSTR_RET1(dec_return);
>> +__INSTR_RET1(inc_not_zero, 64, long long, long long);
>> +__INSTR_RET1(dec_if_positive, 64, long long, long long);
>> +
>> +#define INSTR_RET_BOOL1(op)  \
>> +__INSTR_RET1(op, , int, bool);       \
>> +__INSTR_RET1(op, 64, long long, bool)
>> +
>> +INSTR_RET_BOOL1(dec_and_test);
>> +INSTR_RET_BOOL1(inc_and_test);
>> +
>> +#undef __INSTR_RET1
>> +#undef INSTR_RET1
>> +#undef INSTR_RET_BOOL1
>> +
>> +#define __INSTR_RET2(op, sz, type, rtype)                            \
>> +static __always_inline rtype atomic##sz##_##op(type i, atomic##sz##_t *v) \
>> +{                                                                    \
>> +     return arch_atomic##sz##_##op(i, v);                            \
>> +}
>> +
>> +#define INSTR_RET2(op)               \
>> +__INSTR_RET2(op, , int, int);        \
>> +__INSTR_RET2(op, 64, long long, long long)
>> +
>> +INSTR_RET2(add_return);
>> +INSTR_RET2(sub_return);
>> +INSTR_RET2(fetch_add);
>> +INSTR_RET2(fetch_sub);
>> +INSTR_RET2(fetch_and);
>> +INSTR_RET2(fetch_or);
>> +INSTR_RET2(fetch_xor);
>> +
>> +#define INSTR_RET_BOOL2(op)          \
>> +__INSTR_RET2(op, , int, bool);               \
>> +__INSTR_RET2(op, 64, long long, bool)
>> +
>> +INSTR_RET_BOOL2(sub_and_test);
>> +INSTR_RET_BOOL2(add_negative);
>> +
>> +#undef __INSTR_RET2
>> +#undef INSTR_RET2
>> +#undef INSTR_RET_BOOL2
>
> Are just utterly disgusting that turn perfectly readable code into an unreadable,
> unmaintainable mess.
>
> You need to find some better, cleaner solution please, or convince me that no such
> solution is possible. NAK for the time being.

Well, I can just write all functions as is. Does it better confirm to
kernel style? I've just looked at the x86 atomic.h and it uses macros
for similar purpose (ATOMIC_OP/ATOMIC_FETCH_OP), so I thought that
must be idiomatic kernel style...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-24  7:14     ` Dmitry Vyukov
@ 2017-03-24  8:39       ` Dmitry Vyukov
  2017-03-24 10:57       ` Ingo Molnar
  1 sibling, 0 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-24  8:39 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Peter Zijlstra, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Peter Zijlstra, Linus Torvalds, Thomas Gleixner, H. Peter Anvin

On Fri, Mar 24, 2017 at 8:14 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Fri, Mar 24, 2017 at 7:52 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>
>> * Dmitry Vyukov <dvyukov@google.com> wrote:
>>
>>> KASAN uses compiler instrumentation to intercept all memory accesses.
>>> But it does not see memory accesses done in assembly code.
>>> One notable user of assembly code is atomic operations. Frequently,
>>> for example, an atomic reference decrement is the last access to an
>>> object and a good candidate for a racy use-after-free.
>>>
>>> Atomic operations are defined in arch files, but KASAN instrumentation
>>> is required for several archs that support KASAN. Later we will need
>>> similar hooks for KMSAN (uninit use detector) and KTSAN (data race
>>> detector).
>>>
>>> This change introduces wrappers around atomic operations that can be
>>> used to add KASAN/KMSAN/KTSAN instrumentation across several archs.
>>> This patch uses the wrappers only for x86 arch. Arm64 will be switched
>>> later.
>>>
>>> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Cc: Will Deacon <will.deacon@arm.com>,
>>> Cc: Andrew Morton <akpm@linux-foundation.org>,
>>> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>,
>>> Cc: Ingo Molnar <mingo@redhat.com>,
>>> Cc: kasan-dev@googlegroups.com
>>> Cc: linux-mm@kvack.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Cc: x86@kernel.org
>>> ---
>>>  arch/x86/include/asm/atomic.h             | 100 +++++++-------
>>>  arch/x86/include/asm/atomic64_32.h        |  86 ++++++------
>>>  arch/x86/include/asm/atomic64_64.h        |  90 ++++++-------
>>>  arch/x86/include/asm/cmpxchg.h            |  12 +-
>>>  arch/x86/include/asm/cmpxchg_32.h         |   8 +-
>>>  arch/x86/include/asm/cmpxchg_64.h         |   4 +-
>>>  include/asm-generic/atomic-instrumented.h | 210 ++++++++++++++++++++++++++++++
>>>  7 files changed, 367 insertions(+), 143 deletions(-)
>>
>> Ugh, that's disgusting really...
>>
>>>
>>> diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
>>> index 14635c5ea025..95dd167eb3af 100644
>>> --- a/arch/x86/include/asm/atomic.h
>>> +++ b/arch/x86/include/asm/atomic.h
>>> @@ -16,36 +16,46 @@
>>>  #define ATOMIC_INIT(i)       { (i) }
>>>
>>>  /**
>>> - * atomic_read - read atomic variable
>>> + * arch_atomic_read - read atomic variable
>>>   * @v: pointer of type atomic_t
>>>   *
>>>   * Atomically reads the value of @v.
>>>   */
>>> -static __always_inline int atomic_read(const atomic_t *v)
>>> +static __always_inline int arch_atomic_read(const atomic_t *v)
>>>  {
>>> -     return READ_ONCE((v)->counter);
>>> +     /*
>>> +      * We use READ_ONCE_NOCHECK() because atomic_read() contains KASAN
>>> +      * instrumentation. Double instrumentation is unnecessary.
>>> +      */
>>> +     return READ_ONCE_NOCHECK((v)->counter);
>>>  }
>
> Hello Ingo,
>
>> Firstly, the patch is way too large, please split off new the documentation parts
>> of the patch to reduce the size and to make it easier to read!
>>
>> Secondly, the next patch should do the rename to arch_atomic_*() pattern - and
>> nothing else:
>
> Next after what? Please provide full list of patches as you see them.
> How do we avoid build breakage if we do only the rename in a separate patch?
>
>
>
>>>  /**
>>> - * atomic_set - set atomic variable
>>> + * arch_atomic_set - set atomic variable
>>>   * @v: pointer of type atomic_t
>>>   * @i: required value
>>>   *
>>>   * Atomically sets the value of @v to @i.
>>>   */
>>> -static __always_inline void atomic_set(atomic_t *v, int i)
>>> +static __always_inline void arch_atomic_set(atomic_t *v, int i)
>>
>>
>> Third, the prototype CPP complications:
>>
>>> +#define __INSTR_VOID1(op, sz)                                                \
>>> +static __always_inline void atomic##sz##_##op(atomic##sz##_t *v)     \
>>> +{                                                                    \
>>> +     arch_atomic##sz##_##op(v);                                      \
>>> +}
>>> +
>>> +#define INSTR_VOID1(op)      \
>>> +__INSTR_VOID1(op,);  \
>>> +__INSTR_VOID1(op, 64)
>>> +
>>> +INSTR_VOID1(inc);
>>> +INSTR_VOID1(dec);
>>> +
>>> +#undef __INSTR_VOID1
>>> +#undef INSTR_VOID1
>>> +
>>> +#define __INSTR_VOID2(op, sz, type)                                  \
>>> +static __always_inline void atomic##sz##_##op(type i, atomic##sz##_t *v)\
>>> +{                                                                    \
>>> +     arch_atomic##sz##_##op(i, v);                                   \
>>> +}
>>> +
>>> +#define INSTR_VOID2(op)              \
>>> +__INSTR_VOID2(op, , int);    \
>>> +__INSTR_VOID2(op, 64, long long)
>>> +
>>> +INSTR_VOID2(add);
>>> +INSTR_VOID2(sub);
>>> +INSTR_VOID2(and);
>>> +INSTR_VOID2(or);
>>> +INSTR_VOID2(xor);
>>> +
>>> +#undef __INSTR_VOID2
>>> +#undef INSTR_VOID2
>>> +
>>> +#define __INSTR_RET1(op, sz, type, rtype)                            \
>>> +static __always_inline rtype atomic##sz##_##op(atomic##sz##_t *v)    \
>>> +{                                                                    \
>>> +     return arch_atomic##sz##_##op(v);                               \
>>> +}
>>> +
>>> +#define INSTR_RET1(op)               \
>>> +__INSTR_RET1(op, , int, int);        \
>>> +__INSTR_RET1(op, 64, long long, long long)
>>> +
>>> +INSTR_RET1(inc_return);
>>> +INSTR_RET1(dec_return);
>>> +__INSTR_RET1(inc_not_zero, 64, long long, long long);
>>> +__INSTR_RET1(dec_if_positive, 64, long long, long long);
>>> +
>>> +#define INSTR_RET_BOOL1(op)  \
>>> +__INSTR_RET1(op, , int, bool);       \
>>> +__INSTR_RET1(op, 64, long long, bool)
>>> +
>>> +INSTR_RET_BOOL1(dec_and_test);
>>> +INSTR_RET_BOOL1(inc_and_test);
>>> +
>>> +#undef __INSTR_RET1
>>> +#undef INSTR_RET1
>>> +#undef INSTR_RET_BOOL1
>>> +
>>> +#define __INSTR_RET2(op, sz, type, rtype)                            \
>>> +static __always_inline rtype atomic##sz##_##op(type i, atomic##sz##_t *v) \
>>> +{                                                                    \
>>> +     return arch_atomic##sz##_##op(i, v);                            \
>>> +}
>>> +
>>> +#define INSTR_RET2(op)               \
>>> +__INSTR_RET2(op, , int, int);        \
>>> +__INSTR_RET2(op, 64, long long, long long)
>>> +
>>> +INSTR_RET2(add_return);
>>> +INSTR_RET2(sub_return);
>>> +INSTR_RET2(fetch_add);
>>> +INSTR_RET2(fetch_sub);
>>> +INSTR_RET2(fetch_and);
>>> +INSTR_RET2(fetch_or);
>>> +INSTR_RET2(fetch_xor);
>>> +
>>> +#define INSTR_RET_BOOL2(op)          \
>>> +__INSTR_RET2(op, , int, bool);               \
>>> +__INSTR_RET2(op, 64, long long, bool)
>>> +
>>> +INSTR_RET_BOOL2(sub_and_test);
>>> +INSTR_RET_BOOL2(add_negative);
>>> +
>>> +#undef __INSTR_RET2
>>> +#undef INSTR_RET2
>>> +#undef INSTR_RET_BOOL2
>>
>> Are just utterly disgusting that turn perfectly readable code into an unreadable,
>> unmaintainable mess.
>>
>> You need to find some better, cleaner solution please, or convince me that no such
>> solution is possible. NAK for the time being.
>
> Well, I can just write all functions as is. Does it better confirm to
> kernel style? I've just looked at the x86 atomic.h and it uses macros
> for similar purpose (ATOMIC_OP/ATOMIC_FETCH_OP), so I thought that
> must be idiomatic kernel style...


Stephen Rothwell reported that this patch conflicts with:
  a9ebf306f52c ("locking/atomic: Introduce atomic_try_cmpxchg()")
  e6790e4b5d5e ("locking/atomic/x86: Use atomic_try_cmpxchg()")
does it make sense to base my patch on the tree where these patches
were added and then submit to that tree?

I've also sent 2 fixes for this patch, if I resent this I also squash
these fixes, right?

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-24  7:14     ` Dmitry Vyukov
  2017-03-24  8:39       ` Dmitry Vyukov
@ 2017-03-24 10:57       ` Ingo Molnar
  2017-03-24 12:46         ` Dmitry Vyukov
  1 sibling, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-03-24 10:57 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Mark Rutland, Peter Zijlstra, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Peter Zijlstra, Linus Torvalds, Thomas Gleixner, H. Peter Anvin


* Dmitry Vyukov <dvyukov@google.com> wrote:

> > Are just utterly disgusting that turn perfectly readable code into an 
> > unreadable, unmaintainable mess.
> >
> > You need to find some better, cleaner solution please, or convince me that no 
> > such solution is possible. NAK for the time being.
> 
> Well, I can just write all functions as is. Does it better confirm to kernel 
> style?

I think writing the prototypes out as-is, properly organized, beats any of these 
macro based solutions.

> [...] I've just looked at the x86 atomic.h and it uses macros for similar 
> purpose (ATOMIC_OP/ATOMIC_FETCH_OP), so I thought that must be idiomatic kernel 
> style...

Mind fixing those too while at it?

And please squash any bug fixes and re-send a clean series against latest upstream 
or so.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-24 10:57       ` Ingo Molnar
@ 2017-03-24 12:46         ` Dmitry Vyukov
  2017-03-28  7:52           ` Ingo Molnar
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-24 12:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mark Rutland, Peter Zijlstra, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Peter Zijlstra, Linus Torvalds, Thomas Gleixner, H. Peter Anvin

On Fri, Mar 24, 2017 at 11:57 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Dmitry Vyukov <dvyukov@google.com> wrote:
>
>> > Are just utterly disgusting that turn perfectly readable code into an
>> > unreadable, unmaintainable mess.
>> >
>> > You need to find some better, cleaner solution please, or convince me that no
>> > such solution is possible. NAK for the time being.
>>
>> Well, I can just write all functions as is. Does it better confirm to kernel
>> style?
>
> I think writing the prototypes out as-is, properly organized, beats any of these
> macro based solutions.

You mean write out the prototypes, but use what for definitions? Macros again?

>> [...] I've just looked at the x86 atomic.h and it uses macros for similar
>> purpose (ATOMIC_OP/ATOMIC_FETCH_OP), so I thought that must be idiomatic kernel
>> style...
>
> Mind fixing those too while at it?

I don't mind once I understand how exactly you want it to look.

> And please squash any bug fixes and re-send a clean series against latest upstream
> or so.
>
> Thanks,
>
>         Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-24 12:46         ` Dmitry Vyukov
@ 2017-03-28  7:52           ` Ingo Molnar
  2017-03-28  9:27             ` Peter Zijlstra
  0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-03-28  7:52 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Mark Rutland, Peter Zijlstra, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Peter Zijlstra, Linus Torvalds, Thomas Gleixner, H. Peter Anvin


* Dmitry Vyukov <dvyukov@google.com> wrote:

> On Fri, Mar 24, 2017 at 11:57 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> >> > Are just utterly disgusting that turn perfectly readable code into an
> >> > unreadable, unmaintainable mess.
> >> >
> >> > You need to find some better, cleaner solution please, or convince me that no
> >> > such solution is possible. NAK for the time being.
> >>
> >> Well, I can just write all functions as is. Does it better confirm to kernel
> >> style?
> >
> > I think writing the prototypes out as-is, properly organized, beats any of these
> > macro based solutions.
> 
> You mean write out the prototypes, but use what for definitions? Macros again?

No, regular C code.

I don't see the point of generating all this code via CPP - it's certainly not 
making it more readable to me. I.e. this patch I commented on is a step backwards 
for readability.

I'd prefer repetition and a higher overall line count over complex CPP constructs.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-28  7:52           ` Ingo Molnar
@ 2017-03-28  9:27             ` Peter Zijlstra
  2017-03-28  9:46               ` Dmitry Vyukov
  2017-03-28  9:51               ` Ingo Molnar
  0 siblings, 2 replies; 27+ messages in thread
From: Peter Zijlstra @ 2017-03-28  9:27 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dmitry Vyukov, Mark Rutland, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Linus Torvalds, Thomas Gleixner, H. Peter Anvin

On Tue, Mar 28, 2017 at 09:52:32AM +0200, Ingo Molnar wrote:

> No, regular C code.
> 
> I don't see the point of generating all this code via CPP - it's certainly not 
> making it more readable to me. I.e. this patch I commented on is a step backwards 
> for readability.

Note that much of the atomic stuff we have today is all CPP already.

x86 is the exception because its 'weird', but most other archs are
almost pure CPP -- check Alpha for example, or asm-generic/atomic.h.

Also, look at linux/atomic.h, its a giant maze of CPP.

The CPP help us generate functions, reduces endless copy/paste (which
induces random differences -- read bugs) and construct variants
depending on the architecture input.

Yes, the CPP is a pain, but writing all that out explicitly is more of a
pain.



I've not yet looked too hard at these patches under consideration; and I
really wish we could get the compiler to do the right thing here, but
reducing the endless copy/paste that's otherwise the result of this, is
something I've found to be very valuable.

Not to mention that adding additional atomic ops got trivial (the set is
now near complete, so that's not much of an argument anymore -- but it
was, its what kept me sane sanitizing the atomic ops across all our 25+
architectures).

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-28  9:27             ` Peter Zijlstra
@ 2017-03-28  9:46               ` Dmitry Vyukov
  2017-03-28  9:51               ` Ingo Molnar
  1 sibling, 0 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-28  9:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Mark Rutland, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Linus Torvalds, Thomas Gleixner, H. Peter Anvin

On Tue, Mar 28, 2017 at 11:27 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Mar 28, 2017 at 09:52:32AM +0200, Ingo Molnar wrote:
>
>> No, regular C code.
>>
>> I don't see the point of generating all this code via CPP - it's certainly not
>> making it more readable to me. I.e. this patch I commented on is a step backwards
>> for readability.
>
> Note that much of the atomic stuff we have today is all CPP already.
>
> x86 is the exception because its 'weird', but most other archs are
> almost pure CPP -- check Alpha for example, or asm-generic/atomic.h.
>
> Also, look at linux/atomic.h, its a giant maze of CPP.
>
> The CPP help us generate functions, reduces endless copy/paste (which
> induces random differences -- read bugs) and construct variants
> depending on the architecture input.
>
> Yes, the CPP is a pain, but writing all that out explicitly is more of a
> pain.
>
>
>
> I've not yet looked too hard at these patches under consideration; and I
> really wish we could get the compiler to do the right thing here, but
> reducing the endless copy/paste that's otherwise the result of this, is
> something I've found to be very valuable.
>
> Not to mention that adding additional atomic ops got trivial (the set is
> now near complete, so that's not much of an argument anymore -- but it
> was, its what kept me sane sanitizing the atomic ops across all our 25+
> architectures).


I am almost done with Ingo's proposal, including de-macro-ifying x86
atomic ops code.
I am ready to do either of them, I think both have pros and cons and
there is no perfect solution. But please agree on something.


While we are here, one thing that I noticed is that 32-bit atomic code
uses 'long long' for 64-bit operands, while 64-bit code uses 'long'
for 64-bit operands. This sorta worked more of less before, but
ultimately it makes it impossible to write any portable code (e.g. you
don't know what format specifier to use to print return value of
atomic64_read, nor what local variable type to use to avoid compiler
warnings). With the try_cmpxchg it become worse, because 'long*' is
not convertible to 'long long*' so it is not possible to write any
portable code that uses it. If you declare 'old' variable as 'long'
32-bit code won't compile, if you declare it as 'long long' 64-bit
code won't compiler. I think we need to switch to a single type for
64-bit operands/return values, e.g. 'long long'.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-28  9:27             ` Peter Zijlstra
  2017-03-28  9:46               ` Dmitry Vyukov
@ 2017-03-28  9:51               ` Ingo Molnar
  2017-03-28  9:56                 ` Dmitry Vyukov
  1 sibling, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-03-28  9:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Vyukov, Mark Rutland, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Linus Torvalds, Thomas Gleixner, H. Peter Anvin


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Mar 28, 2017 at 09:52:32AM +0200, Ingo Molnar wrote:
> 
> > No, regular C code.
> > 
> > I don't see the point of generating all this code via CPP - it's certainly not 
> > making it more readable to me. I.e. this patch I commented on is a step backwards 
> > for readability.
> 
> Note that much of the atomic stuff we have today is all CPP already.

Yeah, but there it's implementational: we pick up arch primitives depending on 
whether they are defined, such as:

#ifndef atomic_read_acquire
# define atomic_read_acquire(v)		smp_load_acquire(&(v)->counter)
#endif

> x86 is the exception because its 'weird', but most other archs are
> almost pure CPP -- check Alpha for example, or asm-generic/atomic.h.

include/asm-generic/atomic.h looks pretty clean and readable overall.

> Also, look at linux/atomic.h, its a giant maze of CPP.

Nah, that's OK, much of is is essentially __weak inlines implemented via CPP - 
i.e. CPP is filling in a missing compiler feature.

But this patch I replied to appears to add instrumentation wrappery via CPP which 
looks like excessive and avoidable obfuscation to me.

If it's much more readable and much more compact than the C version then maybe, 
but I'd like to see the C version first and see ...

> The CPP help us generate functions, reduces endless copy/paste (which induces 
> random differences -- read bugs) and construct variants depending on the 
> architecture input.
> 
> Yes, the CPP is a pain, but writing all that out explicitly is more of a
> pain.

So I'm not convinced that it's true in this case.

Could we see the C version and compare? I could be wrong about it all.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-28  9:51               ` Ingo Molnar
@ 2017-03-28  9:56                 ` Dmitry Vyukov
  2017-03-28 10:15                   ` Ingo Molnar
  0 siblings, 1 reply; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-28  9:56 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Mark Rutland, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Linus Torvalds, Thomas Gleixner, H. Peter Anvin

On Tue, Mar 28, 2017 at 11:51 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Peter Zijlstra <peterz@infradead.org> wrote:
>
>> On Tue, Mar 28, 2017 at 09:52:32AM +0200, Ingo Molnar wrote:
>>
>> > No, regular C code.
>> >
>> > I don't see the point of generating all this code via CPP - it's certainly not
>> > making it more readable to me. I.e. this patch I commented on is a step backwards
>> > for readability.
>>
>> Note that much of the atomic stuff we have today is all CPP already.
>
> Yeah, but there it's implementational: we pick up arch primitives depending on
> whether they are defined, such as:
>
> #ifndef atomic_read_acquire
> # define atomic_read_acquire(v)         smp_load_acquire(&(v)->counter)
> #endif
>
>> x86 is the exception because its 'weird', but most other archs are
>> almost pure CPP -- check Alpha for example, or asm-generic/atomic.h.
>
> include/asm-generic/atomic.h looks pretty clean and readable overall.
>
>> Also, look at linux/atomic.h, its a giant maze of CPP.
>
> Nah, that's OK, much of is is essentially __weak inlines implemented via CPP -
> i.e. CPP is filling in a missing compiler feature.
>
> But this patch I replied to appears to add instrumentation wrappery via CPP which
> looks like excessive and avoidable obfuscation to me.
>
> If it's much more readable and much more compact than the C version then maybe,
> but I'd like to see the C version first and see ...
>
>> The CPP help us generate functions, reduces endless copy/paste (which induces
>> random differences -- read bugs) and construct variants depending on the
>> architecture input.
>>
>> Yes, the CPP is a pain, but writing all that out explicitly is more of a
>> pain.
>
> So I'm not convinced that it's true in this case.
>
> Could we see the C version and compare? I could be wrong about it all.

Here it is (without instrumentation):
https://gist.github.com/dvyukov/e33d580f701019e0cd99429054ff1f9a

Instrumentation will add for each function:

 static __always_inline void atomic64_set(atomic64_t *v, long long i)
 {
+       kasan_check_write(v, sizeof(*v));
        arch_atomic64_set(v, i);
 }

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-28  9:56                 ` Dmitry Vyukov
@ 2017-03-28 10:15                   ` Ingo Molnar
  2017-03-28 16:29                     ` Dmitry Vyukov
  0 siblings, 1 reply; 27+ messages in thread
From: Ingo Molnar @ 2017-03-28 10:15 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Mark Rutland, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Linus Torvalds, Thomas Gleixner, H. Peter Anvin


* Dmitry Vyukov <dvyukov@google.com> wrote:

> > So I'm not convinced that it's true in this case.
> >
> > Could we see the C version and compare? I could be wrong about it all.
> 
> Here it is (without instrumentation):
> https://gist.github.com/dvyukov/e33d580f701019e0cd99429054ff1f9a

Could you please include the full patch so that it can be discussed via email and 
such?

> Instrumentation will add for each function:
> 
>  static __always_inline void atomic64_set(atomic64_t *v, long long i)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         arch_atomic64_set(v, i);
>  }

That in itself looks sensible and readable.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/3] asm-generic, x86: wrap atomic operations
  2017-03-28 10:15                   ` Ingo Molnar
@ 2017-03-28 16:29                     ` Dmitry Vyukov
  0 siblings, 0 replies; 27+ messages in thread
From: Dmitry Vyukov @ 2017-03-28 16:29 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Mark Rutland, Andrey Ryabinin, Ingo Molnar,
	Will Deacon, Andrew Morton, kasan-dev, linux-mm, x86, LKML,
	Linus Torvalds, Thomas Gleixner, H. Peter Anvin

On Tue, Mar 28, 2017 at 12:15 PM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Dmitry Vyukov <dvyukov@google.com> wrote:
>
>> > So I'm not convinced that it's true in this case.
>> >
>> > Could we see the C version and compare? I could be wrong about it all.
>>
>> Here it is (without instrumentation):
>> https://gist.github.com/dvyukov/e33d580f701019e0cd99429054ff1f9a
>
> Could you please include the full patch so that it can be discussed via email and
> such?


Mailed the whole series.


>> Instrumentation will add for each function:
>>
>>  static __always_inline void atomic64_set(atomic64_t *v, long long i)
>>  {
>> +       kasan_check_write(v, sizeof(*v));
>>         arch_atomic64_set(v, i);
>>  }
>
> That in itself looks sensible and readable.
>
> Thanks,
>
>         Ingo

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/3] x86, kasan: add KASAN checks to atomic operations
  2017-03-14 19:24 [PATCH 0/3] x86, kasan: add KASAN checks to atomic operations Dmitry Vyukov
                   ` (2 preceding siblings ...)
  2017-03-14 19:24 ` [PATCH 3/3] asm-generic: add KASAN instrumentation to " Dmitry Vyukov
@ 2017-03-30 22:30 ` Andrew Morton
  3 siblings, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2017-03-30 22:30 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: mark.rutland, peterz, aryabinin, mingo, will.deacon, kasan-dev,
	linux-mm, x86, linux-kernel

On Tue, 14 Mar 2017 20:24:11 +0100 Dmitry Vyukov <dvyukov@google.com> wrote:

> KASAN uses compiler instrumentation to intercept all memory accesses.
> But it does not see memory accesses done in assembly code.
> One notable user of assembly code is atomic operations. Frequently,
> for example, an atomic reference decrement is the last access to an
> object and a good candidate for a racy use-after-free.

I'm getting a pile of build errors from this patchset (and related
patches).  Due to messed up merge fixing, probably.  Please the review
process has been a bit bumpy.

So I'll drop

kasan-allow-kasan_check_read-write-to-accept-pointers-to-volatiles.patch
asm-generic-x86-wrap-atomic-operations.patch
asm-generic-x86-wrap-atomic-operations-fix.patch
asm-generic-add-kasan-instrumentation-to-atomic-operations.patch
asm-generic-fix-compilation-failure-in-cmpxchg_double.patch
x86-remove-unused-atomic_inc_short.patch
x86-asm-generic-add-kasan-instrumentation-to-bitops.patch

for now.  Please resend (against -mm or linux-next) when the dust has
settled.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-03-30 22:30 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-14 19:24 [PATCH 0/3] x86, kasan: add KASAN checks to atomic operations Dmitry Vyukov
2017-03-14 19:24 ` [PATCH 1/3] kasan: allow kasan_check_read/write() to accept pointers to volatiles Dmitry Vyukov
2017-03-14 19:24 ` [PATCH 2/3] asm-generic, x86: wrap atomic operations Dmitry Vyukov
2017-03-20 16:41   ` Andrey Ryabinin
2017-03-20 17:17   ` Mark Rutland
2017-03-21  9:25     ` Andrey Ryabinin
2017-03-21 10:41       ` Mark Rutland
2017-03-21 18:06         ` Dmitry Vyukov
2017-03-21 21:20           ` Arnd Bergmann
2017-03-22 10:42             ` Dmitry Vyukov
2017-03-22 11:30               ` Arnd Bergmann
2017-03-22 12:14                 ` Dmitry Vyukov
2017-03-22 12:48                   ` Arnd Bergmann
2017-03-24  6:52   ` Ingo Molnar
2017-03-24  7:14     ` Dmitry Vyukov
2017-03-24  8:39       ` Dmitry Vyukov
2017-03-24 10:57       ` Ingo Molnar
2017-03-24 12:46         ` Dmitry Vyukov
2017-03-28  7:52           ` Ingo Molnar
2017-03-28  9:27             ` Peter Zijlstra
2017-03-28  9:46               ` Dmitry Vyukov
2017-03-28  9:51               ` Ingo Molnar
2017-03-28  9:56                 ` Dmitry Vyukov
2017-03-28 10:15                   ` Ingo Molnar
2017-03-28 16:29                     ` Dmitry Vyukov
2017-03-14 19:24 ` [PATCH 3/3] asm-generic: add KASAN instrumentation to " Dmitry Vyukov
2017-03-30 22:30 ` [PATCH 0/3] x86, kasan: add KASAN checks " Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).