linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86, kasan: add KASAN checks to atomic operations
@ 2017-03-06 12:42 Dmitry Vyukov
  2017-03-06 12:50 ` Dmitry Vyukov
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-06 12:42 UTC (permalink / raw)
  To: akpm, aryabinin
  Cc: peterz, mingo, Dmitry Vyukov, kasan-dev, linux-mm, linux-kernel

KASAN uses compiler instrumentation to intercept all memory accesses.
But it does not see memory accesses done in assembly code.
One notable user of assembly code is atomic operations. Frequently,
for example, an atomic reference decrement is the last access to an
object and a good candidate for a racy use-after-free.

Add manual KASAN checks to atomic operations.
Note: we need checks only before asm blocks and don't need them
in atomic functions composed of other atomic functions
(e.g. load-cmpxchg loops).

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org

---
Within a day it has found its first bug:

==================================================================
BUG: KASAN: use-after-free in atomic_dec_and_test
arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158
BUG: KASAN: use-after-free in put_task_struct
include/linux/sched/task.h:93 [inline] at addr ffff880079c30158
BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
kernel/events/core.c:1131 at addr ffff880079c30158
Write of size 4 by task syz-executor6/25698
CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x2fb/0x3fd lib/dump_stack.c:52
 kasan_object_err+0x1c/0x90 mm/kasan/report.c:166
 print_address_description mm/kasan/report.c:208 [inline]
 kasan_report_error mm/kasan/report.c:292 [inline]
 kasan_report.part.2+0x1b0/0x460 mm/kasan/report.c:314
 kasan_report+0x21/0x30 mm/kasan/report.c:301
 check_memory_region_inline mm/kasan/kasan.c:326 [inline]
 check_memory_region+0x139/0x190 mm/kasan/kasan.c:333
 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
 atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
 put_task_struct include/linux/sched/task.h:93 [inline]
 put_ctx+0xcf/0x110 kernel/events/core.c:1131
 perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
 perf_release+0x37/0x50 kernel/events/core.c:4338
 __fput+0x332/0x800 fs/file_table.c:209
 ____fput+0x15/0x20 fs/file_table.c:245
 task_work_run+0x197/0x260 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0xb38/0x29c0 kernel/exit.c:880
 do_group_exit+0x149/0x420 kernel/exit.c:984
 get_signal+0x7e0/0x1820 kernel/signal.c:2318
 do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
 exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
 syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
 do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x4458d9
RSP: 002b:00007f3f07187cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000007080c8 RCX: 00000000004458d9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000007080c8
RBP: 00000000007080a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f3f071889c0 R15: 00007f3f07188700
Object at ffff880079c30140, in cache task_struct size: 5376
Allocated:
PID = 25681
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:616
 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:555
 kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
 alloc_task_struct_node kernel/fork.c:153 [inline]
 dup_task_struct kernel/fork.c:495 [inline]
 copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
 copy_process kernel/fork.c:1531 [inline]
 _do_fork+0x200/0x1010 kernel/fork.c:1994
 SYSC_clone kernel/fork.c:2104 [inline]
 SyS_clone+0x37/0x50 kernel/fork.c:2098
 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
 return_from_SYSCALL_64+0x0/0x7a
Freed:
PID = 25681
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
 __cache_free mm/slab.c:3514 [inline]
 kmem_cache_free+0x71/0x240 mm/slab.c:3774
 free_task_struct kernel/fork.c:158 [inline]
 free_task+0x151/0x1d0 kernel/fork.c:370
 copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
 copy_process kernel/fork.c:1531 [inline]
 _do_fork+0x200/0x1010 kernel/fork.c:1994
 SYSC_clone kernel/fork.c:2104 [inline]
 SyS_clone+0x37/0x50 kernel/fork.c:2098
 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
 return_from_SYSCALL_64+0x0/0x7a
---
 arch/x86/include/asm/atomic.h      | 11 +++++++++++
 arch/x86/include/asm/atomic64_64.h | 10 ++++++++++
 arch/x86/include/asm/cmpxchg.h     |  4 ++++
 3 files changed, 25 insertions(+)

diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..64f0a7fb9b2f 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -2,6 +2,7 @@
 #define _ASM_X86_ATOMIC_H
 
 #include <linux/compiler.h>
+#include <linux/kasan-checks.h>
 #include <linux/types.h>
 #include <asm/alternative.h>
 #include <asm/cmpxchg.h>
@@ -47,6 +48,7 @@ static __always_inline void atomic_set(atomic_t *v, int i)
  */
 static __always_inline void atomic_add(int i, atomic_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm volatile(LOCK_PREFIX "addl %1,%0"
 		     : "+m" (v->counter)
 		     : "ir" (i));
@@ -61,6 +63,7 @@ static __always_inline void atomic_add(int i, atomic_t *v)
  */
 static __always_inline void atomic_sub(int i, atomic_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm volatile(LOCK_PREFIX "subl %1,%0"
 		     : "+m" (v->counter)
 		     : "ir" (i));
@@ -77,6 +80,7 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
  */
 static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, "er", i, "%0", e);
 }
 
@@ -88,6 +92,7 @@ static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
  */
 static __always_inline void atomic_inc(atomic_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm volatile(LOCK_PREFIX "incl %0"
 		     : "+m" (v->counter));
 }
@@ -100,6 +105,7 @@ static __always_inline void atomic_inc(atomic_t *v)
  */
 static __always_inline void atomic_dec(atomic_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm volatile(LOCK_PREFIX "decl %0"
 		     : "+m" (v->counter));
 }
@@ -114,6 +120,7 @@ static __always_inline void atomic_dec(atomic_t *v)
  */
 static __always_inline bool atomic_dec_and_test(atomic_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	GEN_UNARY_RMWcc(LOCK_PREFIX "decl", v->counter, "%0", e);
 }
 
@@ -127,6 +134,7 @@ static __always_inline bool atomic_dec_and_test(atomic_t *v)
  */
 static __always_inline bool atomic_inc_and_test(atomic_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	GEN_UNARY_RMWcc(LOCK_PREFIX "incl", v->counter, "%0", e);
 }
 
@@ -141,6 +149,7 @@ static __always_inline bool atomic_inc_and_test(atomic_t *v)
  */
 static __always_inline bool atomic_add_negative(int i, atomic_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	GEN_BINARY_RMWcc(LOCK_PREFIX "addl", v->counter, "er", i, "%0", s);
 }
 
@@ -194,6 +203,7 @@ static inline int atomic_xchg(atomic_t *v, int new)
 #define ATOMIC_OP(op)							\
 static inline void atomic_##op(int i, atomic_t *v)			\
 {									\
+	kasan_check_write(v, sizeof(*v));				\
 	asm volatile(LOCK_PREFIX #op"l %1,%0"				\
 			: "+m" (v->counter)				\
 			: "ir" (i)					\
@@ -258,6 +268,7 @@ static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
  */
 static __always_inline short int atomic_inc_short(short int *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm(LOCK_PREFIX "addw $1, %0" : "+m" (*v));
 	return *v;
 }
diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h
index 89ed2f6ae2f7..13fe8ff5a126 100644
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -2,6 +2,7 @@
 #define _ASM_X86_ATOMIC64_64_H
 
 #include <linux/types.h>
+#include <linux/kasan-checks.h>
 #include <asm/alternative.h>
 #include <asm/cmpxchg.h>
 
@@ -42,6 +43,7 @@ static inline void atomic64_set(atomic64_t *v, long i)
  */
 static __always_inline void atomic64_add(long i, atomic64_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm volatile(LOCK_PREFIX "addq %1,%0"
 		     : "=m" (v->counter)
 		     : "er" (i), "m" (v->counter));
@@ -56,6 +58,7 @@ static __always_inline void atomic64_add(long i, atomic64_t *v)
  */
 static inline void atomic64_sub(long i, atomic64_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm volatile(LOCK_PREFIX "subq %1,%0"
 		     : "=m" (v->counter)
 		     : "er" (i), "m" (v->counter));
@@ -72,6 +75,7 @@ static inline void atomic64_sub(long i, atomic64_t *v)
  */
 static inline bool atomic64_sub_and_test(long i, atomic64_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	GEN_BINARY_RMWcc(LOCK_PREFIX "subq", v->counter, "er", i, "%0", e);
 }
 
@@ -83,6 +87,7 @@ static inline bool atomic64_sub_and_test(long i, atomic64_t *v)
  */
 static __always_inline void atomic64_inc(atomic64_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm volatile(LOCK_PREFIX "incq %0"
 		     : "=m" (v->counter)
 		     : "m" (v->counter));
@@ -96,6 +101,7 @@ static __always_inline void atomic64_inc(atomic64_t *v)
  */
 static __always_inline void atomic64_dec(atomic64_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	asm volatile(LOCK_PREFIX "decq %0"
 		     : "=m" (v->counter)
 		     : "m" (v->counter));
@@ -111,6 +117,7 @@ static __always_inline void atomic64_dec(atomic64_t *v)
  */
 static inline bool atomic64_dec_and_test(atomic64_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e);
 }
 
@@ -124,6 +131,7 @@ static inline bool atomic64_dec_and_test(atomic64_t *v)
  */
 static inline bool atomic64_inc_and_test(atomic64_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e);
 }
 
@@ -138,6 +146,7 @@ static inline bool atomic64_inc_and_test(atomic64_t *v)
  */
 static inline bool atomic64_add_negative(long i, atomic64_t *v)
 {
+	kasan_check_write(v, sizeof(*v));
 	GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
 }
 
@@ -233,6 +242,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
 #define ATOMIC64_OP(op)							\
 static inline void atomic64_##op(long i, atomic64_t *v)			\
 {									\
+	kasan_check_write(v, sizeof(*v));				\
 	asm volatile(LOCK_PREFIX #op"q %1,%0"				\
 			: "+m" (v->counter)				\
 			: "er" (i)					\
diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 97848cdfcb1a..a10e7fb09210 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -2,6 +2,7 @@
 #define ASM_X86_CMPXCHG_H
 
 #include <linux/compiler.h>
+#include <linux/kasan-checks.h>
 #include <asm/cpufeatures.h>
 #include <asm/alternative.h> /* Provides LOCK_PREFIX */
 
@@ -41,6 +42,7 @@ extern void __add_wrong_size(void)
 #define __xchg_op(ptr, arg, op, lock)					\
 	({								\
 	        __typeof__ (*(ptr)) __ret = (arg);			\
+		kasan_check_write((void *)(ptr), sizeof(*(ptr)));	\
 		switch (sizeof(*(ptr))) {				\
 		case __X86_CASE_B:					\
 			asm volatile (lock #op "b %b0, %1\n"		\
@@ -86,6 +88,7 @@ extern void __add_wrong_size(void)
 	__typeof__(*(ptr)) __ret;					\
 	__typeof__(*(ptr)) __old = (old);				\
 	__typeof__(*(ptr)) __new = (new);				\
+	kasan_check_write((void *)(ptr), sizeof(*(ptr)));		\
 	switch (size) {							\
 	case __X86_CASE_B:						\
 	{								\
@@ -171,6 +174,7 @@ extern void __add_wrong_size(void)
 	BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long));			\
 	VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long)));		\
 	VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2));	\
+	kasan_check_write((void *)(p1), 2 * sizeof(*(p1)));		\
 	asm volatile(pfx "cmpxchg%c4b %2; sete %0"			\
 		     : "=a" (__ret), "+d" (__old2),			\
 		       "+m" (*(p1)), "+m" (*(p2))			\
-- 
2.12.0.rc1.440.g5b76565f74-goog

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 12:42 [PATCH] x86, kasan: add KASAN checks to atomic operations Dmitry Vyukov
@ 2017-03-06 12:50 ` Dmitry Vyukov
  2017-03-06 12:58   ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-06 12:50 UTC (permalink / raw)
  To: Andrew Morton, Andrey Ryabinin
  Cc: Peter Zijlstra, Ingo Molnar, Dmitry Vyukov, kasan-dev, linux-mm, LKML

On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> KASAN uses compiler instrumentation to intercept all memory accesses.
> But it does not see memory accesses done in assembly code.
> One notable user of assembly code is atomic operations. Frequently,
> for example, an atomic reference decrement is the last access to an
> object and a good candidate for a racy use-after-free.
>
> Add manual KASAN checks to atomic operations.
> Note: we need checks only before asm blocks and don't need them
> in atomic functions composed of other atomic functions
> (e.g. load-cmpxchg loops).

Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.


> Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: kasan-dev@googlegroups.com
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
>
> ---
> Within a day it has found its first bug:
>
> ==================================================================
> BUG: KASAN: use-after-free in atomic_dec_and_test
> arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158
> BUG: KASAN: use-after-free in put_task_struct
> include/linux/sched/task.h:93 [inline] at addr ffff880079c30158
> BUG: KASAN: use-after-free in put_ctx+0xcf/0x110
> kernel/events/core.c:1131 at addr ffff880079c30158
> Write of size 4 by task syz-executor6/25698
> CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x2fb/0x3fd lib/dump_stack.c:52
>  kasan_object_err+0x1c/0x90 mm/kasan/report.c:166
>  print_address_description mm/kasan/report.c:208 [inline]
>  kasan_report_error mm/kasan/report.c:292 [inline]
>  kasan_report.part.2+0x1b0/0x460 mm/kasan/report.c:314
>  kasan_report+0x21/0x30 mm/kasan/report.c:301
>  check_memory_region_inline mm/kasan/kasan.c:326 [inline]
>  check_memory_region+0x139/0x190 mm/kasan/kasan.c:333
>  kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
>  atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
>  put_task_struct include/linux/sched/task.h:93 [inline]
>  put_ctx+0xcf/0x110 kernel/events/core.c:1131
>  perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
>  perf_release+0x37/0x50 kernel/events/core.c:4338
>  __fput+0x332/0x800 fs/file_table.c:209
>  ____fput+0x15/0x20 fs/file_table.c:245
>  task_work_run+0x197/0x260 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0xb38/0x29c0 kernel/exit.c:880
>  do_group_exit+0x149/0x420 kernel/exit.c:984
>  get_signal+0x7e0/0x1820 kernel/signal.c:2318
>  do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
>  exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
>  syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
>  do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
>  entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x4458d9
> RSP: 002b:00007f3f07187cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: fffffffffffffe00 RBX: 00000000007080c8 RCX: 00000000004458d9
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000007080c8
> RBP: 00000000007080a8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 0000000000000000 R14: 00007f3f071889c0 R15: 00007f3f07188700
> Object at ffff880079c30140, in cache task_struct size: 5376
> Allocated:
> PID = 25681
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_kmalloc+0xaa/0xd0 mm/kasan/kasan.c:616
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:555
>  kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
>  alloc_task_struct_node kernel/fork.c:153 [inline]
>  dup_task_struct kernel/fork.c:495 [inline]
>  copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
>  copy_process kernel/fork.c:1531 [inline]
>  _do_fork+0x200/0x1010 kernel/fork.c:1994
>  SYSC_clone kernel/fork.c:2104 [inline]
>  SyS_clone+0x37/0x50 kernel/fork.c:2098
>  do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>  return_from_SYSCALL_64+0x0/0x7a
> Freed:
> PID = 25681
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_slab_free+0x6f/0xb0 mm/kasan/kasan.c:589
>  __cache_free mm/slab.c:3514 [inline]
>  kmem_cache_free+0x71/0x240 mm/slab.c:3774
>  free_task_struct kernel/fork.c:158 [inline]
>  free_task+0x151/0x1d0 kernel/fork.c:370
>  copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
>  copy_process kernel/fork.c:1531 [inline]
>  _do_fork+0x200/0x1010 kernel/fork.c:1994
>  SYSC_clone kernel/fork.c:2104 [inline]
>  SyS_clone+0x37/0x50 kernel/fork.c:2098
>  do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>  return_from_SYSCALL_64+0x0/0x7a
> ---
>  arch/x86/include/asm/atomic.h      | 11 +++++++++++
>  arch/x86/include/asm/atomic64_64.h | 10 ++++++++++
>  arch/x86/include/asm/cmpxchg.h     |  4 ++++
>  3 files changed, 25 insertions(+)
>
> diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
> index 14635c5ea025..64f0a7fb9b2f 100644
> --- a/arch/x86/include/asm/atomic.h
> +++ b/arch/x86/include/asm/atomic.h
> @@ -2,6 +2,7 @@
>  #define _ASM_X86_ATOMIC_H
>
>  #include <linux/compiler.h>
> +#include <linux/kasan-checks.h>
>  #include <linux/types.h>
>  #include <asm/alternative.h>
>  #include <asm/cmpxchg.h>
> @@ -47,6 +48,7 @@ static __always_inline void atomic_set(atomic_t *v, int i)
>   */
>  static __always_inline void atomic_add(int i, atomic_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm volatile(LOCK_PREFIX "addl %1,%0"
>                      : "+m" (v->counter)
>                      : "ir" (i));
> @@ -61,6 +63,7 @@ static __always_inline void atomic_add(int i, atomic_t *v)
>   */
>  static __always_inline void atomic_sub(int i, atomic_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm volatile(LOCK_PREFIX "subl %1,%0"
>                      : "+m" (v->counter)
>                      : "ir" (i));
> @@ -77,6 +80,7 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
>   */
>  static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, "er", i, "%0", e);
>  }
>
> @@ -88,6 +92,7 @@ static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
>   */
>  static __always_inline void atomic_inc(atomic_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm volatile(LOCK_PREFIX "incl %0"
>                      : "+m" (v->counter));
>  }
> @@ -100,6 +105,7 @@ static __always_inline void atomic_inc(atomic_t *v)
>   */
>  static __always_inline void atomic_dec(atomic_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm volatile(LOCK_PREFIX "decl %0"
>                      : "+m" (v->counter));
>  }
> @@ -114,6 +120,7 @@ static __always_inline void atomic_dec(atomic_t *v)
>   */
>  static __always_inline bool atomic_dec_and_test(atomic_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         GEN_UNARY_RMWcc(LOCK_PREFIX "decl", v->counter, "%0", e);
>  }
>
> @@ -127,6 +134,7 @@ static __always_inline bool atomic_dec_and_test(atomic_t *v)
>   */
>  static __always_inline bool atomic_inc_and_test(atomic_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         GEN_UNARY_RMWcc(LOCK_PREFIX "incl", v->counter, "%0", e);
>  }
>
> @@ -141,6 +149,7 @@ static __always_inline bool atomic_inc_and_test(atomic_t *v)
>   */
>  static __always_inline bool atomic_add_negative(int i, atomic_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         GEN_BINARY_RMWcc(LOCK_PREFIX "addl", v->counter, "er", i, "%0", s);
>  }
>
> @@ -194,6 +203,7 @@ static inline int atomic_xchg(atomic_t *v, int new)
>  #define ATOMIC_OP(op)                                                  \
>  static inline void atomic_##op(int i, atomic_t *v)                     \
>  {                                                                      \
> +       kasan_check_write(v, sizeof(*v));                               \
>         asm volatile(LOCK_PREFIX #op"l %1,%0"                           \
>                         : "+m" (v->counter)                             \
>                         : "ir" (i)                                      \
> @@ -258,6 +268,7 @@ static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
>   */
>  static __always_inline short int atomic_inc_short(short int *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm(LOCK_PREFIX "addw $1, %0" : "+m" (*v));
>         return *v;
>  }
> diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h
> index 89ed2f6ae2f7..13fe8ff5a126 100644
> --- a/arch/x86/include/asm/atomic64_64.h
> +++ b/arch/x86/include/asm/atomic64_64.h
> @@ -2,6 +2,7 @@
>  #define _ASM_X86_ATOMIC64_64_H
>
>  #include <linux/types.h>
> +#include <linux/kasan-checks.h>
>  #include <asm/alternative.h>
>  #include <asm/cmpxchg.h>
>
> @@ -42,6 +43,7 @@ static inline void atomic64_set(atomic64_t *v, long i)
>   */
>  static __always_inline void atomic64_add(long i, atomic64_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm volatile(LOCK_PREFIX "addq %1,%0"
>                      : "=m" (v->counter)
>                      : "er" (i), "m" (v->counter));
> @@ -56,6 +58,7 @@ static __always_inline void atomic64_add(long i, atomic64_t *v)
>   */
>  static inline void atomic64_sub(long i, atomic64_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm volatile(LOCK_PREFIX "subq %1,%0"
>                      : "=m" (v->counter)
>                      : "er" (i), "m" (v->counter));
> @@ -72,6 +75,7 @@ static inline void atomic64_sub(long i, atomic64_t *v)
>   */
>  static inline bool atomic64_sub_and_test(long i, atomic64_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         GEN_BINARY_RMWcc(LOCK_PREFIX "subq", v->counter, "er", i, "%0", e);
>  }
>
> @@ -83,6 +87,7 @@ static inline bool atomic64_sub_and_test(long i, atomic64_t *v)
>   */
>  static __always_inline void atomic64_inc(atomic64_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm volatile(LOCK_PREFIX "incq %0"
>                      : "=m" (v->counter)
>                      : "m" (v->counter));
> @@ -96,6 +101,7 @@ static __always_inline void atomic64_inc(atomic64_t *v)
>   */
>  static __always_inline void atomic64_dec(atomic64_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         asm volatile(LOCK_PREFIX "decq %0"
>                      : "=m" (v->counter)
>                      : "m" (v->counter));
> @@ -111,6 +117,7 @@ static __always_inline void atomic64_dec(atomic64_t *v)
>   */
>  static inline bool atomic64_dec_and_test(atomic64_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e);
>  }
>
> @@ -124,6 +131,7 @@ static inline bool atomic64_dec_and_test(atomic64_t *v)
>   */
>  static inline bool atomic64_inc_and_test(atomic64_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e);
>  }
>
> @@ -138,6 +146,7 @@ static inline bool atomic64_inc_and_test(atomic64_t *v)
>   */
>  static inline bool atomic64_add_negative(long i, atomic64_t *v)
>  {
> +       kasan_check_write(v, sizeof(*v));
>         GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
>  }
>
> @@ -233,6 +242,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
>  #define ATOMIC64_OP(op)                                                        \
>  static inline void atomic64_##op(long i, atomic64_t *v)                        \
>  {                                                                      \
> +       kasan_check_write(v, sizeof(*v));                               \
>         asm volatile(LOCK_PREFIX #op"q %1,%0"                           \
>                         : "+m" (v->counter)                             \
>                         : "er" (i)                                      \
> diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
> index 97848cdfcb1a..a10e7fb09210 100644
> --- a/arch/x86/include/asm/cmpxchg.h
> +++ b/arch/x86/include/asm/cmpxchg.h
> @@ -2,6 +2,7 @@
>  #define ASM_X86_CMPXCHG_H
>
>  #include <linux/compiler.h>
> +#include <linux/kasan-checks.h>
>  #include <asm/cpufeatures.h>
>  #include <asm/alternative.h> /* Provides LOCK_PREFIX */
>
> @@ -41,6 +42,7 @@ extern void __add_wrong_size(void)
>  #define __xchg_op(ptr, arg, op, lock)                                  \
>         ({                                                              \
>                 __typeof__ (*(ptr)) __ret = (arg);                      \
> +               kasan_check_write((void *)(ptr), sizeof(*(ptr)));       \
>                 switch (sizeof(*(ptr))) {                               \
>                 case __X86_CASE_B:                                      \
>                         asm volatile (lock #op "b %b0, %1\n"            \
> @@ -86,6 +88,7 @@ extern void __add_wrong_size(void)
>         __typeof__(*(ptr)) __ret;                                       \
>         __typeof__(*(ptr)) __old = (old);                               \
>         __typeof__(*(ptr)) __new = (new);                               \
> +       kasan_check_write((void *)(ptr), sizeof(*(ptr)));               \
>         switch (size) {                                                 \
>         case __X86_CASE_B:                                              \
>         {                                                               \
> @@ -171,6 +174,7 @@ extern void __add_wrong_size(void)
>         BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long));                    \
>         VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long)));            \
>         VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2));    \
> +       kasan_check_write((void *)(p1), 2 * sizeof(*(p1)));             \
>         asm volatile(pfx "cmpxchg%c4b %2; sete %0"                      \
>                      : "=a" (__ret), "+d" (__old2),                     \
>                        "+m" (*(p1)), "+m" (*(p2))                       \
> --
> 2.12.0.rc1.440.g5b76565f74-goog
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 12:50 ` Dmitry Vyukov
@ 2017-03-06 12:58   ` Peter Zijlstra
  2017-03-06 13:01     ` Peter Zijlstra
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-06 12:58 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrew Morton, Andrey Ryabinin, Ingo Molnar, kasan-dev, linux-mm, LKML

On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > KASAN uses compiler instrumentation to intercept all memory accesses.
> > But it does not see memory accesses done in assembly code.
> > One notable user of assembly code is atomic operations. Frequently,
> > for example, an atomic reference decrement is the last access to an
> > object and a good candidate for a racy use-after-free.
> >
> > Add manual KASAN checks to atomic operations.
> > Note: we need checks only before asm blocks and don't need them
> > in atomic functions composed of other atomic functions
> > (e.g. load-cmpxchg loops).
> 
> Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.
> 

> >  static __always_inline void atomic_add(int i, atomic_t *v)
> >  {
> > +       kasan_check_write(v, sizeof(*v));
> >         asm volatile(LOCK_PREFIX "addl %1,%0"
> >                      : "+m" (v->counter)
> >                      : "ir" (i));


So the problem is doing load/stores from asm bits, and GCC
(traditionally) doesn't try and interpret APP asm bits.

However, could we not write a GCC plugin that does exactly that?
Something that interprets the APP asm bits and generates these KASAN
bits that go with it?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 12:58   ` Peter Zijlstra
@ 2017-03-06 13:01     ` Peter Zijlstra
  2017-03-06 14:24       ` Dmitry Vyukov
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-06 13:01 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrew Morton, Andrey Ryabinin, Ingo Molnar, kasan-dev, linux-mm, LKML

On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > > KASAN uses compiler instrumentation to intercept all memory accesses.
> > > But it does not see memory accesses done in assembly code.
> > > One notable user of assembly code is atomic operations. Frequently,
> > > for example, an atomic reference decrement is the last access to an
> > > object and a good candidate for a racy use-after-free.
> > >
> > > Add manual KASAN checks to atomic operations.
> > > Note: we need checks only before asm blocks and don't need them
> > > in atomic functions composed of other atomic functions
> > > (e.g. load-cmpxchg loops).
> > 
> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.
> > 
> 
> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> > >  {
> > > +       kasan_check_write(v, sizeof(*v));
> > >         asm volatile(LOCK_PREFIX "addl %1,%0"
> > >                      : "+m" (v->counter)
> > >                      : "ir" (i));
> 
> 
> So the problem is doing load/stores from asm bits, and GCC
> (traditionally) doesn't try and interpret APP asm bits.
> 
> However, could we not write a GCC plugin that does exactly that?
> Something that interprets the APP asm bits and generates these KASAN
> bits that go with it?

Another suspect is the per-cpu stuff, that's all asm foo as well.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 13:01     ` Peter Zijlstra
@ 2017-03-06 14:24       ` Dmitry Vyukov
  2017-03-06 15:20         ` Peter Zijlstra
                           ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-06 14:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andrew Morton, Andrey Ryabinin, Ingo Molnar, kasan-dev, linux-mm,
	LKML, x86, Mark Rutland

On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
>> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
>> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> > > KASAN uses compiler instrumentation to intercept all memory accesses.
>> > > But it does not see memory accesses done in assembly code.
>> > > One notable user of assembly code is atomic operations. Frequently,
>> > > for example, an atomic reference decrement is the last access to an
>> > > object and a good candidate for a racy use-after-free.
>> > >
>> > > Add manual KASAN checks to atomic operations.
>> > > Note: we need checks only before asm blocks and don't need them
>> > > in atomic functions composed of other atomic functions
>> > > (e.g. load-cmpxchg loops).
>> >
>> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.
>> >
>>
>> > >  static __always_inline void atomic_add(int i, atomic_t *v)
>> > >  {
>> > > +       kasan_check_write(v, sizeof(*v));
>> > >         asm volatile(LOCK_PREFIX "addl %1,%0"
>> > >                      : "+m" (v->counter)
>> > >                      : "ir" (i));
>>
>>
>> So the problem is doing load/stores from asm bits, and GCC
>> (traditionally) doesn't try and interpret APP asm bits.
>>
>> However, could we not write a GCC plugin that does exactly that?
>> Something that interprets the APP asm bits and generates these KASAN
>> bits that go with it?
>
> Another suspect is the per-cpu stuff, that's all asm foo as well.


+x86, Mark

Let me provide more context and design alternatives.

There are also other archs, at least arm64 for now.
There are also other tools. For KTSAN (race detector) we will
absolutely need to hook into atomic ops. For KMSAN (uses of unit
values) we also need to understand atomic ops at least to some degree.
Both of them will require different instrumentation.
For KASAN we are also more interested in cases where it's more likely
that an object is touched only by an asm, but not by normal memory
accesses (otherwise we would report the bug on the normal access,
which is fine, this makes atomic ops stand out in my opinion).

We could involve compiler (and by compiler I mean clang, because we
are not going to touch gcc, any volunteers?).
However, it's unclear if it will be simpler or not. There will
definitely will be a problem with uaccess asm blocks. Currently KASAN
relies of the fact that it does not see uaccess accesses and the user
addresses are considered bad by KASAN. There can also be a problem
with offsets/sizes, it's not possible to figure out what exactly an
asm block touches, we can only assume that it directly dereferences
the passed pointer. However, for example, bitops touch the pointer
with offset. Looking at the current x86 impl, we should be able to
handle it because the offset is computed outside of asm blocks. But
it's unclear if we hit this problem in other places.
I also see that arm64 bitops are implemented in .S files. And we won't
be able to instrument them in compiler.
There can also be other problems. Is it possible that some asm blocks
accept e.g. physical addresses? KASAN would consider them as bad.

We could also provide a parallel implementation of atomic ops based on
the new compiler builtins (__atomic_load_n and friends):
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
and enable it under KSAN. The nice thing about it is that it will
automatically support arm64 and KMSAN and KTSAN.
But it's more work.

Re per-cpu asm. I would say that it's less critical than atomic ops.
Static per-cpu slots are not subject to use-after-free. Dynamic slots
can be subject to use-after-free and it would be nice to catch bugs
there. However, I think we will need to add manual
poisoning/unpoisoning of dynamic slots as well.

Bottom line:
1. Involving compiler looks quite complex, hard to deploy, and it's
unclear if it will actually make things easier.
2. This patch is the simplest short-term option (I am leaning towards
adding bitops to this patch and leaving percpu out for now).
3. Providing an implementation of atomic ops based on compiler
builtins looks like a nice option for other archs and tools, but is
more work. If you consider this as a good solution, we can move
straight to this option.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 14:24       ` Dmitry Vyukov
@ 2017-03-06 15:20         ` Peter Zijlstra
  2017-03-06 16:04           ` Mark Rutland
  2017-03-06 15:33         ` Peter Zijlstra
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-06 15:20 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrew Morton, Andrey Ryabinin, Ingo Molnar, kasan-dev, linux-mm,
	LKML, x86, Mark Rutland, Will Deacon

On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> We could also provide a parallel implementation of atomic ops based on
> the new compiler builtins (__atomic_load_n and friends):
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> and enable it under KSAN. The nice thing about it is that it will
> automatically support arm64 and KMSAN and KTSAN.
> But it's more work.

There's a summary out there somewhere, I think Will knows, that explain
how the C/C++ memory model and the Linux Kernel Memory model differ and
how its going to be 'interesting' to make using the C/C++ builtin crud
with the kernel 'correct.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 14:24       ` Dmitry Vyukov
  2017-03-06 15:20         ` Peter Zijlstra
@ 2017-03-06 15:33         ` Peter Zijlstra
  2017-03-06 16:20         ` Mark Rutland
  2017-03-06 16:48         ` Andrey Ryabinin
  3 siblings, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-06 15:33 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Andrew Morton, Andrey Ryabinin, Ingo Molnar, kasan-dev, linux-mm,
	LKML, x86, Mark Rutland

On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

FWIW, clang isn't even close to being a viable compiler for the kernel.
It lacks far too many features, _IF_ you can get it to compiler in the
first place.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 15:20         ` Peter Zijlstra
@ 2017-03-06 16:04           ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2017-03-06 16:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Vyukov, Andrew Morton, Andrey Ryabinin, Ingo Molnar,
	kasan-dev, linux-mm, LKML, x86, Will Deacon

On Mon, Mar 06, 2017 at 04:20:13PM +0100, Peter Zijlstra wrote:
> On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> > We could also provide a parallel implementation of atomic ops based on
> > the new compiler builtins (__atomic_load_n and friends):
> > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> > and enable it under KSAN. The nice thing about it is that it will
> > automatically support arm64 and KMSAN and KTSAN.
> > But it's more work.
> 
> There's a summary out there somewhere, I think Will knows, that explain
> how the C/C++ memory model and the Linux Kernel Memory model differ and
> how its going to be 'interesting' to make using the C/C++ builtin crud
> with the kernel 'correct.

Trivially, The C++ model doesn't feature I/O ordering [1]...

Otherwise Will pointed out a few details in [2].

Thanks,
Mark.

[1] https://lwn.net/Articles/698014/
[2] http://lwn.net/Articles/691295/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 14:24       ` Dmitry Vyukov
  2017-03-06 15:20         ` Peter Zijlstra
  2017-03-06 15:33         ` Peter Zijlstra
@ 2017-03-06 16:20         ` Mark Rutland
  2017-03-06 16:27           ` Dmitry Vyukov
  2017-03-06 20:35           ` Peter Zijlstra
  2017-03-06 16:48         ` Andrey Ryabinin
  3 siblings, 2 replies; 25+ messages in thread
From: Mark Rutland @ 2017-03-06 16:20 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Andrew Morton, Andrey Ryabinin, Ingo Molnar,
	kasan-dev, linux-mm, LKML, x86, will.deacon

Hi,

[roping in Will, since he loves atomics]

On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> >> > > KASAN uses compiler instrumentation to intercept all memory accesses.
> >> > > But it does not see memory accesses done in assembly code.
> >> > > One notable user of assembly code is atomic operations. Frequently,
> >> > > for example, an atomic reference decrement is the last access to an
> >> > > object and a good candidate for a racy use-after-free.
> >> > >
> >> > > Add manual KASAN checks to atomic operations.
> >> > > Note: we need checks only before asm blocks and don't need them
> >> > > in atomic functions composed of other atomic functions
> >> > > (e.g. load-cmpxchg loops).
> >> >
> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.
> >> >
> >>
> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> >> > >  {
> >> > > +       kasan_check_write(v, sizeof(*v));
> >> > >         asm volatile(LOCK_PREFIX "addl %1,%0"
> >> > >                      : "+m" (v->counter)
> >> > >                      : "ir" (i));
> >>
> >>
> >> So the problem is doing load/stores from asm bits, and GCC
> >> (traditionally) doesn't try and interpret APP asm bits.
> >>
> >> However, could we not write a GCC plugin that does exactly that?
> >> Something that interprets the APP asm bits and generates these KASAN
> >> bits that go with it?
> >
> > Another suspect is the per-cpu stuff, that's all asm foo as well.

Unfortunately, I think that manual annotation is the only way to handle
these (as we already do for kernel part of the uaccess sequences), since
we hide things from the compiler or otherwise trick it into doing what
we want.

> +x86, Mark
> 
> Let me provide more context and design alternatives.
> 
> There are also other archs, at least arm64 for now.
> There are also other tools. For KTSAN (race detector) we will
> absolutely need to hook into atomic ops. For KMSAN (uses of unit
> values) we also need to understand atomic ops at least to some degree.
> Both of them will require different instrumentation.
> For KASAN we are also more interested in cases where it's more likely
> that an object is touched only by an asm, but not by normal memory
> accesses (otherwise we would report the bug on the normal access,
> which is fine, this makes atomic ops stand out in my opinion).
> 
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

I don't think there's much you'll be able to do within the compiler,
assuming you mean to derive this from the asm block inputs and outputs.

Those can hide address-generation (e.g. with per-cpu stuff), which the
compiler may erroneously be detected as racing.

Those may also take fake inputs (e.g. the sp input to arm64's
__my_cpu_offset()) which may confuse matters.

Parsing the assembly itself will be *extremely* painful due to the way
that's set up for run-time patching.

> However, it's unclear if it will be simpler or not. There will
> definitely will be a problem with uaccess asm blocks. Currently KASAN
> relies of the fact that it does not see uaccess accesses and the user
> addresses are considered bad by KASAN. There can also be a problem
> with offsets/sizes, it's not possible to figure out what exactly an
> asm block touches, we can only assume that it directly dereferences
> the passed pointer. However, for example, bitops touch the pointer
> with offset. Looking at the current x86 impl, we should be able to
> handle it because the offset is computed outside of asm blocks. But
> it's unclear if we hit this problem in other places.

As above, I think you'd see more fun for the percpu stuff, since the
pointer passed into those is "fake", with a percpu pointer accessing
different addresses dependent on the CPU it is executed on.

> I also see that arm64 bitops are implemented in .S files. And we won't
> be able to instrument them in compiler.
> There can also be other problems. Is it possible that some asm blocks
> accept e.g. physical addresses? KASAN would consider them as bad.

I'm not sure I follow what you mean here.

I can imagine physical addresses being passed into asm statements that
don't access memory (e.g. for setting up the base registers for page
tables).

> We could also provide a parallel implementation of atomic ops based on
> the new compiler builtins (__atomic_load_n and friends):
> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
> and enable it under KSAN. The nice thing about it is that it will
> automatically support arm64 and KMSAN and KTSAN.
> But it's more work.

These don't permit runtime patching, and there are some differences
between the C11 and Linux kernel memory models, so at least in the near
term, I don't imagine we'd be likely to use this.

> Re per-cpu asm. I would say that it's less critical than atomic ops.
> Static per-cpu slots are not subject to use-after-free. Dynamic slots
> can be subject to use-after-free and it would be nice to catch bugs
> there. However, I think we will need to add manual
> poisoning/unpoisoning of dynamic slots as well.
> 
> Bottom line:
> 1. Involving compiler looks quite complex, hard to deploy, and it's
> unclear if it will actually make things easier.
> 2. This patch is the simplest short-term option (I am leaning towards
> adding bitops to this patch and leaving percpu out for now).
> 3. Providing an implementation of atomic ops based on compiler
> builtins looks like a nice option for other archs and tools, but is
> more work. If you consider this as a good solution, we can move
> straight to this option.

Having *only* seen the assembly snippet at the top of this mail, I can't
say whether this is the simplest implementation.

However, I do think that annotation of this sort is the only reasonable
way to handle this.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 16:20         ` Mark Rutland
@ 2017-03-06 16:27           ` Dmitry Vyukov
  2017-03-06 17:25             ` Mark Rutland
  2017-03-06 20:35           ` Peter Zijlstra
  1 sibling, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-06 16:27 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, Andrew Morton, Andrey Ryabinin, Ingo Molnar,
	kasan-dev, linux-mm, LKML, x86, Will Deacon

On Mon, Mar 6, 2017 at 5:20 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> [roping in Will, since he loves atomics]
>
> On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
>> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
>> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
>> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> >> > > KASAN uses compiler instrumentation to intercept all memory accesses.
>> >> > > But it does not see memory accesses done in assembly code.
>> >> > > One notable user of assembly code is atomic operations. Frequently,
>> >> > > for example, an atomic reference decrement is the last access to an
>> >> > > object and a good candidate for a racy use-after-free.
>> >> > >
>> >> > > Add manual KASAN checks to atomic operations.
>> >> > > Note: we need checks only before asm blocks and don't need them
>> >> > > in atomic functions composed of other atomic functions
>> >> > > (e.g. load-cmpxchg loops).
>> >> >
>> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.
>> >> >
>> >>
>> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
>> >> > >  {
>> >> > > +       kasan_check_write(v, sizeof(*v));
>> >> > >         asm volatile(LOCK_PREFIX "addl %1,%0"
>> >> > >                      : "+m" (v->counter)
>> >> > >                      : "ir" (i));
>> >>
>> >>
>> >> So the problem is doing load/stores from asm bits, and GCC
>> >> (traditionally) doesn't try and interpret APP asm bits.
>> >>
>> >> However, could we not write a GCC plugin that does exactly that?
>> >> Something that interprets the APP asm bits and generates these KASAN
>> >> bits that go with it?
>> >
>> > Another suspect is the per-cpu stuff, that's all asm foo as well.
>
> Unfortunately, I think that manual annotation is the only way to handle
> these (as we already do for kernel part of the uaccess sequences), since
> we hide things from the compiler or otherwise trick it into doing what
> we want.
>
>> +x86, Mark
>>
>> Let me provide more context and design alternatives.
>>
>> There are also other archs, at least arm64 for now.
>> There are also other tools. For KTSAN (race detector) we will
>> absolutely need to hook into atomic ops. For KMSAN (uses of unit
>> values) we also need to understand atomic ops at least to some degree.
>> Both of them will require different instrumentation.
>> For KASAN we are also more interested in cases where it's more likely
>> that an object is touched only by an asm, but not by normal memory
>> accesses (otherwise we would report the bug on the normal access,
>> which is fine, this makes atomic ops stand out in my opinion).
>>
>> We could involve compiler (and by compiler I mean clang, because we
>> are not going to touch gcc, any volunteers?).
>
> I don't think there's much you'll be able to do within the compiler,
> assuming you mean to derive this from the asm block inputs and outputs.
>
> Those can hide address-generation (e.g. with per-cpu stuff), which the
> compiler may erroneously be detected as racing.
>
> Those may also take fake inputs (e.g. the sp input to arm64's
> __my_cpu_offset()) which may confuse matters.
>
> Parsing the assembly itself will be *extremely* painful due to the way
> that's set up for run-time patching.
>
>> However, it's unclear if it will be simpler or not. There will
>> definitely will be a problem with uaccess asm blocks. Currently KASAN
>> relies of the fact that it does not see uaccess accesses and the user
>> addresses are considered bad by KASAN. There can also be a problem
>> with offsets/sizes, it's not possible to figure out what exactly an
>> asm block touches, we can only assume that it directly dereferences
>> the passed pointer. However, for example, bitops touch the pointer
>> with offset. Looking at the current x86 impl, we should be able to
>> handle it because the offset is computed outside of asm blocks. But
>> it's unclear if we hit this problem in other places.
>
> As above, I think you'd see more fun for the percpu stuff, since the
> pointer passed into those is "fake", with a percpu pointer accessing
> different addresses dependent on the CPU it is executed on.
>
>> I also see that arm64 bitops are implemented in .S files. And we won't
>> be able to instrument them in compiler.
>> There can also be other problems. Is it possible that some asm blocks
>> accept e.g. physical addresses? KASAN would consider them as bad.
>
> I'm not sure I follow what you mean here.
>
> I can imagine physical addresses being passed into asm statements that
> don't access memory (e.g. for setting up the base registers for page
> tables).
>
>> We could also provide a parallel implementation of atomic ops based on
>> the new compiler builtins (__atomic_load_n and friends):
>> https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
>> and enable it under KSAN. The nice thing about it is that it will
>> automatically support arm64 and KMSAN and KTSAN.
>> But it's more work.
>
> These don't permit runtime patching, and there are some differences
> between the C11 and Linux kernel memory models, so at least in the near
> term, I don't imagine we'd be likely to use this.
>
>> Re per-cpu asm. I would say that it's less critical than atomic ops.
>> Static per-cpu slots are not subject to use-after-free. Dynamic slots
>> can be subject to use-after-free and it would be nice to catch bugs
>> there. However, I think we will need to add manual
>> poisoning/unpoisoning of dynamic slots as well.
>>
>> Bottom line:
>> 1. Involving compiler looks quite complex, hard to deploy, and it's
>> unclear if it will actually make things easier.
>> 2. This patch is the simplest short-term option (I am leaning towards
>> adding bitops to this patch and leaving percpu out for now).
>> 3. Providing an implementation of atomic ops based on compiler
>> builtins looks like a nice option for other archs and tools, but is
>> more work. If you consider this as a good solution, we can move
>> straight to this option.
>
> Having *only* seen the assembly snippet at the top of this mail, I can't
> say whether this is the simplest implementation.
>
> However, I do think that annotation of this sort is the only reasonable
> way to handle this.


Here is the whole patch:
https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/X76pwg_tAwAJ

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 14:24       ` Dmitry Vyukov
                           ` (2 preceding siblings ...)
  2017-03-06 16:20         ` Mark Rutland
@ 2017-03-06 16:48         ` Andrey Ryabinin
  3 siblings, 0 replies; 25+ messages in thread
From: Andrey Ryabinin @ 2017-03-06 16:48 UTC (permalink / raw)
  To: Dmitry Vyukov, Peter Zijlstra
  Cc: Andrew Morton, Ingo Molnar, kasan-dev, linux-mm, LKML, x86, Mark Rutland

On 03/06/2017 05:24 PM, Dmitry Vyukov wrote:

> Let me provide more context and design alternatives.
> 
> There are also other archs, at least arm64 for now.
> There are also other tools. For KTSAN (race detector) we will
> absolutely need to hook into atomic ops. For KMSAN (uses of unit
> values) we also need to understand atomic ops at least to some degree.
> Both of them will require different instrumentation.
> For KASAN we are also more interested in cases where it's more likely
> that an object is touched only by an asm, but not by normal memory
> accesses (otherwise we would report the bug on the normal access,
> which is fine, this makes atomic ops stand out in my opinion).
> 
> We could involve compiler (and by compiler I mean clang, because we
> are not going to touch gcc, any volunteers?).

We've tried this with gcc about 3 years ago. Here is the patch - https://gcc.gnu.org/ml/gcc-patches/2014-05/msg02447.html
The problem is that memory block in "m" constraint doesn't actually mean
that inline asm will access it. It only means that asm block *may* access that memory (or part of it).
This causes false positives. As I vaguely remember I hit some false-positive in FPU-related code.

This problem gave birth to another idea - add a new constraint to strictly mark the memory access
inside asm block. See https://gcc.gnu.org/ml/gcc/2014-09/msg00237.html
But all ended with nothing.



> However, it's unclear if it will be simpler or not. There will
> definitely will be a problem with uaccess asm blocks. Currently KASAN
> relies of the fact that it does not see uaccess accesses and the user
> addresses are considered bad by KASAN. There can also be a problem
> with offsets/sizes, it's not possible to figure out what exactly an
> asm block touches, we can only assume that it directly dereferences
> the passed pointer. However, for example, bitops touch the pointer
> with offset. Looking at the current x86 impl, we should be able to
> handle it because the offset is computed outside of asm blocks. But
> it's unclear if we hit this problem in other places.
>
> I also see that arm64 bitops are implemented in .S files. And we won't
> be able to instrument them in compiler.
> There can also be other problems. Is it possible that some asm blocks
> accept e.g. physical addresses? KASAN would consider them as bad.
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 16:27           ` Dmitry Vyukov
@ 2017-03-06 17:25             ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2017-03-06 17:25 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Peter Zijlstra, Andrew Morton, Andrey Ryabinin, Ingo Molnar,
	kasan-dev, linux-mm, LKML, x86, Will Deacon

On Mon, Mar 06, 2017 at 05:27:44PM +0100, Dmitry Vyukov wrote:
> On Mon, Mar 6, 2017 at 5:20 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Mon, Mar 06, 2017 at 03:24:23PM +0100, Dmitry Vyukov wrote:
> >> On Mon, Mar 6, 2017 at 2:01 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> >> > On Mon, Mar 06, 2017 at 01:58:51PM +0100, Peter Zijlstra wrote:
> >> >> On Mon, Mar 06, 2017 at 01:50:47PM +0100, Dmitry Vyukov wrote:
> >> >> > On Mon, Mar 6, 2017 at 1:42 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> >> >> > > KASAN uses compiler instrumentation to intercept all memory accesses.
> >> >> > > But it does not see memory accesses done in assembly code.
> >> >> > > One notable user of assembly code is atomic operations. Frequently,
> >> >> > > for example, an atomic reference decrement is the last access to an
> >> >> > > object and a good candidate for a racy use-after-free.
> >> >> > >
> >> >> > > Add manual KASAN checks to atomic operations.
> >> >> > > Note: we need checks only before asm blocks and don't need them
> >> >> > > in atomic functions composed of other atomic functions
> >> >> > > (e.g. load-cmpxchg loops).
> >> >> >
> >> >> > Peter, also pointed me at arch/x86/include/asm/bitops.h. Will add them in v2.
> >> >> >
> >> >>
> >> >> > >  static __always_inline void atomic_add(int i, atomic_t *v)
> >> >> > >  {
> >> >> > > +       kasan_check_write(v, sizeof(*v));
> >> >> > >         asm volatile(LOCK_PREFIX "addl %1,%0"
> >> >> > >                      : "+m" (v->counter)
> >> >> > >                      : "ir" (i));

> >> Bottom line:
> >> 1. Involving compiler looks quite complex, hard to deploy, and it's
> >> unclear if it will actually make things easier.
> >> 2. This patch is the simplest short-term option (I am leaning towards
> >> adding bitops to this patch and leaving percpu out for now).
> >> 3. Providing an implementation of atomic ops based on compiler
> >> builtins looks like a nice option for other archs and tools, but is
> >> more work. If you consider this as a good solution, we can move
> >> straight to this option.
> >
> > Having *only* seen the assembly snippet at the top of this mail, I can't
> > say whether this is the simplest implementation.
> >
> > However, I do think that annotation of this sort is the only reasonable
> > way to handle this.
> 
> Here is the whole patch:
> https://groups.google.com/d/msg/kasan-dev/3sNHjjb4GCI/X76pwg_tAwAJ

I see.

Given we'd have to instrument each architecture's atomics in an
identical fashion, maybe we should follow the example of spinlocks, and
add an arch_ prefix to the arch-specific implementation, and place the
instrumentation in a common wrapper.

i.e. have something like:

static __always_inline void atomic_inc(atomic_t *v)
{
	kasan_check_write(v, sizeof(*v)); 
	arch_atomic_inc(v);
}

... in asm-generic somewhere.

It's more churn initially, but it should bea saving overall, and I
imagine for KMSAN or other things we may want more instrumentation
anyway...

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 16:20         ` Mark Rutland
  2017-03-06 16:27           ` Dmitry Vyukov
@ 2017-03-06 20:35           ` Peter Zijlstra
  2017-03-08 13:42             ` Dmitry Vyukov
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-06 20:35 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Dmitry Vyukov, Andrew Morton, Andrey Ryabinin, Ingo Molnar,
	kasan-dev, linux-mm, LKML, x86, will.deacon

On Mon, Mar 06, 2017 at 04:20:18PM +0000, Mark Rutland wrote:
> > >> So the problem is doing load/stores from asm bits, and GCC
> > >> (traditionally) doesn't try and interpret APP asm bits.
> > >>
> > >> However, could we not write a GCC plugin that does exactly that?
> > >> Something that interprets the APP asm bits and generates these KASAN
> > >> bits that go with it?

> I don't think there's much you'll be able to do within the compiler,
> assuming you mean to derive this from the asm block inputs and outputs.

Nah, I was thinking about a full asm interpreter.

> Those can hide address-generation (e.g. with per-cpu stuff), which the
> compiler may erroneously be detected as racing.
> 
> Those may also take fake inputs (e.g. the sp input to arm64's
> __my_cpu_offset()) which may confuse matters.
> 
> Parsing the assembly itself will be *extremely* painful due to the way
> that's set up for run-time patching.

Argh, yah, completely forgot about all that alternative and similar
nonsense :/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-06 20:35           ` Peter Zijlstra
@ 2017-03-08 13:42             ` Dmitry Vyukov
  2017-03-08 15:20               ` Mark Rutland
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-08 13:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mark Rutland, Andrew Morton, Andrey Ryabinin, Ingo Molnar,
	kasan-dev, linux-mm, LKML, x86, Will Deacon

[-- Attachment #1: Type: text/plain, Size: 2602 bytes --]

On Mon, Mar 6, 2017 at 9:35 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Mon, Mar 06, 2017 at 04:20:18PM +0000, Mark Rutland wrote:
>> > >> So the problem is doing load/stores from asm bits, and GCC
>> > >> (traditionally) doesn't try and interpret APP asm bits.
>> > >>
>> > >> However, could we not write a GCC plugin that does exactly that?
>> > >> Something that interprets the APP asm bits and generates these KASAN
>> > >> bits that go with it?
>
>> I don't think there's much you'll be able to do within the compiler,
>> assuming you mean to derive this from the asm block inputs and outputs.
>
> Nah, I was thinking about a full asm interpreter.
>
>> Those can hide address-generation (e.g. with per-cpu stuff), which the
>> compiler may erroneously be detected as racing.
>>
>> Those may also take fake inputs (e.g. the sp input to arm64's
>> __my_cpu_offset()) which may confuse matters.
>>
>> Parsing the assembly itself will be *extremely* painful due to the way
>> that's set up for run-time patching.
>
> Argh, yah, completely forgot about all that alternative and similar
> nonsense :/



I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
consequently x86/arm64) initially, it becomes more realistic. For the
tools we don't care about absolute efficiency and this gets rid of
Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
Re (3) I think rmb/wmb can be reasonably replaced with
atomic_thread_fence(acquire/release). Re (5) situation with
correctness becomes better very quickly as more people use them in
user-space. Since KASAN is not intended to be used in production (or
at least such build is expected to crash), we can afford to shake out
any remaining correctness issues in such build. (1) I don't fully
understand, what exactly is the problem with seq_cst?

I've sketched a patch that does it, and did some testing with/without
KASAN on x86_64.

In short, it adds include/linux/atomic_compiler.h which is included
from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
and <asm/atomic.h> is not included when CONFIG_COMPILER_ATOMIC is
defined.
For bitops it is similar except that only parts of asm/bitops.h are
selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
defines other stuff.
asm/barriers.h is left intact for now. We don't need it for KASAN. But
for KTSAN we can do similar thing -- selectively disable some of the
barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).

Such change would allow us to support atomic ops for multiple arches
for all of KASAN/KTSAN/KMSAN.

Thoughts?

[-- Attachment #2: atomic_compiler.patch --]
[-- Type: text/x-patch, Size: 17432 bytes --]

diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..7bcb10544fc1 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -1,6 +1,10 @@
 #ifndef _ASM_X86_ATOMIC_H
 #define _ASM_X86_ATOMIC_H
 
+#ifdef CONFIG_COMPILER_ATOMIC
+#error "should not be included"
+#endif
+
 #include <linux/compiler.h>
 #include <linux/types.h>
 #include <asm/alternative.h>
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 854022772c5b..e42b85f1ed75 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -68,6 +68,7 @@
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void
 set_bit(long nr, volatile unsigned long *addr)
 {
@@ -81,6 +82,7 @@ set_bit(long nr, volatile unsigned long *addr)
 			: BITOP_ADDR(addr) : "Ir" (nr) : "memory");
 	}
 }
+#endif
 
 /**
  * __set_bit - Set a bit in memory
@@ -106,6 +108,7 @@ static __always_inline void __set_bit(long nr, volatile unsigned long *addr)
  * you should call smp_mb__before_atomic() and/or smp_mb__after_atomic()
  * in order to ensure changes are visible on other processors.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void
 clear_bit(long nr, volatile unsigned long *addr)
 {
@@ -119,6 +122,7 @@ clear_bit(long nr, volatile unsigned long *addr)
 			: "Ir" (nr));
 	}
 }
+#endif
 
 /*
  * clear_bit_unlock - Clears a bit in memory
@@ -128,17 +132,20 @@ clear_bit(long nr, volatile unsigned long *addr)
  * clear_bit() is atomic and implies release semantics before the memory
  * operation. It can be used for an unlock.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
 {
 	barrier();
 	clear_bit(nr, addr);
 }
+#endif
 
 static __always_inline void __clear_bit(long nr, volatile unsigned long *addr)
 {
 	asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
 }
 
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
 {
 	bool negative;
@@ -151,6 +158,7 @@ static __always_inline bool clear_bit_unlock_is_negative_byte(long nr, volatile
 
 // Let everybody know we have it
 #define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+#endif
 
 /*
  * __clear_bit_unlock - Clears a bit in memory
@@ -193,6 +201,7 @@ static __always_inline void __change_bit(long nr, volatile unsigned long *addr)
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline void change_bit(long nr, volatile unsigned long *addr)
 {
 	if (IS_IMMEDIATE(nr)) {
@@ -205,6 +214,7 @@ static __always_inline void change_bit(long nr, volatile unsigned long *addr)
 			: "Ir" (nr));
 	}
 }
+#endif
 
 /**
  * test_and_set_bit - Set a bit and return its old value
@@ -214,10 +224,12 @@ static __always_inline void change_bit(long nr, volatile unsigned long *addr)
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "bts", *addr, "Ir", nr, "%0", c);
 }
+#endif
 
 /**
  * test_and_set_bit_lock - Set a bit and return its old value for lock
@@ -226,11 +238,13 @@ static __always_inline bool test_and_set_bit(long nr, volatile unsigned long *ad
  *
  * This is the same as test_and_set_bit on x86.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline bool
 test_and_set_bit_lock(long nr, volatile unsigned long *addr)
 {
 	return test_and_set_bit(nr, addr);
 }
+#endif
 
 /**
  * __test_and_set_bit - Set a bit and return its old value
@@ -260,10 +274,12 @@ static __always_inline bool __test_and_set_bit(long nr, volatile unsigned long *
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "btr", *addr, "Ir", nr, "%0", c);
 }
+#endif
 
 /**
  * __test_and_clear_bit - Clear a bit and return its old value
@@ -313,10 +329,12 @@ static __always_inline bool __test_and_change_bit(long nr, volatile unsigned lon
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
+#ifndef CONFIG_COMPILER_BITOPS
 static __always_inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "btc", *addr, "Ir", nr, "%0", c);
 }
+#endif
 
 static __always_inline bool constant_test_bit(long nr, const volatile unsigned long *addr)
 {
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 898dba2e2e2c..33a87ed3c150 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -63,7 +63,7 @@ struct saved_msrs {
 /*
  * Be very careful with includes. This header is prone to include loops.
  */
-#include <asm/atomic.h>
+#include <linux/atomic.h>
 #include <linux/tracepoint-defs.h>
 
 extern struct tracepoint __tracepoint_read_msr;
diff --git a/include/linux/atomic.h b/include/linux/atomic.h
index e71835bf60a9..5e02d01007d1 100644
--- a/include/linux/atomic.h
+++ b/include/linux/atomic.h
@@ -1,7 +1,14 @@
 /* Atomic operations usable in machine independent code */
 #ifndef _LINUX_ATOMIC_H
 #define _LINUX_ATOMIC_H
+
+#if defined(CONFIG_KASAN)
+#define CONFIG_COMPILER_ATOMIC
+#include <linux/atomic_compiler.h>
+#else
 #include <asm/atomic.h>
+#endif
+
 #include <asm/barrier.h>
 
 /*
diff --git a/include/linux/atomic_compiler.h b/include/linux/atomic_compiler.h
new file mode 100644
index 000000000000..4039761449dd
--- /dev/null
+++ b/include/linux/atomic_compiler.h
@@ -0,0 +1,339 @@
+#ifndef _LINUX_ATOMIC_COMPILER_H
+#define _LINUX_ATOMIC_COMPILER_H
+
+#include <linux/types.h>
+
+/* The 32-bit atomic type */
+
+#define ATOMIC_INIT(i)	{ (i) }
+
+static inline int atomic_read(const atomic_t *v)
+{
+	return __atomic_load_n(&v->counter, __ATOMIC_RELAXED);
+}
+
+static inline void atomic_set(atomic_t *v, int i)
+{
+	__atomic_store_n(&v->counter, i, __ATOMIC_RELAXED);
+}
+
+static inline void atomic_add(int i, atomic_t *v)
+{
+	__atomic_fetch_add(&v->counter, i, __ATOMIC_RELAXED);
+}
+
+static inline void atomic_sub(int i, atomic_t *v)
+{
+	__atomic_fetch_sub(&v->counter, i, __ATOMIC_RELAXED);
+}
+
+static inline bool atomic_sub_and_test(int i, atomic_t *v)
+{
+	return __atomic_fetch_sub(&v->counter, i, __ATOMIC_ACQ_REL) == i;
+}
+
+#define atomic_inc(v)  (atomic_add(1, v))
+#define atomic_dec(v)  (atomic_sub(1, v))
+
+static inline bool atomic_dec_and_test(atomic_t *v)
+{
+	return __atomic_fetch_sub(&v->counter, 1, __ATOMIC_ACQ_REL) == 1;
+}
+
+static inline bool atomic_inc_and_test(atomic_t *v)
+{
+	return __atomic_fetch_add(&v->counter, 1, __ATOMIC_ACQ_REL) == -1;
+}
+
+static inline bool atomic_add_negative(int i, atomic_t *v)
+{
+	return __atomic_fetch_add(&v->counter, i, __ATOMIC_ACQ_REL) + i < 0;
+}
+
+static inline int atomic_add_return(int i, atomic_t *v)
+{
+	return __atomic_fetch_add(&v->counter, i, __ATOMIC_ACQ_REL) + i;
+}
+
+static inline int atomic_sub_return(int i, atomic_t *v)
+{
+	return __atomic_fetch_sub(&v->counter, i, __ATOMIC_ACQ_REL) - i;
+}
+
+#define atomic_inc_return(v)  (atomic_add_return(1, v))
+#define atomic_dec_return(v)  (atomic_sub_return(1, v))
+
+static inline int atomic_fetch_add(int i, atomic_t *v)
+{
+	return __atomic_fetch_add(&v->counter, i, __ATOMIC_ACQ_REL);
+}
+
+static inline int atomic_fetch_sub(int i, atomic_t *v)
+{
+	return __atomic_fetch_sub(&v->counter, i, __ATOMIC_ACQ_REL);
+}
+
+static inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+	__atomic_compare_exchange_n(&v->counter, &old, new, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+	return old;
+}
+
+static inline int atomic_xchg(atomic_t *v, int new)
+{
+	return __atomic_exchange_n(&v->counter, new, __ATOMIC_ACQ_REL);
+}
+
+static inline int __atomic_add_unless(atomic_t *v, int a, int u)
+{
+	int c, old;
+	c = atomic_read(v);
+	for (;;) {
+		if (unlikely(c == u))
+			break;
+		old = atomic_cmpxchg(v, c, c + a);
+		if (likely(old == c))
+			break;
+		c = old;
+	}
+	return c;
+}
+
+#define ATOMIC_OP(op)							\
+static inline void atomic_##op(int i, atomic_t *v)			\
+{									\
+	__atomic_fetch_##op(&(v)->counter, i, __ATOMIC_RELAXED);	\
+}
+
+#define ATOMIC_FETCH_OP(op, c_op)					\
+static inline int atomic_fetch_##op(int i, atomic_t *v)			\
+{									\
+	return __atomic_fetch_##op(&(v)->counter, i, __ATOMIC_ACQ_REL); \
+}
+
+#define ATOMIC_OPS(op, c_op)						\
+	ATOMIC_OP(op)							\
+	ATOMIC_FETCH_OP(op, c_op)
+
+ATOMIC_OPS(and, &)
+ATOMIC_OPS(or , |)
+ATOMIC_OPS(xor, ^)
+
+#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
+#undef ATOMIC_OP
+
+/* The 64-bit atomic type */
+
+#ifndef CONFIG_64BIT
+typedef struct {
+	u64 __aligned(8) counter;
+} atomic64_t;
+#endif
+
+#define ATOMIC64_INIT(i)	{ (i) }
+
+static inline u64 atomic64_read(const atomic64_t *v)
+{
+	return __atomic_load_n(&v->counter, __ATOMIC_RELAXED);
+}
+
+static inline void atomic64_set(atomic64_t *v, u64 i)
+{
+	__atomic_store_n(&v->counter, i, __ATOMIC_RELAXED);
+}
+
+static inline void atomic64_add(u64 i, atomic64_t *v)
+{
+	__atomic_fetch_add(&v->counter, i, __ATOMIC_RELAXED);
+}
+
+static inline void atomic64_sub(u64 i, atomic64_t *v)
+{
+	__atomic_fetch_sub(&v->counter, i, __ATOMIC_RELAXED);
+}
+
+static inline bool atomic64_sub_and_test(u64 i, atomic64_t *v)
+{
+	return __atomic_fetch_sub(&v->counter, i, __ATOMIC_ACQ_REL) == i;
+}
+
+#define atomic64_inc(v)  (atomic64_add(1, v))
+#define atomic64_dec(v)  (atomic64_sub(1, v))
+
+static inline bool atomic64_dec_and_test(atomic64_t *v)
+{
+	return __atomic_fetch_sub(&v->counter, 1, __ATOMIC_ACQ_REL) == 1;
+}
+
+static inline bool atomic64_inc_and_test(atomic64_t *v)
+{
+	return __atomic_fetch_add(&v->counter, 1, __ATOMIC_ACQ_REL) == -1;
+}
+
+static inline bool atomic64_add_negative(u64 i, atomic64_t *v)
+{
+	return __atomic_fetch_add(&v->counter, i, __ATOMIC_ACQ_REL) + i < 0;
+}
+
+static inline u64 atomic64_add_return(u64 i, atomic64_t *v)
+{
+	return __atomic_fetch_add(&v->counter, i, __ATOMIC_ACQ_REL) + i;
+}
+
+static inline u64 atomic64_sub_return(u64 i, atomic64_t *v)
+{
+	return __atomic_fetch_sub(&v->counter, i, __ATOMIC_ACQ_REL) - i;
+}
+
+#define atomic64_inc_return(v)  (atomic64_add_return(1, (v)))
+#define atomic64_dec_return(v)  (atomic64_sub_return(1, (v)))
+
+static inline u64 atomic64_fetch_add(u64 i, atomic64_t *v)
+{
+	return __atomic_fetch_add(&v->counter, i, __ATOMIC_ACQ_REL);
+}
+
+static inline u64 atomic64_fetch_sub(u64 i, atomic64_t *v)
+{
+	return __atomic_fetch_sub(&v->counter, i, __ATOMIC_ACQ_REL);
+}
+
+static inline u64 atomic64_cmpxchg(atomic64_t *v, u64 old, u64 new)
+{
+	__atomic_compare_exchange_n(&v->counter, &old, new, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+	return old;
+}
+
+static inline u64 atomic64_xchg(atomic64_t *v, u64 new)
+{
+	return __atomic_exchange_n(&v->counter, new, __ATOMIC_ACQ_REL);
+}
+
+static inline bool atomic64_add_unless(atomic64_t *v, u64 a, u64 u)
+{
+	u64 c, old;
+	c = atomic64_read(v);
+	for (;;) {
+		if (unlikely(c == u))
+			break;
+		old = atomic64_cmpxchg(v, c, c + a);
+		if (likely(old == c))
+			break;
+		c = old;
+	}
+	return c != u;
+}
+
+#define atomic64_inc_not_zero(v) atomic64_add_unless((v), 1, 0)
+
+static inline u64 atomic64_dec_if_positive(atomic64_t *v)
+{
+	u64 c, old, dec;
+	c = atomic64_read(v);
+	for (;;) {
+		dec = c - 1;
+		if (unlikely(dec < 0))
+			break;
+		old = atomic64_cmpxchg(v, c, dec);
+		if (likely(old == c))
+			break;
+		c = old;
+	}
+	return dec;
+}
+
+#define ATOMIC64_OP(op)							\
+static inline void atomic64_##op(u64 i, atomic64_t *v)			\
+{									\
+	__atomic_fetch_##op(&(v)->counter, i, __ATOMIC_RELAXED);	\
+}
+
+#define ATOMIC64_FETCH_OP(op, c_op)					\
+static inline u64 atomic64_fetch_##op(u64 i, atomic64_t *v)		\
+{									\
+	return __atomic_fetch_##op(&(v)->counter, i, __ATOMIC_ACQ_REL); \
+}
+
+#define ATOMIC64_OPS(op, c_op)						\
+	ATOMIC64_OP(op)							\
+	ATOMIC64_FETCH_OP(op, c_op)
+
+ATOMIC64_OPS(and, &)
+ATOMIC64_OPS(or, |)
+ATOMIC64_OPS(xor, ^)
+
+#undef ATOMIC64_OPS
+#undef ATOMIC64_FETCH_OP
+#undef ATOMIC64_OP
+
+/* Cmpxchg */
+
+#define xchg(ptr, v) __atomic_exchange_n((ptr), (v), __ATOMIC_ACQ_REL)
+#define xadd(ptr, inc) __atomic_add_fetch((ptr), (inc), __ATOMIC_ACQ_REL)
+
+#define cmpxchg(ptr, old, new) ({								\
+	typeof(old) tmp = old;									\
+	__atomic_compare_exchange_n((ptr), &tmp, (new), 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);	\
+	tmp;											\
+})
+
+#define sync_cmpxchg	cmpxchg
+#define cmpxchg_local	cmpxchg
+#define cmpxchg64	cmpxchg
+#define cmpxchg64_local	cmpxchg
+
+/*
+typedef struct {
+	long v1, v2;
+} __cmpxchg_double_struct;
+
+#define cmpxchg_double(p1, p2, o1, o2, n1, n2)				\
+({									\
+	__cmpxchg_double_struct old = {(long)o1, (long)o2};		\
+	__cmpxchg_double_struct new = {(long)n1, (long)n2};		\
+	BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long));			\
+	BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long));			\
+	VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long)));		\
+	VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2));	\
+	__atomic_compare_exchange_n((__int128 *)(p1), (__int128 *)&old, *(__int128 *)&new, 0, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);	\
+})
+
+#define cmpxchg_double_local	cmpxchg_double
+#define system_has_cmpxchg_double() 1
+*/
+
+#undef CONFIG_HAVE_CMPXCHG_DOUBLE
+
+/* 16-bit atomic ops */
+
+static inline short int atomic_inc_short(short int *v)
+{
+	return __atomic_fetch_add(v, 1, __ATOMIC_ACQ_REL) + 1;
+}
+
+/* Barriers */
+/*
+#define barrier()	__atomic_signal_fence(__ATOMIC_SEQ_CST)
+#define mb()		__atomic_thread_fence(__ATOMIC_SEQ_CST)
+#define rmb()		__atomic_thread_fence(__ATOMIC_ACQUIRE)
+#define wmb()		__atomic_thread_fence(__ATOMIC_RELEASE)
+
+#define __smp_mb()	mb()
+#define __smp_rmb()	rmb()
+#define __smp_wmb()	wmb()
+
+#define dma_rmb()	mb()
+#define dma_wmb()	mb()
+
+#define __smp_store_mb(var, value) __atomic_store_n(&(var), (value), __ATOMIC_SEQ_CST)
+#define __smp_store_release(p, v) __atomic_store_n((p), (v), __ATOMIC_RELEASE)
+#define __smp_load_acquire(p) __atomic_load_n((p), __ATOMIC_ACQUIRE)
+
+#define __smp_mb__before_atomic()	mb()
+#define __smp_mb__after_atomic()	mb()
+
+#include <asm-generic/barrier.h>
+*/
+
+#endif /* _LINUX_ATOMIC_COMPILER_H */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index a83c822c35c2..1c6b1b925dd9 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -1,5 +1,11 @@
 #ifndef _LINUX_BITOPS_H
 #define _LINUX_BITOPS_H
+
+#if defined(CONFIG_KASAN)
+#define CONFIG_COMPILER_BITOPS
+#include <linux/bitops_compiler.h>
+#endif
+
 #include <asm/types.h>
 
 #ifdef	__KERNEL__
diff --git a/include/linux/bitops_compiler.h b/include/linux/bitops_compiler.h
new file mode 100644
index 000000000000..4d2a253776f2
--- /dev/null
+++ b/include/linux/bitops_compiler.h
@@ -0,0 +1,56 @@
+#ifndef _LINUX_BITOPS_COMPILER_H
+#define _LINUX_BITOPS_COMPILER_H
+
+#include <linux/types.h>
+
+static inline void
+set_bit(long nr, volatile unsigned long *addr)
+{
+	__atomic_fetch_or((char *)addr + (nr / 8), 1 << (nr % 8), __ATOMIC_RELAXED);
+}
+
+static inline void
+clear_bit(long nr, volatile unsigned long *addr)
+{
+	__atomic_fetch_and((char *)addr + (nr / 8), ~(1 << (nr % 8)), __ATOMIC_RELAXED);
+}
+
+static inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
+{
+	__atomic_fetch_and((char *)addr + (nr / 8), ~(1 << (nr % 8)), __ATOMIC_RELEASE);
+}
+
+static inline bool clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
+{
+	return __atomic_fetch_and((char *)addr + (nr / 8), ~(1 << (nr % 8)), __ATOMIC_RELEASE) < 0;
+}
+
+#define clear_bit_unlock_is_negative_byte clear_bit_unlock_is_negative_byte
+
+static inline void change_bit(long nr, volatile unsigned long *addr)
+{
+	__atomic_fetch_xor((char *)addr + (nr / 8), 1 << (nr % 8), __ATOMIC_RELAXED);
+}
+
+static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
+{
+	return __atomic_fetch_or((char *)addr + (nr / 8), 1 << (nr % 8), __ATOMIC_ACQ_REL) & (1 << (nr % 8));
+}
+
+static inline bool
+test_and_set_bit_lock(long nr, volatile unsigned long *addr)
+{
+	return __atomic_fetch_or((char *)addr + (nr / 8), 1 << (nr % 8), __ATOMIC_ACQUIRE) & (1 << (nr % 8));
+}
+
+static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
+{
+	return __atomic_fetch_and((char *)addr + (nr / 8), ~(1 << (nr % 8)), __ATOMIC_ACQ_REL) & (1 << (nr % 8));
+}
+
+static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
+{
+	return __atomic_fetch_xor((char *)addr + (nr / 8), 1 << (nr % 8), __ATOMIC_ACQ_REL) & (1 << (nr % 8));
+}
+
+#endif /* _LINUX_BITOPS_COMPILER_H */
diff --git a/net/sunrpc/xprtmultipath.c b/net/sunrpc/xprtmultipath.c
index ae92a9e9ba52..3b1e85619ce0 100644
--- a/net/sunrpc/xprtmultipath.c
+++ b/net/sunrpc/xprtmultipath.c
@@ -12,7 +12,7 @@
 #include <linux/rcupdate.h>
 #include <linux/rculist.h>
 #include <linux/slab.h>
-#include <asm/cmpxchg.h>
+#include <linux/atomic.h>
 #include <linux/spinlock.h>
 #include <linux/sunrpc/xprt.h>
 #include <linux/sunrpc/addr.h>

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-08 13:42             ` Dmitry Vyukov
@ 2017-03-08 15:20               ` Mark Rutland
  2017-03-08 15:27                 ` Dmitry Vyukov
  2017-03-08 17:43                 ` Will Deacon
  0 siblings, 2 replies; 25+ messages in thread
From: Mark Rutland @ 2017-03-08 15:20 UTC (permalink / raw)
  To: Dmitry Vyukov, Will Deacon
  Cc: Peter Zijlstra, Andrew Morton, Andrey Ryabinin, Ingo Molnar,
	kasan-dev, linux-mm, LKML, x86

Hi,

On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
> I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
> consequently x86/arm64) initially, it becomes more realistic. For the
> tools we don't care about absolute efficiency and this gets rid of
> Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
> Re (3) I think rmb/wmb can be reasonably replaced with
> atomic_thread_fence(acquire/release). Re (5) situation with
> correctness becomes better very quickly as more people use them in
> user-space. Since KASAN is not intended to be used in production (or
> at least such build is expected to crash), we can afford to shake out
> any remaining correctness issues in such build. (1) I don't fully
> understand, what exactly is the problem with seq_cst?

I'll have to leave it to Will to have the final word on these; I'm
certainly not familiar enough with the C11 memory model to comment on
(1).

However, w.r.t. (3), I don't think we can substitute rmb() and wmb()
with atomic_thread_fence_acquire() and atomic_thread_fence_release()
respectively on arm64.

The former use barriers with full system scope, whereas the latter may
be limited to the inner shareable domain. While I'm not sure of the
precise intended semantics of wmb() and rmb(), I believe this
substitution would break some cases (e.g. communicating with a
non-coherent master).

Note that regardless, we'd have to special-case __iowmb() to use a full
system barrier.

Also, w.r.t. (5), modulo the lack of atomic instrumentation, people use
KASAN today, with compilers that are known to have bugs in their atomics
(e.g. GCC bug 69875). Thus, we cannot rely on the compiler's
implementation of atomics without introducing a functional regression.

> i'Ve sketched a patch that does it, and did some testing with/without
> KASAN on x86_64.
> 
> In short, it adds include/linux/atomic_compiler.h which is included
> from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
> and <asm/atomic.h> is not included when CONFIG_COMPILER_ATOMIC is
> defined.
> For bitops it is similar except that only parts of asm/bitops.h are
> selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
> defines other stuff.
> asm/barriers.h is left intact for now. We don't need it for KASAN. But
> for KTSAN we can do similar thing -- selectively disable some of the
> barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).
> 
> Such change would allow us to support atomic ops for multiple arches
> for all of KASAN/KTSAN/KMSAN.
> 
> Thoughts?

As in my other reply, I'd prefer that we wrapped the (arch-specific)
atomic implementations such that we can instrument them explicitly in a
core header. That means that the implementation and semantics of the
atomics don't change at all.

Note that we could initially do this just for x86 and arm64), e.g. by
having those explicitly include an <asm-generic/atomic-instrumented.h>
at the end of their <asm/atomic.h>.

For architectures which can use the compiler's atomics, we can allow
them to do so, skipping the redundant explicit instrumentation.

Other than being potentially slower (which we've established we don't
care too much about above), is there a problem with that approach?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-08 15:20               ` Mark Rutland
@ 2017-03-08 15:27                 ` Dmitry Vyukov
  2017-03-08 15:43                   ` Mark Rutland
  2017-03-08 17:43                 ` Will Deacon
  1 sibling, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-08 15:27 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, Peter Zijlstra, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
>> I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
>> consequently x86/arm64) initially, it becomes more realistic. For the
>> tools we don't care about absolute efficiency and this gets rid of
>> Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
>> Re (3) I think rmb/wmb can be reasonably replaced with
>> atomic_thread_fence(acquire/release). Re (5) situation with
>> correctness becomes better very quickly as more people use them in
>> user-space. Since KASAN is not intended to be used in production (or
>> at least such build is expected to crash), we can afford to shake out
>> any remaining correctness issues in such build. (1) I don't fully
>> understand, what exactly is the problem with seq_cst?
>
> I'll have to leave it to Will to have the final word on these; I'm
> certainly not familiar enough with the C11 memory model to comment on
> (1).
>
> However, w.r.t. (3), I don't think we can substitute rmb() and wmb()
> with atomic_thread_fence_acquire() and atomic_thread_fence_release()
> respectively on arm64.
>
> The former use barriers with full system scope, whereas the latter may
> be limited to the inner shareable domain. While I'm not sure of the
> precise intended semantics of wmb() and rmb(), I believe this
> substitution would break some cases (e.g. communicating with a
> non-coherent master).
>
> Note that regardless, we'd have to special-case __iowmb() to use a full
> system barrier.
>
> Also, w.r.t. (5), modulo the lack of atomic instrumentation, people use
> KASAN today, with compilers that are known to have bugs in their atomics
> (e.g. GCC bug 69875). Thus, we cannot rely on the compiler's
> implementation of atomics without introducing a functional regression.
>
>> i'Ve sketched a patch that does it, and did some testing with/without
>> KASAN on x86_64.
>>
>> In short, it adds include/linux/atomic_compiler.h which is included
>> from include/linux/atomic.h when CONFIG_COMPILER_ATOMIC is defined;
>> and <asm/atomic.h> is not included when CONFIG_COMPILER_ATOMIC is
>> defined.
>> For bitops it is similar except that only parts of asm/bitops.h are
>> selectively disabled when CONFIG_COMPILER_ATOMIC, because it also
>> defines other stuff.
>> asm/barriers.h is left intact for now. We don't need it for KASAN. But
>> for KTSAN we can do similar thing -- selectively disable some of the
>> barriers in asm/barriers.h (e.g. leaving dma_rmb/wmb per arch).
>>
>> Such change would allow us to support atomic ops for multiple arches
>> for all of KASAN/KTSAN/KMSAN.
>>
>> Thoughts?
>
> As in my other reply, I'd prefer that we wrapped the (arch-specific)
> atomic implementations such that we can instrument them explicitly in a
> core header. That means that the implementation and semantics of the
> atomics don't change at all.
>
> Note that we could initially do this just for x86 and arm64), e.g. by
> having those explicitly include an <asm-generic/atomic-instrumented.h>
> at the end of their <asm/atomic.h>.

How exactly do you want to do this incrementally?
I don't feel ready to shuffle all archs, but doing x86 in one patch
and then arm64 in another looks tractable.


> For architectures which can use the compiler's atomics, we can allow
> them to do so, skipping the redundant explicit instrumentation.
>
> Other than being potentially slower (which we've established we don't
> care too much about above), is there a problem with that approach?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-08 15:27                 ` Dmitry Vyukov
@ 2017-03-08 15:43                   ` Mark Rutland
  2017-03-08 15:45                     ` Dmitry Vyukov
  0 siblings, 1 reply; 25+ messages in thread
From: Mark Rutland @ 2017-03-08 15:43 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Will Deacon, Peter Zijlstra, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
> > atomic implementations such that we can instrument them explicitly in a
> > core header. That means that the implementation and semantics of the
> > atomics don't change at all.
> >
> > Note that we could initially do this just for x86 and arm64), e.g. by
> > having those explicitly include an <asm-generic/atomic-instrumented.h>
> > at the end of their <asm/atomic.h>.
> 
> How exactly do you want to do this incrementally?
> I don't feel ready to shuffle all archs, but doing x86 in one patch
> and then arm64 in another looks tractable.

I guess we'd have three patches: one adding the header and any core
infrastructure, followed by separate patches migrating arm64 and x86
over.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-08 15:43                   ` Mark Rutland
@ 2017-03-08 15:45                     ` Dmitry Vyukov
  2017-03-08 15:48                       ` Mark Rutland
  0 siblings, 1 reply; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-08 15:45 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, Peter Zijlstra, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Wed, Mar 8, 2017 at 4:43 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
>> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
>> > atomic implementations such that we can instrument them explicitly in a
>> > core header. That means that the implementation and semantics of the
>> > atomics don't change at all.
>> >
>> > Note that we could initially do this just for x86 and arm64), e.g. by
>> > having those explicitly include an <asm-generic/atomic-instrumented.h>
>> > at the end of their <asm/atomic.h>.
>>
>> How exactly do you want to do this incrementally?
>> I don't feel ready to shuffle all archs, but doing x86 in one patch
>> and then arm64 in another looks tractable.
>
> I guess we'd have three patches: one adding the header and any core
> infrastructure, followed by separate patches migrating arm64 and x86
> over.

But if we add e.g. atomic_read() which forwards to arch_atomic_read()
to <linux/atomic.h>, it will break all archs that don't rename its
atomic_read() to arch_atomic_read().

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-08 15:45                     ` Dmitry Vyukov
@ 2017-03-08 15:48                       ` Mark Rutland
  0 siblings, 0 replies; 25+ messages in thread
From: Mark Rutland @ 2017-03-08 15:48 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Will Deacon, Peter Zijlstra, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Wed, Mar 08, 2017 at 04:45:58PM +0100, Dmitry Vyukov wrote:
> On Wed, Mar 8, 2017 at 4:43 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Wed, Mar 08, 2017 at 04:27:11PM +0100, Dmitry Vyukov wrote:
> >> On Wed, Mar 8, 2017 at 4:20 PM, Mark Rutland <mark.rutland@arm.com> wrote:
> >> > As in my other reply, I'd prefer that we wrapped the (arch-specific)
> >> > atomic implementations such that we can instrument them explicitly in a
> >> > core header. That means that the implementation and semantics of the
> >> > atomics don't change at all.
> >> >
> >> > Note that we could initially do this just for x86 and arm64), e.g. by
> >> > having those explicitly include an <asm-generic/atomic-instrumented.h>
> >> > at the end of their <asm/atomic.h>.
> >>
> >> How exactly do you want to do this incrementally?
> >> I don't feel ready to shuffle all archs, but doing x86 in one patch
> >> and then arm64 in another looks tractable.
> >
> > I guess we'd have three patches: one adding the header and any core
> > infrastructure, followed by separate patches migrating arm64 and x86
> > over.
> 
> But if we add e.g. atomic_read() which forwards to arch_atomic_read()
> to <linux/atomic.h>, it will break all archs that don't rename its
> atomic_read() to arch_atomic_read().

... as above, that'd be handled by placing this in an
<asm-generic/atomic-instrumented.h> file, that we only include at the
end of the arch implementation.

So we'd only include that on arm64 and x86, without needing to change
the names elsewhere.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-08 15:20               ` Mark Rutland
  2017-03-08 15:27                 ` Dmitry Vyukov
@ 2017-03-08 17:43                 ` Will Deacon
  2017-03-14 15:22                   ` Dmitry Vyukov
  1 sibling, 1 reply; 25+ messages in thread
From: Will Deacon @ 2017-03-08 17:43 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Dmitry Vyukov, Peter Zijlstra, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Wed, Mar 08, 2017 at 03:20:41PM +0000, Mark Rutland wrote:
> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
> > I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
> > consequently x86/arm64) initially, it becomes more realistic. For the
> > tools we don't care about absolute efficiency and this gets rid of
> > Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
> > Re (3) I think rmb/wmb can be reasonably replaced with
> > atomic_thread_fence(acquire/release). Re (5) situation with
> > correctness becomes better very quickly as more people use them in
> > user-space. Since KASAN is not intended to be used in production (or
> > at least such build is expected to crash), we can afford to shake out
> > any remaining correctness issues in such build. (1) I don't fully
> > understand, what exactly is the problem with seq_cst?
> 
> I'll have to leave it to Will to have the final word on these; I'm
> certainly not familiar enough with the C11 memory model to comment on
> (1).

rmb()/wmb() are not remotely similar to
atomic_thread_fenc_{acquire,release}, even if you restrict ordering to
coherent CPUs (i.e. the smp_* variants). Please don't do that :)

I'm also terrified of the optimisations that the compiler is theoretically
allowed to make to C11 atomics given the assumptions of the language
virtual machine, which are not necessarily valid in the kernel environment.
We would at least need well-supported compiler options to disable these
options, and also to allow data races with things like READ_ONCE.

Will

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-08 17:43                 ` Will Deacon
@ 2017-03-14 15:22                   ` Dmitry Vyukov
  2017-03-14 15:31                     ` Peter Zijlstra
  2017-03-14 15:32                     ` Peter Zijlstra
  0 siblings, 2 replies; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-14 15:22 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Peter Zijlstra, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

[-- Attachment #1: Type: text/plain, Size: 2210 bytes --]

On Wed, Mar 8, 2017 at 6:43 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Wed, Mar 08, 2017 at 03:20:41PM +0000, Mark Rutland wrote:
>> On Wed, Mar 08, 2017 at 02:42:10PM +0100, Dmitry Vyukov wrote:
>> > I think if we scope compiler atomic builtins to KASAN/KTSAN/KMSAN (and
>> > consequently x86/arm64) initially, it becomes more realistic. For the
>> > tools we don't care about absolute efficiency and this gets rid of
>> > Will's points (2), (4) and (6) here https://lwn.net/Articles/691295/.
>> > Re (3) I think rmb/wmb can be reasonably replaced with
>> > atomic_thread_fence(acquire/release). Re (5) situation with
>> > correctness becomes better very quickly as more people use them in
>> > user-space. Since KASAN is not intended to be used in production (or
>> > at least such build is expected to crash), we can afford to shake out
>> > any remaining correctness issues in such build. (1) I don't fully
>> > understand, what exactly is the problem with seq_cst?
>>
>> I'll have to leave it to Will to have the final word on these; I'm
>> certainly not familiar enough with the C11 memory model to comment on
>> (1).
>
> rmb()/wmb() are not remotely similar to
> atomic_thread_fenc_{acquire,release}, even if you restrict ordering to
> coherent CPUs (i.e. the smp_* variants). Please don't do that :)
>
> I'm also terrified of the optimisations that the compiler is theoretically
> allowed to make to C11 atomics given the assumptions of the language
> virtual machine, which are not necessarily valid in the kernel environment.
> We would at least need well-supported compiler options to disable these
> options, and also to allow data races with things like READ_ONCE.

Hello,

I've prototyped what Mark suggested:
 - prefix arch atomics with arch_
 - add <asm-generic/atomic-instrumented.h> which defined atomics and
forwards to the arch_ version

Patch attached. It boot with/without KASAN.

Does it look reasonable to you?

If so, I will split it into:
 - minor kasan patch that adds volatile to kasan_check_read/write
 - main patch that adds arch_ prefix and
<asm-generic/atomic-instrumented.h> header
 - kasan instrumentation in <asm-generic/atomic-instrumented.h>

Any other suggestions?

[-- Attachment #2: atomic.patch --]
[-- Type: text/x-patch, Size: 40193 bytes --]

diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h
index 14635c5ea025..605a29c9fe81 100644
--- a/arch/x86/include/asm/atomic.h
+++ b/arch/x86/include/asm/atomic.h
@@ -16,36 +16,36 @@
 #define ATOMIC_INIT(i)	{ (i) }
 
 /**
- * atomic_read - read atomic variable
+ * arch_atomic_read - read atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically reads the value of @v.
  */
-static __always_inline int atomic_read(const atomic_t *v)
+static __always_inline int arch_atomic_read(const atomic_t *v)
 {
-	return READ_ONCE((v)->counter);
+	return READ_ONCE_NOCHECK((v)->counter);
 }
 
 /**
- * atomic_set - set atomic variable
+ * arch_atomic_set - set atomic variable
  * @v: pointer of type atomic_t
  * @i: required value
  *
  * Atomically sets the value of @v to @i.
  */
-static __always_inline void atomic_set(atomic_t *v, int i)
+static __always_inline void arch_atomic_set(atomic_t *v, int i)
 {
 	WRITE_ONCE(v->counter, i);
 }
 
 /**
- * atomic_add - add integer to atomic variable
+ * arch_atomic_add - add integer to atomic variable
  * @i: integer value to add
  * @v: pointer of type atomic_t
  *
  * Atomically adds @i to @v.
  */
-static __always_inline void atomic_add(int i, atomic_t *v)
+static __always_inline void arch_atomic_add(int i, atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "addl %1,%0"
 		     : "+m" (v->counter)
@@ -53,13 +53,13 @@ static __always_inline void atomic_add(int i, atomic_t *v)
 }
 
 /**
- * atomic_sub - subtract integer from atomic variable
+ * arch_atomic_sub - subtract integer from atomic variable
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
  *
  * Atomically subtracts @i from @v.
  */
-static __always_inline void atomic_sub(int i, atomic_t *v)
+static __always_inline void arch_atomic_sub(int i, atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "subl %1,%0"
 		     : "+m" (v->counter)
@@ -67,7 +67,7 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
 }
 
 /**
- * atomic_sub_and_test - subtract value from variable and test result
+ * arch_atomic_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
  * @v: pointer of type atomic_t
  *
@@ -75,63 +75,63 @@ static __always_inline void atomic_sub(int i, atomic_t *v)
  * true if the result is zero, or false for all
  * other cases.
  */
-static __always_inline bool atomic_sub_and_test(int i, atomic_t *v)
+static __always_inline bool arch_atomic_sub_and_test(int i, atomic_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "subl", v->counter, "er", i, "%0", e);
 }
 
 /**
- * atomic_inc - increment atomic variable
+ * arch_atomic_inc - increment atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically increments @v by 1.
  */
-static __always_inline void atomic_inc(atomic_t *v)
+static __always_inline void arch_atomic_inc(atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "incl %0"
 		     : "+m" (v->counter));
 }
 
 /**
- * atomic_dec - decrement atomic variable
+ * arch_atomic_dec - decrement atomic variable
  * @v: pointer of type atomic_t
  *
  * Atomically decrements @v by 1.
  */
-static __always_inline void atomic_dec(atomic_t *v)
+static __always_inline void arch_atomic_dec(atomic_t *v)
 {
 	asm volatile(LOCK_PREFIX "decl %0"
 		     : "+m" (v->counter));
 }
 
 /**
- * atomic_dec_and_test - decrement and test
+ * arch_atomic_dec_and_test - decrement and test
  * @v: pointer of type atomic_t
  *
  * Atomically decrements @v by 1 and
  * returns true if the result is 0, or false for all other
  * cases.
  */
-static __always_inline bool atomic_dec_and_test(atomic_t *v)
+static __always_inline bool arch_atomic_dec_and_test(atomic_t *v)
 {
 	GEN_UNARY_RMWcc(LOCK_PREFIX "decl", v->counter, "%0", e);
 }
 
 /**
- * atomic_inc_and_test - increment and test
+ * arch_atomic_inc_and_test - increment and test
  * @v: pointer of type atomic_t
  *
  * Atomically increments @v by 1
  * and returns true if the result is zero, or false for all
  * other cases.
  */
-static __always_inline bool atomic_inc_and_test(atomic_t *v)
+static __always_inline bool arch_atomic_inc_and_test(atomic_t *v)
 {
 	GEN_UNARY_RMWcc(LOCK_PREFIX "incl", v->counter, "%0", e);
 }
 
 /**
- * atomic_add_negative - add and test if negative
+ * arch_atomic_add_negative - add and test if negative
  * @i: integer value to add
  * @v: pointer of type atomic_t
  *
@@ -139,60 +139,60 @@ static __always_inline bool atomic_inc_and_test(atomic_t *v)
  * if the result is negative, or false when
  * result is greater than or equal to zero.
  */
-static __always_inline bool atomic_add_negative(int i, atomic_t *v)
+static __always_inline bool arch_atomic_add_negative(int i, atomic_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "addl", v->counter, "er", i, "%0", s);
 }
 
 /**
- * atomic_add_return - add integer and return
+ * arch_atomic_add_return - add integer and return
  * @i: integer value to add
  * @v: pointer of type atomic_t
  *
  * Atomically adds @i to @v and returns @i + @v
  */
-static __always_inline int atomic_add_return(int i, atomic_t *v)
+static __always_inline int arch_atomic_add_return(int i, atomic_t *v)
 {
-	return i + xadd(&v->counter, i);
+	return i + arch_xadd(&v->counter, i);
 }
 
 /**
- * atomic_sub_return - subtract integer and return
+ * arch_atomic_sub_return - subtract integer and return
  * @v: pointer of type atomic_t
  * @i: integer value to subtract
  *
  * Atomically subtracts @i from @v and returns @v - @i
  */
-static __always_inline int atomic_sub_return(int i, atomic_t *v)
+static __always_inline int arch_atomic_sub_return(int i, atomic_t *v)
 {
-	return atomic_add_return(-i, v);
+	return arch_atomic_add_return(-i, v);
 }
 
-#define atomic_inc_return(v)  (atomic_add_return(1, v))
-#define atomic_dec_return(v)  (atomic_sub_return(1, v))
+#define arch_atomic_inc_return(v)  (arch_atomic_add_return(1, v))
+#define arch_atomic_dec_return(v)  (arch_atomic_sub_return(1, v))
 
-static __always_inline int atomic_fetch_add(int i, atomic_t *v)
+static __always_inline int arch_atomic_fetch_add(int i, atomic_t *v)
 {
-	return xadd(&v->counter, i);
+	return arch_xadd(&v->counter, i);
 }
 
-static __always_inline int atomic_fetch_sub(int i, atomic_t *v)
+static __always_inline int arch_atomic_fetch_sub(int i, atomic_t *v)
 {
-	return xadd(&v->counter, -i);
+	return arch_xadd(&v->counter, -i);
 }
 
-static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+static __always_inline int arch_atomic_cmpxchg(atomic_t *v, int old, int new)
 {
-	return cmpxchg(&v->counter, old, new);
+	return arch_cmpxchg(&v->counter, old, new);
 }
 
-static inline int atomic_xchg(atomic_t *v, int new)
+static inline int arch_atomic_xchg(atomic_t *v, int new)
 {
-	return xchg(&v->counter, new);
+	return arch_xchg(&v->counter, new);
 }
 
 #define ATOMIC_OP(op)							\
-static inline void atomic_##op(int i, atomic_t *v)			\
+static inline void arch_atomic_##op(int i, atomic_t *v)			\
 {									\
 	asm volatile(LOCK_PREFIX #op"l %1,%0"				\
 			: "+m" (v->counter)				\
@@ -201,11 +201,11 @@ static inline void atomic_##op(int i, atomic_t *v)			\
 }
 
 #define ATOMIC_FETCH_OP(op, c_op)					\
-static inline int atomic_fetch_##op(int i, atomic_t *v)		\
+static inline int arch_atomic_fetch_##op(int i, atomic_t *v)		\
 {									\
-	int old, val = atomic_read(v);					\
+	int old, val = arch_atomic_read(v);				\
 	for (;;) {							\
-		old = atomic_cmpxchg(v, val, val c_op i);		\
+		old = arch_atomic_cmpxchg(v, val, val c_op i);		\
 		if (old == val)						\
 			break;						\
 		val = old;						\
@@ -226,7 +226,7 @@ ATOMIC_OPS(xor, ^)
 #undef ATOMIC_OP
 
 /**
- * __atomic_add_unless - add unless the number is already a given value
+ * __arch_atomic_add_unless - add unless the number is already a given value
  * @v: pointer of type atomic_t
  * @a: the amount to add to v...
  * @u: ...unless v is equal to u.
@@ -234,14 +234,14 @@ ATOMIC_OPS(xor, ^)
  * Atomically adds @a to @v, so long as @v was not already @u.
  * Returns the old value of @v.
  */
-static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
+static __always_inline int __arch_atomic_add_unless(atomic_t *v, int a, int u)
 {
 	int c, old;
-	c = atomic_read(v);
+	c = arch_atomic_read(v);
 	for (;;) {
 		if (unlikely(c == (u)))
 			break;
-		old = atomic_cmpxchg((v), c, c + (a));
+		old = arch_atomic_cmpxchg((v), c, c + (a));
 		if (likely(old == c))
 			break;
 		c = old;
@@ -250,13 +250,13 @@ static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
 }
 
 /**
- * atomic_inc_short - increment of a short integer
+ * arch_atomic_inc_short - increment of a short integer
  * @v: pointer to type int
  *
  * Atomically adds 1 to @v
  * Returns the new value of @u
  */
-static __always_inline short int atomic_inc_short(short int *v)
+static __always_inline short int arch_atomic_inc_short(short int *v)
 {
 	asm(LOCK_PREFIX "addw $1, %0" : "+m" (*v));
 	return *v;
@@ -268,4 +268,6 @@ static __always_inline short int atomic_inc_short(short int *v)
 # include <asm/atomic64_64.h>
 #endif
 
+#include <asm-generic/atomic-instrumented.h>
+
 #endif /* _ASM_X86_ATOMIC_H */
diff --git a/arch/x86/include/asm/atomic64_32.h b/arch/x86/include/asm/atomic64_32.h
index 71d7705fb303..b45defaf94c1 100644
--- a/arch/x86/include/asm/atomic64_32.h
+++ b/arch/x86/include/asm/atomic64_32.h
@@ -61,7 +61,7 @@ ATOMIC64_DECL(add_unless);
 #undef ATOMIC64_EXPORT
 
 /**
- * atomic64_cmpxchg - cmpxchg atomic64 variable
+ * arch_atomic64_cmpxchg - cmpxchg atomic64 variable
  * @v: pointer to type atomic64_t
  * @o: expected value
  * @n: new value
@@ -70,20 +70,21 @@ ATOMIC64_DECL(add_unless);
  * the old value.
  */
 
-static inline long long atomic64_cmpxchg(atomic64_t *v, long long o, long long n)
+static inline long long arch_atomic64_cmpxchg(atomic64_t *v, long long o,
+					      long long n)
 {
-	return cmpxchg64(&v->counter, o, n);
+	return arch_cmpxchg64(&v->counter, o, n);
 }
 
 /**
- * atomic64_xchg - xchg atomic64 variable
+ * arch_atomic64_xchg - xchg atomic64 variable
  * @v: pointer to type atomic64_t
  * @n: value to assign
  *
  * Atomically xchgs the value of @v to @n and returns
  * the old value.
  */
-static inline long long atomic64_xchg(atomic64_t *v, long long n)
+static inline long long arch_atomic64_xchg(atomic64_t *v, long long n)
 {
 	long long o;
 	unsigned high = (unsigned)(n >> 32);
@@ -95,13 +96,13 @@ static inline long long atomic64_xchg(atomic64_t *v, long long n)
 }
 
 /**
- * atomic64_set - set atomic64 variable
+ * arch_atomic64_set - set atomic64 variable
  * @v: pointer to type atomic64_t
  * @i: value to assign
  *
  * Atomically sets the value of @v to @n.
  */
-static inline void atomic64_set(atomic64_t *v, long long i)
+static inline void arch_atomic64_set(atomic64_t *v, long long i)
 {
 	unsigned high = (unsigned)(i >> 32);
 	unsigned low = (unsigned)i;
@@ -111,12 +112,12 @@ static inline void atomic64_set(atomic64_t *v, long long i)
 }
 
 /**
- * atomic64_read - read atomic64 variable
+ * arch_atomic64_read - read atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically reads the value of @v and returns it.
  */
-static inline long long atomic64_read(const atomic64_t *v)
+static inline long long arch_atomic64_read(const atomic64_t *v)
 {
 	long long r;
 	alternative_atomic64(read, "=&A" (r), "c" (v) : "memory");
@@ -124,13 +125,13 @@ static inline long long atomic64_read(const atomic64_t *v)
  }
 
 /**
- * atomic64_add_return - add and return
+ * arch_atomic64_add_return - add and return
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
  * Atomically adds @i to @v and returns @i + *@v
  */
-static inline long long atomic64_add_return(long long i, atomic64_t *v)
+static inline long long arch_atomic64_add_return(long long i, atomic64_t *v)
 {
 	alternative_atomic64(add_return,
 			     ASM_OUTPUT2("+A" (i), "+c" (v)),
@@ -141,7 +142,7 @@ static inline long long atomic64_add_return(long long i, atomic64_t *v)
 /*
  * Other variants with different arithmetic operators:
  */
-static inline long long atomic64_sub_return(long long i, atomic64_t *v)
+static inline long long arch_atomic64_sub_return(long long i, atomic64_t *v)
 {
 	alternative_atomic64(sub_return,
 			     ASM_OUTPUT2("+A" (i), "+c" (v)),
@@ -149,7 +150,7 @@ static inline long long atomic64_sub_return(long long i, atomic64_t *v)
 	return i;
 }
 
-static inline long long atomic64_inc_return(atomic64_t *v)
+static inline long long arch_atomic64_inc_return(atomic64_t *v)
 {
 	long long a;
 	alternative_atomic64(inc_return, "=&A" (a),
@@ -157,7 +158,7 @@ static inline long long atomic64_inc_return(atomic64_t *v)
 	return a;
 }
 
-static inline long long atomic64_dec_return(atomic64_t *v)
+static inline long long arch_atomic64_dec_return(atomic64_t *v)
 {
 	long long a;
 	alternative_atomic64(dec_return, "=&A" (a),
@@ -166,13 +167,13 @@ static inline long long atomic64_dec_return(atomic64_t *v)
 }
 
 /**
- * atomic64_add - add integer to atomic64 variable
+ * arch_atomic64_add - add integer to atomic64 variable
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
  * Atomically adds @i to @v.
  */
-static inline long long atomic64_add(long long i, atomic64_t *v)
+static inline long long arch_atomic64_add(long long i, atomic64_t *v)
 {
 	__alternative_atomic64(add, add_return,
 			       ASM_OUTPUT2("+A" (i), "+c" (v)),
@@ -181,13 +182,13 @@ static inline long long atomic64_add(long long i, atomic64_t *v)
 }
 
 /**
- * atomic64_sub - subtract the atomic64 variable
+ * arch_atomic64_sub - subtract the atomic64 variable
  * @i: integer value to subtract
  * @v: pointer to type atomic64_t
  *
  * Atomically subtracts @i from @v.
  */
-static inline long long atomic64_sub(long long i, atomic64_t *v)
+static inline long long arch_atomic64_sub(long long i, atomic64_t *v)
 {
 	__alternative_atomic64(sub, sub_return,
 			       ASM_OUTPUT2("+A" (i), "+c" (v)),
@@ -196,7 +197,7 @@ static inline long long atomic64_sub(long long i, atomic64_t *v)
 }
 
 /**
- * atomic64_sub_and_test - subtract value from variable and test result
+ * arch_atomic64_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
  * @v: pointer to type atomic64_t
  *
@@ -204,46 +205,46 @@ static inline long long atomic64_sub(long long i, atomic64_t *v)
  * true if the result is zero, or false for all
  * other cases.
  */
-static inline int atomic64_sub_and_test(long long i, atomic64_t *v)
+static inline int arch_atomic64_sub_and_test(long long i, atomic64_t *v)
 {
-	return atomic64_sub_return(i, v) == 0;
+	return arch_atomic64_sub_return(i, v) == 0;
 }
 
 /**
- * atomic64_inc - increment atomic64 variable
+ * arch_atomic64_inc - increment atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically increments @v by 1.
  */
-static inline void atomic64_inc(atomic64_t *v)
+static inline void arch_atomic64_inc(atomic64_t *v)
 {
 	__alternative_atomic64(inc, inc_return, /* no output */,
 			       "S" (v) : "memory", "eax", "ecx", "edx");
 }
 
 /**
- * atomic64_dec - decrement atomic64 variable
+ * arch_atomic64_dec - decrement atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically decrements @v by 1.
  */
-static inline void atomic64_dec(atomic64_t *v)
+static inline void arch_atomic64_dec(atomic64_t *v)
 {
 	__alternative_atomic64(dec, dec_return, /* no output */,
 			       "S" (v) : "memory", "eax", "ecx", "edx");
 }
 
 /**
- * atomic64_dec_and_test - decrement and test
+ * arch_atomic64_dec_and_test - decrement and test
  * @v: pointer to type atomic64_t
  *
  * Atomically decrements @v by 1 and
  * returns true if the result is 0, or false for all other
  * cases.
  */
-static inline int atomic64_dec_and_test(atomic64_t *v)
+static inline int arch_atomic64_dec_and_test(atomic64_t *v)
 {
-	return atomic64_dec_return(v) == 0;
+	return arch_atomic64_dec_return(v) == 0;
 }
 
 /**
@@ -254,13 +255,13 @@ static inline int atomic64_dec_and_test(atomic64_t *v)
  * and returns true if the result is zero, or false for all
  * other cases.
  */
-static inline int atomic64_inc_and_test(atomic64_t *v)
+static inline int arch_atomic64_inc_and_test(atomic64_t *v)
 {
-	return atomic64_inc_return(v) == 0;
+	return arch_atomic64_inc_return(v) == 0;
 }
 
 /**
- * atomic64_add_negative - add and test if negative
+ * arch_atomic64_add_negative - add and test if negative
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
@@ -268,13 +269,13 @@ static inline int atomic64_inc_and_test(atomic64_t *v)
  * if the result is negative, or false when
  * result is greater than or equal to zero.
  */
-static inline int atomic64_add_negative(long long i, atomic64_t *v)
+static inline int arch_atomic64_add_negative(long long i, atomic64_t *v)
 {
-	return atomic64_add_return(i, v) < 0;
+	return arch_atomic64_add_return(i, v) < 0;
 }
 
 /**
- * atomic64_add_unless - add unless the number is a given value
+ * arch_atomic64_add_unless - add unless the number is a given value
  * @v: pointer of type atomic64_t
  * @a: the amount to add to v...
  * @u: ...unless v is equal to u.
@@ -282,7 +283,7 @@ static inline int atomic64_add_negative(long long i, atomic64_t *v)
  * Atomically adds @a to @v, so long as it was not @u.
  * Returns non-zero if the add was done, zero otherwise.
  */
-static inline int atomic64_add_unless(atomic64_t *v, long long a, long long u)
+static inline int arch_atomic64_add_unless(atomic64_t *v, long long a, long long u)
 {
 	unsigned low = (unsigned)u;
 	unsigned high = (unsigned)(u >> 32);
@@ -293,7 +294,7 @@ static inline int atomic64_add_unless(atomic64_t *v, long long a, long long u)
 }
 
 
-static inline int atomic64_inc_not_zero(atomic64_t *v)
+static inline int arch_atomic64_inc_not_zero(atomic64_t *v)
 {
 	int r;
 	alternative_atomic64(inc_not_zero, "=&a" (r),
@@ -301,7 +302,7 @@ static inline int atomic64_inc_not_zero(atomic64_t *v)
 	return r;
 }
 
-static inline long long atomic64_dec_if_positive(atomic64_t *v)
+static inline long long arch_atomic64_dec_if_positive(atomic64_t *v)
 {
 	long long r;
 	alternative_atomic64(dec_if_positive, "=&A" (r),
@@ -313,25 +314,25 @@ static inline long long atomic64_dec_if_positive(atomic64_t *v)
 #undef __alternative_atomic64
 
 #define ATOMIC64_OP(op, c_op)						\
-static inline void atomic64_##op(long long i, atomic64_t *v)		\
+static inline void arch_atomic64_##op(long long i, atomic64_t *v)	\
 {									\
 	long long old, c = 0;						\
-	while ((old = atomic64_cmpxchg(v, c, c c_op i)) != c)		\
+	while ((old = arch_atomic64_cmpxchg(v, c, c c_op i)) != c)	\
 		c = old;						\
 }
 
 #define ATOMIC64_FETCH_OP(op, c_op)					\
-static inline long long atomic64_fetch_##op(long long i, atomic64_t *v)	\
+static inline long long arch_atomic64_fetch_##op(long long i, atomic64_t *v) \
 {									\
 	long long old, c = 0;						\
-	while ((old = atomic64_cmpxchg(v, c, c c_op i)) != c)		\
+	while ((old = arch_atomic64_cmpxchg(v, c, c c_op i)) != c)	\
 		c = old;						\
 	return old;							\
 }
 
 ATOMIC64_FETCH_OP(add, +)
 
-#define atomic64_fetch_sub(i, v)	atomic64_fetch_add(-(i), (v))
+#define arch_atomic64_fetch_sub(i, v)	arch_atomic64_fetch_add(-(i), (v))
 
 #define ATOMIC64_OPS(op, c_op)						\
 	ATOMIC64_OP(op, c_op)						\
diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h
index 89ed2f6ae2f7..3c8aa7a3fd3e 100644
--- a/arch/x86/include/asm/atomic64_64.h
+++ b/arch/x86/include/asm/atomic64_64.h
@@ -16,31 +16,31 @@
  * Atomically reads the value of @v.
  * Doesn't imply a read memory barrier.
  */
-static inline long atomic64_read(const atomic64_t *v)
+static inline long arch_atomic64_read(const atomic64_t *v)
 {
-	return READ_ONCE((v)->counter);
+	return READ_ONCE_NOCHECK((v)->counter);
 }
 
 /**
- * atomic64_set - set atomic64 variable
+ * arch_atomic64_set - set atomic64 variable
  * @v: pointer to type atomic64_t
  * @i: required value
  *
  * Atomically sets the value of @v to @i.
  */
-static inline void atomic64_set(atomic64_t *v, long i)
+static inline void arch_atomic64_set(atomic64_t *v, long i)
 {
 	WRITE_ONCE(v->counter, i);
 }
 
 /**
- * atomic64_add - add integer to atomic64 variable
+ * arch_atomic64_add - add integer to atomic64 variable
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
  * Atomically adds @i to @v.
  */
-static __always_inline void atomic64_add(long i, atomic64_t *v)
+static __always_inline void arch_atomic64_add(long i, atomic64_t *v)
 {
 	asm volatile(LOCK_PREFIX "addq %1,%0"
 		     : "=m" (v->counter)
@@ -48,13 +48,13 @@ static __always_inline void atomic64_add(long i, atomic64_t *v)
 }
 
 /**
- * atomic64_sub - subtract the atomic64 variable
+ * arch_atomic64_sub - subtract the atomic64 variable
  * @i: integer value to subtract
  * @v: pointer to type atomic64_t
  *
  * Atomically subtracts @i from @v.
  */
-static inline void atomic64_sub(long i, atomic64_t *v)
+static inline void arch_atomic64_sub(long i, atomic64_t *v)
 {
 	asm volatile(LOCK_PREFIX "subq %1,%0"
 		     : "=m" (v->counter)
@@ -62,7 +62,7 @@ static inline void atomic64_sub(long i, atomic64_t *v)
 }
 
 /**
- * atomic64_sub_and_test - subtract value from variable and test result
+ * arch_atomic64_sub_and_test - subtract value from variable and test result
  * @i: integer value to subtract
  * @v: pointer to type atomic64_t
  *
@@ -70,18 +70,18 @@ static inline void atomic64_sub(long i, atomic64_t *v)
  * true if the result is zero, or false for all
  * other cases.
  */
-static inline bool atomic64_sub_and_test(long i, atomic64_t *v)
+static inline bool arch_atomic64_sub_and_test(long i, atomic64_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "subq", v->counter, "er", i, "%0", e);
 }
 
 /**
- * atomic64_inc - increment atomic64 variable
+ * arch_atomic64_inc - increment atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically increments @v by 1.
  */
-static __always_inline void atomic64_inc(atomic64_t *v)
+static __always_inline void arch_atomic64_inc(atomic64_t *v)
 {
 	asm volatile(LOCK_PREFIX "incq %0"
 		     : "=m" (v->counter)
@@ -89,12 +89,12 @@ static __always_inline void atomic64_inc(atomic64_t *v)
 }
 
 /**
- * atomic64_dec - decrement atomic64 variable
+ * arch_atomic64_dec - decrement atomic64 variable
  * @v: pointer to type atomic64_t
  *
  * Atomically decrements @v by 1.
  */
-static __always_inline void atomic64_dec(atomic64_t *v)
+static __always_inline void arch_atomic64_dec(atomic64_t *v)
 {
 	asm volatile(LOCK_PREFIX "decq %0"
 		     : "=m" (v->counter)
@@ -102,33 +102,33 @@ static __always_inline void atomic64_dec(atomic64_t *v)
 }
 
 /**
- * atomic64_dec_and_test - decrement and test
+ * arch_atomic64_dec_and_test - decrement and test
  * @v: pointer to type atomic64_t
  *
  * Atomically decrements @v by 1 and
  * returns true if the result is 0, or false for all other
  * cases.
  */
-static inline bool atomic64_dec_and_test(atomic64_t *v)
+static inline bool arch_atomic64_dec_and_test(atomic64_t *v)
 {
 	GEN_UNARY_RMWcc(LOCK_PREFIX "decq", v->counter, "%0", e);
 }
 
 /**
- * atomic64_inc_and_test - increment and test
+ * arch_atomic64_inc_and_test - increment and test
  * @v: pointer to type atomic64_t
  *
  * Atomically increments @v by 1
  * and returns true if the result is zero, or false for all
  * other cases.
  */
-static inline bool atomic64_inc_and_test(atomic64_t *v)
+static inline bool arch_atomic64_inc_and_test(atomic64_t *v)
 {
 	GEN_UNARY_RMWcc(LOCK_PREFIX "incq", v->counter, "%0", e);
 }
 
 /**
- * atomic64_add_negative - add and test if negative
+ * arch_atomic64_add_negative - add and test if negative
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
@@ -136,53 +136,53 @@ static inline bool atomic64_inc_and_test(atomic64_t *v)
  * if the result is negative, or false when
  * result is greater than or equal to zero.
  */
-static inline bool atomic64_add_negative(long i, atomic64_t *v)
+static inline bool arch_atomic64_add_negative(long i, atomic64_t *v)
 {
 	GEN_BINARY_RMWcc(LOCK_PREFIX "addq", v->counter, "er", i, "%0", s);
 }
 
 /**
- * atomic64_add_return - add and return
+ * arch_atomic64_add_return - add and return
  * @i: integer value to add
  * @v: pointer to type atomic64_t
  *
  * Atomically adds @i to @v and returns @i + @v
  */
-static __always_inline long atomic64_add_return(long i, atomic64_t *v)
+static __always_inline long arch_atomic64_add_return(long i, atomic64_t *v)
 {
-	return i + xadd(&v->counter, i);
+	return i + arch_xadd(&v->counter, i);
 }
 
-static inline long atomic64_sub_return(long i, atomic64_t *v)
+static inline long arch_atomic64_sub_return(long i, atomic64_t *v)
 {
-	return atomic64_add_return(-i, v);
+	return arch_atomic64_add_return(-i, v);
 }
 
-static inline long atomic64_fetch_add(long i, atomic64_t *v)
+static inline long arch_atomic64_fetch_add(long i, atomic64_t *v)
 {
-	return xadd(&v->counter, i);
+	return arch_xadd(&v->counter, i);
 }
 
-static inline long atomic64_fetch_sub(long i, atomic64_t *v)
+static inline long arch_atomic64_fetch_sub(long i, atomic64_t *v)
 {
-	return xadd(&v->counter, -i);
+	return arch_xadd(&v->counter, -i);
 }
 
-#define atomic64_inc_return(v)  (atomic64_add_return(1, (v)))
-#define atomic64_dec_return(v)  (atomic64_sub_return(1, (v)))
+#define arch_atomic64_inc_return(v)  (arch_atomic64_add_return(1, (v)))
+#define arch_atomic64_dec_return(v)  (arch_atomic64_sub_return(1, (v)))
 
-static inline long atomic64_cmpxchg(atomic64_t *v, long old, long new)
+static inline long arch_atomic64_cmpxchg(atomic64_t *v, long old, long new)
 {
-	return cmpxchg(&v->counter, old, new);
+	return arch_cmpxchg(&v->counter, old, new);
 }
 
-static inline long atomic64_xchg(atomic64_t *v, long new)
+static inline long arch_atomic64_xchg(atomic64_t *v, long new)
 {
-	return xchg(&v->counter, new);
+	return arch_xchg(&v->counter, new);
 }
 
 /**
- * atomic64_add_unless - add unless the number is a given value
+ * arch_atomic64_add_unless - add unless the number is a given value
  * @v: pointer of type atomic64_t
  * @a: the amount to add to v...
  * @u: ...unless v is equal to u.
@@ -190,14 +190,14 @@ static inline long atomic64_xchg(atomic64_t *v, long new)
  * Atomically adds @a to @v, so long as it was not @u.
  * Returns the old value of @v.
  */
-static inline bool atomic64_add_unless(atomic64_t *v, long a, long u)
+static inline bool arch_atomic64_add_unless(atomic64_t *v, long a, long u)
 {
 	long c, old;
-	c = atomic64_read(v);
+	c = arch_atomic64_read(v);
 	for (;;) {
 		if (unlikely(c == (u)))
 			break;
-		old = atomic64_cmpxchg((v), c, c + (a));
+		old = arch_atomic64_cmpxchg((v), c, c + (a));
 		if (likely(old == c))
 			break;
 		c = old;
@@ -205,24 +205,24 @@ static inline bool atomic64_add_unless(atomic64_t *v, long a, long u)
 	return c != (u);
 }
 
-#define atomic64_inc_not_zero(v) atomic64_add_unless((v), 1, 0)
+#define arch_atomic64_inc_not_zero(v) arch_atomic64_add_unless((v), 1, 0)
 
 /*
- * atomic64_dec_if_positive - decrement by 1 if old value positive
+ * arch_atomic64_dec_if_positive - decrement by 1 if old value positive
  * @v: pointer of type atomic_t
  *
  * The function returns the old value of *v minus 1, even if
  * the atomic variable, v, was not decremented.
  */
-static inline long atomic64_dec_if_positive(atomic64_t *v)
+static inline long arch_atomic64_dec_if_positive(atomic64_t *v)
 {
 	long c, old, dec;
-	c = atomic64_read(v);
+	c = arch_atomic64_read(v);
 	for (;;) {
 		dec = c - 1;
 		if (unlikely(dec < 0))
 			break;
-		old = atomic64_cmpxchg((v), c, dec);
+		old = arch_atomic64_cmpxchg((v), c, dec);
 		if (likely(old == c))
 			break;
 		c = old;
@@ -231,7 +231,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
 }
 
 #define ATOMIC64_OP(op)							\
-static inline void atomic64_##op(long i, atomic64_t *v)			\
+static inline void arch_atomic64_##op(long i, atomic64_t *v)		\
 {									\
 	asm volatile(LOCK_PREFIX #op"q %1,%0"				\
 			: "+m" (v->counter)				\
@@ -240,11 +240,11 @@ static inline void atomic64_##op(long i, atomic64_t *v)			\
 }
 
 #define ATOMIC64_FETCH_OP(op, c_op)					\
-static inline long atomic64_fetch_##op(long i, atomic64_t *v)		\
+static inline long arch_atomic64_fetch_##op(long i, atomic64_t *v)	\
 {									\
-	long old, val = atomic64_read(v);				\
+	long old, val = arch_atomic64_read(v);				\
 	for (;;) {							\
-		old = atomic64_cmpxchg(v, val, val c_op i);		\
+		old = arch_atomic64_cmpxchg(v, val, val c_op i);	\
 		if (old == val)						\
 			break;						\
 		val = old;						\
diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 97848cdfcb1a..ca3968b5cd46 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -74,7 +74,7 @@ extern void __add_wrong_size(void)
  * use "asm volatile" and "memory" clobbers to prevent gcc from moving
  * information around.
  */
-#define xchg(ptr, v)	__xchg_op((ptr), (v), xchg, "")
+#define arch_xchg(ptr, v)	__xchg_op((ptr), (v), xchg, "")
 
 /*
  * Atomic compare and exchange.  Compare OLD with MEM, if identical,
@@ -144,23 +144,23 @@ extern void __add_wrong_size(void)
 # include <asm/cmpxchg_64.h>
 #endif
 
-#define cmpxchg(ptr, old, new)						\
+#define arch_cmpxchg(ptr, old, new)					\
 	__cmpxchg(ptr, old, new, sizeof(*(ptr)))
 
-#define sync_cmpxchg(ptr, old, new)					\
+#define arch_sync_cmpxchg(ptr, old, new)				\
 	__sync_cmpxchg(ptr, old, new, sizeof(*(ptr)))
 
-#define cmpxchg_local(ptr, old, new)					\
+#define arch_cmpxchg_local(ptr, old, new)				\
 	__cmpxchg_local(ptr, old, new, sizeof(*(ptr)))
 
 /*
- * xadd() adds "inc" to "*ptr" and atomically returns the previous
+ * arch_xadd() adds "inc" to "*ptr" and atomically returns the previous
  * value of "*ptr".
  *
- * xadd() is locked when multiple CPUs are online
+ * arch_xadd() is locked when multiple CPUs are online
  */
 #define __xadd(ptr, inc, lock)	__xchg_op((ptr), (inc), xadd, lock)
-#define xadd(ptr, inc)		__xadd((ptr), (inc), LOCK_PREFIX)
+#define arch_xadd(ptr, inc)		__xadd((ptr), (inc), LOCK_PREFIX)
 
 #define __cmpxchg_double(pfx, p1, p2, o1, o2, n1, n2)			\
 ({									\
@@ -179,10 +179,10 @@ extern void __add_wrong_size(void)
 	__ret;								\
 })
 
-#define cmpxchg_double(p1, p2, o1, o2, n1, n2) \
+#define arch_cmpxchg_double(p1, p2, o1, o2, n1, n2) \
 	__cmpxchg_double(LOCK_PREFIX, p1, p2, o1, o2, n1, n2)
 
-#define cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \
+#define arch_cmpxchg_double_local(p1, p2, o1, o2, n1, n2) \
 	__cmpxchg_double(, p1, p2, o1, o2, n1, n2)
 
 #endif	/* ASM_X86_CMPXCHG_H */
diff --git a/arch/x86/include/asm/cmpxchg_32.h b/arch/x86/include/asm/cmpxchg_32.h
index e4959d023af8..d897291d2bf9 100644
--- a/arch/x86/include/asm/cmpxchg_32.h
+++ b/arch/x86/include/asm/cmpxchg_32.h
@@ -35,10 +35,10 @@ static inline void set_64bit(volatile u64 *ptr, u64 value)
 }
 
 #ifdef CONFIG_X86_CMPXCHG64
-#define cmpxchg64(ptr, o, n)						\
+#define arch_cmpxchg64(ptr, o, n)					\
 	((__typeof__(*(ptr)))__cmpxchg64((ptr), (unsigned long long)(o), \
 					 (unsigned long long)(n)))
-#define cmpxchg64_local(ptr, o, n)					\
+#define arch_cmpxchg64_local(ptr, o, n)					\
 	((__typeof__(*(ptr)))__cmpxchg64_local((ptr), (unsigned long long)(o), \
 					       (unsigned long long)(n)))
 #endif
@@ -75,7 +75,7 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)
  * to simulate the cmpxchg8b on the 80386 and 80486 CPU.
  */
 
-#define cmpxchg64(ptr, o, n)					\
+#define arch_cmpxchg64(ptr, o, n)				\
 ({								\
 	__typeof__(*(ptr)) __ret;				\
 	__typeof__(*(ptr)) __old = (o);				\
@@ -92,7 +92,7 @@ static inline u64 __cmpxchg64_local(volatile u64 *ptr, u64 old, u64 new)
 	__ret; })
 
 
-#define cmpxchg64_local(ptr, o, n)				\
+#define arch_cmpxchg64_local(ptr, o, n)				\
 ({								\
 	__typeof__(*(ptr)) __ret;				\
 	__typeof__(*(ptr)) __old = (o);				\
diff --git a/arch/x86/include/asm/cmpxchg_64.h b/arch/x86/include/asm/cmpxchg_64.h
index caa23a34c963..fafaebacca2d 100644
--- a/arch/x86/include/asm/cmpxchg_64.h
+++ b/arch/x86/include/asm/cmpxchg_64.h
@@ -6,13 +6,13 @@ static inline void set_64bit(volatile u64 *ptr, u64 val)
 	*ptr = val;
 }
 
-#define cmpxchg64(ptr, o, n)						\
+#define arch_cmpxchg64(ptr, o, n)					\
 ({									\
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg((ptr), (o), (n));					\
 })
 
-#define cmpxchg64_local(ptr, o, n)					\
+#define arch_cmpxchg64_local(ptr, o, n)					\
 ({									\
 	BUILD_BUG_ON(sizeof(*(ptr)) != 8);				\
 	cmpxchg_local((ptr), (o), (n));					\
diff --git a/include/asm-generic/atomic-instrumented.h b/include/asm-generic/atomic-instrumented.h
new file mode 100644
index 000000000000..af817399bdfa
--- /dev/null
+++ b/include/asm-generic/atomic-instrumented.h
@@ -0,0 +1,233 @@
+#ifndef _LINUX_ATOMIC_INSTRUMENTED_H
+#define _LINUX_ATOMIC_INSTRUMENTED_H
+
+#include <linux/kasan-checks.h>
+
+static __always_inline int atomic_read(const atomic_t *v)
+{
+	kasan_check_read(v, sizeof(*v));
+	return arch_atomic_read(v);
+}
+
+static __always_inline long long atomic64_read(const atomic64_t *v)
+{
+	kasan_check_read(v, sizeof(*v));
+	return arch_atomic64_read(v);
+}
+
+
+static __always_inline void atomic_set(atomic_t *v, int i)
+{
+	kasan_check_write(v, sizeof(*v));
+	arch_atomic_set(v, i);
+}
+
+static __always_inline void atomic64_set(atomic64_t *v, long long i)
+{
+	kasan_check_write(v, sizeof(*v));
+	arch_atomic64_set(v, i);
+}
+
+static __always_inline int atomic_xchg(atomic_t *v, int i)
+{
+	kasan_check_write(v, sizeof(*v));
+	return arch_atomic_xchg(v, i);
+}
+
+static __always_inline long long atomic64_xchg(atomic64_t *v, long long i)
+{
+	kasan_check_write(v, sizeof(*v));
+	return arch_atomic64_xchg(v, i);
+}
+
+static __always_inline int atomic_cmpxchg(atomic_t *v, int old, int new)
+{
+	kasan_check_write(v, sizeof(*v));
+	return arch_atomic_cmpxchg(v, old, new);
+}
+
+static __always_inline long long atomic64_cmpxchg(atomic64_t *v, long long old,
+						  long long new)
+{
+	kasan_check_write(v, sizeof(*v));
+	return arch_atomic64_cmpxchg(v, old, new);
+}
+
+static __always_inline int __atomic_add_unless(atomic_t *v, int a, int u)
+{
+	kasan_check_write(v, sizeof(*v));
+	return __arch_atomic_add_unless(v, a, u);
+}
+
+
+static __always_inline bool atomic64_add_unless(atomic64_t *v, long long a,
+						long long u)
+{
+	kasan_check_write(v, sizeof(*v));
+	return arch_atomic64_add_unless(v, a, u);
+}
+
+static __always_inline short int atomic_inc_short(short int *v)
+{
+	kasan_check_write(v, sizeof(*v));
+	return arch_atomic_inc_short(v);
+}
+
+#define __INSTR_VOID1(op, sz)						\
+static __always_inline void atomic##sz##_##op(atomic##sz##_t *v)	\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic##sz##_##op(v);					\
+}
+
+#define INSTR_VOID1(op)	\
+__INSTR_VOID1(op,);	\
+__INSTR_VOID1(op, 64);
+
+INSTR_VOID1(inc);
+INSTR_VOID1(dec);
+
+#undef __INSTR_VOID1
+#undef INSTR_VOID1
+
+#define __INSTR_VOID2(op, sz, type)					\
+static __always_inline void atomic##sz##_##op(type i, atomic##sz##_t *v)\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	arch_atomic##sz##_##op(i, v);					\
+}
+
+#define INSTR_VOID2(op)		\
+__INSTR_VOID2(op, , int);	\
+__INSTR_VOID2(op, 64, long long);
+
+INSTR_VOID2(add);
+INSTR_VOID2(sub);
+INSTR_VOID2(and);
+INSTR_VOID2(or);
+INSTR_VOID2(xor);
+
+#undef __INSTR_VOID2
+#undef INSTR_VOID2
+
+#define __INSTR_RET1(op, sz, type, rtype)				\
+static __always_inline rtype atomic##sz##_##op(atomic##sz##_t *v)	\
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic##sz##_##op(v);				\
+}
+
+#define INSTR_RET1(op)		\
+__INSTR_RET1(op, , int, int);	\
+__INSTR_RET1(op, 64, long long, long long);
+
+INSTR_RET1(inc_return);
+INSTR_RET1(dec_return);
+__INSTR_RET1(inc_not_zero, 64, long long, long long);
+__INSTR_RET1(dec_if_positive, 64, long long, long long);
+
+#define INSTR_RET_BOOL1(op)	\
+__INSTR_RET1(op, , int, bool);	\
+__INSTR_RET1(op, 64, long long, bool);
+
+INSTR_RET_BOOL1(dec_and_test);
+INSTR_RET_BOOL1(inc_and_test);
+
+#undef __INSTR_RET1
+#undef INSTR_RET1
+#undef INSTR_RET_BOOL1
+
+#define __INSTR_RET2(op, sz, type, rtype)				\
+static __always_inline rtype atomic##sz##_##op(type i, atomic##sz##_t *v) \
+{									\
+	kasan_check_write(v, sizeof(*v));				\
+	return arch_atomic##sz##_##op(i, v);				\
+}
+
+#define INSTR_RET2(op)		\
+__INSTR_RET2(op, , int, int);	\
+__INSTR_RET2(op, 64, long long, long long);
+
+INSTR_RET2(add_return);
+INSTR_RET2(sub_return);
+INSTR_RET2(fetch_add);
+INSTR_RET2(fetch_sub);
+INSTR_RET2(fetch_and);
+INSTR_RET2(fetch_or);
+INSTR_RET2(fetch_xor);
+
+#define INSTR_RET_BOOL2(op)		\
+__INSTR_RET2(op, , int, bool);		\
+__INSTR_RET2(op, 64, long long, bool);
+
+INSTR_RET_BOOL2(sub_and_test);
+INSTR_RET_BOOL2(add_negative);
+
+#undef __INSTR_RET2
+#undef INSTR_RET2
+#undef INSTR_RET_BOOL2
+
+#define xchg(ptr, v)					\
+({							\
+	__typeof__(ptr) ____ptr = (ptr);		\
+	kasan_check_write(____ptr, sizeof(*____ptr));	\
+	arch_xchg(____ptr, (v));			\
+})
+
+#define xadd(ptr, v)					\
+({							\
+	__typeof__(ptr) ____ptr = (ptr);		\
+	kasan_check_write(____ptr, sizeof(*____ptr));	\
+	arch_xadd(____ptr, (v));			\
+})
+
+#define cmpxchg(ptr, old, new)				\
+({							\
+	__typeof__(ptr) ___ptr = (ptr);			\
+	kasan_check_write(___ptr, sizeof(*___ptr));	\
+	arch_cmpxchg(___ptr, (old), (new));		\
+})
+
+#define sync_cmpxchg(ptr, old, new)			\
+({							\
+	__typeof__(ptr) ____ptr = (ptr);		\
+	kasan_check_write(____ptr, sizeof(*____ptr));	\
+	arch_sync_cmpxchg(____ptr, (old), (new));	\
+})
+
+#define cmpxchg_local(ptr, old, new)			\
+({							\
+	__typeof__(ptr) ____ptr = (ptr);		\
+	kasan_check_write(____ptr, sizeof(*____ptr));	\
+	arch_cmpxchg_local(____ptr, (old), (new));	\
+})
+
+#define cmpxchg64(iptr, iold, inew)			\
+({							\
+	__typeof__(iptr) __iptr = (iptr);		\
+	kasan_check_write(__iptr, sizeof(*__iptr));	\
+	arch_cmpxchg64(__iptr, (iold), (inew));		\
+})
+
+#define cmpxchg64_local(iptr, iold, inew)		\
+({							\
+	__typeof__(iptr) __iptr = (iptr);		\
+	kasan_check_write(__iptr, sizeof(*__iptr));	\
+	arch_cmpxchg64_local(__iptr, (iold), (inew));	\
+})
+
+#define cmpxchg_double(p1, p2, o1, o2, n1, n2)				\
+({									\
+	__typeof__(p1) __p1 = (p1);					\
+	kasan_check_write(__p1, 2 * sizeof(*__p1));			\
+	arch_cmpxchg_double(__p1, (p2), (o1), (o2), (n1), (n2));	\
+})
+
+#define cmpxchg_double_local(p1, p2, o1, o2, n1, n2)			\
+({									\
+	__typeof__(p1) __p1 = (p1);					\
+	kasan_check_write(__p1, 2 * sizeof(*__p1));			\
+	arch_cmpxchg_double_local(__p1, (p2), (o1), (o2), (n1), (n2));	\
+})
+
+#endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
diff --git a/include/linux/kasan-checks.h b/include/linux/kasan-checks.h
index b7f8aced7870..41960fecf783 100644
--- a/include/linux/kasan-checks.h
+++ b/include/linux/kasan-checks.h
@@ -2,11 +2,13 @@
 #define _LINUX_KASAN_CHECKS_H
 
 #ifdef CONFIG_KASAN
-void kasan_check_read(const void *p, unsigned int size);
-void kasan_check_write(const void *p, unsigned int size);
+void kasan_check_read(const volatile void *p, unsigned int size);
+void kasan_check_write(const volatile void *p, unsigned int size);
 #else
-static inline void kasan_check_read(const void *p, unsigned int size) { }
-static inline void kasan_check_write(const void *p, unsigned int size) { }
+static inline void kasan_check_read(const volatile void *p, unsigned int size)
+{ }
+static inline void kasan_check_write(const volatile void *p, unsigned int size)
+{ }
 #endif
 
 #endif
diff --git a/mm/kasan/kasan.c b/mm/kasan/kasan.c
index 98b27195e38b..db46e66eb1d4 100644
--- a/mm/kasan/kasan.c
+++ b/mm/kasan/kasan.c
@@ -333,13 +333,13 @@ static void check_memory_region(unsigned long addr,
 	check_memory_region_inline(addr, size, write, ret_ip);
 }
 
-void kasan_check_read(const void *p, unsigned int size)
+void kasan_check_read(const volatile void *p, unsigned int size)
 {
 	check_memory_region((unsigned long)p, size, false, _RET_IP_);
 }
 EXPORT_SYMBOL(kasan_check_read);
 
-void kasan_check_write(const void *p, unsigned int size)
+void kasan_check_write(const volatile void *p, unsigned int size)
 {
 	check_memory_region((unsigned long)p, size, true, _RET_IP_);
 }

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-14 15:22                   ` Dmitry Vyukov
@ 2017-03-14 15:31                     ` Peter Zijlstra
  2017-03-14 15:32                     ` Peter Zijlstra
  1 sibling, 0 replies; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-14 15:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Will Deacon, Mark Rutland, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> Any other suggestions?

> -	return i + xadd(&v->counter, i);
> +	return i + arch_xadd(&v->counter, i);

> +#define xadd(ptr, v)					\
> +({							\
> +	__typeof__(ptr) ____ptr = (ptr);		\
> +	kasan_check_write(____ptr, sizeof(*____ptr));	\
> +	arch_xadd(____ptr, (v));			\
> +})

xadd() isn't a generic thing, it only exists inside x86 as a helper to
implement atomic bits.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-14 15:22                   ` Dmitry Vyukov
  2017-03-14 15:31                     ` Peter Zijlstra
@ 2017-03-14 15:32                     ` Peter Zijlstra
  2017-03-14 15:44                       ` Mark Rutland
  1 sibling, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2017-03-14 15:32 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Will Deacon, Mark Rutland, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> -static __always_inline int atomic_read(const atomic_t *v)
> +static __always_inline int arch_atomic_read(const atomic_t *v)
>  {
> -	return READ_ONCE((v)->counter);
> +	return READ_ONCE_NOCHECK((v)->counter);

Should NOCHEKC come with a comment, because i've no idea why this is so.

>  }

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-14 15:32                     ` Peter Zijlstra
@ 2017-03-14 15:44                       ` Mark Rutland
  2017-03-14 19:25                         ` Dmitry Vyukov
  0 siblings, 1 reply; 25+ messages in thread
From: Mark Rutland @ 2017-03-14 15:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dmitry Vyukov, Will Deacon, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Tue, Mar 14, 2017 at 04:32:30PM +0100, Peter Zijlstra wrote:
> On Tue, Mar 14, 2017 at 04:22:52PM +0100, Dmitry Vyukov wrote:
> > -static __always_inline int atomic_read(const atomic_t *v)
> > +static __always_inline int arch_atomic_read(const atomic_t *v)
> >  {
> > -	return READ_ONCE((v)->counter);
> > +	return READ_ONCE_NOCHECK((v)->counter);
> 
> Should NOCHEKC come with a comment, because i've no idea why this is so.

I suspect the idea is that given the wrapper will have done the KASAN
check, duplicating it here is either sub-optimal, or results in
duplicate splats. READ_ONCE() has an implicit KASAN check,
READ_ONCE_NOCHECK() does not.

If this is to solve duplicate splats, it'd be worth having a
WRITE_ONCE_NOCHECK() for arch_atomic_set().

Agreed on the comment, regardless.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] x86, kasan: add KASAN checks to atomic operations
  2017-03-14 15:44                       ` Mark Rutland
@ 2017-03-14 19:25                         ` Dmitry Vyukov
  0 siblings, 0 replies; 25+ messages in thread
From: Dmitry Vyukov @ 2017-03-14 19:25 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Peter Zijlstra, Will Deacon, Andrew Morton, Andrey Ryabinin,
	Ingo Molnar, kasan-dev, linux-mm, LKML, x86

On Tue, Mar 14, 2017 at 4:44 PM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > -static __always_inline int atomic_read(const atomic_t *v)
>> > +static __always_inline int arch_atomic_read(const atomic_t *v)
>> >  {
>> > -   return READ_ONCE((v)->counter);
>> > +   return READ_ONCE_NOCHECK((v)->counter);
>>
>> Should NOCHEKC come with a comment, because i've no idea why this is so.
>
> I suspect the idea is that given the wrapper will have done the KASAN
> check, duplicating it here is either sub-optimal, or results in
> duplicate splats. READ_ONCE() has an implicit KASAN check,
> READ_ONCE_NOCHECK() does not.
>
> If this is to solve duplicate splats, it'd be worth having a
> WRITE_ONCE_NOCHECK() for arch_atomic_set().
>
> Agreed on the comment, regardless.


Reverted xchg changes.
Added comments re READ_ONCE_NOCHECK() and WRITE_ONCE().
Added file comment.
Split into 3 patches and mailed.

Thanks!

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2017-03-14 19:25 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-06 12:42 [PATCH] x86, kasan: add KASAN checks to atomic operations Dmitry Vyukov
2017-03-06 12:50 ` Dmitry Vyukov
2017-03-06 12:58   ` Peter Zijlstra
2017-03-06 13:01     ` Peter Zijlstra
2017-03-06 14:24       ` Dmitry Vyukov
2017-03-06 15:20         ` Peter Zijlstra
2017-03-06 16:04           ` Mark Rutland
2017-03-06 15:33         ` Peter Zijlstra
2017-03-06 16:20         ` Mark Rutland
2017-03-06 16:27           ` Dmitry Vyukov
2017-03-06 17:25             ` Mark Rutland
2017-03-06 20:35           ` Peter Zijlstra
2017-03-08 13:42             ` Dmitry Vyukov
2017-03-08 15:20               ` Mark Rutland
2017-03-08 15:27                 ` Dmitry Vyukov
2017-03-08 15:43                   ` Mark Rutland
2017-03-08 15:45                     ` Dmitry Vyukov
2017-03-08 15:48                       ` Mark Rutland
2017-03-08 17:43                 ` Will Deacon
2017-03-14 15:22                   ` Dmitry Vyukov
2017-03-14 15:31                     ` Peter Zijlstra
2017-03-14 15:32                     ` Peter Zijlstra
2017-03-14 15:44                       ` Mark Rutland
2017-03-14 19:25                         ` Dmitry Vyukov
2017-03-06 16:48         ` Andrey Ryabinin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).