All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] locking: Introduce local{,64}_try_cmpxchg
@ 2023-03-05 20:56 ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Huacai Chen, WANG Xuerui, Thomas Bogendoerfer, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Arnd Bergmann,
	Peter Zijlstra, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	Will Deacon, Boqun Feng, Jiaxun Yang, Jun Yi

Add generic and target specific support for local{,64}_try_cmpxchg
and wire up support for all targets that use local_t infrastructure.

The patch enables x86 targets to emit special instruction for
local_try_cmpxchg and also local64_try_cmpxchg for x86_64.

The last patch changes __perf_output_begin in events/ring_buffer
to use new locking primitive and improves code from

     4b3:	48 8b 82 e8 00 00 00 	mov    0xe8(%rdx),%rax
     4ba:	48 8b b8 08 04 00 00 	mov    0x408(%rax),%rdi
     4c1:	8b 42 1c             	mov    0x1c(%rdx),%eax
     4c4:	48 8b 4a 28          	mov    0x28(%rdx),%rcx
     4c8:	85 c0                	test   %eax,%eax
     ...
     4ef:	48 89 c8             	mov    %rcx,%rax
     4f2:	48 0f b1 7a 28       	cmpxchg %rdi,0x28(%rdx)
     4f7:	48 39 c1             	cmp    %rax,%rcx
     4fa:	75 b7                	jne    4b3 <...>

to

     4b2:	48 8b 4a 28          	mov    0x28(%rdx),%rcx
     4b6:	48 8b 82 e8 00 00 00 	mov    0xe8(%rdx),%rax
     4bd:	48 8b b0 08 04 00 00 	mov    0x408(%rax),%rsi
     4c4:	8b 42 1c             	mov    0x1c(%rdx),%eax
     4c7:	85 c0                	test   %eax,%eax
     ...
     4d4:	48 89 c8             	mov    %rcx,%rax
     4d7:	48 0f b1 72 28       	cmpxchg %rsi,0x28(%rdx)
     4dc:	0f 85 d0 00 00 00    	jne    5b2 <...>
     ...
     5b2:	48 89 c1             	mov    %rax,%rcx
     5b5:	e9 fc fe ff ff       	jmp    4b6 <...>

Please note that in addition to removed compare, the load from
0x28(%rdx) gets moved out of the loop and the code is rearranged
according to likely/unlikely tags in the source.

Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Jun Yi <yijun@loongson.cn>

Uros Bizjak (10):
  locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  locking/atomic: Add generic try_cmpxchg{,64}_local support
  locking/alpha: Wire up local_try_cmpxchg
  locking/loongarch: Wire up local_try_cmpxchg
  locking/mips: Wire up local_try_cmpxchg
  locking/powerpc: Wire up local_try_cmpxchg
  locking/x86: Wire up local_try_cmpxchg
  locking/generic: Wire up local{,64}_try_cmpxchg
  locking/x86: Enable local{,64}_try_cmpxchg
  perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin

 arch/alpha/include/asm/local.h              |  2 ++
 arch/loongarch/include/asm/local.h          |  2 ++
 arch/mips/include/asm/local.h               |  2 ++
 arch/powerpc/include/asm/local.h            | 11 ++++++
 arch/x86/include/asm/cmpxchg.h              |  6 ++++
 arch/x86/include/asm/local.h                |  2 ++
 include/asm-generic/local.h                 |  1 +
 include/asm-generic/local64.h               |  2 ++
 include/linux/atomic/atomic-arch-fallback.h | 40 ++++++++++++++++-----
 include/linux/atomic/atomic-instrumented.h  | 20 ++++++++++-
 kernel/events/ring_buffer.c                 |  5 +--
 scripts/atomic/gen-atomic-fallback.sh       |  6 +++-
 scripts/atomic/gen-atomic-instrumented.sh   |  2 +-
 13 files changed, 87 insertions(+), 14 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 00/10] locking: Introduce local{,64}_try_cmpxchg
@ 2023-03-05 20:56 ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Mark Rutland, Ian Rogers, x86, Peter Zijlstra, Dave Hansen,
	Jiaxun Yang, H. Peter Anvin, WANG Xuerui, Will Deacon, Jun Yi,
	Huacai Chen, Uros Bizjak, Alexander Shishkin, Ingo Molnar,
	Matt Turner, Arnd Bergmann, Boqun Feng, Richard Henderson,
	Nicholas Piggin, Ivan Kokshaysky, Arnaldo Carvalho de Melo,
	Namhyung Kim, Thomas Gleixner, Thomas Bogendoerfer, Jiri Olsa,
	Borislav Petkov

Add generic and target specific support for local{,64}_try_cmpxchg
and wire up support for all targets that use local_t infrastructure.

The patch enables x86 targets to emit special instruction for
local_try_cmpxchg and also local64_try_cmpxchg for x86_64.

The last patch changes __perf_output_begin in events/ring_buffer
to use new locking primitive and improves code from

     4b3:	48 8b 82 e8 00 00 00 	mov    0xe8(%rdx),%rax
     4ba:	48 8b b8 08 04 00 00 	mov    0x408(%rax),%rdi
     4c1:	8b 42 1c             	mov    0x1c(%rdx),%eax
     4c4:	48 8b 4a 28          	mov    0x28(%rdx),%rcx
     4c8:	85 c0                	test   %eax,%eax
     ...
     4ef:	48 89 c8             	mov    %rcx,%rax
     4f2:	48 0f b1 7a 28       	cmpxchg %rdi,0x28(%rdx)
     4f7:	48 39 c1             	cmp    %rax,%rcx
     4fa:	75 b7                	jne    4b3 <...>

to

     4b2:	48 8b 4a 28          	mov    0x28(%rdx),%rcx
     4b6:	48 8b 82 e8 00 00 00 	mov    0xe8(%rdx),%rax
     4bd:	48 8b b0 08 04 00 00 	mov    0x408(%rax),%rsi
     4c4:	8b 42 1c             	mov    0x1c(%rdx),%eax
     4c7:	85 c0                	test   %eax,%eax
     ...
     4d4:	48 89 c8             	mov    %rcx,%rax
     4d7:	48 0f b1 72 28       	cmpxchg %rsi,0x28(%rdx)
     4dc:	0f 85 d0 00 00 00    	jne    5b2 <...>
     ...
     5b2:	48 89 c1             	mov    %rax,%rcx
     5b5:	e9 fc fe ff ff       	jmp    4b6 <...>

Please note that in addition to removed compare, the load from
0x28(%rdx) gets moved out of the loop and the code is rearranged
according to likely/unlikely tags in the source.

Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Jun Yi <yijun@loongson.cn>

Uros Bizjak (10):
  locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  locking/atomic: Add generic try_cmpxchg{,64}_local support
  locking/alpha: Wire up local_try_cmpxchg
  locking/loongarch: Wire up local_try_cmpxchg
  locking/mips: Wire up local_try_cmpxchg
  locking/powerpc: Wire up local_try_cmpxchg
  locking/x86: Wire up local_try_cmpxchg
  locking/generic: Wire up local{,64}_try_cmpxchg
  locking/x86: Enable local{,64}_try_cmpxchg
  perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin

 arch/alpha/include/asm/local.h              |  2 ++
 arch/loongarch/include/asm/local.h          |  2 ++
 arch/mips/include/asm/local.h               |  2 ++
 arch/powerpc/include/asm/local.h            | 11 ++++++
 arch/x86/include/asm/cmpxchg.h              |  6 ++++
 arch/x86/include/asm/local.h                |  2 ++
 include/asm-generic/local.h                 |  1 +
 include/asm-generic/local64.h               |  2 ++
 include/linux/atomic/atomic-arch-fallback.h | 40 ++++++++++++++++-----
 include/linux/atomic/atomic-instrumented.h  | 20 ++++++++++-
 kernel/events/ring_buffer.c                 |  5 +--
 scripts/atomic/gen-atomic-fallback.sh       |  6 +++-
 scripts/atomic/gen-atomic-instrumented.sh   |  2 +-
 13 files changed, 87 insertions(+), 14 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 00/10] locking: Introduce local{,64}_try_cmpxchg
@ 2023-03-05 20:56 ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Huacai Chen, WANG Xuerui, Thomas Bogendoerfer, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Arnd Bergmann,
	Peter Zijlstra, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander

Add generic and target specific support for local{,64}_try_cmpxchg
and wire up support for all targets that use local_t infrastructure.

The patch enables x86 targets to emit special instruction for
local_try_cmpxchg and also local64_try_cmpxchg for x86_64.

The last patch changes __perf_output_begin in events/ring_buffer
to use new locking primitive and improves code from

     4b3:	48 8b 82 e8 00 00 00 	mov    0xe8(%rdx),%rax
     4ba:	48 8b b8 08 04 00 00 	mov    0x408(%rax),%rdi
     4c1:	8b 42 1c             	mov    0x1c(%rdx),%eax
     4c4:	48 8b 4a 28          	mov    0x28(%rdx),%rcx
     4c8:	85 c0                	test   %eax,%eax
     ...
     4ef:	48 89 c8             	mov    %rcx,%rax
     4f2:	48 0f b1 7a 28       	cmpxchg %rdi,0x28(%rdx)
     4f7:	48 39 c1             	cmp    %rax,%rcx
     4fa:	75 b7                	jne    4b3 <...>

to

     4b2:	48 8b 4a 28          	mov    0x28(%rdx),%rcx
     4b6:	48 8b 82 e8 00 00 00 	mov    0xe8(%rdx),%rax
     4bd:	48 8b b0 08 04 00 00 	mov    0x408(%rax),%rsi
     4c4:	8b 42 1c             	mov    0x1c(%rdx),%eax
     4c7:	85 c0                	test   %eax,%eax
     ...
     4d4:	48 89 c8             	mov    %rcx,%rax
     4d7:	48 0f b1 72 28       	cmpxchg %rsi,0x28(%rdx)
     4dc:	0f 85 d0 00 00 00    	jne    5b2 <...>
     ...
     5b2:	48 89 c1             	mov    %rax,%rcx
     5b5:	e9 fc fe ff ff       	jmp    4b6 <...>

Please note that in addition to removed compare, the load from
0x28(%rdx) gets moved out of the loop and the code is rearranged
according to likely/unlikely tags in the source.

Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Jun Yi <yijun@loongson.cn>

Uros Bizjak (10):
  locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  locking/atomic: Add generic try_cmpxchg{,64}_local support
  locking/alpha: Wire up local_try_cmpxchg
  locking/loongarch: Wire up local_try_cmpxchg
  locking/mips: Wire up local_try_cmpxchg
  locking/powerpc: Wire up local_try_cmpxchg
  locking/x86: Wire up local_try_cmpxchg
  locking/generic: Wire up local{,64}_try_cmpxchg
  locking/x86: Enable local{,64}_try_cmpxchg
  perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin

 arch/alpha/include/asm/local.h              |  2 ++
 arch/loongarch/include/asm/local.h          |  2 ++
 arch/mips/include/asm/local.h               |  2 ++
 arch/powerpc/include/asm/local.h            | 11 ++++++
 arch/x86/include/asm/cmpxchg.h              |  6 ++++
 arch/x86/include/asm/local.h                |  2 ++
 include/asm-generic/local.h                 |  1 +
 include/asm-generic/local64.h               |  2 ++
 include/linux/atomic/atomic-arch-fallback.h | 40 ++++++++++++++++-----
 include/linux/atomic/atomic-instrumented.h  | 20 ++++++++++-
 kernel/events/ring_buffer.c                 |  5 +--
 scripts/atomic/gen-atomic-fallback.sh       |  6 +++-
 scripts/atomic/gen-atomic-instrumented.sh   |  2 +-
 13 files changed, 87 insertions(+), 14 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland

Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.

Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 include/linux/atomic/atomic-arch-fallback.h | 18 +++++++++---------
 scripts/atomic/gen-atomic-fallback.sh       |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
index 77bc5522e61c..19debd501ee7 100644
--- a/include/linux/atomic/atomic-arch-fallback.h
+++ b/include/linux/atomic/atomic-arch-fallback.h
@@ -87,7 +87,7 @@
 #ifndef arch_try_cmpxchg
 #define arch_try_cmpxchg(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -98,7 +98,7 @@
 #ifndef arch_try_cmpxchg_acquire
 #define arch_try_cmpxchg_acquire(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg_acquire((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -109,7 +109,7 @@
 #ifndef arch_try_cmpxchg_release
 #define arch_try_cmpxchg_release(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg_release((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -120,7 +120,7 @@
 #ifndef arch_try_cmpxchg_relaxed
 #define arch_try_cmpxchg_relaxed(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg_relaxed((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -157,7 +157,7 @@
 #ifndef arch_try_cmpxchg64
 #define arch_try_cmpxchg64(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg64((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -168,7 +168,7 @@
 #ifndef arch_try_cmpxchg64_acquire
 #define arch_try_cmpxchg64_acquire(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg64_acquire((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -179,7 +179,7 @@
 #ifndef arch_try_cmpxchg64_release
 #define arch_try_cmpxchg64_release(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg64_release((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -190,7 +190,7 @@
 #ifndef arch_try_cmpxchg64_relaxed
 #define arch_try_cmpxchg64_relaxed(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg64_relaxed((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -2456,4 +2456,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
 #endif
 
 #endif /* _LINUX_ATOMIC_FALLBACK_H */
-// b5e87bdd5ede61470c29f7a7e4de781af3770f09
+// 1b4d4c82ae653389cd1538d5b07170267d9b3837
diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
index 3a07695e3c89..39f447161108 100755
--- a/scripts/atomic/gen-atomic-fallback.sh
+++ b/scripts/atomic/gen-atomic-fallback.sh
@@ -171,7 +171,7 @@ cat <<EOF
 #ifndef arch_try_${cmpxchg}${order}
 #define arch_try_${cmpxchg}${order}(_ptr, _oldp, _new) \\
 ({ \\
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \\
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \\
 	___r = arch_${cmpxchg}${order}((_ptr), ___o, (_new)); \\
 	if (unlikely(___r != ___o)) \\
 		*___op = ___r; \\
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Peter Zijlstra, Will Deacon, Boqun Feng, Mark Rutland, Uros Bizjak

Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.

Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 include/linux/atomic/atomic-arch-fallback.h | 18 +++++++++---------
 scripts/atomic/gen-atomic-fallback.sh       |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
index 77bc5522e61c..19debd501ee7 100644
--- a/include/linux/atomic/atomic-arch-fallback.h
+++ b/include/linux/atomic/atomic-arch-fallback.h
@@ -87,7 +87,7 @@
 #ifndef arch_try_cmpxchg
 #define arch_try_cmpxchg(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -98,7 +98,7 @@
 #ifndef arch_try_cmpxchg_acquire
 #define arch_try_cmpxchg_acquire(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg_acquire((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -109,7 +109,7 @@
 #ifndef arch_try_cmpxchg_release
 #define arch_try_cmpxchg_release(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg_release((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -120,7 +120,7 @@
 #ifndef arch_try_cmpxchg_relaxed
 #define arch_try_cmpxchg_relaxed(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg_relaxed((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -157,7 +157,7 @@
 #ifndef arch_try_cmpxchg64
 #define arch_try_cmpxchg64(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg64((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -168,7 +168,7 @@
 #ifndef arch_try_cmpxchg64_acquire
 #define arch_try_cmpxchg64_acquire(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg64_acquire((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -179,7 +179,7 @@
 #ifndef arch_try_cmpxchg64_release
 #define arch_try_cmpxchg64_release(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg64_release((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -190,7 +190,7 @@
 #ifndef arch_try_cmpxchg64_relaxed
 #define arch_try_cmpxchg64_relaxed(_ptr, _oldp, _new) \
 ({ \
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
 	___r = arch_cmpxchg64_relaxed((_ptr), ___o, (_new)); \
 	if (unlikely(___r != ___o)) \
 		*___op = ___r; \
@@ -2456,4 +2456,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
 #endif
 
 #endif /* _LINUX_ATOMIC_FALLBACK_H */
-// b5e87bdd5ede61470c29f7a7e4de781af3770f09
+// 1b4d4c82ae653389cd1538d5b07170267d9b3837
diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
index 3a07695e3c89..39f447161108 100755
--- a/scripts/atomic/gen-atomic-fallback.sh
+++ b/scripts/atomic/gen-atomic-fallback.sh
@@ -171,7 +171,7 @@ cat <<EOF
 #ifndef arch_try_${cmpxchg}${order}
 #define arch_try_${cmpxchg}${order}(_ptr, _oldp, _new) \\
 ({ \\
-	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \\
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \\
 	___r = arch_${cmpxchg}${order}((_ptr), ___o, (_new)); \\
 	if (unlikely(___r != ___o)) \\
 		*___op = ___r; \\
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/10] locking/atomic: Add generic try_cmpxchg{,64}_local support
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Will Deacon, Peter Zijlstra, Boqun Feng, Mark Rutland

Add generic support for try_cmpxchg{,64}_local and their falbacks.

Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 include/linux/atomic/atomic-arch-fallback.h | 24 ++++++++++++++++++++-
 include/linux/atomic/atomic-instrumented.h  | 20 ++++++++++++++++-
 scripts/atomic/gen-atomic-fallback.sh       |  4 ++++
 scripts/atomic/gen-atomic-instrumented.sh   |  2 +-
 4 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
index 19debd501ee7..a7d2e5f4e548 100644
--- a/include/linux/atomic/atomic-arch-fallback.h
+++ b/include/linux/atomic/atomic-arch-fallback.h
@@ -217,6 +217,28 @@
 
 #endif /* arch_try_cmpxchg64_relaxed */
 
+#ifndef arch_try_cmpxchg_local
+#define arch_try_cmpxchg_local(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg_local((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_local */
+
+#ifndef arch_try_cmpxchg64_local
+#define arch_try_cmpxchg64_local(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg64_local((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg64_local */
+
 #ifndef arch_atomic_read_acquire
 static __always_inline int
 arch_atomic_read_acquire(const atomic_t *v)
@@ -2456,4 +2478,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
 #endif
 
 #endif /* _LINUX_ATOMIC_FALLBACK_H */
-// 1b4d4c82ae653389cd1538d5b07170267d9b3837
+// 9bb8cca3d4cbc000e7068eb7cb4481cb3e48c45a
diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h
index 7a139ec030b0..14a9212cc987 100644
--- a/include/linux/atomic/atomic-instrumented.h
+++ b/include/linux/atomic/atomic-instrumented.h
@@ -2066,6 +2066,24 @@ atomic_long_dec_if_positive(atomic_long_t *v)
 	arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
 })
 
+#define try_cmpxchg_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg64_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
 #define cmpxchg_double(ptr, ...) \
 ({ \
 	typeof(ptr) __ai_ptr = (ptr); \
@@ -2083,4 +2101,4 @@ atomic_long_dec_if_positive(atomic_long_t *v)
 })
 
 #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
-// 764f741eb77a7ad565dc8d99ce2837d5542e8aee
+// 456e206c7e4e681126c482e4edcc6f46921ac731
diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
index 39f447161108..26829a75b644 100755
--- a/scripts/atomic/gen-atomic-fallback.sh
+++ b/scripts/atomic/gen-atomic-fallback.sh
@@ -225,6 +225,10 @@ for cmpxchg in "cmpxchg" "cmpxchg64"; do
 	gen_try_cmpxchg_fallbacks "${cmpxchg}"
 done
 
+for cmpxchg in "cmpxchg_local" "cmpxchg64_local"; do
+	gen_try_cmpxchg_fallback "${cmpxchg}" ""
+done
+
 grep '^[a-z]' "$1" | while read name meta args; do
 	gen_proto "${meta}" "${name}" "atomic" "int" ${args}
 done
diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh
index 77c06526a574..c8165e9431bf 100755
--- a/scripts/atomic/gen-atomic-instrumented.sh
+++ b/scripts/atomic/gen-atomic-instrumented.sh
@@ -173,7 +173,7 @@ for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do
 	done
 done
 
-for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg"; do
+for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do
 	gen_xchg "${xchg}" "" ""
 	printf "\n"
 done
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/10] locking/atomic: Add generic try_cmpxchg{,64}_local support
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Peter Zijlstra, Will Deacon, Boqun Feng, Mark Rutland, Uros Bizjak

Add generic support for try_cmpxchg{,64}_local and their falbacks.

Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 include/linux/atomic/atomic-arch-fallback.h | 24 ++++++++++++++++++++-
 include/linux/atomic/atomic-instrumented.h  | 20 ++++++++++++++++-
 scripts/atomic/gen-atomic-fallback.sh       |  4 ++++
 scripts/atomic/gen-atomic-instrumented.sh   |  2 +-
 4 files changed, 47 insertions(+), 3 deletions(-)

diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
index 19debd501ee7..a7d2e5f4e548 100644
--- a/include/linux/atomic/atomic-arch-fallback.h
+++ b/include/linux/atomic/atomic-arch-fallback.h
@@ -217,6 +217,28 @@
 
 #endif /* arch_try_cmpxchg64_relaxed */
 
+#ifndef arch_try_cmpxchg_local
+#define arch_try_cmpxchg_local(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg_local((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_local */
+
+#ifndef arch_try_cmpxchg64_local
+#define arch_try_cmpxchg64_local(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg64_local((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg64_local */
+
 #ifndef arch_atomic_read_acquire
 static __always_inline int
 arch_atomic_read_acquire(const atomic_t *v)
@@ -2456,4 +2478,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
 #endif
 
 #endif /* _LINUX_ATOMIC_FALLBACK_H */
-// 1b4d4c82ae653389cd1538d5b07170267d9b3837
+// 9bb8cca3d4cbc000e7068eb7cb4481cb3e48c45a
diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h
index 7a139ec030b0..14a9212cc987 100644
--- a/include/linux/atomic/atomic-instrumented.h
+++ b/include/linux/atomic/atomic-instrumented.h
@@ -2066,6 +2066,24 @@ atomic_long_dec_if_positive(atomic_long_t *v)
 	arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
 })
 
+#define try_cmpxchg_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg64_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
 #define cmpxchg_double(ptr, ...) \
 ({ \
 	typeof(ptr) __ai_ptr = (ptr); \
@@ -2083,4 +2101,4 @@ atomic_long_dec_if_positive(atomic_long_t *v)
 })
 
 #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
-// 764f741eb77a7ad565dc8d99ce2837d5542e8aee
+// 456e206c7e4e681126c482e4edcc6f46921ac731
diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
index 39f447161108..26829a75b644 100755
--- a/scripts/atomic/gen-atomic-fallback.sh
+++ b/scripts/atomic/gen-atomic-fallback.sh
@@ -225,6 +225,10 @@ for cmpxchg in "cmpxchg" "cmpxchg64"; do
 	gen_try_cmpxchg_fallbacks "${cmpxchg}"
 done
 
+for cmpxchg in "cmpxchg_local" "cmpxchg64_local"; do
+	gen_try_cmpxchg_fallback "${cmpxchg}" ""
+done
+
 grep '^[a-z]' "$1" | while read name meta args; do
 	gen_proto "${meta}" "${name}" "atomic" "int" ${args}
 done
diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh
index 77c06526a574..c8165e9431bf 100755
--- a/scripts/atomic/gen-atomic-instrumented.sh
+++ b/scripts/atomic/gen-atomic-instrumented.sh
@@ -173,7 +173,7 @@ for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do
 	done
 done
 
-for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg"; do
+for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do
 	gen_xchg "${xchg}" "" ""
 	printf "\n"
 done
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/10] locking/alpha: Wire up local_try_cmpxchg
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Richard Henderson, Ivan Kokshaysky, Matt Turner

Implement target specific support for local_try_cmpxchg.

Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/alpha/include/asm/local.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/alpha/include/asm/local.h b/arch/alpha/include/asm/local.h
index fab26a1c93d5..7eef027e0dde 100644
--- a/arch/alpha/include/asm/local.h
+++ b/arch/alpha/include/asm/local.h
@@ -54,6 +54,8 @@ static __inline__ long local_sub_return(long i, local_t * l)
 
 #define local_cmpxchg(l, o, n) \
 	(cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+	(try_cmpxchg_local(&((l)->a.counter), (po), (n)))
 #define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n)))
 
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 03/10] locking/alpha: Wire up local_try_cmpxchg
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Ivan Kokshaysky, Richard Henderson, Matt Turner, Uros Bizjak

Implement target specific support for local_try_cmpxchg.

Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/alpha/include/asm/local.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/alpha/include/asm/local.h b/arch/alpha/include/asm/local.h
index fab26a1c93d5..7eef027e0dde 100644
--- a/arch/alpha/include/asm/local.h
+++ b/arch/alpha/include/asm/local.h
@@ -54,6 +54,8 @@ static __inline__ long local_sub_return(long i, local_t * l)
 
 #define local_cmpxchg(l, o, n) \
 	(cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+	(try_cmpxchg_local(&((l)->a.counter), (po), (n)))
 #define local_xchg(l, n) (xchg_local(&((l)->a.counter), (n)))
 
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/10] locking/loongarch: Wire up local_try_cmpxchg
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Huacai Chen, WANG Xuerui, Jiaxun Yang, Jun Yi

Implement target specific support for local_try_cmpxchg.

Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Jun Yi <yijun@loongson.cn>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/loongarch/include/asm/local.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/loongarch/include/asm/local.h b/arch/loongarch/include/asm/local.h
index 65fbbae9fc4d..dff6bcbe4821 100644
--- a/arch/loongarch/include/asm/local.h
+++ b/arch/loongarch/include/asm/local.h
@@ -58,6 +58,8 @@ static inline long local_sub_return(long i, local_t *l)
 
 #define local_cmpxchg(l, o, n) \
 	((long)cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+	(try_cmpxchg_local(&((l)->a.counter), (po), (n)))
 #define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
 
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/10] locking/loongarch: Wire up local_try_cmpxchg
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: WANG Xuerui, Huacai Chen, Uros Bizjak, Jun Yi, Jiaxun Yang

Implement target specific support for local_try_cmpxchg.

Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Jun Yi <yijun@loongson.cn>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/loongarch/include/asm/local.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/loongarch/include/asm/local.h b/arch/loongarch/include/asm/local.h
index 65fbbae9fc4d..dff6bcbe4821 100644
--- a/arch/loongarch/include/asm/local.h
+++ b/arch/loongarch/include/asm/local.h
@@ -58,6 +58,8 @@ static inline long local_sub_return(long i, local_t *l)
 
 #define local_cmpxchg(l, o, n) \
 	((long)cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+	(try_cmpxchg_local(&((l)->a.counter), (po), (n)))
 #define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
 
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/10] locking/mips: Wire up local_try_cmpxchg
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Thomas Bogendoerfer

Implement target specific support for local_try_cmpxchg.

Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/mips/include/asm/local.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/include/asm/local.h b/arch/mips/include/asm/local.h
index 08366b1fd273..b611361a760b 100644
--- a/arch/mips/include/asm/local.h
+++ b/arch/mips/include/asm/local.h
@@ -96,6 +96,8 @@ static __inline__ long local_sub_return(long i, local_t * l)
 
 #define local_cmpxchg(l, o, n) \
 	((long)cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+	(try_cmpxchg_local(&((l)->a.counter), (po), (n)))
 #define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
 
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/10] locking/mips: Wire up local_try_cmpxchg
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Thomas Bogendoerfer, Uros Bizjak

Implement target specific support for local_try_cmpxchg.

Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/mips/include/asm/local.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/mips/include/asm/local.h b/arch/mips/include/asm/local.h
index 08366b1fd273..b611361a760b 100644
--- a/arch/mips/include/asm/local.h
+++ b/arch/mips/include/asm/local.h
@@ -96,6 +96,8 @@ static __inline__ long local_sub_return(long i, local_t * l)
 
 #define local_cmpxchg(l, o, n) \
 	((long)cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+	(try_cmpxchg_local(&((l)->a.counter), (po), (n)))
 #define local_xchg(l, n) (atomic_long_xchg((&(l)->a), (n)))
 
 /**
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/10] locking/powerpc: Wire up local_try_cmpxchg
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Michael Ellerman, Nicholas Piggin, Christophe Leroy

Implement target specific support for local_try_cmpxchg.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/powerpc/include/asm/local.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h
index bc4bd19b7fc2..45492fb5bf22 100644
--- a/arch/powerpc/include/asm/local.h
+++ b/arch/powerpc/include/asm/local.h
@@ -90,6 +90,17 @@ static __inline__ long local_cmpxchg(local_t *l, long o, long n)
 	return t;
 }
 
+static __inline__ bool local_try_cmpxchg(local_t *l, long *po, long n)
+{
+	long o = *po, r;
+
+	r = local_cmpxchg(l, o, n);
+	if (unlikely(r != o))
+		*po = r;
+
+	return likely(r == o);
+}
+
 static __inline__ long local_xchg(local_t *l, long n)
 {
 	long t;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 06/10] locking/powerpc: Wire up local_try_cmpxchg
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Nicholas Piggin

Implement target specific support for local_try_cmpxchg.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/powerpc/include/asm/local.h | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h
index bc4bd19b7fc2..45492fb5bf22 100644
--- a/arch/powerpc/include/asm/local.h
+++ b/arch/powerpc/include/asm/local.h
@@ -90,6 +90,17 @@ static __inline__ long local_cmpxchg(local_t *l, long o, long n)
 	return t;
 }
 
+static __inline__ bool local_try_cmpxchg(local_t *l, long *po, long n)
+{
+	long o = *po, r;
+
+	r = local_cmpxchg(l, o, n);
+	if (unlikely(r != o))
+		*po = r;
+
+	return likely(r == o);
+}
+
 static __inline__ long local_xchg(local_t *l, long n)
 {
 	long t;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/10] locking/x86: Wire up local_try_cmpxchg
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin

Implement target specific support for local_try_cmpxchg.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/include/asm/local.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
index 349a47acaa4a..4bcc72906524 100644
--- a/arch/x86/include/asm/local.h
+++ b/arch/x86/include/asm/local.h
@@ -122,6 +122,8 @@ static inline long local_sub_return(long i, local_t *l)
 
 #define local_cmpxchg(l, o, n) \
 	(cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+	(try_cmpxchg_local(&((l)->a.counter), (po), (n)))
 /* Always has a lock prefix */
 #define local_xchg(l, n) (xchg(&((l)->a.counter), (n)))
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/10] locking/x86: Wire up local_try_cmpxchg
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Dave Hansen, Uros Bizjak, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Thomas Gleixner

Implement target specific support for local_try_cmpxchg.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/include/asm/local.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
index 349a47acaa4a..4bcc72906524 100644
--- a/arch/x86/include/asm/local.h
+++ b/arch/x86/include/asm/local.h
@@ -122,6 +122,8 @@ static inline long local_sub_return(long i, local_t *l)
 
 #define local_cmpxchg(l, o, n) \
 	(cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+	(try_cmpxchg_local(&((l)->a.counter), (po), (n)))
 /* Always has a lock prefix */
 #define local_xchg(l, n) (xchg(&((l)->a.counter), (n)))
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/10] locking/generic: Wire up local{,64}_try_cmpxchg
  2023-03-05 20:56 ` Uros Bizjak
                   ` (8 preceding siblings ...)
  (?)
@ 2023-03-05 20:56 ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Arnd Bergmann

Implement generic support for local{,64}_try_cmpxchg.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 include/asm-generic/local.h   | 1 +
 include/asm-generic/local64.h | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/include/asm-generic/local.h b/include/asm-generic/local.h
index fca7f1d84818..7f97018df66f 100644
--- a/include/asm-generic/local.h
+++ b/include/asm-generic/local.h
@@ -42,6 +42,7 @@ typedef struct
 #define local_inc_return(l) atomic_long_inc_return(&(l)->a)
 
 #define local_cmpxchg(l, o, n) atomic_long_cmpxchg((&(l)->a), (o), (n))
+#define local_try_cmpxchg(l, po, n) atomic_long_try_cmpxchg((&(l)->a), (po), (n))
 #define local_xchg(l, n) atomic_long_xchg((&(l)->a), (n))
 #define local_add_unless(l, _a, u) atomic_long_add_unless((&(l)->a), (_a), (u))
 #define local_inc_not_zero(l) atomic_long_inc_not_zero(&(l)->a)
diff --git a/include/asm-generic/local64.h b/include/asm-generic/local64.h
index 765be0b7d883..54b91e93ae76 100644
--- a/include/asm-generic/local64.h
+++ b/include/asm-generic/local64.h
@@ -43,6 +43,7 @@ typedef struct {
 #define local64_inc_return(l)	local_inc_return(&(l)->a)
 
 #define local64_cmpxchg(l, o, n) local_cmpxchg((&(l)->a), (o), (n))
+#define local64_try_cmpxchg(l, po, n) local_try_cmpxchg((&(l)->a), (po), (n))
 #define local64_xchg(l, n)	local_xchg((&(l)->a), (n))
 #define local64_add_unless(l, _a, u) local_add_unless((&(l)->a), (_a), (u))
 #define local64_inc_not_zero(l)	local_inc_not_zero(&(l)->a)
@@ -81,6 +82,7 @@ typedef struct {
 #define local64_inc_return(l)	atomic64_inc_return(&(l)->a)
 
 #define local64_cmpxchg(l, o, n) atomic64_cmpxchg((&(l)->a), (o), (n))
+#define local64_try_cmpxchg(l, po, n) atomic64_try_cmpxchg((&(l)->a), (po), (n))
 #define local64_xchg(l, n)	atomic64_xchg((&(l)->a), (n))
 #define local64_add_unless(l, _a, u) atomic64_add_unless((&(l)->a), (_a), (u))
 #define local64_inc_not_zero(l)	atomic64_inc_not_zero(&(l)->a)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/10] locking/x86: Enable local{,64}_try_cmpxchg
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin

Enable local_try_cmpxchg and also local64_try_cmpxchg for x86_64.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/include/asm/cmpxchg.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 94fbe6ae7431..e8f939e8432e 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -221,9 +221,15 @@ extern void __add_wrong_size(void)
 #define __try_cmpxchg(ptr, pold, new, size)				\
 	__raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
 
+#define __try_cmpxchg_local(ptr, pold, new, size)			\
+	__raw_try_cmpxchg((ptr), (pold), (new), (size), "")
+
 #define arch_try_cmpxchg(ptr, pold, new) 				\
 	__try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
 
+#define arch_try_cmpxchg_local(ptr, pold, new) 				\
+	__try_cmpxchg_local((ptr), (pold), (new), sizeof(*(ptr)))
+
 /*
  * xadd() adds "inc" to "*ptr" and atomically returns the previous
  * value of "*ptr".
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/10] locking/x86: Enable local{,64}_try_cmpxchg
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Dave Hansen, Uros Bizjak, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Thomas Gleixner

Enable local_try_cmpxchg and also local64_try_cmpxchg for x86_64.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/include/asm/cmpxchg.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 94fbe6ae7431..e8f939e8432e 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -221,9 +221,15 @@ extern void __add_wrong_size(void)
 #define __try_cmpxchg(ptr, pold, new, size)				\
 	__raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
 
+#define __try_cmpxchg_local(ptr, pold, new, size)			\
+	__raw_try_cmpxchg((ptr), (pold), (new), (size), "")
+
 #define arch_try_cmpxchg(ptr, pold, new) 				\
 	__try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
 
+#define arch_try_cmpxchg_local(ptr, pold, new) 				\
+	__try_cmpxchg_local((ptr), (pold), (new), sizeof(*(ptr)))
+
 /*
  * xadd() adds "inc" to "*ptr" and atomically returns the previous
  * value of "*ptr".
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/10] perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin
  2023-03-05 20:56 ` Uros Bizjak
@ 2023-03-05 20:56   ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Ian Rogers

Use local_try_cmpxchg instead of local_cmpxchg (*ptr, old, new) == old in
__perf_output_begin.  x86 CMPXCHG instruction returns success in ZF flag,
so this change saves a compare after cmpxchg.

Also, local_try_cmpxchg implicitly assigns old *ptr value to "old" when
cmpxchg fails. There is no need to re-read the value in the loop.

No functional change intended.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 kernel/events/ring_buffer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 273a0fe7910a..e07c10f4d141 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -191,9 +191,10 @@ __perf_output_begin(struct perf_output_handle *handle,
 
 	perf_output_get_handle(handle);
 
+	offset = local_read(&rb->head);
 	do {
+		head = offset;
 		tail = READ_ONCE(rb->user_page->data_tail);
-		offset = head = local_read(&rb->head);
 		if (!rb->overwrite) {
 			if (unlikely(!ring_buffer_has_space(head, tail,
 							    perf_data_size(rb),
@@ -217,7 +218,7 @@ __perf_output_begin(struct perf_output_handle *handle,
 			head += size;
 		else
 			head -= size;
-	} while (local_cmpxchg(&rb->head, offset, head) != offset);
+	} while (!local_try_cmpxchg(&rb->head, &offset, head));
 
 	if (backward) {
 		offset = head;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/10] perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin
@ 2023-03-05 20:56   ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Mark Rutland, Ian Rogers, Peter Zijlstra, Uros Bizjak,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Ingo Molnar,
	Jiri Olsa, Namhyung Kim

Use local_try_cmpxchg instead of local_cmpxchg (*ptr, old, new) == old in
__perf_output_begin.  x86 CMPXCHG instruction returns success in ZF flag,
so this change saves a compare after cmpxchg.

Also, local_try_cmpxchg implicitly assigns old *ptr value to "old" when
cmpxchg fails. There is no need to re-read the value in the loop.

No functional change intended.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 kernel/events/ring_buffer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 273a0fe7910a..e07c10f4d141 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -191,9 +191,10 @@ __perf_output_begin(struct perf_output_handle *handle,
 
 	perf_output_get_handle(handle);
 
+	offset = local_read(&rb->head);
 	do {
+		head = offset;
 		tail = READ_ONCE(rb->user_page->data_tail);
-		offset = head = local_read(&rb->head);
 		if (!rb->overwrite) {
 			if (unlikely(!ring_buffer_has_space(head, tail,
 							    perf_data_size(rb),
@@ -217,7 +218,7 @@ __perf_output_begin(struct perf_output_handle *handle,
 			head += size;
 		else
 			head -= size;
-	} while (local_cmpxchg(&rb->head, offset, head) != offset);
+	} while (!local_try_cmpxchg(&rb->head, &offset, head));
 
 	if (backward) {
 		offset = head;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-03-05 20:56   ` Uros Bizjak
@ 2023-03-24 14:13     ` Mark Rutland
  -1 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-03-24 14:13 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.

Can you give an example of where we are passing an incompatible pointer?

That sounds indicative of a bug in the caller, but maybe I'm missing some
reason this is necessary due to some indirection.

> Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")

I'm not sure that this needs a fixes tag. Does anything go wrong today, or only
later in this series?

Thanks,
Mark.

> Cc: Will Deacon <will@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> ---
>  include/linux/atomic/atomic-arch-fallback.h | 18 +++++++++---------
>  scripts/atomic/gen-atomic-fallback.sh       |  2 +-
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
> index 77bc5522e61c..19debd501ee7 100644
> --- a/include/linux/atomic/atomic-arch-fallback.h
> +++ b/include/linux/atomic/atomic-arch-fallback.h
> @@ -87,7 +87,7 @@
>  #ifndef arch_try_cmpxchg
>  #define arch_try_cmpxchg(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -98,7 +98,7 @@
>  #ifndef arch_try_cmpxchg_acquire
>  #define arch_try_cmpxchg_acquire(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg_acquire((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -109,7 +109,7 @@
>  #ifndef arch_try_cmpxchg_release
>  #define arch_try_cmpxchg_release(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg_release((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -120,7 +120,7 @@
>  #ifndef arch_try_cmpxchg_relaxed
>  #define arch_try_cmpxchg_relaxed(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg_relaxed((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -157,7 +157,7 @@
>  #ifndef arch_try_cmpxchg64
>  #define arch_try_cmpxchg64(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg64((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -168,7 +168,7 @@
>  #ifndef arch_try_cmpxchg64_acquire
>  #define arch_try_cmpxchg64_acquire(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg64_acquire((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -179,7 +179,7 @@
>  #ifndef arch_try_cmpxchg64_release
>  #define arch_try_cmpxchg64_release(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg64_release((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -190,7 +190,7 @@
>  #ifndef arch_try_cmpxchg64_relaxed
>  #define arch_try_cmpxchg64_relaxed(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg64_relaxed((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -2456,4 +2456,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
>  #endif
>  
>  #endif /* _LINUX_ATOMIC_FALLBACK_H */
> -// b5e87bdd5ede61470c29f7a7e4de781af3770f09
> +// 1b4d4c82ae653389cd1538d5b07170267d9b3837
> diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
> index 3a07695e3c89..39f447161108 100755
> --- a/scripts/atomic/gen-atomic-fallback.sh
> +++ b/scripts/atomic/gen-atomic-fallback.sh
> @@ -171,7 +171,7 @@ cat <<EOF
>  #ifndef arch_try_${cmpxchg}${order}
>  #define arch_try_${cmpxchg}${order}(_ptr, _oldp, _new) \\
>  ({ \\
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \\
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \\
>  	___r = arch_${cmpxchg}${order}((_ptr), ___o, (_new)); \\
>  	if (unlikely(___r != ___o)) \\
>  		*___op = ___r; \\
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-24 14:13     ` Mark Rutland
  0 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-03-24 14:13 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.

Can you give an example of where we are passing an incompatible pointer?

That sounds indicative of a bug in the caller, but maybe I'm missing some
reason this is necessary due to some indirection.

> Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")

I'm not sure that this needs a fixes tag. Does anything go wrong today, or only
later in this series?

Thanks,
Mark.

> Cc: Will Deacon <will@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Boqun Feng <boqun.feng@gmail.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
> ---
>  include/linux/atomic/atomic-arch-fallback.h | 18 +++++++++---------
>  scripts/atomic/gen-atomic-fallback.sh       |  2 +-
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
> index 77bc5522e61c..19debd501ee7 100644
> --- a/include/linux/atomic/atomic-arch-fallback.h
> +++ b/include/linux/atomic/atomic-arch-fallback.h
> @@ -87,7 +87,7 @@
>  #ifndef arch_try_cmpxchg
>  #define arch_try_cmpxchg(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -98,7 +98,7 @@
>  #ifndef arch_try_cmpxchg_acquire
>  #define arch_try_cmpxchg_acquire(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg_acquire((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -109,7 +109,7 @@
>  #ifndef arch_try_cmpxchg_release
>  #define arch_try_cmpxchg_release(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg_release((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -120,7 +120,7 @@
>  #ifndef arch_try_cmpxchg_relaxed
>  #define arch_try_cmpxchg_relaxed(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg_relaxed((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -157,7 +157,7 @@
>  #ifndef arch_try_cmpxchg64
>  #define arch_try_cmpxchg64(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg64((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -168,7 +168,7 @@
>  #ifndef arch_try_cmpxchg64_acquire
>  #define arch_try_cmpxchg64_acquire(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg64_acquire((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -179,7 +179,7 @@
>  #ifndef arch_try_cmpxchg64_release
>  #define arch_try_cmpxchg64_release(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg64_release((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -190,7 +190,7 @@
>  #ifndef arch_try_cmpxchg64_relaxed
>  #define arch_try_cmpxchg64_relaxed(_ptr, _oldp, _new) \
>  ({ \
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \
>  	___r = arch_cmpxchg64_relaxed((_ptr), ___o, (_new)); \
>  	if (unlikely(___r != ___o)) \
>  		*___op = ___r; \
> @@ -2456,4 +2456,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
>  #endif
>  
>  #endif /* _LINUX_ATOMIC_FALLBACK_H */
> -// b5e87bdd5ede61470c29f7a7e4de781af3770f09
> +// 1b4d4c82ae653389cd1538d5b07170267d9b3837
> diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
> index 3a07695e3c89..39f447161108 100755
> --- a/scripts/atomic/gen-atomic-fallback.sh
> +++ b/scripts/atomic/gen-atomic-fallback.sh
> @@ -171,7 +171,7 @@ cat <<EOF
>  #ifndef arch_try_${cmpxchg}${order}
>  #define arch_try_${cmpxchg}${order}(_ptr, _oldp, _new) \\
>  ({ \\
> -	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \\
> +	typeof(*(_ptr)) *___op = (typeof(_ptr))(_oldp), ___o = *___op, ___r; \\
>  	___r = arch_${cmpxchg}${order}((_ptr), ___o, (_new)); \\
>  	if (unlikely(___r != ___o)) \\
>  		*___op = ___r; \\
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-03-24 14:13     ` Mark Rutland
  (?)
@ 2023-03-24 15:43       ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-24 15:43 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
>
> Can you give an example of where we are passing an incompatible pointer?

An example is patch 10/10 from the series, which will fail without
this fix when fallback code is used. We have:

-       } while (local_cmpxchg(&rb->head, offset, head) != offset);
+       } while (!local_try_cmpxchg(&rb->head, &offset, head));

where rb->head is defined as:

typedef struct {
   atomic_long_t a;
} local_t;

while offset is defined as 'unsigned long'.

The assignment in existing try_cmpxchg template:

typeof(*(_ptr)) *___op = (_oldp)

will trigger an initialization from an incompatible pointer type error.

Please note that x86 avoids this issue by a cast in its
target-dependent definition:

#define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock)                \
({                                                                      \
       bool success;                                                   \
       __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);              \
       __typeof__(*(_ptr)) __old = *_old;                              \
       __typeof__(*(_ptr)) __new = (_new);                             \

so, the warning/error will trigger only in the fallback code.

> That sounds indicative of a bug in the caller, but maybe I'm missing some
> reason this is necessary due to some indirection.
>
> > Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")
>
> I'm not sure that this needs a fixes tag. Does anything go wrong today, or only
> later in this series?

The patch at [1] triggered a build error in posix_acl.c/__get.acl due
to the same problem. The compilation for x86 target was OK, because
x86 defines target-specific arch_try_cmpxchg, but the compilation
broke for targets that revert to generic support. Please note that
this specific problem was recently fixed in a different way [2], but
the issue with the fallback remains.

[1] https://lore.kernel.org/lkml/20220714173819.13312-1-ubizjak@gmail.com/
[2] https://lore.kernel.org/lkml/20221201160103.76012-1-ubizjak@gmail.com/

Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-24 15:43       ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-24 15:43 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
>
> Can you give an example of where we are passing an incompatible pointer?

An example is patch 10/10 from the series, which will fail without
this fix when fallback code is used. We have:

-       } while (local_cmpxchg(&rb->head, offset, head) != offset);
+       } while (!local_try_cmpxchg(&rb->head, &offset, head));

where rb->head is defined as:

typedef struct {
   atomic_long_t a;
} local_t;

while offset is defined as 'unsigned long'.

The assignment in existing try_cmpxchg template:

typeof(*(_ptr)) *___op = (_oldp)

will trigger an initialization from an incompatible pointer type error.

Please note that x86 avoids this issue by a cast in its
target-dependent definition:

#define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock)                \
({                                                                      \
       bool success;                                                   \
       __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);              \
       __typeof__(*(_ptr)) __old = *_old;                              \
       __typeof__(*(_ptr)) __new = (_new);                             \

so, the warning/error will trigger only in the fallback code.

> That sounds indicative of a bug in the caller, but maybe I'm missing some
> reason this is necessary due to some indirection.
>
> > Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")
>
> I'm not sure that this needs a fixes tag. Does anything go wrong today, or only
> later in this series?

The patch at [1] triggered a build error in posix_acl.c/__get.acl due
to the same problem. The compilation for x86 target was OK, because
x86 defines target-specific arch_try_cmpxchg, but the compilation
broke for targets that revert to generic support. Please note that
this specific problem was recently fixed in a different way [2], but
the issue with the fallback remains.

[1] https://lore.kernel.org/lkml/20220714173819.13312-1-ubizjak@gmail.com/
[2] https://lore.kernel.org/lkml/20221201160103.76012-1-ubizjak@gmail.com/

Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-24 15:43       ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-24 15:43 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
>
> Can you give an example of where we are passing an incompatible pointer?

An example is patch 10/10 from the series, which will fail without
this fix when fallback code is used. We have:

-       } while (local_cmpxchg(&rb->head, offset, head) != offset);
+       } while (!local_try_cmpxchg(&rb->head, &offset, head));

where rb->head is defined as:

typedef struct {
   atomic_long_t a;
} local_t;

while offset is defined as 'unsigned long'.

The assignment in existing try_cmpxchg template:

typeof(*(_ptr)) *___op = (_oldp)

will trigger an initialization from an incompatible pointer type error.

Please note that x86 avoids this issue by a cast in its
target-dependent definition:

#define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock)                \
({                                                                      \
       bool success;                                                   \
       __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);              \
       __typeof__(*(_ptr)) __old = *_old;                              \
       __typeof__(*(_ptr)) __new = (_new);                             \

so, the warning/error will trigger only in the fallback code.

> That sounds indicative of a bug in the caller, but maybe I'm missing some
> reason this is necessary due to some indirection.
>
> > Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")
>
> I'm not sure that this needs a fixes tag. Does anything go wrong today, or only
> later in this series?

The patch at [1] triggered a build error in posix_acl.c/__get.acl due
to the same problem. The compilation for x86 target was OK, because
x86 defines target-specific arch_try_cmpxchg, but the compilation
broke for targets that revert to generic support. Please note that
this specific problem was recently fixed in a different way [2], but
the issue with the fallback remains.

[1] https://lore.kernel.org/lkml/20220714173819.13312-1-ubizjak@gmail.com/
[2] https://lore.kernel.org/lkml/20221201160103.76012-1-ubizjak@gmail.com/

Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-03-24 15:43       ` Uros Bizjak
  (?)
@ 2023-03-24 16:14         ` Mark Rutland
  -1 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-03-24 16:14 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> >
> > Can you give an example of where we are passing an incompatible pointer?
> 
> An example is patch 10/10 from the series, which will fail without
> this fix when fallback code is used. We have:
> 
> -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> 
> where rb->head is defined as:
> 
> typedef struct {
>    atomic_long_t a;
> } local_t;
> 
> while offset is defined as 'unsigned long'.

Ok, but that's because we're doing the wrong thing to start with.

Since local_t is defined in terms of atomic_long_t, we should define the
generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
have a mismatch between 'long *' and 'unsigned long *', but then we can fix
that in the callsite:

	while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))

... which then won't silently mask issues elsewhere, and will be consistent
with all the other atomic APIs.

Thanks,
Mark.

> 
> The assignment in existing try_cmpxchg template:
> 
> typeof(*(_ptr)) *___op = (_oldp)
> 
> will trigger an initialization from an incompatible pointer type error.
> 
> Please note that x86 avoids this issue by a cast in its
> target-dependent definition:
> 
> #define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock)                \
> ({                                                                      \
>        bool success;                                                   \
>        __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);              \
>        __typeof__(*(_ptr)) __old = *_old;                              \
>        __typeof__(*(_ptr)) __new = (_new);                             \
> 
> so, the warning/error will trigger only in the fallback code.
> 
> > That sounds indicative of a bug in the caller, but maybe I'm missing some
> > reason this is necessary due to some indirection.
> >
> > > Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")
> >
> > I'm not sure that this needs a fixes tag. Does anything go wrong today, or only
> > later in this series?
> 
> The patch at [1] triggered a build error in posix_acl.c/__get.acl due
> to the same problem. The compilation for x86 target was OK, because
> x86 defines target-specific arch_try_cmpxchg, but the compilation
> broke for targets that revert to generic support. Please note that
> this specific problem was recently fixed in a different way [2], but
> the issue with the fallback remains.
> 
> [1] https://lore.kernel.org/lkml/20220714173819.13312-1-ubizjak@gmail.com/
> [2] https://lore.kernel.org/lkml/20221201160103.76012-1-ubizjak@gmail.com/
> 
> Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-24 16:14         ` Mark Rutland
  0 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-03-24 16:14 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> >
> > Can you give an example of where we are passing an incompatible pointer?
> 
> An example is patch 10/10 from the series, which will fail without
> this fix when fallback code is used. We have:
> 
> -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> 
> where rb->head is defined as:
> 
> typedef struct {
>    atomic_long_t a;
> } local_t;
> 
> while offset is defined as 'unsigned long'.

Ok, but that's because we're doing the wrong thing to start with.

Since local_t is defined in terms of atomic_long_t, we should define the
generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
have a mismatch between 'long *' and 'unsigned long *', but then we can fix
that in the callsite:

	while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))

... which then won't silently mask issues elsewhere, and will be consistent
with all the other atomic APIs.

Thanks,
Mark.

> 
> The assignment in existing try_cmpxchg template:
> 
> typeof(*(_ptr)) *___op = (_oldp)
> 
> will trigger an initialization from an incompatible pointer type error.
> 
> Please note that x86 avoids this issue by a cast in its
> target-dependent definition:
> 
> #define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock)                \
> ({                                                                      \
>        bool success;                                                   \
>        __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);              \
>        __typeof__(*(_ptr)) __old = *_old;                              \
>        __typeof__(*(_ptr)) __new = (_new);                             \
> 
> so, the warning/error will trigger only in the fallback code.
> 
> > That sounds indicative of a bug in the caller, but maybe I'm missing some
> > reason this is necessary due to some indirection.
> >
> > > Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")
> >
> > I'm not sure that this needs a fixes tag. Does anything go wrong today, or only
> > later in this series?
> 
> The patch at [1] triggered a build error in posix_acl.c/__get.acl due
> to the same problem. The compilation for x86 target was OK, because
> x86 defines target-specific arch_try_cmpxchg, but the compilation
> broke for targets that revert to generic support. Please note that
> this specific problem was recently fixed in a different way [2], but
> the issue with the fallback remains.
> 
> [1] https://lore.kernel.org/lkml/20220714173819.13312-1-ubizjak@gmail.com/
> [2] https://lore.kernel.org/lkml/20221201160103.76012-1-ubizjak@gmail.com/
> 
> Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-24 16:14         ` Mark Rutland
  0 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-03-24 16:14 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> >
> > Can you give an example of where we are passing an incompatible pointer?
> 
> An example is patch 10/10 from the series, which will fail without
> this fix when fallback code is used. We have:
> 
> -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> 
> where rb->head is defined as:
> 
> typedef struct {
>    atomic_long_t a;
> } local_t;
> 
> while offset is defined as 'unsigned long'.

Ok, but that's because we're doing the wrong thing to start with.

Since local_t is defined in terms of atomic_long_t, we should define the
generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
have a mismatch between 'long *' and 'unsigned long *', but then we can fix
that in the callsite:

	while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))

... which then won't silently mask issues elsewhere, and will be consistent
with all the other atomic APIs.

Thanks,
Mark.

> 
> The assignment in existing try_cmpxchg template:
> 
> typeof(*(_ptr)) *___op = (_oldp)
> 
> will trigger an initialization from an incompatible pointer type error.
> 
> Please note that x86 avoids this issue by a cast in its
> target-dependent definition:
> 
> #define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock)                \
> ({                                                                      \
>        bool success;                                                   \
>        __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold);              \
>        __typeof__(*(_ptr)) __old = *_old;                              \
>        __typeof__(*(_ptr)) __new = (_new);                             \
> 
> so, the warning/error will trigger only in the fallback code.
> 
> > That sounds indicative of a bug in the caller, but maybe I'm missing some
> > reason this is necessary due to some indirection.
> >
> > > Fixes: 29f006fdefe6 ("asm-generic/atomic: Add try_cmpxchg() fallbacks")
> >
> > I'm not sure that this needs a fixes tag. Does anything go wrong today, or only
> > later in this series?
> 
> The patch at [1] triggered a build error in posix_acl.c/__get.acl due
> to the same problem. The compilation for x86 target was OK, because
> x86 defines target-specific arch_try_cmpxchg, but the compilation
> broke for targets that revert to generic support. Please note that
> this specific problem was recently fixed in a different way [2], but
> the issue with the fallback remains.
> 
> [1] https://lore.kernel.org/lkml/20220714173819.13312-1-ubizjak@gmail.com/
> [2] https://lore.kernel.org/lkml/20221201160103.76012-1-ubizjak@gmail.com/
> 
> Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-03-24 16:14         ` Mark Rutland
  (?)
@ 2023-03-24 16:32           ` Mark Rutland
  -1 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-03-24 16:32 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > >
> > > Can you give an example of where we are passing an incompatible pointer?
> > 
> > An example is patch 10/10 from the series, which will fail without
> > this fix when fallback code is used. We have:
> > 
> > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > 
> > where rb->head is defined as:
> > 
> > typedef struct {
> >    atomic_long_t a;
> > } local_t;
> > 
> > while offset is defined as 'unsigned long'.
> 
> Ok, but that's because we're doing the wrong thing to start with.
> 
> Since local_t is defined in terms of atomic_long_t, we should define the
> generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> that in the callsite:
> 
> 	while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))

Sorry, that should be:
	
	while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))

The fundamenalthing I'm trying to say is that the
atomic/atomic64/atomic_long/local/local64 APIs should be type-safe, and for
their try_cmpxchg() implementations, the type signature should be:

	${atomictype}_try_cmpxchg(${atomictype} *ptr, ${inttype} *old, ${inttype} new)

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-24 16:32           ` Mark Rutland
  0 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-03-24 16:32 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > >
> > > Can you give an example of where we are passing an incompatible pointer?
> > 
> > An example is patch 10/10 from the series, which will fail without
> > this fix when fallback code is used. We have:
> > 
> > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > 
> > where rb->head is defined as:
> > 
> > typedef struct {
> >    atomic_long_t a;
> > } local_t;
> > 
> > while offset is defined as 'unsigned long'.
> 
> Ok, but that's because we're doing the wrong thing to start with.
> 
> Since local_t is defined in terms of atomic_long_t, we should define the
> generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> that in the callsite:
> 
> 	while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))

Sorry, that should be:
	
	while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))

The fundamenalthing I'm trying to say is that the
atomic/atomic64/atomic_long/local/local64 APIs should be type-safe, and for
their try_cmpxchg() implementations, the type signature should be:

	${atomictype}_try_cmpxchg(${atomictype} *ptr, ${inttype} *old, ${inttype} new)

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-24 16:32           ` Mark Rutland
  0 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-03-24 16:32 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > >
> > > Can you give an example of where we are passing an incompatible pointer?
> > 
> > An example is patch 10/10 from the series, which will fail without
> > this fix when fallback code is used. We have:
> > 
> > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > 
> > where rb->head is defined as:
> > 
> > typedef struct {
> >    atomic_long_t a;
> > } local_t;
> > 
> > while offset is defined as 'unsigned long'.
> 
> Ok, but that's because we're doing the wrong thing to start with.
> 
> Since local_t is defined in terms of atomic_long_t, we should define the
> generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> that in the callsite:
> 
> 	while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))

Sorry, that should be:
	
	while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))

The fundamenalthing I'm trying to say is that the
atomic/atomic64/atomic_long/local/local64 APIs should be type-safe, and for
their try_cmpxchg() implementations, the type signature should be:

	${atomictype}_try_cmpxchg(${atomictype} *ptr, ${inttype} *old, ${inttype} new)

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-03-24 16:32           ` Mark Rutland
@ 2023-03-26 19:28             ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-26 19:28 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > >
> > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > >
> > > > Can you give an example of where we are passing an incompatible pointer?
> > >
> > > An example is patch 10/10 from the series, which will fail without
> > > this fix when fallback code is used. We have:
> > >
> > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > >
> > > where rb->head is defined as:
> > >
> > > typedef struct {
> > >    atomic_long_t a;
> > > } local_t;
> > >
> > > while offset is defined as 'unsigned long'.
> >
> > Ok, but that's because we're doing the wrong thing to start with.
> >
> > Since local_t is defined in terms of atomic_long_t, we should define the
> > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > that in the callsite:
> >
> >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
>
> Sorry, that should be:
>
>         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))

The fallbacks are a bit more complicated than above, and are different
from atomic_try_cmpxchg.

Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
are not defined call arch_cmpxchg_local. Also in patch 2/10,
try_cmpxchg_local is introduced, where it calls
arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
:

#define local_cmpxchg(l, o, n) \
       (cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))

which is part of the local_t API. Targets should either define all
these #defines, or none. There are no partial fallbacks as is the case
with atomic_t.

The core of the local_h API is in the local.h header. If the target
doesn't define its own local.h header, then asm-generic/local.h is
used that does exactly what you propose above regarding the usage of
atomic functions.

OTOH, when the target defines its own local.h, then the above
target-dependent #define path applies. The target should define its
own arch_try_cmpxchg_local, otherwise a "generic" target-dependent
fallback that calls target arch_cmpxchg_local applies. In the case of
x86, patch 9/10 enables new instruction by defining
arch_try_cmpxchg_local.

FYI, the patch sequence is carefully chosen so that x86 also exercises
fallback code between different patches in the series.

Targets are free to define local_t to whatever they like, but for some
reason they all define it to:

typedef struct {
    atomic_long_t a;
} local_t;

so they have to dig the variable out of the struct like:

#define local_cmpxchg(l, o, n) \
     (cmpxchg_local(&((l)->a.counter), (o), (n)))

Regarding the mismatch of 'long *' vs 'unsigned long *': x86
target-specific code does for try_cmpxchg:

#define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock) \
({ \
bool success; \
__typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
__typeof__(*(_ptr)) __old = *_old; \
__typeof__(*(_ptr)) __new = (_new); \

so, it *does* cast the "old" pointer to the type of "ptr". The generic
code does *not*. This difference is dangerous, since the compilation
of some code involving try_cmpxchg will compile OK for x86 but will
break for other targets that use try_cmpxchg fallback templates (I was
the unlucky one that tripped on this in the past). Please note that
this problem is not specific to the proposed local_try_cmpxchg series,
but affects the existing try_cmpxchg API.

Also, I don't think that "fixing" callsites is the right thing to do.
The generic code should follow x86 and cast the "old" pointer to the
type of "ptr" inside the fallback.

> The fundamenalthing I'm trying to say is that the
> atomic/atomic64/atomic_long/local/local64 APIs should be type-safe, and for
> their try_cmpxchg() implementations, the type signature should be:
>
>         ${atomictype}_try_cmpxchg(${atomictype} *ptr, ${inttype} *old, ${inttype} new)

This conversion should be performed also for the cmpxchg family of
functions, if desired at all. try_cmpxchg fallback is just cmpxchg
with some extra code around.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-03-26 19:28             ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-26 19:28 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > >
> > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > >
> > > > Can you give an example of where we are passing an incompatible pointer?
> > >
> > > An example is patch 10/10 from the series, which will fail without
> > > this fix when fallback code is used. We have:
> > >
> > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > >
> > > where rb->head is defined as:
> > >
> > > typedef struct {
> > >    atomic_long_t a;
> > > } local_t;
> > >
> > > while offset is defined as 'unsigned long'.
> >
> > Ok, but that's because we're doing the wrong thing to start with.
> >
> > Since local_t is defined in terms of atomic_long_t, we should define the
> > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > that in the callsite:
> >
> >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
>
> Sorry, that should be:
>
>         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))

The fallbacks are a bit more complicated than above, and are different
from atomic_try_cmpxchg.

Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
are not defined call arch_cmpxchg_local. Also in patch 2/10,
try_cmpxchg_local is introduced, where it calls
arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
:

#define local_cmpxchg(l, o, n) \
       (cmpxchg_local(&((l)->a.counter), (o), (n)))
+#define local_try_cmpxchg(l, po, n) \
+       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))

which is part of the local_t API. Targets should either define all
these #defines, or none. There are no partial fallbacks as is the case
with atomic_t.

The core of the local_h API is in the local.h header. If the target
doesn't define its own local.h header, then asm-generic/local.h is
used that does exactly what you propose above regarding the usage of
atomic functions.

OTOH, when the target defines its own local.h, then the above
target-dependent #define path applies. The target should define its
own arch_try_cmpxchg_local, otherwise a "generic" target-dependent
fallback that calls target arch_cmpxchg_local applies. In the case of
x86, patch 9/10 enables new instruction by defining
arch_try_cmpxchg_local.

FYI, the patch sequence is carefully chosen so that x86 also exercises
fallback code between different patches in the series.

Targets are free to define local_t to whatever they like, but for some
reason they all define it to:

typedef struct {
    atomic_long_t a;
} local_t;

so they have to dig the variable out of the struct like:

#define local_cmpxchg(l, o, n) \
     (cmpxchg_local(&((l)->a.counter), (o), (n)))

Regarding the mismatch of 'long *' vs 'unsigned long *': x86
target-specific code does for try_cmpxchg:

#define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock) \
({ \
bool success; \
__typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
__typeof__(*(_ptr)) __old = *_old; \
__typeof__(*(_ptr)) __new = (_new); \

so, it *does* cast the "old" pointer to the type of "ptr". The generic
code does *not*. This difference is dangerous, since the compilation
of some code involving try_cmpxchg will compile OK for x86 but will
break for other targets that use try_cmpxchg fallback templates (I was
the unlucky one that tripped on this in the past). Please note that
this problem is not specific to the proposed local_try_cmpxchg series,
but affects the existing try_cmpxchg API.

Also, I don't think that "fixing" callsites is the right thing to do.
The generic code should follow x86 and cast the "old" pointer to the
type of "ptr" inside the fallback.

> The fundamenalthing I'm trying to say is that the
> atomic/atomic64/atomic_long/local/local64 APIs should be type-safe, and for
> their try_cmpxchg() implementations, the type signature should be:
>
>         ${atomictype}_try_cmpxchg(${atomictype} *ptr, ${inttype} *old, ${inttype} new)

This conversion should be performed also for the cmpxchg family of
functions, if desired at all. try_cmpxchg fallback is just cmpxchg
with some extra code around.

Thanks,
Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-03-26 19:28             ` Uros Bizjak
@ 2023-04-03 10:19               ` Mark Rutland
  -1 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-04-03 10:19 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Sun, Mar 26, 2023 at 09:28:38PM +0200, Uros Bizjak wrote:
> On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > >
> > > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > > >
> > > > > Can you give an example of where we are passing an incompatible pointer?
> > > >
> > > > An example is patch 10/10 from the series, which will fail without
> > > > this fix when fallback code is used. We have:
> > > >
> > > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > > >
> > > > where rb->head is defined as:
> > > >
> > > > typedef struct {
> > > >    atomic_long_t a;
> > > > } local_t;
> > > >
> > > > while offset is defined as 'unsigned long'.
> > >
> > > Ok, but that's because we're doing the wrong thing to start with.
> > >
> > > Since local_t is defined in terms of atomic_long_t, we should define the
> > > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > > that in the callsite:
> > >
> > >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
> >
> > Sorry, that should be:
> >
> >         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))
> 
> The fallbacks are a bit more complicated than above, and are different
> from atomic_try_cmpxchg.
> 
> Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
> are not defined call arch_cmpxchg_local. Also in patch 2/10,
> try_cmpxchg_local is introduced, where it calls
> arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
> :
> 
> #define local_cmpxchg(l, o, n) \
>        (cmpxchg_local(&((l)->a.counter), (o), (n)))
> +#define local_try_cmpxchg(l, po, n) \
> +       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))
> 
> which is part of the local_t API. Targets should either define all
> these #defines, or none. There are no partial fallbacks as is the case
> with atomic_t.

Whether or not there are fallbacks is immaterial.

In those cases, architectures can just as easily write C wrappers, e.g.

long local_cmpxchg(local_t *l, long old, long new)
{
	return cmpxchg_local(&l->a.counter, old, new);
}

long local_try_cmpxchg(local_t *l, long *old, long new)
{
	return try_cmpxchg_local(&l->a.counter, old, new);
}

> The core of the local_h API is in the local.h header. If the target
> doesn't define its own local.h header, then asm-generic/local.h is
> used that does exactly what you propose above regarding the usage of
> atomic functions.
> 
> OTOH, when the target defines its own local.h, then the above
> target-dependent #define path applies. The target should define its
> own arch_try_cmpxchg_local, otherwise a "generic" target-dependent
> fallback that calls target arch_cmpxchg_local applies. In the case of
> x86, patch 9/10 enables new instruction by defining
> arch_try_cmpxchg_local.
> 
> FYI, the patch sequence is carefully chosen so that x86 also exercises
> fallback code between different patches in the series.
> 
> Targets are free to define local_t to whatever they like, but for some
> reason they all define it to:
> 
> typedef struct {
>     atomic_long_t a;
> } local_t;

Yes, which is why I used atomic_long() above.

> so they have to dig the variable out of the struct like:
> 
> #define local_cmpxchg(l, o, n) \
>      (cmpxchg_local(&((l)->a.counter), (o), (n)))
> 
> Regarding the mismatch of 'long *' vs 'unsigned long *': x86
> target-specific code does for try_cmpxchg:
> 
> #define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock) \
> ({ \
> bool success; \
> __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
> __typeof__(*(_ptr)) __old = *_old; \
> __typeof__(*(_ptr)) __new = (_new); \
> 
> so, it *does* cast the "old" pointer to the type of "ptr". The generic
> code does *not*. This difference is dangerous, since the compilation
> of some code involving try_cmpxchg will compile OK for x86 but will
> break for other targets that use try_cmpxchg fallback templates (I was
> the unlucky one that tripped on this in the past). Please note that
> this problem is not specific to the proposed local_try_cmpxchg series,
> but affects the existing try_cmpxchg API.

I understand the problem of arch code differing from generic code, and that we
want to have *a* consistent behaviour for hte API.

What I'm saying is that the behaviour we should aim for is where the 'old'
pointer has a specific type (long), and we always require that, as we do for
the various atomic_*() APIs of which local_*() is a cousin.

> Also, I don't think that "fixing" callsites is the right thing to do.

Why? What's wrong with doing that?

The documentation in Documentation/core-api/local_ops.rst says:

    The ``local_t`` type is defined as an opaque ``signed long``

So the obvious and least surprising thing is for the local_*() functions to use
'long' for values and 'long *' for pointers to values.

Requiring a cast in a few places is not the end of the world.

> The generic code should follow x86 and cast the "old" pointer to the
> type of "ptr" inside the fallback.

Why?

I disagree, and think it's far better to be strict by default. That way,
accidental usage of the wrong type will be caught by the compiler, and if
someone *really* wants to use a differently type then can use a cast in the
callsite, which makes it really obvious when that is happening.

I appreciate that may require some preparatory cleanup, but I think that's a
small price to pay for having this in a clearer and more maintainable state.

> > The fundamenalthing I'm trying to say is that the
> > atomic/atomic64/atomic_long/local/local64 APIs should be type-safe, and for
> > their try_cmpxchg() implementations, the type signature should be:
> >
> >         ${atomictype}_try_cmpxchg(${atomictype} *ptr, ${inttype} *old, ${inttype} new)
> 
> This conversion should be performed also for the cmpxchg family of
> functions, if desired at all. try_cmpxchg fallback is just cmpxchg
> with some extra code around.

FWIW, I agree that we *should* make try_cmpxchg() check that ptr and old
pointer are the same type.

However, I don't think that's a prerequisite for doing so for
local_try_cmpxchg().

Plese make local_try_cmpxchg() have a proper type-safe C prototype, as we do
with the atomic*_try_cmpxchg() APIs.

Thanks,
Mark,

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-04-03 10:19               ` Mark Rutland
  0 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-04-03 10:19 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

On Sun, Mar 26, 2023 at 09:28:38PM +0200, Uros Bizjak wrote:
> On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > >
> > > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > > >
> > > > > Can you give an example of where we are passing an incompatible pointer?
> > > >
> > > > An example is patch 10/10 from the series, which will fail without
> > > > this fix when fallback code is used. We have:
> > > >
> > > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > > >
> > > > where rb->head is defined as:
> > > >
> > > > typedef struct {
> > > >    atomic_long_t a;
> > > > } local_t;
> > > >
> > > > while offset is defined as 'unsigned long'.
> > >
> > > Ok, but that's because we're doing the wrong thing to start with.
> > >
> > > Since local_t is defined in terms of atomic_long_t, we should define the
> > > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > > that in the callsite:
> > >
> > >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
> >
> > Sorry, that should be:
> >
> >         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))
> 
> The fallbacks are a bit more complicated than above, and are different
> from atomic_try_cmpxchg.
> 
> Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
> are not defined call arch_cmpxchg_local. Also in patch 2/10,
> try_cmpxchg_local is introduced, where it calls
> arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
> :
> 
> #define local_cmpxchg(l, o, n) \
>        (cmpxchg_local(&((l)->a.counter), (o), (n)))
> +#define local_try_cmpxchg(l, po, n) \
> +       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))
> 
> which is part of the local_t API. Targets should either define all
> these #defines, or none. There are no partial fallbacks as is the case
> with atomic_t.

Whether or not there are fallbacks is immaterial.

In those cases, architectures can just as easily write C wrappers, e.g.

long local_cmpxchg(local_t *l, long old, long new)
{
	return cmpxchg_local(&l->a.counter, old, new);
}

long local_try_cmpxchg(local_t *l, long *old, long new)
{
	return try_cmpxchg_local(&l->a.counter, old, new);
}

> The core of the local_h API is in the local.h header. If the target
> doesn't define its own local.h header, then asm-generic/local.h is
> used that does exactly what you propose above regarding the usage of
> atomic functions.
> 
> OTOH, when the target defines its own local.h, then the above
> target-dependent #define path applies. The target should define its
> own arch_try_cmpxchg_local, otherwise a "generic" target-dependent
> fallback that calls target arch_cmpxchg_local applies. In the case of
> x86, patch 9/10 enables new instruction by defining
> arch_try_cmpxchg_local.
> 
> FYI, the patch sequence is carefully chosen so that x86 also exercises
> fallback code between different patches in the series.
> 
> Targets are free to define local_t to whatever they like, but for some
> reason they all define it to:
> 
> typedef struct {
>     atomic_long_t a;
> } local_t;

Yes, which is why I used atomic_long() above.

> so they have to dig the variable out of the struct like:
> 
> #define local_cmpxchg(l, o, n) \
>      (cmpxchg_local(&((l)->a.counter), (o), (n)))
> 
> Regarding the mismatch of 'long *' vs 'unsigned long *': x86
> target-specific code does for try_cmpxchg:
> 
> #define __raw_try_cmpxchg(_ptr, _pold, _new, size, lock) \
> ({ \
> bool success; \
> __typeof__(_ptr) _old = (__typeof__(_ptr))(_pold); \
> __typeof__(*(_ptr)) __old = *_old; \
> __typeof__(*(_ptr)) __new = (_new); \
> 
> so, it *does* cast the "old" pointer to the type of "ptr". The generic
> code does *not*. This difference is dangerous, since the compilation
> of some code involving try_cmpxchg will compile OK for x86 but will
> break for other targets that use try_cmpxchg fallback templates (I was
> the unlucky one that tripped on this in the past). Please note that
> this problem is not specific to the proposed local_try_cmpxchg series,
> but affects the existing try_cmpxchg API.

I understand the problem of arch code differing from generic code, and that we
want to have *a* consistent behaviour for hte API.

What I'm saying is that the behaviour we should aim for is where the 'old'
pointer has a specific type (long), and we always require that, as we do for
the various atomic_*() APIs of which local_*() is a cousin.

> Also, I don't think that "fixing" callsites is the right thing to do.

Why? What's wrong with doing that?

The documentation in Documentation/core-api/local_ops.rst says:

    The ``local_t`` type is defined as an opaque ``signed long``

So the obvious and least surprising thing is for the local_*() functions to use
'long' for values and 'long *' for pointers to values.

Requiring a cast in a few places is not the end of the world.

> The generic code should follow x86 and cast the "old" pointer to the
> type of "ptr" inside the fallback.

Why?

I disagree, and think it's far better to be strict by default. That way,
accidental usage of the wrong type will be caught by the compiler, and if
someone *really* wants to use a differently type then can use a cast in the
callsite, which makes it really obvious when that is happening.

I appreciate that may require some preparatory cleanup, but I think that's a
small price to pay for having this in a clearer and more maintainable state.

> > The fundamenalthing I'm trying to say is that the
> > atomic/atomic64/atomic_long/local/local64 APIs should be type-safe, and for
> > their try_cmpxchg() implementations, the type signature should be:
> >
> >         ${atomictype}_try_cmpxchg(${atomictype} *ptr, ${inttype} *old, ${inttype} new)
> 
> This conversion should be performed also for the cmpxchg family of
> functions, if desired at all. try_cmpxchg fallback is just cmpxchg
> with some extra code around.

FWIW, I agree that we *should* make try_cmpxchg() check that ptr and old
pointer are the same type.

However, I don't think that's a prerequisite for doing so for
local_try_cmpxchg().

Plese make local_try_cmpxchg() have a proper type-safe C prototype, as we do
with the atomic*_try_cmpxchg() APIs.

Thanks,
Mark,

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-04-03 10:19               ` Mark Rutland
@ 2023-04-04 12:24                 ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-04-04 12:24 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

[-- Attachment #1: Type: text/plain, Size: 3856 bytes --]

On Mon, Apr 3, 2023 at 12:19 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Sun, Mar 26, 2023 at 09:28:38PM +0200, Uros Bizjak wrote:
> > On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > > > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > >
> > > > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > > > >
> > > > > > Can you give an example of where we are passing an incompatible pointer?
> > > > >
> > > > > An example is patch 10/10 from the series, which will fail without
> > > > > this fix when fallback code is used. We have:
> > > > >
> > > > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > > > >
> > > > > where rb->head is defined as:
> > > > >
> > > > > typedef struct {
> > > > >    atomic_long_t a;
> > > > > } local_t;
> > > > >
> > > > > while offset is defined as 'unsigned long'.
> > > >
> > > > Ok, but that's because we're doing the wrong thing to start with.
> > > >
> > > > Since local_t is defined in terms of atomic_long_t, we should define the
> > > > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > > > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > > > that in the callsite:
> > > >
> > > >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
> > >
> > > Sorry, that should be:
> > >
> > >         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))
> >
> > The fallbacks are a bit more complicated than above, and are different
> > from atomic_try_cmpxchg.
> >
> > Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
> > are not defined call arch_cmpxchg_local. Also in patch 2/10,
> > try_cmpxchg_local is introduced, where it calls
> > arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
> > :
> >
> > #define local_cmpxchg(l, o, n) \
> >        (cmpxchg_local(&((l)->a.counter), (o), (n)))
> > +#define local_try_cmpxchg(l, po, n) \
> > +       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))
> >
> > which is part of the local_t API. Targets should either define all
> > these #defines, or none. There are no partial fallbacks as is the case
> > with atomic_t.
>
> Whether or not there are fallbacks is immaterial.
>
> In those cases, architectures can just as easily write C wrappers, e.g.
>
> long local_cmpxchg(local_t *l, long old, long new)
> {
>         return cmpxchg_local(&l->a.counter, old, new);
> }
>
> long local_try_cmpxchg(local_t *l, long *old, long new)
> {
>         return try_cmpxchg_local(&l->a.counter, old, new);
> }

Please find attached the complete prototype patch that implements the
above suggestion.

The patch includes:
- implementation of instrumented try_cmpxchg{,64}_local definitions
- corresponding arch_try_cmpxchg{,64}_local fallback definitions
- generic local{,64}_try_cmpxchg (and local{,64}_cmpxchg) C wrappers

- x86 specific local_try_cmpxchg (and local_cmpxchg) C wrappers
- x86 specific arch_try_cmpxchg_local definition

- kernel/events/ring_buffer.c change to test local_try_cmpxchg
implementation and illustrate the transition
- arch/x86/events/core.c change to test local64_try_cmpxchg
implementation and illustrate the transition

The definition of atomic_long_t is different for 64-bit and 32-bit
targets (s64 vs int), so target specific C wrappers have to use
different casts to account for this difference.

Uros.

[-- Attachment #2: try_cmpxchg_local-04042023.diff.txt --]
[-- Type: text/plain, Size: 8903 bytes --]

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d096b04bf80e..d9310e9363f1 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -129,13 +129,12 @@ u64 x86_perf_event_update(struct perf_event *event)
 	 * exchange a new raw count - then add that new-prev delta
 	 * count to the generic event atomically:
 	 */
-again:
 	prev_raw_count = local64_read(&hwc->prev_count);
-	rdpmcl(hwc->event_base_rdpmc, new_raw_count);
 
-	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
-					new_raw_count) != prev_raw_count)
-		goto again;
+	do {
+		rdpmcl(hwc->event_base_rdpmc, new_raw_count);
+	} while (!local64_try_cmpxchg(&hwc->prev_count, &prev_raw_count,
+				      new_raw_count));
 
 	/*
 	 * Now we have the new raw value and have updated the prev
diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 94fbe6ae7431..540573f515b7 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -221,9 +221,15 @@ extern void __add_wrong_size(void)
 #define __try_cmpxchg(ptr, pold, new, size)				\
 	__raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
 
+#define __try_cmpxchg_local(ptr, pold, new, size)			\
+	__raw_try_cmpxchg((ptr), (pold), (new), (size), "")
+
 #define arch_try_cmpxchg(ptr, pold, new) 				\
 	__try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
 
+#define arch_try_cmpxchg_local(ptr, pold, new)				\
+	__try_cmpxchg_local((ptr), (pold), (new), sizeof(*(ptr)))
+
 /*
  * xadd() adds "inc" to "*ptr" and atomically returns the previous
  * value of "*ptr".
diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
index 349a47acaa4a..d286a6c7c0b7 100644
--- a/arch/x86/include/asm/local.h
+++ b/arch/x86/include/asm/local.h
@@ -120,8 +120,20 @@ static inline long local_sub_return(long i, local_t *l)
 #define local_inc_return(l)  (local_add_return(1, l))
 #define local_dec_return(l)  (local_sub_return(1, l))
 
-#define local_cmpxchg(l, o, n) \
-	(cmpxchg_local(&((l)->a.counter), (o), (n)))
+static inline long local_cmpxchg(local_t *l, long old, long new)
+{
+	return cmpxchg_local(&l->a.counter, old, new);
+}
+
+static inline bool local_try_cmpxchg(local_t *l, long *old, long new)
+{
+#ifdef CONFIG_64BIT
+	return try_cmpxchg_local(&l->a.counter, (s64 *)old, new);
+#else
+	return try_cmpxchg_local(&l->a.counter, (int *)old, new);
+#endif
+}
+
 /* Always has a lock prefix */
 #define local_xchg(l, n) (xchg(&((l)->a.counter), (n)))
 
diff --git a/include/asm-generic/local.h b/include/asm-generic/local.h
index fca7f1d84818..7f97018df66f 100644
--- a/include/asm-generic/local.h
+++ b/include/asm-generic/local.h
@@ -42,6 +42,7 @@ typedef struct
 #define local_inc_return(l) atomic_long_inc_return(&(l)->a)
 
 #define local_cmpxchg(l, o, n) atomic_long_cmpxchg((&(l)->a), (o), (n))
+#define local_try_cmpxchg(l, po, n) atomic_long_try_cmpxchg((&(l)->a), (po), (n))
 #define local_xchg(l, n) atomic_long_xchg((&(l)->a), (n))
 #define local_add_unless(l, _a, u) atomic_long_add_unless((&(l)->a), (_a), (u))
 #define local_inc_not_zero(l) atomic_long_inc_not_zero(&(l)->a)
diff --git a/include/asm-generic/local64.h b/include/asm-generic/local64.h
index 765be0b7d883..14963a7a6253 100644
--- a/include/asm-generic/local64.h
+++ b/include/asm-generic/local64.h
@@ -42,7 +42,16 @@ typedef struct {
 #define local64_sub_return(i, l) local_sub_return((i), (&(l)->a))
 #define local64_inc_return(l)	local_inc_return(&(l)->a)
 
-#define local64_cmpxchg(l, o, n) local_cmpxchg((&(l)->a), (o), (n))
+static inline s64 local64_cmpxchg(local64_t *l, s64 old, s64 new)
+{
+	return local_cmpxchg(&l->a, old, new);
+}
+
+static inline bool local64_try_cmpxchg(local64_t *l, s64 *old, s64 new)
+{
+	return local_try_cmpxchg(&l->a, (long *)old, new);
+}
+
 #define local64_xchg(l, n)	local_xchg((&(l)->a), (n))
 #define local64_add_unless(l, _a, u) local_add_unless((&(l)->a), (_a), (u))
 #define local64_inc_not_zero(l)	local_inc_not_zero(&(l)->a)
@@ -81,6 +90,7 @@ typedef struct {
 #define local64_inc_return(l)	atomic64_inc_return(&(l)->a)
 
 #define local64_cmpxchg(l, o, n) atomic64_cmpxchg((&(l)->a), (o), (n))
+#define local64_try_cmpxchg(l, po, n) atomic64_try_cmpxchg((&(l)->a), (po), (n))
 #define local64_xchg(l, n)	atomic64_xchg((&(l)->a), (n))
 #define local64_add_unless(l, _a, u) atomic64_add_unless((&(l)->a), (_a), (u))
 #define local64_inc_not_zero(l)	atomic64_inc_not_zero(&(l)->a)
diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
index 77bc5522e61c..36c92851cdee 100644
--- a/include/linux/atomic/atomic-arch-fallback.h
+++ b/include/linux/atomic/atomic-arch-fallback.h
@@ -217,6 +217,28 @@
 
 #endif /* arch_try_cmpxchg64_relaxed */
 
+#ifndef arch_try_cmpxchg_local
+#define arch_try_cmpxchg_local(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg_local((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_local */
+
+#ifndef arch_try_cmpxchg64_local
+#define arch_try_cmpxchg64_local(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg64_local((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg64_local */
+
 #ifndef arch_atomic_read_acquire
 static __always_inline int
 arch_atomic_read_acquire(const atomic_t *v)
@@ -2456,4 +2478,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
 #endif
 
 #endif /* _LINUX_ATOMIC_FALLBACK_H */
-// b5e87bdd5ede61470c29f7a7e4de781af3770f09
+// 1f49bd4895a4b7a5383906649027205c52ec80ab
diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h
index 7a139ec030b0..14a9212cc987 100644
--- a/include/linux/atomic/atomic-instrumented.h
+++ b/include/linux/atomic/atomic-instrumented.h
@@ -2066,6 +2066,24 @@ atomic_long_dec_if_positive(atomic_long_t *v)
 	arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
 })
 
+#define try_cmpxchg_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg64_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
 #define cmpxchg_double(ptr, ...) \
 ({ \
 	typeof(ptr) __ai_ptr = (ptr); \
@@ -2083,4 +2101,4 @@ atomic_long_dec_if_positive(atomic_long_t *v)
 })
 
 #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
-// 764f741eb77a7ad565dc8d99ce2837d5542e8aee
+// 456e206c7e4e681126c482e4edcc6f46921ac731
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 273a0fe7910a..111ab85ee97d 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -191,9 +191,10 @@ __perf_output_begin(struct perf_output_handle *handle,
 
 	perf_output_get_handle(handle);
 
+	offset = local_read(&rb->head);
 	do {
 		tail = READ_ONCE(rb->user_page->data_tail);
-		offset = head = local_read(&rb->head);
+		head = offset;
 		if (!rb->overwrite) {
 			if (unlikely(!ring_buffer_has_space(head, tail,
 							    perf_data_size(rb),
@@ -217,7 +218,7 @@ __perf_output_begin(struct perf_output_handle *handle,
 			head += size;
 		else
 			head -= size;
-	} while (local_cmpxchg(&rb->head, offset, head) != offset);
+	} while (!local_try_cmpxchg(&rb->head, &offset, head));
 
 	if (backward) {
 		offset = head;
diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
index 3a07695e3c89..6e853f0dad8d 100755
--- a/scripts/atomic/gen-atomic-fallback.sh
+++ b/scripts/atomic/gen-atomic-fallback.sh
@@ -225,6 +225,10 @@ for cmpxchg in "cmpxchg" "cmpxchg64"; do
 	gen_try_cmpxchg_fallbacks "${cmpxchg}"
 done
 
+for cmpxchg in "cmpxchg_local" "cmpxchg64_local"; do
+	gen_try_cmpxchg_fallback "${cmpxchg}" ""
+done
+
 grep '^[a-z]' "$1" | while read name meta args; do
 	gen_proto "${meta}" "${name}" "atomic" "int" ${args}
 done
diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh
index 77c06526a574..c8165e9431bf 100755
--- a/scripts/atomic/gen-atomic-instrumented.sh
+++ b/scripts/atomic/gen-atomic-instrumented.sh
@@ -173,7 +173,7 @@ for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do
 	done
 done
 
-for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg"; do
+for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do
 	gen_xchg "${xchg}" "" ""
 	printf "\n"
 done

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-04-04 12:24                 ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-04-04 12:24 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3856 bytes --]

On Mon, Apr 3, 2023 at 12:19 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Sun, Mar 26, 2023 at 09:28:38PM +0200, Uros Bizjak wrote:
> > On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > > > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > >
> > > > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > > > >
> > > > > > Can you give an example of where we are passing an incompatible pointer?
> > > > >
> > > > > An example is patch 10/10 from the series, which will fail without
> > > > > this fix when fallback code is used. We have:
> > > > >
> > > > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > > > >
> > > > > where rb->head is defined as:
> > > > >
> > > > > typedef struct {
> > > > >    atomic_long_t a;
> > > > > } local_t;
> > > > >
> > > > > while offset is defined as 'unsigned long'.
> > > >
> > > > Ok, but that's because we're doing the wrong thing to start with.
> > > >
> > > > Since local_t is defined in terms of atomic_long_t, we should define the
> > > > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > > > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > > > that in the callsite:
> > > >
> > > >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
> > >
> > > Sorry, that should be:
> > >
> > >         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))
> >
> > The fallbacks are a bit more complicated than above, and are different
> > from atomic_try_cmpxchg.
> >
> > Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
> > are not defined call arch_cmpxchg_local. Also in patch 2/10,
> > try_cmpxchg_local is introduced, where it calls
> > arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
> > :
> >
> > #define local_cmpxchg(l, o, n) \
> >        (cmpxchg_local(&((l)->a.counter), (o), (n)))
> > +#define local_try_cmpxchg(l, po, n) \
> > +       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))
> >
> > which is part of the local_t API. Targets should either define all
> > these #defines, or none. There are no partial fallbacks as is the case
> > with atomic_t.
>
> Whether or not there are fallbacks is immaterial.
>
> In those cases, architectures can just as easily write C wrappers, e.g.
>
> long local_cmpxchg(local_t *l, long old, long new)
> {
>         return cmpxchg_local(&l->a.counter, old, new);
> }
>
> long local_try_cmpxchg(local_t *l, long *old, long new)
> {
>         return try_cmpxchg_local(&l->a.counter, old, new);
> }

Please find attached the complete prototype patch that implements the
above suggestion.

The patch includes:
- implementation of instrumented try_cmpxchg{,64}_local definitions
- corresponding arch_try_cmpxchg{,64}_local fallback definitions
- generic local{,64}_try_cmpxchg (and local{,64}_cmpxchg) C wrappers

- x86 specific local_try_cmpxchg (and local_cmpxchg) C wrappers
- x86 specific arch_try_cmpxchg_local definition

- kernel/events/ring_buffer.c change to test local_try_cmpxchg
implementation and illustrate the transition
- arch/x86/events/core.c change to test local64_try_cmpxchg
implementation and illustrate the transition

The definition of atomic_long_t is different for 64-bit and 32-bit
targets (s64 vs int), so target specific C wrappers have to use
different casts to account for this difference.

Uros.

[-- Attachment #2: try_cmpxchg_local-04042023.diff.txt --]
[-- Type: text/plain, Size: 8903 bytes --]

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index d096b04bf80e..d9310e9363f1 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -129,13 +129,12 @@ u64 x86_perf_event_update(struct perf_event *event)
 	 * exchange a new raw count - then add that new-prev delta
 	 * count to the generic event atomically:
 	 */
-again:
 	prev_raw_count = local64_read(&hwc->prev_count);
-	rdpmcl(hwc->event_base_rdpmc, new_raw_count);
 
-	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
-					new_raw_count) != prev_raw_count)
-		goto again;
+	do {
+		rdpmcl(hwc->event_base_rdpmc, new_raw_count);
+	} while (!local64_try_cmpxchg(&hwc->prev_count, &prev_raw_count,
+				      new_raw_count));
 
 	/*
 	 * Now we have the new raw value and have updated the prev
diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
index 94fbe6ae7431..540573f515b7 100644
--- a/arch/x86/include/asm/cmpxchg.h
+++ b/arch/x86/include/asm/cmpxchg.h
@@ -221,9 +221,15 @@ extern void __add_wrong_size(void)
 #define __try_cmpxchg(ptr, pold, new, size)				\
 	__raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
 
+#define __try_cmpxchg_local(ptr, pold, new, size)			\
+	__raw_try_cmpxchg((ptr), (pold), (new), (size), "")
+
 #define arch_try_cmpxchg(ptr, pold, new) 				\
 	__try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
 
+#define arch_try_cmpxchg_local(ptr, pold, new)				\
+	__try_cmpxchg_local((ptr), (pold), (new), sizeof(*(ptr)))
+
 /*
  * xadd() adds "inc" to "*ptr" and atomically returns the previous
  * value of "*ptr".
diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
index 349a47acaa4a..d286a6c7c0b7 100644
--- a/arch/x86/include/asm/local.h
+++ b/arch/x86/include/asm/local.h
@@ -120,8 +120,20 @@ static inline long local_sub_return(long i, local_t *l)
 #define local_inc_return(l)  (local_add_return(1, l))
 #define local_dec_return(l)  (local_sub_return(1, l))
 
-#define local_cmpxchg(l, o, n) \
-	(cmpxchg_local(&((l)->a.counter), (o), (n)))
+static inline long local_cmpxchg(local_t *l, long old, long new)
+{
+	return cmpxchg_local(&l->a.counter, old, new);
+}
+
+static inline bool local_try_cmpxchg(local_t *l, long *old, long new)
+{
+#ifdef CONFIG_64BIT
+	return try_cmpxchg_local(&l->a.counter, (s64 *)old, new);
+#else
+	return try_cmpxchg_local(&l->a.counter, (int *)old, new);
+#endif
+}
+
 /* Always has a lock prefix */
 #define local_xchg(l, n) (xchg(&((l)->a.counter), (n)))
 
diff --git a/include/asm-generic/local.h b/include/asm-generic/local.h
index fca7f1d84818..7f97018df66f 100644
--- a/include/asm-generic/local.h
+++ b/include/asm-generic/local.h
@@ -42,6 +42,7 @@ typedef struct
 #define local_inc_return(l) atomic_long_inc_return(&(l)->a)
 
 #define local_cmpxchg(l, o, n) atomic_long_cmpxchg((&(l)->a), (o), (n))
+#define local_try_cmpxchg(l, po, n) atomic_long_try_cmpxchg((&(l)->a), (po), (n))
 #define local_xchg(l, n) atomic_long_xchg((&(l)->a), (n))
 #define local_add_unless(l, _a, u) atomic_long_add_unless((&(l)->a), (_a), (u))
 #define local_inc_not_zero(l) atomic_long_inc_not_zero(&(l)->a)
diff --git a/include/asm-generic/local64.h b/include/asm-generic/local64.h
index 765be0b7d883..14963a7a6253 100644
--- a/include/asm-generic/local64.h
+++ b/include/asm-generic/local64.h
@@ -42,7 +42,16 @@ typedef struct {
 #define local64_sub_return(i, l) local_sub_return((i), (&(l)->a))
 #define local64_inc_return(l)	local_inc_return(&(l)->a)
 
-#define local64_cmpxchg(l, o, n) local_cmpxchg((&(l)->a), (o), (n))
+static inline s64 local64_cmpxchg(local64_t *l, s64 old, s64 new)
+{
+	return local_cmpxchg(&l->a, old, new);
+}
+
+static inline bool local64_try_cmpxchg(local64_t *l, s64 *old, s64 new)
+{
+	return local_try_cmpxchg(&l->a, (long *)old, new);
+}
+
 #define local64_xchg(l, n)	local_xchg((&(l)->a), (n))
 #define local64_add_unless(l, _a, u) local_add_unless((&(l)->a), (_a), (u))
 #define local64_inc_not_zero(l)	local_inc_not_zero(&(l)->a)
@@ -81,6 +90,7 @@ typedef struct {
 #define local64_inc_return(l)	atomic64_inc_return(&(l)->a)
 
 #define local64_cmpxchg(l, o, n) atomic64_cmpxchg((&(l)->a), (o), (n))
+#define local64_try_cmpxchg(l, po, n) atomic64_try_cmpxchg((&(l)->a), (po), (n))
 #define local64_xchg(l, n)	atomic64_xchg((&(l)->a), (n))
 #define local64_add_unless(l, _a, u) atomic64_add_unless((&(l)->a), (_a), (u))
 #define local64_inc_not_zero(l)	atomic64_inc_not_zero(&(l)->a)
diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
index 77bc5522e61c..36c92851cdee 100644
--- a/include/linux/atomic/atomic-arch-fallback.h
+++ b/include/linux/atomic/atomic-arch-fallback.h
@@ -217,6 +217,28 @@
 
 #endif /* arch_try_cmpxchg64_relaxed */
 
+#ifndef arch_try_cmpxchg_local
+#define arch_try_cmpxchg_local(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg_local((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg_local */
+
+#ifndef arch_try_cmpxchg64_local
+#define arch_try_cmpxchg64_local(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg64_local((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg64_local */
+
 #ifndef arch_atomic_read_acquire
 static __always_inline int
 arch_atomic_read_acquire(const atomic_t *v)
@@ -2456,4 +2478,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
 #endif
 
 #endif /* _LINUX_ATOMIC_FALLBACK_H */
-// b5e87bdd5ede61470c29f7a7e4de781af3770f09
+// 1f49bd4895a4b7a5383906649027205c52ec80ab
diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h
index 7a139ec030b0..14a9212cc987 100644
--- a/include/linux/atomic/atomic-instrumented.h
+++ b/include/linux/atomic/atomic-instrumented.h
@@ -2066,6 +2066,24 @@ atomic_long_dec_if_positive(atomic_long_t *v)
 	arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
 })
 
+#define try_cmpxchg_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg64_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
 #define cmpxchg_double(ptr, ...) \
 ({ \
 	typeof(ptr) __ai_ptr = (ptr); \
@@ -2083,4 +2101,4 @@ atomic_long_dec_if_positive(atomic_long_t *v)
 })
 
 #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
-// 764f741eb77a7ad565dc8d99ce2837d5542e8aee
+// 456e206c7e4e681126c482e4edcc6f46921ac731
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
index 273a0fe7910a..111ab85ee97d 100644
--- a/kernel/events/ring_buffer.c
+++ b/kernel/events/ring_buffer.c
@@ -191,9 +191,10 @@ __perf_output_begin(struct perf_output_handle *handle,
 
 	perf_output_get_handle(handle);
 
+	offset = local_read(&rb->head);
 	do {
 		tail = READ_ONCE(rb->user_page->data_tail);
-		offset = head = local_read(&rb->head);
+		head = offset;
 		if (!rb->overwrite) {
 			if (unlikely(!ring_buffer_has_space(head, tail,
 							    perf_data_size(rb),
@@ -217,7 +218,7 @@ __perf_output_begin(struct perf_output_handle *handle,
 			head += size;
 		else
 			head -= size;
-	} while (local_cmpxchg(&rb->head, offset, head) != offset);
+	} while (!local_try_cmpxchg(&rb->head, &offset, head));
 
 	if (backward) {
 		offset = head;
diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
index 3a07695e3c89..6e853f0dad8d 100755
--- a/scripts/atomic/gen-atomic-fallback.sh
+++ b/scripts/atomic/gen-atomic-fallback.sh
@@ -225,6 +225,10 @@ for cmpxchg in "cmpxchg" "cmpxchg64"; do
 	gen_try_cmpxchg_fallbacks "${cmpxchg}"
 done
 
+for cmpxchg in "cmpxchg_local" "cmpxchg64_local"; do
+	gen_try_cmpxchg_fallback "${cmpxchg}" ""
+done
+
 grep '^[a-z]' "$1" | while read name meta args; do
 	gen_proto "${meta}" "${name}" "atomic" "int" ${args}
 done
diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh
index 77c06526a574..c8165e9431bf 100755
--- a/scripts/atomic/gen-atomic-instrumented.sh
+++ b/scripts/atomic/gen-atomic-instrumented.sh
@@ -173,7 +173,7 @@ for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do
 	done
 done
 
-for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg"; do
+for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do
 	gen_xchg "${xchg}" "" ""
 	printf "\n"
 done

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-04-04 12:24                 ` Uros Bizjak
@ 2023-04-04 13:19                   ` Mark Rutland
  -1 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-04-04 13:19 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Tue, Apr 04, 2023 at 02:24:38PM +0200, Uros Bizjak wrote:
> On Mon, Apr 3, 2023 at 12:19 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Sun, Mar 26, 2023 at 09:28:38PM +0200, Uros Bizjak wrote:
> > > On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > >
> > > > On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > > > > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > > > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > >
> > > > > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > > > > >
> > > > > > > Can you give an example of where we are passing an incompatible pointer?
> > > > > >
> > > > > > An example is patch 10/10 from the series, which will fail without
> > > > > > this fix when fallback code is used. We have:
> > > > > >
> > > > > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > > > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > > > > >
> > > > > > where rb->head is defined as:
> > > > > >
> > > > > > typedef struct {
> > > > > >    atomic_long_t a;
> > > > > > } local_t;
> > > > > >
> > > > > > while offset is defined as 'unsigned long'.
> > > > >
> > > > > Ok, but that's because we're doing the wrong thing to start with.
> > > > >
> > > > > Since local_t is defined in terms of atomic_long_t, we should define the
> > > > > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > > > > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > > > > that in the callsite:
> > > > >
> > > > >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
> > > >
> > > > Sorry, that should be:
> > > >
> > > >         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))
> > >
> > > The fallbacks are a bit more complicated than above, and are different
> > > from atomic_try_cmpxchg.
> > >
> > > Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
> > > are not defined call arch_cmpxchg_local. Also in patch 2/10,
> > > try_cmpxchg_local is introduced, where it calls
> > > arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
> > > :
> > >
> > > #define local_cmpxchg(l, o, n) \
> > >        (cmpxchg_local(&((l)->a.counter), (o), (n)))
> > > +#define local_try_cmpxchg(l, po, n) \
> > > +       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))
> > >
> > > which is part of the local_t API. Targets should either define all
> > > these #defines, or none. There are no partial fallbacks as is the case
> > > with atomic_t.
> >
> > Whether or not there are fallbacks is immaterial.
> >
> > In those cases, architectures can just as easily write C wrappers, e.g.
> >
> > long local_cmpxchg(local_t *l, long old, long new)
> > {
> >         return cmpxchg_local(&l->a.counter, old, new);
> > }
> >
> > long local_try_cmpxchg(local_t *l, long *old, long new)
> > {
> >         return try_cmpxchg_local(&l->a.counter, old, new);
> > }
> 
> Please find attached the complete prototype patch that implements the
> above suggestion.
> 
> The patch includes:
> - implementation of instrumented try_cmpxchg{,64}_local definitions
> - corresponding arch_try_cmpxchg{,64}_local fallback definitions
> - generic local{,64}_try_cmpxchg (and local{,64}_cmpxchg) C wrappers
> 
> - x86 specific local_try_cmpxchg (and local_cmpxchg) C wrappers
> - x86 specific arch_try_cmpxchg_local definition
> 
> - kernel/events/ring_buffer.c change to test local_try_cmpxchg
> implementation and illustrate the transition
> - arch/x86/events/core.c change to test local64_try_cmpxchg
> implementation and illustrate the transition
> 
> The definition of atomic_long_t is different for 64-bit and 32-bit
> targets (s64 vs int), so target specific C wrappers have to use
> different casts to account for this difference.
> 
> Uros.

Thanks for this!

FWIW, the patch (inline below) looks good to me.

Mark.

> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index d096b04bf80e..d9310e9363f1 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -129,13 +129,12 @@ u64 x86_perf_event_update(struct perf_event *event)
>  	 * exchange a new raw count - then add that new-prev delta
>  	 * count to the generic event atomically:
>  	 */
> -again:
>  	prev_raw_count = local64_read(&hwc->prev_count);
> -	rdpmcl(hwc->event_base_rdpmc, new_raw_count);
>  
> -	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
> -					new_raw_count) != prev_raw_count)
> -		goto again;
> +	do {
> +		rdpmcl(hwc->event_base_rdpmc, new_raw_count);
> +	} while (!local64_try_cmpxchg(&hwc->prev_count, &prev_raw_count,
> +				      new_raw_count));
>  
>  	/*
>  	 * Now we have the new raw value and have updated the prev
> diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
> index 94fbe6ae7431..540573f515b7 100644
> --- a/arch/x86/include/asm/cmpxchg.h
> +++ b/arch/x86/include/asm/cmpxchg.h
> @@ -221,9 +221,15 @@ extern void __add_wrong_size(void)
>  #define __try_cmpxchg(ptr, pold, new, size)				\
>  	__raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
>  
> +#define __try_cmpxchg_local(ptr, pold, new, size)			\
> +	__raw_try_cmpxchg((ptr), (pold), (new), (size), "")
> +
>  #define arch_try_cmpxchg(ptr, pold, new) 				\
>  	__try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
>  
> +#define arch_try_cmpxchg_local(ptr, pold, new)				\
> +	__try_cmpxchg_local((ptr), (pold), (new), sizeof(*(ptr)))
> +
>  /*
>   * xadd() adds "inc" to "*ptr" and atomically returns the previous
>   * value of "*ptr".
> diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
> index 349a47acaa4a..d286a6c7c0b7 100644
> --- a/arch/x86/include/asm/local.h
> +++ b/arch/x86/include/asm/local.h
> @@ -120,8 +120,20 @@ static inline long local_sub_return(long i, local_t *l)
>  #define local_inc_return(l)  (local_add_return(1, l))
>  #define local_dec_return(l)  (local_sub_return(1, l))
>  
> -#define local_cmpxchg(l, o, n) \
> -	(cmpxchg_local(&((l)->a.counter), (o), (n)))
> +static inline long local_cmpxchg(local_t *l, long old, long new)
> +{
> +	return cmpxchg_local(&l->a.counter, old, new);
> +}
> +
> +static inline bool local_try_cmpxchg(local_t *l, long *old, long new)
> +{
> +#ifdef CONFIG_64BIT
> +	return try_cmpxchg_local(&l->a.counter, (s64 *)old, new);
> +#else
> +	return try_cmpxchg_local(&l->a.counter, (int *)old, new);
> +#endif
> +}
> +
>  /* Always has a lock prefix */
>  #define local_xchg(l, n) (xchg(&((l)->a.counter), (n)))
>  
> diff --git a/include/asm-generic/local.h b/include/asm-generic/local.h
> index fca7f1d84818..7f97018df66f 100644
> --- a/include/asm-generic/local.h
> +++ b/include/asm-generic/local.h
> @@ -42,6 +42,7 @@ typedef struct
>  #define local_inc_return(l) atomic_long_inc_return(&(l)->a)
>  
>  #define local_cmpxchg(l, o, n) atomic_long_cmpxchg((&(l)->a), (o), (n))
> +#define local_try_cmpxchg(l, po, n) atomic_long_try_cmpxchg((&(l)->a), (po), (n))
>  #define local_xchg(l, n) atomic_long_xchg((&(l)->a), (n))
>  #define local_add_unless(l, _a, u) atomic_long_add_unless((&(l)->a), (_a), (u))
>  #define local_inc_not_zero(l) atomic_long_inc_not_zero(&(l)->a)
> diff --git a/include/asm-generic/local64.h b/include/asm-generic/local64.h
> index 765be0b7d883..14963a7a6253 100644
> --- a/include/asm-generic/local64.h
> +++ b/include/asm-generic/local64.h
> @@ -42,7 +42,16 @@ typedef struct {
>  #define local64_sub_return(i, l) local_sub_return((i), (&(l)->a))
>  #define local64_inc_return(l)	local_inc_return(&(l)->a)
>  
> -#define local64_cmpxchg(l, o, n) local_cmpxchg((&(l)->a), (o), (n))
> +static inline s64 local64_cmpxchg(local64_t *l, s64 old, s64 new)
> +{
> +	return local_cmpxchg(&l->a, old, new);
> +}
> +
> +static inline bool local64_try_cmpxchg(local64_t *l, s64 *old, s64 new)
> +{
> +	return local_try_cmpxchg(&l->a, (long *)old, new);
> +}
> +
>  #define local64_xchg(l, n)	local_xchg((&(l)->a), (n))
>  #define local64_add_unless(l, _a, u) local_add_unless((&(l)->a), (_a), (u))
>  #define local64_inc_not_zero(l)	local_inc_not_zero(&(l)->a)
> @@ -81,6 +90,7 @@ typedef struct {
>  #define local64_inc_return(l)	atomic64_inc_return(&(l)->a)
>  
>  #define local64_cmpxchg(l, o, n) atomic64_cmpxchg((&(l)->a), (o), (n))
> +#define local64_try_cmpxchg(l, po, n) atomic64_try_cmpxchg((&(l)->a), (po), (n))
>  #define local64_xchg(l, n)	atomic64_xchg((&(l)->a), (n))
>  #define local64_add_unless(l, _a, u) atomic64_add_unless((&(l)->a), (_a), (u))
>  #define local64_inc_not_zero(l)	atomic64_inc_not_zero(&(l)->a)
> diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
> index 77bc5522e61c..36c92851cdee 100644
> --- a/include/linux/atomic/atomic-arch-fallback.h
> +++ b/include/linux/atomic/atomic-arch-fallback.h
> @@ -217,6 +217,28 @@
>  
>  #endif /* arch_try_cmpxchg64_relaxed */
>  
> +#ifndef arch_try_cmpxchg_local
> +#define arch_try_cmpxchg_local(_ptr, _oldp, _new) \
> +({ \
> +	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	___r = arch_cmpxchg_local((_ptr), ___o, (_new)); \
> +	if (unlikely(___r != ___o)) \
> +		*___op = ___r; \
> +	likely(___r == ___o); \
> +})
> +#endif /* arch_try_cmpxchg_local */
> +
> +#ifndef arch_try_cmpxchg64_local
> +#define arch_try_cmpxchg64_local(_ptr, _oldp, _new) \
> +({ \
> +	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	___r = arch_cmpxchg64_local((_ptr), ___o, (_new)); \
> +	if (unlikely(___r != ___o)) \
> +		*___op = ___r; \
> +	likely(___r == ___o); \
> +})
> +#endif /* arch_try_cmpxchg64_local */
> +
>  #ifndef arch_atomic_read_acquire
>  static __always_inline int
>  arch_atomic_read_acquire(const atomic_t *v)
> @@ -2456,4 +2478,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
>  #endif
>  
>  #endif /* _LINUX_ATOMIC_FALLBACK_H */
> -// b5e87bdd5ede61470c29f7a7e4de781af3770f09
> +// 1f49bd4895a4b7a5383906649027205c52ec80ab
> diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h
> index 7a139ec030b0..14a9212cc987 100644
> --- a/include/linux/atomic/atomic-instrumented.h
> +++ b/include/linux/atomic/atomic-instrumented.h
> @@ -2066,6 +2066,24 @@ atomic_long_dec_if_positive(atomic_long_t *v)
>  	arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
>  })
>  
> +#define try_cmpxchg_local(ptr, oldp, ...) \
> +({ \
> +	typeof(ptr) __ai_ptr = (ptr); \
> +	typeof(oldp) __ai_oldp = (oldp); \
> +	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
> +	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
> +	arch_try_cmpxchg_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
> +})
> +
> +#define try_cmpxchg64_local(ptr, oldp, ...) \
> +({ \
> +	typeof(ptr) __ai_ptr = (ptr); \
> +	typeof(oldp) __ai_oldp = (oldp); \
> +	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
> +	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
> +	arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
> +})
> +
>  #define cmpxchg_double(ptr, ...) \
>  ({ \
>  	typeof(ptr) __ai_ptr = (ptr); \
> @@ -2083,4 +2101,4 @@ atomic_long_dec_if_positive(atomic_long_t *v)
>  })
>  
>  #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
> -// 764f741eb77a7ad565dc8d99ce2837d5542e8aee
> +// 456e206c7e4e681126c482e4edcc6f46921ac731
> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
> index 273a0fe7910a..111ab85ee97d 100644
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -191,9 +191,10 @@ __perf_output_begin(struct perf_output_handle *handle,
>  
>  	perf_output_get_handle(handle);
>  
> +	offset = local_read(&rb->head);
>  	do {
>  		tail = READ_ONCE(rb->user_page->data_tail);
> -		offset = head = local_read(&rb->head);
> +		head = offset;
>  		if (!rb->overwrite) {
>  			if (unlikely(!ring_buffer_has_space(head, tail,
>  							    perf_data_size(rb),
> @@ -217,7 +218,7 @@ __perf_output_begin(struct perf_output_handle *handle,
>  			head += size;
>  		else
>  			head -= size;
> -	} while (local_cmpxchg(&rb->head, offset, head) != offset);
> +	} while (!local_try_cmpxchg(&rb->head, &offset, head));
>  
>  	if (backward) {
>  		offset = head;
> diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
> index 3a07695e3c89..6e853f0dad8d 100755
> --- a/scripts/atomic/gen-atomic-fallback.sh
> +++ b/scripts/atomic/gen-atomic-fallback.sh
> @@ -225,6 +225,10 @@ for cmpxchg in "cmpxchg" "cmpxchg64"; do
>  	gen_try_cmpxchg_fallbacks "${cmpxchg}"
>  done
>  
> +for cmpxchg in "cmpxchg_local" "cmpxchg64_local"; do
> +	gen_try_cmpxchg_fallback "${cmpxchg}" ""
> +done
> +
>  grep '^[a-z]' "$1" | while read name meta args; do
>  	gen_proto "${meta}" "${name}" "atomic" "int" ${args}
>  done
> diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh
> index 77c06526a574..c8165e9431bf 100755
> --- a/scripts/atomic/gen-atomic-instrumented.sh
> +++ b/scripts/atomic/gen-atomic-instrumented.sh
> @@ -173,7 +173,7 @@ for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do
>  	done
>  done
>  
> -for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg"; do
> +for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do
>  	gen_xchg "${xchg}" "" ""
>  	printf "\n"
>  done


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-04-04 13:19                   ` Mark Rutland
  0 siblings, 0 replies; 43+ messages in thread
From: Mark Rutland @ 2023-04-04 13:19 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

On Tue, Apr 04, 2023 at 02:24:38PM +0200, Uros Bizjak wrote:
> On Mon, Apr 3, 2023 at 12:19 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Sun, Mar 26, 2023 at 09:28:38PM +0200, Uros Bizjak wrote:
> > > On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > >
> > > > On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > > > > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > > > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > >
> > > > > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > > > > >
> > > > > > > Can you give an example of where we are passing an incompatible pointer?
> > > > > >
> > > > > > An example is patch 10/10 from the series, which will fail without
> > > > > > this fix when fallback code is used. We have:
> > > > > >
> > > > > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > > > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > > > > >
> > > > > > where rb->head is defined as:
> > > > > >
> > > > > > typedef struct {
> > > > > >    atomic_long_t a;
> > > > > > } local_t;
> > > > > >
> > > > > > while offset is defined as 'unsigned long'.
> > > > >
> > > > > Ok, but that's because we're doing the wrong thing to start with.
> > > > >
> > > > > Since local_t is defined in terms of atomic_long_t, we should define the
> > > > > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > > > > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > > > > that in the callsite:
> > > > >
> > > > >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
> > > >
> > > > Sorry, that should be:
> > > >
> > > >         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))
> > >
> > > The fallbacks are a bit more complicated than above, and are different
> > > from atomic_try_cmpxchg.
> > >
> > > Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
> > > are not defined call arch_cmpxchg_local. Also in patch 2/10,
> > > try_cmpxchg_local is introduced, where it calls
> > > arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
> > > :
> > >
> > > #define local_cmpxchg(l, o, n) \
> > >        (cmpxchg_local(&((l)->a.counter), (o), (n)))
> > > +#define local_try_cmpxchg(l, po, n) \
> > > +       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))
> > >
> > > which is part of the local_t API. Targets should either define all
> > > these #defines, or none. There are no partial fallbacks as is the case
> > > with atomic_t.
> >
> > Whether or not there are fallbacks is immaterial.
> >
> > In those cases, architectures can just as easily write C wrappers, e.g.
> >
> > long local_cmpxchg(local_t *l, long old, long new)
> > {
> >         return cmpxchg_local(&l->a.counter, old, new);
> > }
> >
> > long local_try_cmpxchg(local_t *l, long *old, long new)
> > {
> >         return try_cmpxchg_local(&l->a.counter, old, new);
> > }
> 
> Please find attached the complete prototype patch that implements the
> above suggestion.
> 
> The patch includes:
> - implementation of instrumented try_cmpxchg{,64}_local definitions
> - corresponding arch_try_cmpxchg{,64}_local fallback definitions
> - generic local{,64}_try_cmpxchg (and local{,64}_cmpxchg) C wrappers
> 
> - x86 specific local_try_cmpxchg (and local_cmpxchg) C wrappers
> - x86 specific arch_try_cmpxchg_local definition
> 
> - kernel/events/ring_buffer.c change to test local_try_cmpxchg
> implementation and illustrate the transition
> - arch/x86/events/core.c change to test local64_try_cmpxchg
> implementation and illustrate the transition
> 
> The definition of atomic_long_t is different for 64-bit and 32-bit
> targets (s64 vs int), so target specific C wrappers have to use
> different casts to account for this difference.
> 
> Uros.

Thanks for this!

FWIW, the patch (inline below) looks good to me.

Mark.

> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index d096b04bf80e..d9310e9363f1 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -129,13 +129,12 @@ u64 x86_perf_event_update(struct perf_event *event)
>  	 * exchange a new raw count - then add that new-prev delta
>  	 * count to the generic event atomically:
>  	 */
> -again:
>  	prev_raw_count = local64_read(&hwc->prev_count);
> -	rdpmcl(hwc->event_base_rdpmc, new_raw_count);
>  
> -	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
> -					new_raw_count) != prev_raw_count)
> -		goto again;
> +	do {
> +		rdpmcl(hwc->event_base_rdpmc, new_raw_count);
> +	} while (!local64_try_cmpxchg(&hwc->prev_count, &prev_raw_count,
> +				      new_raw_count));
>  
>  	/*
>  	 * Now we have the new raw value and have updated the prev
> diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
> index 94fbe6ae7431..540573f515b7 100644
> --- a/arch/x86/include/asm/cmpxchg.h
> +++ b/arch/x86/include/asm/cmpxchg.h
> @@ -221,9 +221,15 @@ extern void __add_wrong_size(void)
>  #define __try_cmpxchg(ptr, pold, new, size)				\
>  	__raw_try_cmpxchg((ptr), (pold), (new), (size), LOCK_PREFIX)
>  
> +#define __try_cmpxchg_local(ptr, pold, new, size)			\
> +	__raw_try_cmpxchg((ptr), (pold), (new), (size), "")
> +
>  #define arch_try_cmpxchg(ptr, pold, new) 				\
>  	__try_cmpxchg((ptr), (pold), (new), sizeof(*(ptr)))
>  
> +#define arch_try_cmpxchg_local(ptr, pold, new)				\
> +	__try_cmpxchg_local((ptr), (pold), (new), sizeof(*(ptr)))
> +
>  /*
>   * xadd() adds "inc" to "*ptr" and atomically returns the previous
>   * value of "*ptr".
> diff --git a/arch/x86/include/asm/local.h b/arch/x86/include/asm/local.h
> index 349a47acaa4a..d286a6c7c0b7 100644
> --- a/arch/x86/include/asm/local.h
> +++ b/arch/x86/include/asm/local.h
> @@ -120,8 +120,20 @@ static inline long local_sub_return(long i, local_t *l)
>  #define local_inc_return(l)  (local_add_return(1, l))
>  #define local_dec_return(l)  (local_sub_return(1, l))
>  
> -#define local_cmpxchg(l, o, n) \
> -	(cmpxchg_local(&((l)->a.counter), (o), (n)))
> +static inline long local_cmpxchg(local_t *l, long old, long new)
> +{
> +	return cmpxchg_local(&l->a.counter, old, new);
> +}
> +
> +static inline bool local_try_cmpxchg(local_t *l, long *old, long new)
> +{
> +#ifdef CONFIG_64BIT
> +	return try_cmpxchg_local(&l->a.counter, (s64 *)old, new);
> +#else
> +	return try_cmpxchg_local(&l->a.counter, (int *)old, new);
> +#endif
> +}
> +
>  /* Always has a lock prefix */
>  #define local_xchg(l, n) (xchg(&((l)->a.counter), (n)))
>  
> diff --git a/include/asm-generic/local.h b/include/asm-generic/local.h
> index fca7f1d84818..7f97018df66f 100644
> --- a/include/asm-generic/local.h
> +++ b/include/asm-generic/local.h
> @@ -42,6 +42,7 @@ typedef struct
>  #define local_inc_return(l) atomic_long_inc_return(&(l)->a)
>  
>  #define local_cmpxchg(l, o, n) atomic_long_cmpxchg((&(l)->a), (o), (n))
> +#define local_try_cmpxchg(l, po, n) atomic_long_try_cmpxchg((&(l)->a), (po), (n))
>  #define local_xchg(l, n) atomic_long_xchg((&(l)->a), (n))
>  #define local_add_unless(l, _a, u) atomic_long_add_unless((&(l)->a), (_a), (u))
>  #define local_inc_not_zero(l) atomic_long_inc_not_zero(&(l)->a)
> diff --git a/include/asm-generic/local64.h b/include/asm-generic/local64.h
> index 765be0b7d883..14963a7a6253 100644
> --- a/include/asm-generic/local64.h
> +++ b/include/asm-generic/local64.h
> @@ -42,7 +42,16 @@ typedef struct {
>  #define local64_sub_return(i, l) local_sub_return((i), (&(l)->a))
>  #define local64_inc_return(l)	local_inc_return(&(l)->a)
>  
> -#define local64_cmpxchg(l, o, n) local_cmpxchg((&(l)->a), (o), (n))
> +static inline s64 local64_cmpxchg(local64_t *l, s64 old, s64 new)
> +{
> +	return local_cmpxchg(&l->a, old, new);
> +}
> +
> +static inline bool local64_try_cmpxchg(local64_t *l, s64 *old, s64 new)
> +{
> +	return local_try_cmpxchg(&l->a, (long *)old, new);
> +}
> +
>  #define local64_xchg(l, n)	local_xchg((&(l)->a), (n))
>  #define local64_add_unless(l, _a, u) local_add_unless((&(l)->a), (_a), (u))
>  #define local64_inc_not_zero(l)	local_inc_not_zero(&(l)->a)
> @@ -81,6 +90,7 @@ typedef struct {
>  #define local64_inc_return(l)	atomic64_inc_return(&(l)->a)
>  
>  #define local64_cmpxchg(l, o, n) atomic64_cmpxchg((&(l)->a), (o), (n))
> +#define local64_try_cmpxchg(l, po, n) atomic64_try_cmpxchg((&(l)->a), (po), (n))
>  #define local64_xchg(l, n)	atomic64_xchg((&(l)->a), (n))
>  #define local64_add_unless(l, _a, u) atomic64_add_unless((&(l)->a), (_a), (u))
>  #define local64_inc_not_zero(l)	atomic64_inc_not_zero(&(l)->a)
> diff --git a/include/linux/atomic/atomic-arch-fallback.h b/include/linux/atomic/atomic-arch-fallback.h
> index 77bc5522e61c..36c92851cdee 100644
> --- a/include/linux/atomic/atomic-arch-fallback.h
> +++ b/include/linux/atomic/atomic-arch-fallback.h
> @@ -217,6 +217,28 @@
>  
>  #endif /* arch_try_cmpxchg64_relaxed */
>  
> +#ifndef arch_try_cmpxchg_local
> +#define arch_try_cmpxchg_local(_ptr, _oldp, _new) \
> +({ \
> +	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	___r = arch_cmpxchg_local((_ptr), ___o, (_new)); \
> +	if (unlikely(___r != ___o)) \
> +		*___op = ___r; \
> +	likely(___r == ___o); \
> +})
> +#endif /* arch_try_cmpxchg_local */
> +
> +#ifndef arch_try_cmpxchg64_local
> +#define arch_try_cmpxchg64_local(_ptr, _oldp, _new) \
> +({ \
> +	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
> +	___r = arch_cmpxchg64_local((_ptr), ___o, (_new)); \
> +	if (unlikely(___r != ___o)) \
> +		*___op = ___r; \
> +	likely(___r == ___o); \
> +})
> +#endif /* arch_try_cmpxchg64_local */
> +
>  #ifndef arch_atomic_read_acquire
>  static __always_inline int
>  arch_atomic_read_acquire(const atomic_t *v)
> @@ -2456,4 +2478,4 @@ arch_atomic64_dec_if_positive(atomic64_t *v)
>  #endif
>  
>  #endif /* _LINUX_ATOMIC_FALLBACK_H */
> -// b5e87bdd5ede61470c29f7a7e4de781af3770f09
> +// 1f49bd4895a4b7a5383906649027205c52ec80ab
> diff --git a/include/linux/atomic/atomic-instrumented.h b/include/linux/atomic/atomic-instrumented.h
> index 7a139ec030b0..14a9212cc987 100644
> --- a/include/linux/atomic/atomic-instrumented.h
> +++ b/include/linux/atomic/atomic-instrumented.h
> @@ -2066,6 +2066,24 @@ atomic_long_dec_if_positive(atomic_long_t *v)
>  	arch_sync_cmpxchg(__ai_ptr, __VA_ARGS__); \
>  })
>  
> +#define try_cmpxchg_local(ptr, oldp, ...) \
> +({ \
> +	typeof(ptr) __ai_ptr = (ptr); \
> +	typeof(oldp) __ai_oldp = (oldp); \
> +	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
> +	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
> +	arch_try_cmpxchg_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
> +})
> +
> +#define try_cmpxchg64_local(ptr, oldp, ...) \
> +({ \
> +	typeof(ptr) __ai_ptr = (ptr); \
> +	typeof(oldp) __ai_oldp = (oldp); \
> +	instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
> +	instrument_atomic_write(__ai_oldp, sizeof(*__ai_oldp)); \
> +	arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
> +})
> +
>  #define cmpxchg_double(ptr, ...) \
>  ({ \
>  	typeof(ptr) __ai_ptr = (ptr); \
> @@ -2083,4 +2101,4 @@ atomic_long_dec_if_positive(atomic_long_t *v)
>  })
>  
>  #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
> -// 764f741eb77a7ad565dc8d99ce2837d5542e8aee
> +// 456e206c7e4e681126c482e4edcc6f46921ac731
> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
> index 273a0fe7910a..111ab85ee97d 100644
> --- a/kernel/events/ring_buffer.c
> +++ b/kernel/events/ring_buffer.c
> @@ -191,9 +191,10 @@ __perf_output_begin(struct perf_output_handle *handle,
>  
>  	perf_output_get_handle(handle);
>  
> +	offset = local_read(&rb->head);
>  	do {
>  		tail = READ_ONCE(rb->user_page->data_tail);
> -		offset = head = local_read(&rb->head);
> +		head = offset;
>  		if (!rb->overwrite) {
>  			if (unlikely(!ring_buffer_has_space(head, tail,
>  							    perf_data_size(rb),
> @@ -217,7 +218,7 @@ __perf_output_begin(struct perf_output_handle *handle,
>  			head += size;
>  		else
>  			head -= size;
> -	} while (local_cmpxchg(&rb->head, offset, head) != offset);
> +	} while (!local_try_cmpxchg(&rb->head, &offset, head));
>  
>  	if (backward) {
>  		offset = head;
> diff --git a/scripts/atomic/gen-atomic-fallback.sh b/scripts/atomic/gen-atomic-fallback.sh
> index 3a07695e3c89..6e853f0dad8d 100755
> --- a/scripts/atomic/gen-atomic-fallback.sh
> +++ b/scripts/atomic/gen-atomic-fallback.sh
> @@ -225,6 +225,10 @@ for cmpxchg in "cmpxchg" "cmpxchg64"; do
>  	gen_try_cmpxchg_fallbacks "${cmpxchg}"
>  done
>  
> +for cmpxchg in "cmpxchg_local" "cmpxchg64_local"; do
> +	gen_try_cmpxchg_fallback "${cmpxchg}" ""
> +done
> +
>  grep '^[a-z]' "$1" | while read name meta args; do
>  	gen_proto "${meta}" "${name}" "atomic" "int" ${args}
>  done
> diff --git a/scripts/atomic/gen-atomic-instrumented.sh b/scripts/atomic/gen-atomic-instrumented.sh
> index 77c06526a574..c8165e9431bf 100755
> --- a/scripts/atomic/gen-atomic-instrumented.sh
> +++ b/scripts/atomic/gen-atomic-instrumented.sh
> @@ -173,7 +173,7 @@ for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do
>  	done
>  done
>  
> -for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg"; do
> +for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do
>  	gen_xchg "${xchg}" "" ""
>  	printf "\n"
>  done


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  2023-04-04 13:19                   ` Mark Rutland
@ 2023-04-04 13:23                     ` Uros Bizjak
  -1 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-04-04 13:23 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users, Will Deacon, Peter Zijlstra,
	Boqun Feng

On Tue, Apr 4, 2023 at 3:19 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Tue, Apr 04, 2023 at 02:24:38PM +0200, Uros Bizjak wrote:
> > On Mon, Apr 3, 2023 at 12:19 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Sun, Mar 26, 2023 at 09:28:38PM +0200, Uros Bizjak wrote:
> > > > On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > >
> > > > > On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > > > > > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > > > > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > > >
> > > > > > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > > > > > >
> > > > > > > > Can you give an example of where we are passing an incompatible pointer?
> > > > > > >
> > > > > > > An example is patch 10/10 from the series, which will fail without
> > > > > > > this fix when fallback code is used. We have:
> > > > > > >
> > > > > > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > > > > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > > > > > >
> > > > > > > where rb->head is defined as:
> > > > > > >
> > > > > > > typedef struct {
> > > > > > >    atomic_long_t a;
> > > > > > > } local_t;
> > > > > > >
> > > > > > > while offset is defined as 'unsigned long'.
> > > > > >
> > > > > > Ok, but that's because we're doing the wrong thing to start with.
> > > > > >
> > > > > > Since local_t is defined in terms of atomic_long_t, we should define the
> > > > > > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > > > > > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > > > > > that in the callsite:
> > > > > >
> > > > > >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
> > > > >
> > > > > Sorry, that should be:
> > > > >
> > > > >         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))
> > > >
> > > > The fallbacks are a bit more complicated than above, and are different
> > > > from atomic_try_cmpxchg.
> > > >
> > > > Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
> > > > are not defined call arch_cmpxchg_local. Also in patch 2/10,
> > > > try_cmpxchg_local is introduced, where it calls
> > > > arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
> > > > :
> > > >
> > > > #define local_cmpxchg(l, o, n) \
> > > >        (cmpxchg_local(&((l)->a.counter), (o), (n)))
> > > > +#define local_try_cmpxchg(l, po, n) \
> > > > +       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))
> > > >
> > > > which is part of the local_t API. Targets should either define all
> > > > these #defines, or none. There are no partial fallbacks as is the case
> > > > with atomic_t.
> > >
> > > Whether or not there are fallbacks is immaterial.
> > >
> > > In those cases, architectures can just as easily write C wrappers, e.g.
> > >
> > > long local_cmpxchg(local_t *l, long old, long new)
> > > {
> > >         return cmpxchg_local(&l->a.counter, old, new);
> > > }
> > >
> > > long local_try_cmpxchg(local_t *l, long *old, long new)
> > > {
> > >         return try_cmpxchg_local(&l->a.counter, old, new);
> > > }
> >
> > Please find attached the complete prototype patch that implements the
> > above suggestion.
> >
> > The patch includes:
> > - implementation of instrumented try_cmpxchg{,64}_local definitions
> > - corresponding arch_try_cmpxchg{,64}_local fallback definitions
> > - generic local{,64}_try_cmpxchg (and local{,64}_cmpxchg) C wrappers
> >
> > - x86 specific local_try_cmpxchg (and local_cmpxchg) C wrappers
> > - x86 specific arch_try_cmpxchg_local definition
> >
> > - kernel/events/ring_buffer.c change to test local_try_cmpxchg
> > implementation and illustrate the transition
> > - arch/x86/events/core.c change to test local64_try_cmpxchg
> > implementation and illustrate the transition
> >
> > The definition of atomic_long_t is different for 64-bit and 32-bit
> > targets (s64 vs int), so target specific C wrappers have to use
> > different casts to account for this difference.
> >
> > Uros.
>
> Thanks for this!
>
> FWIW, the patch (inline below) looks good to me.

Thanks, I will prepare a patch series for submission later today.

Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks
@ 2023-04-04 13:23                     ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-04-04 13:23 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arch, Peter Zijlstra, Will Deacon, Boqun Feng, linux-mips,
	linux-kernel, linux-perf-users, loongarch, linux-alpha,
	linuxppc-dev

On Tue, Apr 4, 2023 at 3:19 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Tue, Apr 04, 2023 at 02:24:38PM +0200, Uros Bizjak wrote:
> > On Mon, Apr 3, 2023 at 12:19 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Sun, Mar 26, 2023 at 09:28:38PM +0200, Uros Bizjak wrote:
> > > > On Fri, Mar 24, 2023 at 5:33 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > >
> > > > > On Fri, Mar 24, 2023 at 04:14:22PM +0000, Mark Rutland wrote:
> > > > > > On Fri, Mar 24, 2023 at 04:43:32PM +0100, Uros Bizjak wrote:
> > > > > > > On Fri, Mar 24, 2023 at 3:13 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > > > > > >
> > > > > > > > On Sun, Mar 05, 2023 at 09:56:19PM +0100, Uros Bizjak wrote:
> > > > > > > > > Cast _oldp to the type of _ptr to avoid incompatible-pointer-types warning.
> > > > > > > >
> > > > > > > > Can you give an example of where we are passing an incompatible pointer?
> > > > > > >
> > > > > > > An example is patch 10/10 from the series, which will fail without
> > > > > > > this fix when fallback code is used. We have:
> > > > > > >
> > > > > > > -       } while (local_cmpxchg(&rb->head, offset, head) != offset);
> > > > > > > +       } while (!local_try_cmpxchg(&rb->head, &offset, head));
> > > > > > >
> > > > > > > where rb->head is defined as:
> > > > > > >
> > > > > > > typedef struct {
> > > > > > >    atomic_long_t a;
> > > > > > > } local_t;
> > > > > > >
> > > > > > > while offset is defined as 'unsigned long'.
> > > > > >
> > > > > > Ok, but that's because we're doing the wrong thing to start with.
> > > > > >
> > > > > > Since local_t is defined in terms of atomic_long_t, we should define the
> > > > > > generic local_try_cmpxchg() in terms of atomic_long_try_cmpxchg(). We'll still
> > > > > > have a mismatch between 'long *' and 'unsigned long *', but then we can fix
> > > > > > that in the callsite:
> > > > > >
> > > > > >       while (!local_try_cmpxchg(&rb->head, &(long *)offset, head))
> > > > >
> > > > > Sorry, that should be:
> > > > >
> > > > >         while (!local_try_cmpxchg(&rb->head, (long *)&offset, head))
> > > >
> > > > The fallbacks are a bit more complicated than above, and are different
> > > > from atomic_try_cmpxchg.
> > > >
> > > > Please note in patch 2/10, the falbacks when arch_try_cmpxchg_local
> > > > are not defined call arch_cmpxchg_local. Also in patch 2/10,
> > > > try_cmpxchg_local is introduced, where it calls
> > > > arch_try_cmpxchg_local. Targets (and generic code) simply define (e.g.
> > > > :
> > > >
> > > > #define local_cmpxchg(l, o, n) \
> > > >        (cmpxchg_local(&((l)->a.counter), (o), (n)))
> > > > +#define local_try_cmpxchg(l, po, n) \
> > > > +       (try_cmpxchg_local(&((l)->a.counter), (po), (n)))
> > > >
> > > > which is part of the local_t API. Targets should either define all
> > > > these #defines, or none. There are no partial fallbacks as is the case
> > > > with atomic_t.
> > >
> > > Whether or not there are fallbacks is immaterial.
> > >
> > > In those cases, architectures can just as easily write C wrappers, e.g.
> > >
> > > long local_cmpxchg(local_t *l, long old, long new)
> > > {
> > >         return cmpxchg_local(&l->a.counter, old, new);
> > > }
> > >
> > > long local_try_cmpxchg(local_t *l, long *old, long new)
> > > {
> > >         return try_cmpxchg_local(&l->a.counter, old, new);
> > > }
> >
> > Please find attached the complete prototype patch that implements the
> > above suggestion.
> >
> > The patch includes:
> > - implementation of instrumented try_cmpxchg{,64}_local definitions
> > - corresponding arch_try_cmpxchg{,64}_local fallback definitions
> > - generic local{,64}_try_cmpxchg (and local{,64}_cmpxchg) C wrappers
> >
> > - x86 specific local_try_cmpxchg (and local_cmpxchg) C wrappers
> > - x86 specific arch_try_cmpxchg_local definition
> >
> > - kernel/events/ring_buffer.c change to test local_try_cmpxchg
> > implementation and illustrate the transition
> > - arch/x86/events/core.c change to test local64_try_cmpxchg
> > implementation and illustrate the transition
> >
> > The definition of atomic_long_t is different for 64-bit and 32-bit
> > targets (s64 vs int), so target specific C wrappers have to use
> > different casts to account for this difference.
> >
> > Uros.
>
> Thanks for this!
>
> FWIW, the patch (inline below) looks good to me.

Thanks, I will prepare a patch series for submission later today.

Uros.

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2023-04-04 13:24 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-05 20:56 [PATCH 00/10] locking: Introduce local{,64}_try_cmpxchg Uros Bizjak
2023-03-05 20:56 ` Uros Bizjak
2023-03-05 20:56 ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-24 14:13   ` Mark Rutland
2023-03-24 14:13     ` Mark Rutland
2023-03-24 15:43     ` Uros Bizjak
2023-03-24 15:43       ` Uros Bizjak
2023-03-24 15:43       ` Uros Bizjak
2023-03-24 16:14       ` Mark Rutland
2023-03-24 16:14         ` Mark Rutland
2023-03-24 16:14         ` Mark Rutland
2023-03-24 16:32         ` Mark Rutland
2023-03-24 16:32           ` Mark Rutland
2023-03-24 16:32           ` Mark Rutland
2023-03-26 19:28           ` Uros Bizjak
2023-03-26 19:28             ` Uros Bizjak
2023-04-03 10:19             ` Mark Rutland
2023-04-03 10:19               ` Mark Rutland
2023-04-04 12:24               ` Uros Bizjak
2023-04-04 12:24                 ` Uros Bizjak
2023-04-04 13:19                 ` Mark Rutland
2023-04-04 13:19                   ` Mark Rutland
2023-04-04 13:23                   ` Uros Bizjak
2023-04-04 13:23                     ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 02/10] locking/atomic: Add generic try_cmpxchg{,64}_local support Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 03/10] locking/alpha: Wire up local_try_cmpxchg Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 04/10] locking/loongarch: " Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 05/10] locking/mips: " Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 06/10] locking/powerpc: " Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 07/10] locking/x86: " Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 08/10] locking/generic: Wire up local{,64}_try_cmpxchg Uros Bizjak
2023-03-05 20:56 ` [PATCH 09/10] locking/x86: Enable local{,64}_try_cmpxchg Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 10/10] perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.