All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] locking: Introduce local{,64}_try_cmpxchg
@ 2023-03-05 20:56 ` Uros Bizjak
  0 siblings, 0 replies; 43+ messages in thread
From: Uros Bizjak @ 2023-03-05 20:56 UTC (permalink / raw)
  To: linux-alpha, linux-kernel, loongarch, linux-mips, linuxppc-dev,
	linux-arch, linux-perf-users
  Cc: Uros Bizjak, Richard Henderson, Ivan Kokshaysky, Matt Turner,
	Huacai Chen, WANG Xuerui, Thomas Bogendoerfer, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Arnd Bergmann,
	Peter Zijlstra, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Ian Rogers,
	Will Deacon, Boqun Feng, Jiaxun Yang, Jun Yi

Add generic and target specific support for local{,64}_try_cmpxchg
and wire up support for all targets that use local_t infrastructure.

The patch enables x86 targets to emit special instruction for
local_try_cmpxchg and also local64_try_cmpxchg for x86_64.

The last patch changes __perf_output_begin in events/ring_buffer
to use new locking primitive and improves code from

     4b3:	48 8b 82 e8 00 00 00 	mov    0xe8(%rdx),%rax
     4ba:	48 8b b8 08 04 00 00 	mov    0x408(%rax),%rdi
     4c1:	8b 42 1c             	mov    0x1c(%rdx),%eax
     4c4:	48 8b 4a 28          	mov    0x28(%rdx),%rcx
     4c8:	85 c0                	test   %eax,%eax
     ...
     4ef:	48 89 c8             	mov    %rcx,%rax
     4f2:	48 0f b1 7a 28       	cmpxchg %rdi,0x28(%rdx)
     4f7:	48 39 c1             	cmp    %rax,%rcx
     4fa:	75 b7                	jne    4b3 <...>

to

     4b2:	48 8b 4a 28          	mov    0x28(%rdx),%rcx
     4b6:	48 8b 82 e8 00 00 00 	mov    0xe8(%rdx),%rax
     4bd:	48 8b b0 08 04 00 00 	mov    0x408(%rax),%rsi
     4c4:	8b 42 1c             	mov    0x1c(%rdx),%eax
     4c7:	85 c0                	test   %eax,%eax
     ...
     4d4:	48 89 c8             	mov    %rcx,%rax
     4d7:	48 0f b1 72 28       	cmpxchg %rsi,0x28(%rdx)
     4dc:	0f 85 d0 00 00 00    	jne    5b2 <...>
     ...
     5b2:	48 89 c1             	mov    %rax,%rcx
     5b5:	e9 fc fe ff ff       	jmp    4b6 <...>

Please note that in addition to removed compare, the load from
0x28(%rdx) gets moved out of the loop and the code is rearranged
according to likely/unlikely tags in the source.

Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Jun Yi <yijun@loongson.cn>

Uros Bizjak (10):
  locking/atomic: Add missing cast to try_cmpxchg() fallbacks
  locking/atomic: Add generic try_cmpxchg{,64}_local support
  locking/alpha: Wire up local_try_cmpxchg
  locking/loongarch: Wire up local_try_cmpxchg
  locking/mips: Wire up local_try_cmpxchg
  locking/powerpc: Wire up local_try_cmpxchg
  locking/x86: Wire up local_try_cmpxchg
  locking/generic: Wire up local{,64}_try_cmpxchg
  locking/x86: Enable local{,64}_try_cmpxchg
  perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin

 arch/alpha/include/asm/local.h              |  2 ++
 arch/loongarch/include/asm/local.h          |  2 ++
 arch/mips/include/asm/local.h               |  2 ++
 arch/powerpc/include/asm/local.h            | 11 ++++++
 arch/x86/include/asm/cmpxchg.h              |  6 ++++
 arch/x86/include/asm/local.h                |  2 ++
 include/asm-generic/local.h                 |  1 +
 include/asm-generic/local64.h               |  2 ++
 include/linux/atomic/atomic-arch-fallback.h | 40 ++++++++++++++++-----
 include/linux/atomic/atomic-instrumented.h  | 20 ++++++++++-
 kernel/events/ring_buffer.c                 |  5 +--
 scripts/atomic/gen-atomic-fallback.sh       |  6 +++-
 scripts/atomic/gen-atomic-instrumented.sh   |  2 +-
 13 files changed, 87 insertions(+), 14 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2023-04-04 13:24 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-05 20:56 [PATCH 00/10] locking: Introduce local{,64}_try_cmpxchg Uros Bizjak
2023-03-05 20:56 ` Uros Bizjak
2023-03-05 20:56 ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 01/10] locking/atomic: Add missing cast to try_cmpxchg() fallbacks Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-24 14:13   ` Mark Rutland
2023-03-24 14:13     ` Mark Rutland
2023-03-24 15:43     ` Uros Bizjak
2023-03-24 15:43       ` Uros Bizjak
2023-03-24 15:43       ` Uros Bizjak
2023-03-24 16:14       ` Mark Rutland
2023-03-24 16:14         ` Mark Rutland
2023-03-24 16:14         ` Mark Rutland
2023-03-24 16:32         ` Mark Rutland
2023-03-24 16:32           ` Mark Rutland
2023-03-24 16:32           ` Mark Rutland
2023-03-26 19:28           ` Uros Bizjak
2023-03-26 19:28             ` Uros Bizjak
2023-04-03 10:19             ` Mark Rutland
2023-04-03 10:19               ` Mark Rutland
2023-04-04 12:24               ` Uros Bizjak
2023-04-04 12:24                 ` Uros Bizjak
2023-04-04 13:19                 ` Mark Rutland
2023-04-04 13:19                   ` Mark Rutland
2023-04-04 13:23                   ` Uros Bizjak
2023-04-04 13:23                     ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 02/10] locking/atomic: Add generic try_cmpxchg{,64}_local support Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 03/10] locking/alpha: Wire up local_try_cmpxchg Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 04/10] locking/loongarch: " Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 05/10] locking/mips: " Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 06/10] locking/powerpc: " Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 07/10] locking/x86: " Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 08/10] locking/generic: Wire up local{,64}_try_cmpxchg Uros Bizjak
2023-03-05 20:56 ` [PATCH 09/10] locking/x86: Enable local{,64}_try_cmpxchg Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak
2023-03-05 20:56 ` [PATCH 10/10] perf/ring_buffer: use local_try_cmpxchg in __perf_output_begin Uros Bizjak
2023-03-05 20:56   ` Uros Bizjak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.