linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/11] Introduce cmpxchg128() -- aka. the demise of cmpxchg_double()
@ 2023-05-15  7:56 Peter Zijlstra
  2023-05-15  7:57 ` [PATCH v3 01/11] cyrpto/b128ops: Remove struct u128 Peter Zijlstra
                   ` (11 more replies)
  0 siblings, 12 replies; 46+ messages in thread
From: Peter Zijlstra @ 2023-05-15  7:56 UTC (permalink / raw)
  To: torvalds
  Cc: corbet, will, peterz, boqun.feng, mark.rutland, catalin.marinas,
	dennis, tj, cl, hca, gor, agordeev, borntraeger, svens, tglx,
	mingo, bp, dave.hansen, x86, hpa, joro, suravee.suthikulpanit,
	robin.murphy, dwmw2, baolu.lu, Arnd Bergmann, Herbert Xu, davem,
	penberg, rientjes, iamjoonsoo.kim, Andrew Morton, vbabka,
	roman.gushchin, 42.hyeyoo, linux-doc, linux-kernel, linux-mm,
	linux-s390, iommu, linux-arch, linux-crypto

Hi!

I seem to have forgotten to post this series last release; so here goes. I'm
really hoping to merge it and forget about it.


Since Linus hated on cmpxchg_double(), a few patches to get rid of it, as
proposed here:

  https://lkml.kernel.org/r/Y2U3WdU61FvYlpUh@hirez.programming.kicks-ass.net


These patches are based on 6.4.0-rc2.

Available here:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git core/wip-u128

Since v2:

 - reworked this_cpu_cmpxchg() to not implicity do u128 but provide explicit
   this_cpu_cmpxchg128() (arnd)
 - added try_cmpxchg12_local() (per the addition of the try_cmpxchg*_local()
   family of functions)
 - slight cleanup of the SLUB conversion (due to rebase and having to touch it)
 - added a 'cleanup' patch for SLUB, since I was staring at that anyway

Since v1:

 - rebaed on Eric's ghash cleanups (hence the cryptodev-2.6 dependency)
 - rebased on Heiko's s390/cpum_sf CDSG patch
 - fixed up a bunch of arch code
 - fixed up the inline asm to use 'u128 *' mem argument so the compiler knows
   how wide the modification is.
 - reworked the percpu thing to use union based type-punning instead of
   _Generic() based casts.

---
 Documentation/core-api/this_cpu_ops.rst     |   2 -
 arch/arm64/include/asm/atomic_ll_sc.h       |  56 ++++----
 arch/arm64/include/asm/atomic_lse.h         |  39 +++---
 arch/arm64/include/asm/cmpxchg.h            |  48 ++-----
 arch/arm64/include/asm/percpu.h             |  30 +++--
 arch/s390/include/asm/cmpxchg.h             |  32 +----
 arch/s390/include/asm/cpu_mf.h              |   2 +-
 arch/s390/include/asm/percpu.h              |  34 +++--
 arch/s390/kernel/perf_cpum_sf.c             |  16 +--
 arch/x86/include/asm/cmpxchg.h              |  25 ----
 arch/x86/include/asm/cmpxchg_32.h           |   2 +-
 arch/x86/include/asm/cmpxchg_64.h           |  63 ++++++++-
 arch/x86/include/asm/percpu.h               | 100 +++++++++------
 drivers/iommu/amd/amd_iommu_types.h         |   9 +-
 drivers/iommu/amd/iommu.c                   |  10 +-
 drivers/iommu/intel/irq_remapping.c         |   8 +-
 include/asm-generic/percpu.h                |  66 ++--------
 include/crypto/b128ops.h                    |  14 +-
 include/linux/atomic/atomic-arch-fallback.h |  95 +++++++++++++-
 include/linux/atomic/atomic-instrumented.h  |  93 ++++++++++++--
 include/linux/dmar.h                        | 125 +++++++++---------
 include/linux/percpu-defs.h                 |  38 ------
 include/linux/slub_def.h                    |  12 +-
 include/linux/types.h                       |   5 +
 include/uapi/linux/types.h                  |   4 +
 lib/crypto/curve25519-hacl64.c              |   2 -
 lib/crypto/poly1305-donna64.c               |   2 -
 mm/slab.h                                   |  49 ++++++-
 mm/slub.c                                   | 191 ++++++++++++++--------------
 scripts/atomic/gen-atomic-fallback.sh       |   4 +-
 scripts/atomic/gen-atomic-instrumented.sh   |  19 +--
 31 files changed, 667 insertions(+), 528 deletions(-)


^ permalink raw reply	[flat|nested] 46+ messages in thread
* [PATCH 04/12] instrumentation: Wire up cmpxchg128()
@ 2023-05-31 13:08 Peter Zijlstra
  2023-06-05  7:42 ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
  0 siblings, 1 reply; 46+ messages in thread
From: Peter Zijlstra @ 2023-05-31 13:08 UTC (permalink / raw)
  To: torvalds
  Cc: corbet, will, peterz, boqun.feng, mark.rutland, catalin.marinas,
	dennis, tj, cl, hca, gor, agordeev, borntraeger, svens, tglx,
	mingo, bp, dave.hansen, x86, hpa, joro, suravee.suthikulpanit,
	robin.murphy, dwmw2, baolu.lu, Arnd Bergmann, Herbert Xu, davem,
	penberg, rientjes, iamjoonsoo.kim, Andrew Morton, vbabka,
	roman.gushchin, 42.hyeyoo, linux-doc, linux-kernel, linux-mm,
	linux-s390, iommu, linux-arch, linux-crypto, sfr, mpe,
	James.Bottomley, deller, linux-parisc

Wire up the cmpxchg128 family in the atomic wrapper scripts.

These provide the generic cmpxchg128 family of functions from the
arch_ prefixed version, adding explicit instrumentation where needed.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mark Rutland <mark.rutland@arm.com>
---
 include/linux/atomic/atomic-arch-fallback.h |   95 +++++++++++++++++++++++++++-
 include/linux/atomic/atomic-instrumented.h  |   86 +++++++++++++++++++++++++
 scripts/atomic/gen-atomic-fallback.sh       |    4 -
 scripts/atomic/gen-atomic-instrumented.sh   |    4 -
 4 files changed, 183 insertions(+), 6 deletions(-)

--- a/include/linux/atomic/atomic-arch-fallback.h
+++ b/include/linux/atomic/atomic-arch-fallback.h
@@ -77,6 +77,29 @@
 
 #endif /* arch_cmpxchg64_relaxed */
 
+#ifndef arch_cmpxchg128_relaxed
+#define arch_cmpxchg128_acquire arch_cmpxchg128
+#define arch_cmpxchg128_release arch_cmpxchg128
+#define arch_cmpxchg128_relaxed arch_cmpxchg128
+#else /* arch_cmpxchg128_relaxed */
+
+#ifndef arch_cmpxchg128_acquire
+#define arch_cmpxchg128_acquire(...) \
+	__atomic_op_acquire(arch_cmpxchg128, __VA_ARGS__)
+#endif
+
+#ifndef arch_cmpxchg128_release
+#define arch_cmpxchg128_release(...) \
+	__atomic_op_release(arch_cmpxchg128, __VA_ARGS__)
+#endif
+
+#ifndef arch_cmpxchg128
+#define arch_cmpxchg128(...) \
+	__atomic_op_fence(arch_cmpxchg128, __VA_ARGS__)
+#endif
+
+#endif /* arch_cmpxchg128_relaxed */
+
 #ifndef arch_try_cmpxchg_relaxed
 #ifdef arch_try_cmpxchg
 #define arch_try_cmpxchg_acquire arch_try_cmpxchg
@@ -217,6 +240,76 @@
 
 #endif /* arch_try_cmpxchg64_relaxed */
 
+#ifndef arch_try_cmpxchg128_relaxed
+#ifdef arch_try_cmpxchg128
+#define arch_try_cmpxchg128_acquire arch_try_cmpxchg128
+#define arch_try_cmpxchg128_release arch_try_cmpxchg128
+#define arch_try_cmpxchg128_relaxed arch_try_cmpxchg128
+#endif /* arch_try_cmpxchg128 */
+
+#ifndef arch_try_cmpxchg128
+#define arch_try_cmpxchg128(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg128((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg128 */
+
+#ifndef arch_try_cmpxchg128_acquire
+#define arch_try_cmpxchg128_acquire(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg128_acquire((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg128_acquire */
+
+#ifndef arch_try_cmpxchg128_release
+#define arch_try_cmpxchg128_release(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg128_release((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg128_release */
+
+#ifndef arch_try_cmpxchg128_relaxed
+#define arch_try_cmpxchg128_relaxed(_ptr, _oldp, _new) \
+({ \
+	typeof(*(_ptr)) *___op = (_oldp), ___o = *___op, ___r; \
+	___r = arch_cmpxchg128_relaxed((_ptr), ___o, (_new)); \
+	if (unlikely(___r != ___o)) \
+		*___op = ___r; \
+	likely(___r == ___o); \
+})
+#endif /* arch_try_cmpxchg128_relaxed */
+
+#else /* arch_try_cmpxchg128_relaxed */
+
+#ifndef arch_try_cmpxchg128_acquire
+#define arch_try_cmpxchg128_acquire(...) \
+	__atomic_op_acquire(arch_try_cmpxchg128, __VA_ARGS__)
+#endif
+
+#ifndef arch_try_cmpxchg128_release
+#define arch_try_cmpxchg128_release(...) \
+	__atomic_op_release(arch_try_cmpxchg128, __VA_ARGS__)
+#endif
+
+#ifndef arch_try_cmpxchg128
+#define arch_try_cmpxchg128(...) \
+	__atomic_op_fence(arch_try_cmpxchg128, __VA_ARGS__)
+#endif
+
+#endif /* arch_try_cmpxchg128_relaxed */
+
 #ifndef arch_try_cmpxchg_local
 #define arch_try_cmpxchg_local(_ptr, _oldp, _new) \
 ({ \
@@ -2668,4 +2761,4 @@ arch_atomic64_dec_if_positive(atomic64_t
 #endif
 
 #endif /* _LINUX_ATOMIC_FALLBACK_H */
-// ad2e2b4d168dbc60a73922616047a9bfa446af36
+// 52dfc6fe4a2e7234bbd2aa3e16a377c1db793a53
--- a/include/linux/atomic/atomic-instrumented.h
+++ b/include/linux/atomic/atomic-instrumented.h
@@ -2034,6 +2034,36 @@ atomic_long_dec_if_positive(atomic_long_
 	arch_cmpxchg64_relaxed(__ai_ptr, __VA_ARGS__); \
 })
 
+#define cmpxchg128(ptr, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	kcsan_mb(); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	arch_cmpxchg128(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg128_acquire(ptr, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	arch_cmpxchg128_acquire(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg128_release(ptr, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	kcsan_release(); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	arch_cmpxchg128_release(__ai_ptr, __VA_ARGS__); \
+})
+
+#define cmpxchg128_relaxed(ptr, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	arch_cmpxchg128_relaxed(__ai_ptr, __VA_ARGS__); \
+})
+
 #define try_cmpxchg(ptr, oldp, ...) \
 ({ \
 	typeof(ptr) __ai_ptr = (ptr); \
@@ -2110,6 +2140,44 @@ atomic_long_dec_if_positive(atomic_long_
 	arch_try_cmpxchg64_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \
 })
 
+#define try_cmpxchg128(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	kcsan_mb(); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg128(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg128_acquire(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg128_acquire(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg128_release(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	kcsan_release(); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg128_release(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
+#define try_cmpxchg128_relaxed(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg128_relaxed(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
 #define cmpxchg_local(ptr, ...) \
 ({ \
 	typeof(ptr) __ai_ptr = (ptr); \
@@ -2124,6 +2192,13 @@ atomic_long_dec_if_positive(atomic_long_
 	arch_cmpxchg64_local(__ai_ptr, __VA_ARGS__); \
 })
 
+#define cmpxchg128_local(ptr, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	arch_cmpxchg128_local(__ai_ptr, __VA_ARGS__); \
+})
+
 #define sync_cmpxchg(ptr, ...) \
 ({ \
 	typeof(ptr) __ai_ptr = (ptr); \
@@ -2150,6 +2225,15 @@ atomic_long_dec_if_positive(atomic_long_
 	arch_try_cmpxchg64_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
 })
 
+#define try_cmpxchg128_local(ptr, oldp, ...) \
+({ \
+	typeof(ptr) __ai_ptr = (ptr); \
+	typeof(oldp) __ai_oldp = (oldp); \
+	instrument_atomic_read_write(__ai_ptr, sizeof(*__ai_ptr)); \
+	instrument_read_write(__ai_oldp, sizeof(*__ai_oldp)); \
+	arch_try_cmpxchg128_local(__ai_ptr, __ai_oldp, __VA_ARGS__); \
+})
+
 #define cmpxchg_double(ptr, ...) \
 ({ \
 	typeof(ptr) __ai_ptr = (ptr); \
@@ -2167,4 +2251,4 @@ atomic_long_dec_if_positive(atomic_long_
 })
 
 #endif /* _LINUX_ATOMIC_INSTRUMENTED_H */
-// 6b513a42e1a1b5962532a019b7fc91eaa044ad5e
+// 82d1be694fab30414527d0877c29fa75ed5a0b74
--- a/scripts/atomic/gen-atomic-fallback.sh
+++ b/scripts/atomic/gen-atomic-fallback.sh
@@ -217,11 +217,11 @@ cat << EOF
 
 EOF
 
-for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64"; do
+for xchg in "arch_xchg" "arch_cmpxchg" "arch_cmpxchg64" "arch_cmpxchg128"; do
 	gen_xchg_fallbacks "${xchg}"
 done
 
-for cmpxchg in "cmpxchg" "cmpxchg64"; do
+for cmpxchg in "cmpxchg" "cmpxchg64" "cmpxchg128"; do
 	gen_try_cmpxchg_fallbacks "${cmpxchg}"
 done
 
--- a/scripts/atomic/gen-atomic-instrumented.sh
+++ b/scripts/atomic/gen-atomic-instrumented.sh
@@ -166,14 +166,14 @@ grep '^[a-z]' "$1" | while read name met
 done
 
 
-for xchg in "xchg" "cmpxchg" "cmpxchg64" "try_cmpxchg" "try_cmpxchg64"; do
+for xchg in "xchg" "cmpxchg" "cmpxchg64" "cmpxchg128" "try_cmpxchg" "try_cmpxchg64" "try_cmpxchg128"; do
 	for order in "" "_acquire" "_release" "_relaxed"; do
 		gen_xchg "${xchg}" "${order}" ""
 		printf "\n"
 	done
 done
 
-for xchg in "cmpxchg_local" "cmpxchg64_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" ; do
+for xchg in "cmpxchg_local" "cmpxchg64_local" "cmpxchg128_local" "sync_cmpxchg" "try_cmpxchg_local" "try_cmpxchg64_local" "try_cmpxchg128_local"; do
 	gen_xchg "${xchg}" "" ""
 	printf "\n"
 done



^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2023-06-05  7:43 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-15  7:56 [PATCH v3 00/11] Introduce cmpxchg128() -- aka. the demise of cmpxchg_double() Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 01/11] cyrpto/b128ops: Remove struct u128 Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 02/11] types: Introduce [us]128 Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 03/11] arch: Introduce arch_{,try_}_cmpxchg128{,_local}() Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 04/11] instrumentation: Wire up cmpxchg128() Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 05/11] percpu: Wire up cmpxchg128 Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-25 12:49   ` [PATCH v3 05/11] " Peter Zijlstra
2023-05-25 22:59     ` Petr Tesařík
2023-05-15  7:57 ` [PATCH v3 06/11] x86,amd_iommu: Replace cmpxchg_double() Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 07/11] x86,intel_iommu: " Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 08/11] slub: " Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-24  9:32   ` [PATCH v3 08/11] " Peter Zijlstra
2023-05-24 10:13     ` Vlastimil Babka
2023-05-25 10:29     ` Peter Zijlstra
2023-05-25 10:52       ` Arnd Bergmann
2023-05-25 13:10         ` Peter Zijlstra
2023-05-30 14:22     ` Peter Zijlstra
2023-05-30 19:32       ` Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 09/11] mm/slub: Fold slab_update_freelist() Peter Zijlstra
2023-05-24 11:58   ` Vlastimil Babka
2023-05-15  7:57 ` [PATCH v3 10/11] arch: Remove cmpxchg_double Peter Zijlstra
2023-05-15  8:52   ` Heiko Carstens
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-15  7:57 ` [PATCH v3 11/11] s390/cpum_sf: Convert to cmpxchg128() Peter Zijlstra
2023-05-20 10:49   ` [tip: locking/core] " tip-bot2 for Peter Zijlstra
2023-05-22 10:27   ` tip-bot2 for Peter Zijlstra
2023-05-15  9:42 ` [PATCH v3 00/11] Introduce cmpxchg128() -- aka. the demise of cmpxchg_double() Arnd Bergmann
2023-05-24  9:39   ` Peter Zijlstra
2023-05-31 13:08 [PATCH 04/12] instrumentation: Wire up cmpxchg128() Peter Zijlstra
2023-06-05  7:42 ` [tip: locking/core] " tip-bot2 for Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).