linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants
@ 2022-06-24 12:13 Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 1/9] ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr() Alexander Lobakin
                   ` (10 more replies)
  0 siblings, 11 replies; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

While I was working on converting some structure fields from a fixed
type to a bitmap, I started observing code size increase not only in
places where the code works with the converted structure fields, but
also where the converted vars were on the stack. That said, the
following code:

	DECLARE_BITMAP(foo, BITS_PER_LONG) = { }; // -> unsigned long foo[1];
	unsigned long bar = BIT(BAR_BIT);
	unsigned long baz = 0;

	__set_bit(FOO_BIT, foo);
	baz |= BIT(BAZ_BIT);

	BUILD_BUG_ON(!__builtin_constant_p(test_bit(FOO_BIT, foo));
	BUILD_BUG_ON(!__builtin_constant_p(bar & BAR_BIT));
	BUILD_BUG_ON(!__builtin_constant_p(baz & BAZ_BIT));

triggers the first assertion on x86_64, which means that the
compiler is unable to evaluate it to a compile-time initializer
when the architecture-specific bitop is used even if it's obvious.
I found that this is due to that many architecture-specific
non-atomic bitop implementations use inline asm or other hacks which
are faster or more robust when working with "real" variables (i.e.
fields from the structures etc.), but the compilers have no clue how
to optimize them out when called on compile-time constants.

So, in order to let the compiler optimize out such cases, expand the
test_bit() and __*_bit() definitions with a compile-time condition
check, so that they will pick the generic C non-atomic bitop
implementations when all of the arguments passed are compile-time
constants, which means that the result will be a compile-time
constant as well and the compiler will produce more efficient and
simple code in 100% cases (no changes when there's at least one
non-compile-time-constant argument).
The condition itself:

if (
__builtin_constant_p(nr) &&	/* <- bit position is constant */
__builtin_constant_p(!!addr) &&	/* <- compiler knows bitmap addr is
				      always either NULL or not */
addr &&				/* <- bitmap addr is not NULL */
__builtin_constant_p(*addr)	/* <- compiler knows the value of
				      the target bitmap */
)
	/* then pick the generic C variant
else
	/* old code path, arch-specific

I also tried __is_constexpr() as suggested by Andy, but it was
always returning 0 ('not a constant') for the 2,3 and 4th
conditions.

The object code size changes are architecture, compiler and compiler
flags dependent, there could be 80 Kb saved in one case and then 5
Kb added in another, but what most important is that the bitops are
now often transparent for the compilers, e.g. the following:

	DECLARE_BITMAP(flags, __IP_TUNNEL_FLAG_NUM) = { };
	__be16 flags;

	__set_bit(IP_TUNNEL_CSUM_BIT, flags);

	tun_flags = cpu_to_be16(*flags & U16_MAX);

	if (test_bit(IP_TUNNEL_VTI_BIT, flags))
		tun_flags |= VTI_ISVTI;

	BUILD_BUG_ON(!__builtin_constant_p(tun_flags));

doesn't blow up anymore (which is being checked now at build time),
so that we can now e.g. use fixed bitmaps in compile-time assertions
etc.

Runtime-tested on x86_64 built with LLVM 14 and GCC 10-12,
compile-time tested on ARC (GCC 11), S390 (GCC 11-12, Clang 14,
Clang 15-git) (+ some tests on m68k and ARM64 from the reviewers).

From v4[0]:
* add a workaround to the bitmap compile-time tests code for Clang
  for s390: it is able to optimize the operations, but at the same
  time triggers the assertions (in a weird way) (lkp);
* added missed bits for unification the prototypes on s390 as well;
* fixed the issue in the ice driver code revealed by one of the
  optimizations (lkp);
* there also was one report that GCC 11 on Debian was unable to
  perform optimizations on x86 with v3 applied, but I couldn't
  reproduce it on generic GCC 10-12. Hope it was some random
  unrelated one-time issue.

From v3[1]:
* fix a typo in the comment in 0006 (Andy);
* pick more Reviewed-bys (Andy, Marco);
* don't assume compiler expands small mem*() builtins in bitmap_*()
  (me, with a small hint from lkp).

From v2[2]:
* collect several Reviewed-bys (Andy, Yury);
* add a comment to generic_test_bit() that it is atomic-safe and
  must always stay like that (the first version of this series
  erroneously tried to change this) (Andy, Marco);
* unify the way how architectures define platform-specific bitops,
  both supporting instrumentation and not: now they define only
  'arch_' versions and asm-generic includes take care of the rest;
* micro-optimize the diffstat of 0004/0007 (__check_bitop_pr())
  (Andy);
* add compile-time tests to lib/test_bitmap to make sure everything
  works as expected on any setup (Yury).

From v1[3]:
* change 'gen_' prefixes to '_generic' to disambiguate from
  'generated' etc. (Mark);
* define a separate 'const_' set to use in the optimization to keep
  the generic test_bit() atomic-safe (Marco);
* unify arch_{test,__*}_bit() as well and include them in the type
  check;
* add more relevant and up-to-date bloat-o-meter results, including
  ARM64 (me, Mark);
* pick a couple '*-by' tags (Mark, Yury).

Also available on my open GH[4].

[0] https://lore.kernel.org/linux-arch/20220621191553.69455-1-alexandr.lobakin@intel.com
[1] https://lore.kernel.org/linux-arch/20220617144031.2549432-1-alexandr.lobakin@intel.com
[2] https://lore.kernel.org/linux-arch/20220610113427.908751-1-alexandr.lobakin@intel.com
[3] https://lore.kernel.org/linux-arch/20220606114908.962562-1-alexandr.lobakin@intel.com
[4] https://github.com/alobakin/linux/commits/bitops

Alexander Lobakin (9):
  ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr()
  bitops: always define asm-generic non-atomic bitops
  bitops: unify non-atomic bitops prototypes across architectures
  bitops: define const_*() versions of the non-atomics
  bitops: wrap non-atomic bitops with a transparent macro
  bitops: let optimize out non-atomic bitops on compile-time constants
  net/ice: fix initializing the bitmap in the switch code
  bitmap: don't assume compiler evaluates small mem*() builtins calls
  lib: test_bitmap: add compile-time optimization/evaluations assertions

 arch/alpha/include/asm/bitops.h               |  32 ++--
 arch/hexagon/include/asm/bitops.h             |  24 ++-
 arch/ia64/include/asm/bitops.h                |  42 ++---
 arch/ia64/include/asm/processor.h             |   2 +-
 arch/m68k/include/asm/bitops.h                |  49 ++++--
 arch/s390/include/asm/bitops.h                |  61 +++----
 arch/sh/include/asm/bitops-op32.h             |  34 ++--
 arch/sparc/include/asm/bitops_32.h            |  18 +-
 arch/sparc/lib/atomic32.c                     |  12 +-
 arch/x86/include/asm/bitops.h                 |  22 +--
 drivers/net/ethernet/intel/ice/ice_switch.c   |   2 +-
 .../asm-generic/bitops/generic-non-atomic.h   | 161 ++++++++++++++++++
 .../bitops/instrumented-non-atomic.h          |  35 ++--
 include/asm-generic/bitops/non-atomic.h       | 121 +------------
 .../bitops/non-instrumented-non-atomic.h      |  16 ++
 include/linux/bitmap.h                        |  22 ++-
 include/linux/bitops.h                        |  50 ++++++
 lib/test_bitmap.c                             |  62 +++++++
 tools/include/asm-generic/bitops/non-atomic.h |  34 ++--
 tools/include/linux/bitops.h                  |  16 ++
 20 files changed, 544 insertions(+), 271 deletions(-)
 create mode 100644 include/asm-generic/bitops/generic-non-atomic.h
 create mode 100644 include/asm-generic/bitops/non-instrumented-non-atomic.h

-- 
2.36.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v5 1/9] ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr()
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 2/9] bitops: always define asm-generic non-atomic bitops Alexander Lobakin
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel, stable

test_bit(), as any other bitmap op, takes `unsigned long *` as a
second argument (pointer to the actual bitmap), as any bitmap
itself is an array of unsigned longs. However, the ia64_get_irr()
code passes a ref to `u64` as a second argument.
This works with the ia64 bitops implementation due to that they
have `void *` as the second argument and then cast it later on.
This works with the bitmap API itself due to that `unsigned long`
has the same size on ia64 as `u64` (`unsigned long long`), but
from the compiler PoV those two are different.
Define @irr as `unsigned long` to fix that. That implies no
functional changes. Has been hidden for 16 years!

Fixes: a58786917ce2 ("[IA64] avoid broken SAL_CACHE_FLUSH implementations")
Cc: stable@vger.kernel.org # 2.6.16+
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
---
 arch/ia64/include/asm/processor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
index 7cbce290f4e5..757c2f6d8d4b 100644
--- a/arch/ia64/include/asm/processor.h
+++ b/arch/ia64/include/asm/processor.h
@@ -538,7 +538,7 @@ ia64_get_irr(unsigned int vector)
 {
 	unsigned int reg = vector / 64;
 	unsigned int bit = vector % 64;
-	u64 irr;
+	unsigned long irr;
 
 	switch (reg) {
 	case 0: irr = ia64_getreg(_IA64_REG_CR_IRR0); break;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 2/9] bitops: always define asm-generic non-atomic bitops
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 1/9] ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr() Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2023-01-02 16:14   ` Maciej Fijalkowski
  2022-06-24 12:13 ` [PATCH v5 3/9] bitops: unify non-atomic bitops prototypes across architectures Alexander Lobakin
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

Move generic non-atomic bitops from the asm-generic header which
gets included only when there are no architecture-specific
alternatives, to a separate independent file to make them always
available.
Almost no actual code changes, only one comment added to
generic_test_bit() saying that it's an atomic operation itself
and thus `volatile` must always stay there with no cast-aways.

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> # comment
Suggested-by: Marco Elver <elver@google.com> # reference to kernel-doc
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Marco Elver <elver@google.com>
---
 .../asm-generic/bitops/generic-non-atomic.h   | 130 ++++++++++++++++++
 include/asm-generic/bitops/non-atomic.h       | 110 ++-------------
 2 files changed, 138 insertions(+), 102 deletions(-)
 create mode 100644 include/asm-generic/bitops/generic-non-atomic.h

diff --git a/include/asm-generic/bitops/generic-non-atomic.h b/include/asm-generic/bitops/generic-non-atomic.h
new file mode 100644
index 000000000000..7226488810e5
--- /dev/null
+++ b/include/asm-generic/bitops/generic-non-atomic.h
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H
+#define __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H
+
+#include <linux/bits.h>
+
+#ifndef _LINUX_BITOPS_H
+#error only <linux/bitops.h> can be included directly
+#endif
+
+/*
+ * Generic definitions for bit operations, should not be used in regular code
+ * directly.
+ */
+
+/**
+ * generic___set_bit - Set a bit in memory
+ * @nr: the bit to set
+ * @addr: the address to start counting from
+ *
+ * Unlike set_bit(), this function is non-atomic and may be reordered.
+ * If it's called on the same region of memory simultaneously, the effect
+ * may be that only one operation succeeds.
+ */
+static __always_inline void
+generic___set_bit(unsigned int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BIT_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
+
+	*p  |= mask;
+}
+
+static __always_inline void
+generic___clear_bit(unsigned int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BIT_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
+
+	*p &= ~mask;
+}
+
+/**
+ * generic___change_bit - Toggle a bit in memory
+ * @nr: the bit to change
+ * @addr: the address to start counting from
+ *
+ * Unlike change_bit(), this function is non-atomic and may be reordered.
+ * If it's called on the same region of memory simultaneously, the effect
+ * may be that only one operation succeeds.
+ */
+static __always_inline
+void generic___change_bit(unsigned int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BIT_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
+
+	*p ^= mask;
+}
+
+/**
+ * generic___test_and_set_bit - Set a bit and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static __always_inline int
+generic___test_and_set_bit(unsigned int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BIT_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
+	unsigned long old = *p;
+
+	*p = old | mask;
+	return (old & mask) != 0;
+}
+
+/**
+ * generic___test_and_clear_bit - Clear a bit and return its old value
+ * @nr: Bit to clear
+ * @addr: Address to count from
+ *
+ * This operation is non-atomic and can be reordered.
+ * If two examples of this operation race, one can appear to succeed
+ * but actually fail.  You must protect multiple accesses with a lock.
+ */
+static __always_inline int
+generic___test_and_clear_bit(unsigned int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BIT_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
+	unsigned long old = *p;
+
+	*p = old & ~mask;
+	return (old & mask) != 0;
+}
+
+/* WARNING: non atomic and it can be reordered! */
+static __always_inline int
+generic___test_and_change_bit(unsigned int nr, volatile unsigned long *addr)
+{
+	unsigned long mask = BIT_MASK(nr);
+	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
+	unsigned long old = *p;
+
+	*p = old ^ mask;
+	return (old & mask) != 0;
+}
+
+/**
+ * generic_test_bit - Determine whether a bit is set
+ * @nr: bit number to test
+ * @addr: Address to start counting from
+ */
+static __always_inline int
+generic_test_bit(unsigned int nr, const volatile unsigned long *addr)
+{
+	/*
+	 * Unlike the bitops with the '__' prefix above, this one *is* atomic,
+	 * so `volatile` must always stay here with no cast-aways. See
+	 * `Documentation/atomic_bitops.txt` for the details.
+	 */
+	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
+}
+
+#endif /* __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H */
diff --git a/include/asm-generic/bitops/non-atomic.h b/include/asm-generic/bitops/non-atomic.h
index 078cc68be2f1..23d3abc1e10d 100644
--- a/include/asm-generic/bitops/non-atomic.h
+++ b/include/asm-generic/bitops/non-atomic.h
@@ -2,121 +2,27 @@
 #ifndef _ASM_GENERIC_BITOPS_NON_ATOMIC_H_
 #define _ASM_GENERIC_BITOPS_NON_ATOMIC_H_
 
-#include <asm/types.h>
+#include <asm-generic/bitops/generic-non-atomic.h>
 
-/**
- * arch___set_bit - Set a bit in memory
- * @nr: the bit to set
- * @addr: the address to start counting from
- *
- * Unlike set_bit(), this function is non-atomic and may be reordered.
- * If it's called on the same region of memory simultaneously, the effect
- * may be that only one operation succeeds.
- */
-static __always_inline void
-arch___set_bit(unsigned int nr, volatile unsigned long *addr)
-{
-	unsigned long mask = BIT_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
-
-	*p  |= mask;
-}
+#define arch___set_bit generic___set_bit
 #define __set_bit arch___set_bit
 
-static __always_inline void
-arch___clear_bit(unsigned int nr, volatile unsigned long *addr)
-{
-	unsigned long mask = BIT_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
-
-	*p &= ~mask;
-}
+#define arch___clear_bit generic___clear_bit
 #define __clear_bit arch___clear_bit
 
-/**
- * arch___change_bit - Toggle a bit in memory
- * @nr: the bit to change
- * @addr: the address to start counting from
- *
- * Unlike change_bit(), this function is non-atomic and may be reordered.
- * If it's called on the same region of memory simultaneously, the effect
- * may be that only one operation succeeds.
- */
-static __always_inline
-void arch___change_bit(unsigned int nr, volatile unsigned long *addr)
-{
-	unsigned long mask = BIT_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
-
-	*p ^= mask;
-}
+#define arch___change_bit generic___change_bit
 #define __change_bit arch___change_bit
 
-/**
- * arch___test_and_set_bit - Set a bit and return its old value
- * @nr: Bit to set
- * @addr: Address to count from
- *
- * This operation is non-atomic and can be reordered.
- * If two examples of this operation race, one can appear to succeed
- * but actually fail.  You must protect multiple accesses with a lock.
- */
-static __always_inline int
-arch___test_and_set_bit(unsigned int nr, volatile unsigned long *addr)
-{
-	unsigned long mask = BIT_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
-	unsigned long old = *p;
-
-	*p = old | mask;
-	return (old & mask) != 0;
-}
+#define arch___test_and_set_bit generic___test_and_set_bit
 #define __test_and_set_bit arch___test_and_set_bit
 
-/**
- * arch___test_and_clear_bit - Clear a bit and return its old value
- * @nr: Bit to clear
- * @addr: Address to count from
- *
- * This operation is non-atomic and can be reordered.
- * If two examples of this operation race, one can appear to succeed
- * but actually fail.  You must protect multiple accesses with a lock.
- */
-static __always_inline int
-arch___test_and_clear_bit(unsigned int nr, volatile unsigned long *addr)
-{
-	unsigned long mask = BIT_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
-	unsigned long old = *p;
-
-	*p = old & ~mask;
-	return (old & mask) != 0;
-}
+#define arch___test_and_clear_bit generic___test_and_clear_bit
 #define __test_and_clear_bit arch___test_and_clear_bit
 
-/* WARNING: non atomic and it can be reordered! */
-static __always_inline int
-arch___test_and_change_bit(unsigned int nr, volatile unsigned long *addr)
-{
-	unsigned long mask = BIT_MASK(nr);
-	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
-	unsigned long old = *p;
-
-	*p = old ^ mask;
-	return (old & mask) != 0;
-}
+#define arch___test_and_change_bit generic___test_and_change_bit
 #define __test_and_change_bit arch___test_and_change_bit
 
-/**
- * arch_test_bit - Determine whether a bit is set
- * @nr: bit number to test
- * @addr: Address to start counting from
- */
-static __always_inline int
-arch_test_bit(unsigned int nr, const volatile unsigned long *addr)
-{
-	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
-}
+#define arch_test_bit generic_test_bit
 #define test_bit arch_test_bit
 
 #endif /* _ASM_GENERIC_BITOPS_NON_ATOMIC_H_ */
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 3/9] bitops: unify non-atomic bitops prototypes across architectures
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 1/9] ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr() Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 2/9] bitops: always define asm-generic non-atomic bitops Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2022-07-06 10:09   ` Geert Uytterhoeven
  2022-06-24 12:13 ` [PATCH v5 4/9] bitops: define const_*() versions of the non-atomics Alexander Lobakin
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

Currently, there is a mess with the prototypes of the non-atomic
bitops across the different architectures:

ret	bool, int, unsigned long
nr	int, long, unsigned int, unsigned long
addr	volatile unsigned long *, volatile void *

Thankfully, it doesn't provoke any bugs, but can sometimes make
the compiler angry when it's not handy at all.
Adjust all the prototypes to the following standard:

ret	bool				retval can be only 0 or 1
nr	unsigned long			native; signed makes no sense
addr	volatile unsigned long *	bitmaps are arrays of ulongs

Next, some architectures don't define 'arch_' versions as they don't
support instrumentation, others do. To make sure there is always the
same set of callables present and to ease any potential future
changes, make them all follow the rule:
 * architecture-specific files define only 'arch_' versions;
 * non-prefixed versions can be defined only in asm-generic files;
and place the non-prefixed definitions into a new file in
asm-generic to be included by non-instrumented architectures.

Finally, add some static assertions in order to prevent people from
making a mess in this room again.
I also used the %__always_inline attribute consistently, so that
they always get resolved to the actual operations.

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 arch/alpha/include/asm/bitops.h               | 32 +++++-----
 arch/hexagon/include/asm/bitops.h             | 24 +++++---
 arch/ia64/include/asm/bitops.h                | 42 +++++++------
 arch/m68k/include/asm/bitops.h                | 49 ++++++++++-----
 arch/s390/include/asm/bitops.h                | 61 ++++++++++---------
 arch/sh/include/asm/bitops-op32.h             | 34 +++++++----
 arch/x86/include/asm/bitops.h                 | 22 ++++---
 .../asm-generic/bitops/generic-non-atomic.h   | 24 ++++----
 .../bitops/instrumented-non-atomic.h          | 21 ++++---
 include/asm-generic/bitops/non-atomic.h       | 13 +---
 .../bitops/non-instrumented-non-atomic.h      | 16 +++++
 include/linux/bitops.h                        | 17 ++++++
 tools/include/asm-generic/bitops/non-atomic.h | 24 +++++---
 13 files changed, 229 insertions(+), 150 deletions(-)
 create mode 100644 include/asm-generic/bitops/non-instrumented-non-atomic.h

diff --git a/arch/alpha/include/asm/bitops.h b/arch/alpha/include/asm/bitops.h
index e1d8483a45f2..492c7713ddae 100644
--- a/arch/alpha/include/asm/bitops.h
+++ b/arch/alpha/include/asm/bitops.h
@@ -46,8 +46,8 @@ set_bit(unsigned long nr, volatile void * addr)
 /*
  * WARNING: non atomic version.
  */
-static inline void
-__set_bit(unsigned long nr, volatile void * addr)
+static __always_inline void
+arch___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	int *m = ((int *) addr) + (nr >> 5);
 
@@ -82,8 +82,8 @@ clear_bit_unlock(unsigned long nr, volatile void * addr)
 /*
  * WARNING: non atomic version.
  */
-static __inline__ void
-__clear_bit(unsigned long nr, volatile void * addr)
+static __always_inline void
+arch___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	int *m = ((int *) addr) + (nr >> 5);
 
@@ -94,7 +94,7 @@ static inline void
 __clear_bit_unlock(unsigned long nr, volatile void * addr)
 {
 	smp_mb();
-	__clear_bit(nr, addr);
+	arch___clear_bit(nr, addr);
 }
 
 static inline void
@@ -118,8 +118,8 @@ change_bit(unsigned long nr, volatile void * addr)
 /*
  * WARNING: non atomic version.
  */
-static __inline__ void
-__change_bit(unsigned long nr, volatile void * addr)
+static __always_inline void
+arch___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	int *m = ((int *) addr) + (nr >> 5);
 
@@ -186,8 +186,8 @@ test_and_set_bit_lock(unsigned long nr, volatile void *addr)
 /*
  * WARNING: non atomic version.
  */
-static inline int
-__test_and_set_bit(unsigned long nr, volatile void * addr)
+static __always_inline bool
+arch___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = 1 << (nr & 0x1f);
 	int *m = ((int *) addr) + (nr >> 5);
@@ -230,8 +230,8 @@ test_and_clear_bit(unsigned long nr, volatile void * addr)
 /*
  * WARNING: non atomic version.
  */
-static inline int
-__test_and_clear_bit(unsigned long nr, volatile void * addr)
+static __always_inline bool
+arch___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = 1 << (nr & 0x1f);
 	int *m = ((int *) addr) + (nr >> 5);
@@ -272,8 +272,8 @@ test_and_change_bit(unsigned long nr, volatile void * addr)
 /*
  * WARNING: non atomic version.
  */
-static __inline__ int
-__test_and_change_bit(unsigned long nr, volatile void * addr)
+static __always_inline bool
+arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = 1 << (nr & 0x1f);
 	int *m = ((int *) addr) + (nr >> 5);
@@ -283,8 +283,8 @@ __test_and_change_bit(unsigned long nr, volatile void * addr)
 	return (old & mask) != 0;
 }
 
-static inline int
-test_bit(int nr, const volatile void * addr)
+static __always_inline bool
+arch_test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	return (1UL & (((const int *) addr)[nr >> 5] >> (nr & 31))) != 0UL;
 }
@@ -450,6 +450,8 @@ sched_find_first_bit(const unsigned long b[2])
 	return __ffs(tmp) + ofs;
 }
 
+#include <asm-generic/bitops/non-instrumented-non-atomic.h>
+
 #include <asm-generic/bitops/le.h>
 
 #include <asm-generic/bitops/ext2-atomic-setbit.h>
diff --git a/arch/hexagon/include/asm/bitops.h b/arch/hexagon/include/asm/bitops.h
index 75d6ba3643b8..da500471ac73 100644
--- a/arch/hexagon/include/asm/bitops.h
+++ b/arch/hexagon/include/asm/bitops.h
@@ -127,38 +127,45 @@ static inline void change_bit(int nr, volatile void *addr)
  * be atomic, particularly for things like slab_lock and slab_unlock.
  *
  */
-static inline void __clear_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+arch___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	test_and_clear_bit(nr, addr);
 }
 
-static inline void __set_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+arch___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	test_and_set_bit(nr, addr);
 }
 
-static inline void __change_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+arch___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	test_and_change_bit(nr, addr);
 }
 
 /*  Apparently, at least some of these are allowed to be non-atomic  */
-static inline int __test_and_clear_bit(int nr, volatile unsigned long *addr)
+static __always_inline bool
+arch___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	return test_and_clear_bit(nr, addr);
 }
 
-static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
+static __always_inline bool
+arch___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	return test_and_set_bit(nr, addr);
 }
 
-static inline int __test_and_change_bit(int nr, volatile unsigned long *addr)
+static __always_inline bool
+arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	return test_and_change_bit(nr, addr);
 }
 
-static inline int __test_bit(int nr, const volatile unsigned long *addr)
+static __always_inline bool
+arch_test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	int retval;
 
@@ -172,8 +179,6 @@ static inline int __test_bit(int nr, const volatile unsigned long *addr)
 	return retval;
 }
 
-#define test_bit(nr, addr) __test_bit(nr, addr)
-
 /*
  * ffz - find first zero in word.
  * @word: The word to search
@@ -271,6 +276,7 @@ static inline unsigned long __fls(unsigned long word)
 }
 
 #include <asm-generic/bitops/lock.h>
+#include <asm-generic/bitops/non-instrumented-non-atomic.h>
 
 #include <asm-generic/bitops/fls64.h>
 #include <asm-generic/bitops/sched.h>
diff --git a/arch/ia64/include/asm/bitops.h b/arch/ia64/include/asm/bitops.h
index 577be93c0818..9f62af7fd7c4 100644
--- a/arch/ia64/include/asm/bitops.h
+++ b/arch/ia64/include/asm/bitops.h
@@ -53,7 +53,7 @@ set_bit (int nr, volatile void *addr)
 }
 
 /**
- * __set_bit - Set a bit in memory
+ * arch___set_bit - Set a bit in memory
  * @nr: the bit to set
  * @addr: the address to start counting from
  *
@@ -61,8 +61,8 @@ set_bit (int nr, volatile void *addr)
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static __inline__ void
-__set_bit (int nr, volatile void *addr)
+static __always_inline void
+arch___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	*((__u32 *) addr + (nr >> 5)) |= (1 << (nr & 31));
 }
@@ -135,7 +135,7 @@ __clear_bit_unlock(int nr, void *addr)
 }
 
 /**
- * __clear_bit - Clears a bit in memory (non-atomic version)
+ * arch___clear_bit - Clears a bit in memory (non-atomic version)
  * @nr: the bit to clear
  * @addr: the address to start counting from
  *
@@ -143,8 +143,8 @@ __clear_bit_unlock(int nr, void *addr)
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static __inline__ void
-__clear_bit (int nr, volatile void *addr)
+static __always_inline void
+arch___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	*((__u32 *) addr + (nr >> 5)) &= ~(1 << (nr & 31));
 }
@@ -175,7 +175,7 @@ change_bit (int nr, volatile void *addr)
 }
 
 /**
- * __change_bit - Toggle a bit in memory
+ * arch___change_bit - Toggle a bit in memory
  * @nr: the bit to toggle
  * @addr: the address to start counting from
  *
@@ -183,8 +183,8 @@ change_bit (int nr, volatile void *addr)
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static __inline__ void
-__change_bit (int nr, volatile void *addr)
+static __always_inline void
+arch___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	*((__u32 *) addr + (nr >> 5)) ^= (1 << (nr & 31));
 }
@@ -224,7 +224,7 @@ test_and_set_bit (int nr, volatile void *addr)
 #define test_and_set_bit_lock test_and_set_bit
 
 /**
- * __test_and_set_bit - Set a bit and return its old value
+ * arch___test_and_set_bit - Set a bit and return its old value
  * @nr: Bit to set
  * @addr: Address to count from
  *
@@ -232,8 +232,8 @@ test_and_set_bit (int nr, volatile void *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static __inline__ int
-__test_and_set_bit (int nr, volatile void *addr)
+static __always_inline bool
+arch___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__u32 *p = (__u32 *) addr + (nr >> 5);
 	__u32 m = 1 << (nr & 31);
@@ -269,7 +269,7 @@ test_and_clear_bit (int nr, volatile void *addr)
 }
 
 /**
- * __test_and_clear_bit - Clear a bit and return its old value
+ * arch___test_and_clear_bit - Clear a bit and return its old value
  * @nr: Bit to clear
  * @addr: Address to count from
  *
@@ -277,8 +277,8 @@ test_and_clear_bit (int nr, volatile void *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static __inline__ int
-__test_and_clear_bit(int nr, volatile void * addr)
+static __always_inline bool
+arch___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__u32 *p = (__u32 *) addr + (nr >> 5);
 	__u32 m = 1 << (nr & 31);
@@ -314,14 +314,14 @@ test_and_change_bit (int nr, volatile void *addr)
 }
 
 /**
- * __test_and_change_bit - Change a bit and return its old value
+ * arch___test_and_change_bit - Change a bit and return its old value
  * @nr: Bit to change
  * @addr: Address to count from
  *
  * This operation is non-atomic and can be reordered.
  */
-static __inline__ int
-__test_and_change_bit (int nr, void *addr)
+static __always_inline bool
+arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__u32 old, bit = (1 << (nr & 31));
 	__u32 *m = (__u32 *) addr + (nr >> 5);
@@ -331,8 +331,8 @@ __test_and_change_bit (int nr, void *addr)
 	return (old & bit) != 0;
 }
 
-static __inline__ int
-test_bit (int nr, const volatile void *addr)
+static __always_inline bool
+arch_test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	return 1 & (((const volatile __u32 *) addr)[nr >> 5] >> (nr & 31));
 }
@@ -443,6 +443,8 @@ static __inline__ unsigned long __arch_hweight64(unsigned long x)
 
 #ifdef __KERNEL__
 
+#include <asm-generic/bitops/non-instrumented-non-atomic.h>
+
 #include <asm-generic/bitops/le.h>
 
 #include <asm-generic/bitops/ext2-atomic-setbit.h>
diff --git a/arch/m68k/include/asm/bitops.h b/arch/m68k/include/asm/bitops.h
index 51283db53667..71495faf2a90 100644
--- a/arch/m68k/include/asm/bitops.h
+++ b/arch/m68k/include/asm/bitops.h
@@ -65,8 +65,11 @@ static inline void bfset_mem_set_bit(int nr, volatile unsigned long *vaddr)
 				bfset_mem_set_bit(nr, vaddr))
 #endif
 
-#define __set_bit(nr, vaddr)	set_bit(nr, vaddr)
-
+static __always_inline void
+arch___set_bit(unsigned long nr, volatile unsigned long *addr)
+{
+	set_bit(nr, addr);
+}
 
 static inline void bclr_reg_clear_bit(int nr, volatile unsigned long *vaddr)
 {
@@ -105,8 +108,11 @@ static inline void bfclr_mem_clear_bit(int nr, volatile unsigned long *vaddr)
 				bfclr_mem_clear_bit(nr, vaddr))
 #endif
 
-#define __clear_bit(nr, vaddr)	clear_bit(nr, vaddr)
-
+static __always_inline void
+arch___clear_bit(unsigned long nr, volatile unsigned long *addr)
+{
+	clear_bit(nr, addr);
+}
 
 static inline void bchg_reg_change_bit(int nr, volatile unsigned long *vaddr)
 {
@@ -145,14 +151,17 @@ static inline void bfchg_mem_change_bit(int nr, volatile unsigned long *vaddr)
 				bfchg_mem_change_bit(nr, vaddr))
 #endif
 
-#define __change_bit(nr, vaddr)	change_bit(nr, vaddr)
-
-
-static inline int test_bit(int nr, const volatile unsigned long *vaddr)
+static __always_inline void
+arch___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
-	return (vaddr[nr >> 5] & (1UL << (nr & 31))) != 0;
+	change_bit(nr, addr);
 }
 
+static __always_inline bool
+arch_test_bit(unsigned long nr, const volatile unsigned long *addr)
+{
+	return (addr[nr >> 5] & (1UL << (nr & 31))) != 0;
+}
 
 static inline int bset_reg_test_and_set_bit(int nr,
 					    volatile unsigned long *vaddr)
@@ -201,8 +210,11 @@ static inline int bfset_mem_test_and_set_bit(int nr,
 					bfset_mem_test_and_set_bit(nr, vaddr))
 #endif
 
-#define __test_and_set_bit(nr, vaddr)	test_and_set_bit(nr, vaddr)
-
+static __always_inline bool
+arch___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
+{
+	return test_and_set_bit(nr, addr);
+}
 
 static inline int bclr_reg_test_and_clear_bit(int nr,
 					      volatile unsigned long *vaddr)
@@ -251,8 +263,11 @@ static inline int bfclr_mem_test_and_clear_bit(int nr,
 					bfclr_mem_test_and_clear_bit(nr, vaddr))
 #endif
 
-#define __test_and_clear_bit(nr, vaddr)	test_and_clear_bit(nr, vaddr)
-
+static __always_inline bool
+arch___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
+{
+	return test_and_clear_bit(nr, addr);
+}
 
 static inline int bchg_reg_test_and_change_bit(int nr,
 					       volatile unsigned long *vaddr)
@@ -301,8 +316,11 @@ static inline int bfchg_mem_test_and_change_bit(int nr,
 					bfchg_mem_test_and_change_bit(nr, vaddr))
 #endif
 
-#define __test_and_change_bit(nr, vaddr) test_and_change_bit(nr, vaddr)
-
+static __always_inline bool
+arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
+{
+	return test_and_change_bit(nr, addr);
+}
 
 /*
  *	The true 68020 and more advanced processors support the "bfffo"
@@ -522,6 +540,7 @@ static inline int __fls(int x)
 #define clear_bit_unlock	clear_bit
 #define __clear_bit_unlock	clear_bit_unlock
 
+#include <asm-generic/bitops/non-instrumented-non-atomic.h>
 #include <asm-generic/bitops/ext2-atomic.h>
 #include <asm-generic/bitops/fls64.h>
 #include <asm-generic/bitops/sched.h>
diff --git a/arch/s390/include/asm/bitops.h b/arch/s390/include/asm/bitops.h
index 191dc7898b0f..9a7d15da966e 100644
--- a/arch/s390/include/asm/bitops.h
+++ b/arch/s390/include/asm/bitops.h
@@ -113,75 +113,76 @@ static inline bool arch_test_and_change_bit(unsigned long nr,
 	return old & mask;
 }
 
-static inline void arch___set_bit(unsigned long nr, volatile unsigned long *ptr)
+static __always_inline void
+arch___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
-	unsigned long *addr = __bitops_word(nr, ptr);
+	unsigned long *p = __bitops_word(nr, addr);
 	unsigned long mask = __bitops_mask(nr);
 
-	*addr |= mask;
+	*p |= mask;
 }
 
-static inline void arch___clear_bit(unsigned long nr,
-				    volatile unsigned long *ptr)
+static __always_inline void
+arch___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
-	unsigned long *addr = __bitops_word(nr, ptr);
+	unsigned long *p = __bitops_word(nr, addr);
 	unsigned long mask = __bitops_mask(nr);
 
-	*addr &= ~mask;
+	*p &= ~mask;
 }
 
-static inline void arch___change_bit(unsigned long nr,
-				     volatile unsigned long *ptr)
+static __always_inline void
+arch___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
-	unsigned long *addr = __bitops_word(nr, ptr);
+	unsigned long *p = __bitops_word(nr, addr);
 	unsigned long mask = __bitops_mask(nr);
 
-	*addr ^= mask;
+	*p ^= mask;
 }
 
-static inline bool arch___test_and_set_bit(unsigned long nr,
-					   volatile unsigned long *ptr)
+static __always_inline bool
+arch___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
-	unsigned long *addr = __bitops_word(nr, ptr);
+	unsigned long *p = __bitops_word(nr, addr);
 	unsigned long mask = __bitops_mask(nr);
 	unsigned long old;
 
-	old = *addr;
-	*addr |= mask;
+	old = *p;
+	*p |= mask;
 	return old & mask;
 }
 
-static inline bool arch___test_and_clear_bit(unsigned long nr,
-					     volatile unsigned long *ptr)
+static __always_inline bool
+arch___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
-	unsigned long *addr = __bitops_word(nr, ptr);
+	unsigned long *p = __bitops_word(nr, addr);
 	unsigned long mask = __bitops_mask(nr);
 	unsigned long old;
 
-	old = *addr;
-	*addr &= ~mask;
+	old = *p;
+	*p &= ~mask;
 	return old & mask;
 }
 
-static inline bool arch___test_and_change_bit(unsigned long nr,
-					      volatile unsigned long *ptr)
+static __always_inline bool
+arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
-	unsigned long *addr = __bitops_word(nr, ptr);
+	unsigned long *p = __bitops_word(nr, addr);
 	unsigned long mask = __bitops_mask(nr);
 	unsigned long old;
 
-	old = *addr;
-	*addr ^= mask;
+	old = *p;
+	*p ^= mask;
 	return old & mask;
 }
 
-static inline bool arch_test_bit(unsigned long nr,
-				 const volatile unsigned long *ptr)
+static __always_inline bool
+arch_test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
-	const volatile unsigned long *addr = __bitops_word(nr, ptr);
+	const volatile unsigned long *p = __bitops_word(nr, addr);
 	unsigned long mask = __bitops_mask(nr);
 
-	return *addr & mask;
+	return *p & mask;
 }
 
 static inline bool arch_test_and_set_bit_lock(unsigned long nr,
diff --git a/arch/sh/include/asm/bitops-op32.h b/arch/sh/include/asm/bitops-op32.h
index cfe5465acce7..565a85d8b7fb 100644
--- a/arch/sh/include/asm/bitops-op32.h
+++ b/arch/sh/include/asm/bitops-op32.h
@@ -2,6 +2,8 @@
 #ifndef __ASM_SH_BITOPS_OP32_H
 #define __ASM_SH_BITOPS_OP32_H
 
+#include <linux/bits.h>
+
 /*
  * The bit modifying instructions on SH-2A are only capable of working
  * with a 3-bit immediate, which signifies the shift position for the bit
@@ -16,7 +18,8 @@
 #define BYTE_OFFSET(nr)		((nr) % BITS_PER_BYTE)
 #endif
 
-static inline void __set_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+arch___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
 		__asm__ __volatile__ (
@@ -33,7 +36,8 @@ static inline void __set_bit(int nr, volatile unsigned long *addr)
 	}
 }
 
-static inline void __clear_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+arch___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
 		__asm__ __volatile__ (
@@ -52,7 +56,7 @@ static inline void __clear_bit(int nr, volatile unsigned long *addr)
 }
 
 /**
- * __change_bit - Toggle a bit in memory
+ * arch___change_bit - Toggle a bit in memory
  * @nr: the bit to change
  * @addr: the address to start counting from
  *
@@ -60,7 +64,8 @@ static inline void __clear_bit(int nr, volatile unsigned long *addr)
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static inline void __change_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+arch___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	if (__builtin_constant_p(nr)) {
 		__asm__ __volatile__ (
@@ -79,7 +84,7 @@ static inline void __change_bit(int nr, volatile unsigned long *addr)
 }
 
 /**
- * __test_and_set_bit - Set a bit and return its old value
+ * arch___test_and_set_bit - Set a bit and return its old value
  * @nr: Bit to set
  * @addr: Address to count from
  *
@@ -87,7 +92,8 @@ static inline void __change_bit(int nr, volatile unsigned long *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
+static __always_inline bool
+arch___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -98,7 +104,7 @@ static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
 }
 
 /**
- * __test_and_clear_bit - Clear a bit and return its old value
+ * arch___test_and_clear_bit - Clear a bit and return its old value
  * @nr: Bit to clear
  * @addr: Address to count from
  *
@@ -106,7 +112,8 @@ static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static inline int __test_and_clear_bit(int nr, volatile unsigned long *addr)
+static __always_inline bool
+arch___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -117,8 +124,8 @@ static inline int __test_and_clear_bit(int nr, volatile unsigned long *addr)
 }
 
 /* WARNING: non atomic and it can be reordered! */
-static inline int __test_and_change_bit(int nr,
-					    volatile unsigned long *addr)
+static __always_inline bool
+arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -129,13 +136,16 @@ static inline int __test_and_change_bit(int nr,
 }
 
 /**
- * test_bit - Determine whether a bit is set
+ * arch_test_bit - Determine whether a bit is set
  * @nr: bit number to test
  * @addr: Address to start counting from
  */
-static inline int test_bit(int nr, const volatile unsigned long *addr)
+static __always_inline bool
+arch_test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
 }
 
+#include <asm-generic/bitops/non-instrumented-non-atomic.h>
+
 #endif /* __ASM_SH_BITOPS_OP32_H */
diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index a288ecd230ab..973c6bd17f98 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -63,7 +63,7 @@ arch_set_bit(long nr, volatile unsigned long *addr)
 }
 
 static __always_inline void
-arch___set_bit(long nr, volatile unsigned long *addr)
+arch___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	asm volatile(__ASM_SIZE(bts) " %1,%0" : : ADDR, "Ir" (nr) : "memory");
 }
@@ -89,7 +89,7 @@ arch_clear_bit_unlock(long nr, volatile unsigned long *addr)
 }
 
 static __always_inline void
-arch___clear_bit(long nr, volatile unsigned long *addr)
+arch___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	asm volatile(__ASM_SIZE(btr) " %1,%0" : : ADDR, "Ir" (nr) : "memory");
 }
@@ -114,7 +114,7 @@ arch___clear_bit_unlock(long nr, volatile unsigned long *addr)
 }
 
 static __always_inline void
-arch___change_bit(long nr, volatile unsigned long *addr)
+arch___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	asm volatile(__ASM_SIZE(btc) " %1,%0" : : ADDR, "Ir" (nr) : "memory");
 }
@@ -145,7 +145,7 @@ arch_test_and_set_bit_lock(long nr, volatile unsigned long *addr)
 }
 
 static __always_inline bool
-arch___test_and_set_bit(long nr, volatile unsigned long *addr)
+arch___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	bool oldbit;
 
@@ -171,7 +171,7 @@ arch_test_and_clear_bit(long nr, volatile unsigned long *addr)
  * this without also updating arch/x86/kernel/kvm.c
  */
 static __always_inline bool
-arch___test_and_clear_bit(long nr, volatile unsigned long *addr)
+arch___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	bool oldbit;
 
@@ -183,7 +183,7 @@ arch___test_and_clear_bit(long nr, volatile unsigned long *addr)
 }
 
 static __always_inline bool
-arch___test_and_change_bit(long nr, volatile unsigned long *addr)
+arch___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	bool oldbit;
 
@@ -219,10 +219,12 @@ static __always_inline bool variable_test_bit(long nr, volatile const unsigned l
 	return oldbit;
 }
 
-#define arch_test_bit(nr, addr)			\
-	(__builtin_constant_p((nr))		\
-	 ? constant_test_bit((nr), (addr))	\
-	 : variable_test_bit((nr), (addr)))
+static __always_inline bool
+arch_test_bit(unsigned long nr, const volatile unsigned long *addr)
+{
+	return __builtin_constant_p(nr) ? constant_test_bit(nr, addr) :
+					  variable_test_bit(nr, addr);
+}
 
 /**
  * __ffs - find first set bit in word
diff --git a/include/asm-generic/bitops/generic-non-atomic.h b/include/asm-generic/bitops/generic-non-atomic.h
index 7226488810e5..b85b8a2ac239 100644
--- a/include/asm-generic/bitops/generic-non-atomic.h
+++ b/include/asm-generic/bitops/generic-non-atomic.h
@@ -24,7 +24,7 @@
  * may be that only one operation succeeds.
  */
 static __always_inline void
-generic___set_bit(unsigned int nr, volatile unsigned long *addr)
+generic___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -33,7 +33,7 @@ generic___set_bit(unsigned int nr, volatile unsigned long *addr)
 }
 
 static __always_inline void
-generic___clear_bit(unsigned int nr, volatile unsigned long *addr)
+generic___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -50,8 +50,8 @@ generic___clear_bit(unsigned int nr, volatile unsigned long *addr)
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static __always_inline
-void generic___change_bit(unsigned int nr, volatile unsigned long *addr)
+static __always_inline void
+generic___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -68,8 +68,8 @@ void generic___change_bit(unsigned int nr, volatile unsigned long *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static __always_inline int
-generic___test_and_set_bit(unsigned int nr, volatile unsigned long *addr)
+static __always_inline bool
+generic___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -88,8 +88,8 @@ generic___test_and_set_bit(unsigned int nr, volatile unsigned long *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static __always_inline int
-generic___test_and_clear_bit(unsigned int nr, volatile unsigned long *addr)
+static __always_inline bool
+generic___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -100,8 +100,8 @@ generic___test_and_clear_bit(unsigned int nr, volatile unsigned long *addr)
 }
 
 /* WARNING: non atomic and it can be reordered! */
-static __always_inline int
-generic___test_and_change_bit(unsigned int nr, volatile unsigned long *addr)
+static __always_inline bool
+generic___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -116,8 +116,8 @@ generic___test_and_change_bit(unsigned int nr, volatile unsigned long *addr)
  * @nr: bit number to test
  * @addr: Address to start counting from
  */
-static __always_inline int
-generic_test_bit(unsigned int nr, const volatile unsigned long *addr)
+static __always_inline bool
+generic_test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	/*
 	 * Unlike the bitops with the '__' prefix above, this one *is* atomic,
diff --git a/include/asm-generic/bitops/instrumented-non-atomic.h b/include/asm-generic/bitops/instrumented-non-atomic.h
index 7ab1ecc37782..b019f77ef21c 100644
--- a/include/asm-generic/bitops/instrumented-non-atomic.h
+++ b/include/asm-generic/bitops/instrumented-non-atomic.h
@@ -22,7 +22,8 @@
  * region of memory concurrently, the effect may be that only one operation
  * succeeds.
  */
-static __always_inline void __set_bit(long nr, volatile unsigned long *addr)
+static __always_inline void
+__set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	instrument_write(addr + BIT_WORD(nr), sizeof(long));
 	arch___set_bit(nr, addr);
@@ -37,7 +38,8 @@ static __always_inline void __set_bit(long nr, volatile unsigned long *addr)
  * region of memory concurrently, the effect may be that only one operation
  * succeeds.
  */
-static __always_inline void __clear_bit(long nr, volatile unsigned long *addr)
+static __always_inline void
+__clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	instrument_write(addr + BIT_WORD(nr), sizeof(long));
 	arch___clear_bit(nr, addr);
@@ -52,7 +54,8 @@ static __always_inline void __clear_bit(long nr, volatile unsigned long *addr)
  * region of memory concurrently, the effect may be that only one operation
  * succeeds.
  */
-static __always_inline void __change_bit(long nr, volatile unsigned long *addr)
+static __always_inline void
+__change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	instrument_write(addr + BIT_WORD(nr), sizeof(long));
 	arch___change_bit(nr, addr);
@@ -90,7 +93,8 @@ static __always_inline void __instrument_read_write_bitop(long nr, volatile unsi
  * This operation is non-atomic. If two instances of this operation race, one
  * can appear to succeed but actually fail.
  */
-static __always_inline bool __test_and_set_bit(long nr, volatile unsigned long *addr)
+static __always_inline bool
+__test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__instrument_read_write_bitop(nr, addr);
 	return arch___test_and_set_bit(nr, addr);
@@ -104,7 +108,8 @@ static __always_inline bool __test_and_set_bit(long nr, volatile unsigned long *
  * This operation is non-atomic. If two instances of this operation race, one
  * can appear to succeed but actually fail.
  */
-static __always_inline bool __test_and_clear_bit(long nr, volatile unsigned long *addr)
+static __always_inline bool
+__test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__instrument_read_write_bitop(nr, addr);
 	return arch___test_and_clear_bit(nr, addr);
@@ -118,7 +123,8 @@ static __always_inline bool __test_and_clear_bit(long nr, volatile unsigned long
  * This operation is non-atomic. If two instances of this operation race, one
  * can appear to succeed but actually fail.
  */
-static __always_inline bool __test_and_change_bit(long nr, volatile unsigned long *addr)
+static __always_inline bool
+__test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__instrument_read_write_bitop(nr, addr);
 	return arch___test_and_change_bit(nr, addr);
@@ -129,7 +135,8 @@ static __always_inline bool __test_and_change_bit(long nr, volatile unsigned lon
  * @nr: bit number to test
  * @addr: Address to start counting from
  */
-static __always_inline bool test_bit(long nr, const volatile unsigned long *addr)
+static __always_inline bool
+test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	instrument_atomic_read(addr + BIT_WORD(nr), sizeof(long));
 	return arch_test_bit(nr, addr);
diff --git a/include/asm-generic/bitops/non-atomic.h b/include/asm-generic/bitops/non-atomic.h
index 23d3abc1e10d..5c37ced343ae 100644
--- a/include/asm-generic/bitops/non-atomic.h
+++ b/include/asm-generic/bitops/non-atomic.h
@@ -5,24 +5,15 @@
 #include <asm-generic/bitops/generic-non-atomic.h>
 
 #define arch___set_bit generic___set_bit
-#define __set_bit arch___set_bit
-
 #define arch___clear_bit generic___clear_bit
-#define __clear_bit arch___clear_bit
-
 #define arch___change_bit generic___change_bit
-#define __change_bit arch___change_bit
 
 #define arch___test_and_set_bit generic___test_and_set_bit
-#define __test_and_set_bit arch___test_and_set_bit
-
 #define arch___test_and_clear_bit generic___test_and_clear_bit
-#define __test_and_clear_bit arch___test_and_clear_bit
-
 #define arch___test_and_change_bit generic___test_and_change_bit
-#define __test_and_change_bit arch___test_and_change_bit
 
 #define arch_test_bit generic_test_bit
-#define test_bit arch_test_bit
+
+#include <asm-generic/bitops/non-instrumented-non-atomic.h>
 
 #endif /* _ASM_GENERIC_BITOPS_NON_ATOMIC_H_ */
diff --git a/include/asm-generic/bitops/non-instrumented-non-atomic.h b/include/asm-generic/bitops/non-instrumented-non-atomic.h
new file mode 100644
index 000000000000..e0fd7bf72a56
--- /dev/null
+++ b/include/asm-generic/bitops/non-instrumented-non-atomic.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_GENERIC_BITOPS_NON_INSTRUMENTED_NON_ATOMIC_H
+#define __ASM_GENERIC_BITOPS_NON_INSTRUMENTED_NON_ATOMIC_H
+
+#define __set_bit		arch___set_bit
+#define __clear_bit		arch___clear_bit
+#define __change_bit		arch___change_bit
+
+#define __test_and_set_bit	arch___test_and_set_bit
+#define __test_and_clear_bit	arch___test_and_clear_bit
+#define __test_and_change_bit	arch___test_and_change_bit
+
+#define test_bit		arch_test_bit
+
+#endif /* __ASM_GENERIC_BITOPS_NON_INSTRUMENTED_NON_ATOMIC_H */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 7aaed501f768..87087454a288 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -26,12 +26,29 @@ extern unsigned int __sw_hweight16(unsigned int w);
 extern unsigned int __sw_hweight32(unsigned int w);
 extern unsigned long __sw_hweight64(__u64 w);
 
+#include <asm-generic/bitops/generic-non-atomic.h>
+
 /*
  * Include this here because some architectures need generic_ffs/fls in
  * scope
  */
 #include <asm/bitops.h>
 
+/* Check that the bitops prototypes are sane */
+#define __check_bitop_pr(name)						\
+	static_assert(__same_type(arch_##name, generic_##name) &&	\
+		      __same_type(name, generic_##name))
+
+__check_bitop_pr(__set_bit);
+__check_bitop_pr(__clear_bit);
+__check_bitop_pr(__change_bit);
+__check_bitop_pr(__test_and_set_bit);
+__check_bitop_pr(__test_and_clear_bit);
+__check_bitop_pr(__test_and_change_bit);
+__check_bitop_pr(test_bit);
+
+#undef __check_bitop_pr
+
 static inline int get_bitmask_order(unsigned int count)
 {
 	int order;
diff --git a/tools/include/asm-generic/bitops/non-atomic.h b/tools/include/asm-generic/bitops/non-atomic.h
index 7e10c4b50c5d..e5e78e42e57b 100644
--- a/tools/include/asm-generic/bitops/non-atomic.h
+++ b/tools/include/asm-generic/bitops/non-atomic.h
@@ -2,7 +2,7 @@
 #ifndef _ASM_GENERIC_BITOPS_NON_ATOMIC_H_
 #define _ASM_GENERIC_BITOPS_NON_ATOMIC_H_
 
-#include <asm/types.h>
+#include <linux/bits.h>
 
 /**
  * __set_bit - Set a bit in memory
@@ -13,7 +13,8 @@
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static inline void __set_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+__set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -21,7 +22,8 @@ static inline void __set_bit(int nr, volatile unsigned long *addr)
 	*p  |= mask;
 }
 
-static inline void __clear_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+__clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -38,7 +40,8 @@ static inline void __clear_bit(int nr, volatile unsigned long *addr)
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static inline void __change_bit(int nr, volatile unsigned long *addr)
+static __always_inline void
+__change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -55,7 +58,8 @@ static inline void __change_bit(int nr, volatile unsigned long *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
+static __always_inline bool
+__test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -74,7 +78,8 @@ static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static inline int __test_and_clear_bit(int nr, volatile unsigned long *addr)
+static __always_inline bool
+__test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -85,8 +90,8 @@ static inline int __test_and_clear_bit(int nr, volatile unsigned long *addr)
 }
 
 /* WARNING: non atomic and it can be reordered! */
-static inline int __test_and_change_bit(int nr,
-					    volatile unsigned long *addr)
+static __always_inline bool
+__test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -101,7 +106,8 @@ static inline int __test_and_change_bit(int nr,
  * @nr: bit number to test
  * @addr: Address to start counting from
  */
-static inline int test_bit(int nr, const volatile unsigned long *addr)
+static __always_inline bool
+test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
 }
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 4/9] bitops: define const_*() versions of the non-atomics
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (2 preceding siblings ...)
  2022-06-24 12:13 ` [PATCH v5 3/9] bitops: unify non-atomic bitops prototypes across architectures Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 5/9] bitops: wrap non-atomic bitops with a transparent macro Alexander Lobakin
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

Define const_*() variants of the non-atomic bitops to be used when
the input arguments are compile-time constants, so that the compiler
will be always able to resolve those to compile-time constants as
well. Those are mostly direct aliases for generic_*() with one
exception for const_test_bit(): the original one is declared
atomic-safe and thus doesn't discard the `volatile` qualifier, so
in order to let optimize code, define it separately disregarding
the qualifier.
Add them to the compile-time type checks as well just in case.

Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 .../asm-generic/bitops/generic-non-atomic.h   | 31 +++++++++++++++++++
 include/linux/bitops.h                        |  1 +
 2 files changed, 32 insertions(+)

diff --git a/include/asm-generic/bitops/generic-non-atomic.h b/include/asm-generic/bitops/generic-non-atomic.h
index b85b8a2ac239..3d5ebd24652b 100644
--- a/include/asm-generic/bitops/generic-non-atomic.h
+++ b/include/asm-generic/bitops/generic-non-atomic.h
@@ -127,4 +127,35 @@ generic_test_bit(unsigned long nr, const volatile unsigned long *addr)
 	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
 }
 
+/*
+ * const_*() definitions provide good compile-time optimizations when
+ * the passed arguments can be resolved at compile time.
+ */
+#define const___set_bit			generic___set_bit
+#define const___clear_bit		generic___clear_bit
+#define const___change_bit		generic___change_bit
+#define const___test_and_set_bit	generic___test_and_set_bit
+#define const___test_and_clear_bit	generic___test_and_clear_bit
+#define const___test_and_change_bit	generic___test_and_change_bit
+
+/**
+ * const_test_bit - Determine whether a bit is set
+ * @nr: bit number to test
+ * @addr: Address to start counting from
+ *
+ * A version of generic_test_bit() which discards the `volatile` qualifier to
+ * allow a compiler to optimize code harder. Non-atomic and to be called only
+ * for testing compile-time constants, e.g. by the corresponding macros, not
+ * directly from "regular" code.
+ */
+static __always_inline bool
+const_test_bit(unsigned long nr, const volatile unsigned long *addr)
+{
+	const unsigned long *p = (const unsigned long *)addr + BIT_WORD(nr);
+	unsigned long mask = BIT_MASK(nr);
+	unsigned long val = *p;
+
+	return !!(val & mask);
+}
+
 #endif /* __ASM_GENERIC_BITOPS_GENERIC_NON_ATOMIC_H */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 87087454a288..d393297287d5 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -37,6 +37,7 @@ extern unsigned long __sw_hweight64(__u64 w);
 /* Check that the bitops prototypes are sane */
 #define __check_bitop_pr(name)						\
 	static_assert(__same_type(arch_##name, generic_##name) &&	\
+		      __same_type(const_##name, generic_##name) &&	\
 		      __same_type(name, generic_##name))
 
 __check_bitop_pr(__set_bit);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 5/9] bitops: wrap non-atomic bitops with a transparent macro
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (3 preceding siblings ...)
  2022-06-24 12:13 ` [PATCH v5 4/9] bitops: define const_*() versions of the non-atomics Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

In preparation for altering the non-atomic bitops with a macro, wrap
them in a transparent definition. This requires prepending one more
'_' to their names in order to be able to do that seamlessly. It is
a simple change, given that all the non-prefixed definitions are now
in asm-generic.
sparc32 already has several triple-underscored functions, so I had
to rename them ('___' -> 'sp32_').

Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Marco Elver <elver@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 arch/sparc/include/asm/bitops_32.h            | 18 ++++++------
 arch/sparc/lib/atomic32.c                     | 12 ++++----
 .../bitops/instrumented-non-atomic.h          | 28 +++++++++----------
 .../bitops/non-instrumented-non-atomic.h      | 14 +++++-----
 include/linux/bitops.h                        | 18 +++++++++++-
 tools/include/asm-generic/bitops/non-atomic.h | 24 ++++++++--------
 tools/include/linux/bitops.h                  | 16 +++++++++++
 7 files changed, 81 insertions(+), 49 deletions(-)

diff --git a/arch/sparc/include/asm/bitops_32.h b/arch/sparc/include/asm/bitops_32.h
index 889afa9f990f..3448c191b484 100644
--- a/arch/sparc/include/asm/bitops_32.h
+++ b/arch/sparc/include/asm/bitops_32.h
@@ -19,9 +19,9 @@
 #error only <linux/bitops.h> can be included directly
 #endif
 
-unsigned long ___set_bit(unsigned long *addr, unsigned long mask);
-unsigned long ___clear_bit(unsigned long *addr, unsigned long mask);
-unsigned long ___change_bit(unsigned long *addr, unsigned long mask);
+unsigned long sp32___set_bit(unsigned long *addr, unsigned long mask);
+unsigned long sp32___clear_bit(unsigned long *addr, unsigned long mask);
+unsigned long sp32___change_bit(unsigned long *addr, unsigned long mask);
 
 /*
  * Set bit 'nr' in 32-bit quantity at address 'addr' where bit '0'
@@ -36,7 +36,7 @@ static inline int test_and_set_bit(unsigned long nr, volatile unsigned long *add
 	ADDR = ((unsigned long *) addr) + (nr >> 5);
 	mask = 1 << (nr & 31);
 
-	return ___set_bit(ADDR, mask) != 0;
+	return sp32___set_bit(ADDR, mask) != 0;
 }
 
 static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
@@ -46,7 +46,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
 	ADDR = ((unsigned long *) addr) + (nr >> 5);
 	mask = 1 << (nr & 31);
 
-	(void) ___set_bit(ADDR, mask);
+	(void) sp32___set_bit(ADDR, mask);
 }
 
 static inline int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
@@ -56,7 +56,7 @@ static inline int test_and_clear_bit(unsigned long nr, volatile unsigned long *a
 	ADDR = ((unsigned long *) addr) + (nr >> 5);
 	mask = 1 << (nr & 31);
 
-	return ___clear_bit(ADDR, mask) != 0;
+	return sp32___clear_bit(ADDR, mask) != 0;
 }
 
 static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
@@ -66,7 +66,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
 	ADDR = ((unsigned long *) addr) + (nr >> 5);
 	mask = 1 << (nr & 31);
 
-	(void) ___clear_bit(ADDR, mask);
+	(void) sp32___clear_bit(ADDR, mask);
 }
 
 static inline int test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
@@ -76,7 +76,7 @@ static inline int test_and_change_bit(unsigned long nr, volatile unsigned long *
 	ADDR = ((unsigned long *) addr) + (nr >> 5);
 	mask = 1 << (nr & 31);
 
-	return ___change_bit(ADDR, mask) != 0;
+	return sp32___change_bit(ADDR, mask) != 0;
 }
 
 static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
@@ -86,7 +86,7 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
 	ADDR = ((unsigned long *) addr) + (nr >> 5);
 	mask = 1 << (nr & 31);
 
-	(void) ___change_bit(ADDR, mask);
+	(void) sp32___change_bit(ADDR, mask);
 }
 
 #include <asm-generic/bitops/non-atomic.h>
diff --git a/arch/sparc/lib/atomic32.c b/arch/sparc/lib/atomic32.c
index 8b81d0f00c97..cf80d1ae352b 100644
--- a/arch/sparc/lib/atomic32.c
+++ b/arch/sparc/lib/atomic32.c
@@ -120,7 +120,7 @@ void arch_atomic_set(atomic_t *v, int i)
 }
 EXPORT_SYMBOL(arch_atomic_set);
 
-unsigned long ___set_bit(unsigned long *addr, unsigned long mask)
+unsigned long sp32___set_bit(unsigned long *addr, unsigned long mask)
 {
 	unsigned long old, flags;
 
@@ -131,9 +131,9 @@ unsigned long ___set_bit(unsigned long *addr, unsigned long mask)
 
 	return old & mask;
 }
-EXPORT_SYMBOL(___set_bit);
+EXPORT_SYMBOL(sp32___set_bit);
 
-unsigned long ___clear_bit(unsigned long *addr, unsigned long mask)
+unsigned long sp32___clear_bit(unsigned long *addr, unsigned long mask)
 {
 	unsigned long old, flags;
 
@@ -144,9 +144,9 @@ unsigned long ___clear_bit(unsigned long *addr, unsigned long mask)
 
 	return old & mask;
 }
-EXPORT_SYMBOL(___clear_bit);
+EXPORT_SYMBOL(sp32___clear_bit);
 
-unsigned long ___change_bit(unsigned long *addr, unsigned long mask)
+unsigned long sp32___change_bit(unsigned long *addr, unsigned long mask)
 {
 	unsigned long old, flags;
 
@@ -157,7 +157,7 @@ unsigned long ___change_bit(unsigned long *addr, unsigned long mask)
 
 	return old & mask;
 }
-EXPORT_SYMBOL(___change_bit);
+EXPORT_SYMBOL(sp32___change_bit);
 
 unsigned long __cmpxchg_u32(volatile u32 *ptr, u32 old, u32 new)
 {
diff --git a/include/asm-generic/bitops/instrumented-non-atomic.h b/include/asm-generic/bitops/instrumented-non-atomic.h
index b019f77ef21c..988a3bbfba34 100644
--- a/include/asm-generic/bitops/instrumented-non-atomic.h
+++ b/include/asm-generic/bitops/instrumented-non-atomic.h
@@ -14,7 +14,7 @@
 #include <linux/instrumented.h>
 
 /**
- * __set_bit - Set a bit in memory
+ * ___set_bit - Set a bit in memory
  * @nr: the bit to set
  * @addr: the address to start counting from
  *
@@ -23,14 +23,14 @@
  * succeeds.
  */
 static __always_inline void
-__set_bit(unsigned long nr, volatile unsigned long *addr)
+___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	instrument_write(addr + BIT_WORD(nr), sizeof(long));
 	arch___set_bit(nr, addr);
 }
 
 /**
- * __clear_bit - Clears a bit in memory
+ * ___clear_bit - Clears a bit in memory
  * @nr: the bit to clear
  * @addr: the address to start counting from
  *
@@ -39,14 +39,14 @@ __set_bit(unsigned long nr, volatile unsigned long *addr)
  * succeeds.
  */
 static __always_inline void
-__clear_bit(unsigned long nr, volatile unsigned long *addr)
+___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	instrument_write(addr + BIT_WORD(nr), sizeof(long));
 	arch___clear_bit(nr, addr);
 }
 
 /**
- * __change_bit - Toggle a bit in memory
+ * ___change_bit - Toggle a bit in memory
  * @nr: the bit to change
  * @addr: the address to start counting from
  *
@@ -55,7 +55,7 @@ __clear_bit(unsigned long nr, volatile unsigned long *addr)
  * succeeds.
  */
 static __always_inline void
-__change_bit(unsigned long nr, volatile unsigned long *addr)
+___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	instrument_write(addr + BIT_WORD(nr), sizeof(long));
 	arch___change_bit(nr, addr);
@@ -86,7 +86,7 @@ static __always_inline void __instrument_read_write_bitop(long nr, volatile unsi
 }
 
 /**
- * __test_and_set_bit - Set a bit and return its old value
+ * ___test_and_set_bit - Set a bit and return its old value
  * @nr: Bit to set
  * @addr: Address to count from
  *
@@ -94,14 +94,14 @@ static __always_inline void __instrument_read_write_bitop(long nr, volatile unsi
  * can appear to succeed but actually fail.
  */
 static __always_inline bool
-__test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
+___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__instrument_read_write_bitop(nr, addr);
 	return arch___test_and_set_bit(nr, addr);
 }
 
 /**
- * __test_and_clear_bit - Clear a bit and return its old value
+ * ___test_and_clear_bit - Clear a bit and return its old value
  * @nr: Bit to clear
  * @addr: Address to count from
  *
@@ -109,14 +109,14 @@ __test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
  * can appear to succeed but actually fail.
  */
 static __always_inline bool
-__test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
+___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__instrument_read_write_bitop(nr, addr);
 	return arch___test_and_clear_bit(nr, addr);
 }
 
 /**
- * __test_and_change_bit - Change a bit and return its old value
+ * ___test_and_change_bit - Change a bit and return its old value
  * @nr: Bit to change
  * @addr: Address to count from
  *
@@ -124,19 +124,19 @@ __test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
  * can appear to succeed but actually fail.
  */
 static __always_inline bool
-__test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
+___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	__instrument_read_write_bitop(nr, addr);
 	return arch___test_and_change_bit(nr, addr);
 }
 
 /**
- * test_bit - Determine whether a bit is set
+ * _test_bit - Determine whether a bit is set
  * @nr: bit number to test
  * @addr: Address to start counting from
  */
 static __always_inline bool
-test_bit(unsigned long nr, const volatile unsigned long *addr)
+_test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	instrument_atomic_read(addr + BIT_WORD(nr), sizeof(long));
 	return arch_test_bit(nr, addr);
diff --git a/include/asm-generic/bitops/non-instrumented-non-atomic.h b/include/asm-generic/bitops/non-instrumented-non-atomic.h
index e0fd7bf72a56..bdb9b1ffaee9 100644
--- a/include/asm-generic/bitops/non-instrumented-non-atomic.h
+++ b/include/asm-generic/bitops/non-instrumented-non-atomic.h
@@ -3,14 +3,14 @@
 #ifndef __ASM_GENERIC_BITOPS_NON_INSTRUMENTED_NON_ATOMIC_H
 #define __ASM_GENERIC_BITOPS_NON_INSTRUMENTED_NON_ATOMIC_H
 
-#define __set_bit		arch___set_bit
-#define __clear_bit		arch___clear_bit
-#define __change_bit		arch___change_bit
+#define ___set_bit		arch___set_bit
+#define ___clear_bit		arch___clear_bit
+#define ___change_bit		arch___change_bit
 
-#define __test_and_set_bit	arch___test_and_set_bit
-#define __test_and_clear_bit	arch___test_and_clear_bit
-#define __test_and_change_bit	arch___test_and_change_bit
+#define ___test_and_set_bit	arch___test_and_set_bit
+#define ___test_and_clear_bit	arch___test_and_clear_bit
+#define ___test_and_change_bit	arch___test_and_change_bit
 
-#define test_bit		arch_test_bit
+#define _test_bit		arch_test_bit
 
 #endif /* __ASM_GENERIC_BITOPS_NON_INSTRUMENTED_NON_ATOMIC_H */
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index d393297287d5..3c3afbae1533 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -26,8 +26,24 @@ extern unsigned int __sw_hweight16(unsigned int w);
 extern unsigned int __sw_hweight32(unsigned int w);
 extern unsigned long __sw_hweight64(__u64 w);
 
+/*
+ * Defined here because those may be needed by architecture-specific static
+ * inlines.
+ */
+
 #include <asm-generic/bitops/generic-non-atomic.h>
 
+#define bitop(op, nr, addr)						\
+	op(nr, addr)
+
+#define __set_bit(nr, addr)		bitop(___set_bit, nr, addr)
+#define __clear_bit(nr, addr)		bitop(___clear_bit, nr, addr)
+#define __change_bit(nr, addr)		bitop(___change_bit, nr, addr)
+#define __test_and_set_bit(nr, addr)	bitop(___test_and_set_bit, nr, addr)
+#define __test_and_clear_bit(nr, addr)	bitop(___test_and_clear_bit, nr, addr)
+#define __test_and_change_bit(nr, addr)	bitop(___test_and_change_bit, nr, addr)
+#define test_bit(nr, addr)		bitop(_test_bit, nr, addr)
+
 /*
  * Include this here because some architectures need generic_ffs/fls in
  * scope
@@ -38,7 +54,7 @@ extern unsigned long __sw_hweight64(__u64 w);
 #define __check_bitop_pr(name)						\
 	static_assert(__same_type(arch_##name, generic_##name) &&	\
 		      __same_type(const_##name, generic_##name) &&	\
-		      __same_type(name, generic_##name))
+		      __same_type(_##name, generic_##name))
 
 __check_bitop_pr(__set_bit);
 __check_bitop_pr(__clear_bit);
diff --git a/tools/include/asm-generic/bitops/non-atomic.h b/tools/include/asm-generic/bitops/non-atomic.h
index e5e78e42e57b..0c472a833408 100644
--- a/tools/include/asm-generic/bitops/non-atomic.h
+++ b/tools/include/asm-generic/bitops/non-atomic.h
@@ -5,7 +5,7 @@
 #include <linux/bits.h>
 
 /**
- * __set_bit - Set a bit in memory
+ * ___set_bit - Set a bit in memory
  * @nr: the bit to set
  * @addr: the address to start counting from
  *
@@ -14,7 +14,7 @@
  * may be that only one operation succeeds.
  */
 static __always_inline void
-__set_bit(unsigned long nr, volatile unsigned long *addr)
+___set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -23,7 +23,7 @@ __set_bit(unsigned long nr, volatile unsigned long *addr)
 }
 
 static __always_inline void
-__clear_bit(unsigned long nr, volatile unsigned long *addr)
+___clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -32,7 +32,7 @@ __clear_bit(unsigned long nr, volatile unsigned long *addr)
 }
 
 /**
- * __change_bit - Toggle a bit in memory
+ * ___change_bit - Toggle a bit in memory
  * @nr: the bit to change
  * @addr: the address to start counting from
  *
@@ -41,7 +41,7 @@ __clear_bit(unsigned long nr, volatile unsigned long *addr)
  * may be that only one operation succeeds.
  */
 static __always_inline void
-__change_bit(unsigned long nr, volatile unsigned long *addr)
+___change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -50,7 +50,7 @@ __change_bit(unsigned long nr, volatile unsigned long *addr)
 }
 
 /**
- * __test_and_set_bit - Set a bit and return its old value
+ * ___test_and_set_bit - Set a bit and return its old value
  * @nr: Bit to set
  * @addr: Address to count from
  *
@@ -59,7 +59,7 @@ __change_bit(unsigned long nr, volatile unsigned long *addr)
  * but actually fail.  You must protect multiple accesses with a lock.
  */
 static __always_inline bool
-__test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
+___test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -70,7 +70,7 @@ __test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
 }
 
 /**
- * __test_and_clear_bit - Clear a bit and return its old value
+ * ___test_and_clear_bit - Clear a bit and return its old value
  * @nr: Bit to clear
  * @addr: Address to count from
  *
@@ -79,7 +79,7 @@ __test_and_set_bit(unsigned long nr, volatile unsigned long *addr)
  * but actually fail.  You must protect multiple accesses with a lock.
  */
 static __always_inline bool
-__test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
+___test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -91,7 +91,7 @@ __test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
 
 /* WARNING: non atomic and it can be reordered! */
 static __always_inline bool
-__test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
+___test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -102,12 +102,12 @@ __test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
 }
 
 /**
- * test_bit - Determine whether a bit is set
+ * _test_bit - Determine whether a bit is set
  * @nr: bit number to test
  * @addr: Address to start counting from
  */
 static __always_inline bool
-test_bit(unsigned long nr, const volatile unsigned long *addr)
+_test_bit(unsigned long nr, const volatile unsigned long *addr)
 {
 	return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
 }
diff --git a/tools/include/linux/bitops.h b/tools/include/linux/bitops.h
index 5fca38fe1ba8..f18683b95ea6 100644
--- a/tools/include/linux/bitops.h
+++ b/tools/include/linux/bitops.h
@@ -25,6 +25,22 @@ extern unsigned int __sw_hweight16(unsigned int w);
 extern unsigned int __sw_hweight32(unsigned int w);
 extern unsigned long __sw_hweight64(__u64 w);
 
+/*
+ * Defined here because those may be needed by architecture-specific static
+ * inlines.
+ */
+
+#define bitop(op, nr, addr)						\
+	op(nr, addr)
+
+#define __set_bit(nr, addr)		bitop(___set_bit, nr, addr)
+#define __clear_bit(nr, addr)		bitop(___clear_bit, nr, addr)
+#define __change_bit(nr, addr)		bitop(___change_bit, nr, addr)
+#define __test_and_set_bit(nr, addr)	bitop(___test_and_set_bit, nr, addr)
+#define __test_and_clear_bit(nr, addr)	bitop(___test_and_clear_bit, nr, addr)
+#define __test_and_change_bit(nr, addr)	bitop(___test_and_change_bit, nr, addr)
+#define test_bit(nr, addr)		bitop(_test_bit, nr, addr)
+
 /*
  * Include this here because some architectures need generic_ffs/fls in
  * scope
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (4 preceding siblings ...)
  2022-06-24 12:13 ` [PATCH v5 5/9] bitops: wrap non-atomic bitops with a transparent macro Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2022-07-15  0:04   ` Guenter Roeck
  2022-06-24 12:13 ` [PATCH v5 7/9] net/ice: fix initializing the bitmap in the switch code Alexander Lobakin
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

Currently, many architecture-specific non-atomic bitop
implementations use inline asm or other hacks which are faster or
more robust when working with "real" variables (i.e. fields from
the structures etc.), but the compilers have no clue how to optimize
them out when called on compile-time constants. That said, the
following code:

	DECLARE_BITMAP(foo, BITS_PER_LONG) = { }; // -> unsigned long foo[1];
	unsigned long bar = BIT(BAR_BIT);
	unsigned long baz = 0;

	__set_bit(FOO_BIT, foo);
	baz |= BIT(BAZ_BIT);

	BUILD_BUG_ON(!__builtin_constant_p(test_bit(FOO_BIT, foo));
	BUILD_BUG_ON(!__builtin_constant_p(bar & BAR_BIT));
	BUILD_BUG_ON(!__builtin_constant_p(baz & BAZ_BIT));

triggers the first assertion on x86_64, which means that the
compiler is unable to evaluate it to a compile-time initializer
when the architecture-specific bitop is used even if it's obvious.
In order to let the compiler optimize out such cases, expand the
bitop() macro to use the "constant" C non-atomic bitop
implementations when all of the arguments passed are compile-time
constants, which means that the result will be a compile-time
constant as well, so that it produces more efficient and simple
code in 100% cases, comparing to the architecture-specific
counterparts.

The savings are architecture, compiler and compiler flags dependent,
for example, on x86_64 -O2:

GCC 12: add/remove: 78/29 grow/shrink: 332/525 up/down: 31325/-61560 (-30235)
LLVM 13: add/remove: 79/76 grow/shrink: 184/537 up/down: 55076/-141892 (-86816)
LLVM 14: add/remove: 10/3 grow/shrink: 93/138 up/down: 3705/-6992 (-3287)

and ARM64 (courtesy of Mark):

GCC 11: add/remove: 92/29 grow/shrink: 933/2766 up/down: 39340/-82580 (-43240)
LLVM 14: add/remove: 21/11 grow/shrink: 620/651 up/down: 12060/-15824 (-3764)

Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Marco Elver <elver@google.com>
---
 include/linux/bitops.h | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 3c3afbae1533..cf9bf65039f2 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -33,8 +33,24 @@ extern unsigned long __sw_hweight64(__u64 w);
 
 #include <asm-generic/bitops/generic-non-atomic.h>
 
+/*
+ * Many architecture-specific non-atomic bitops contain inline asm code and due
+ * to that the compiler can't optimize them to compile-time expressions or
+ * constants. In contrary, generic_*() helpers are defined in pure C and
+ * compilers optimize them just well.
+ * Therefore, to make `unsigned long foo = 0; __set_bit(BAR, &foo)` effectively
+ * equal to `unsigned long foo = BIT(BAR)`, pick the generic C alternative when
+ * the arguments can be resolved at compile time. That expression itself is a
+ * constant and doesn't bring any functional changes to the rest of cases.
+ * The casts to `uintptr_t` are needed to mitigate `-Waddress` warnings when
+ * passing a bitmap from .bss or .data (-> `!!addr` is always true).
+ */
 #define bitop(op, nr, addr)						\
-	op(nr, addr)
+	((__builtin_constant_p(nr) &&					\
+	  __builtin_constant_p((uintptr_t)(addr) != (uintptr_t)NULL) &&	\
+	  (uintptr_t)(addr) != (uintptr_t)NULL &&			\
+	  __builtin_constant_p(*(const unsigned long *)(addr))) ?	\
+	 const##op(nr, addr) : op(nr, addr))
 
 #define __set_bit(nr, addr)		bitop(___set_bit, nr, addr)
 #define __clear_bit(nr, addr)		bitop(___clear_bit, nr, addr)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 7/9] net/ice: fix initializing the bitmap in the switch code
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (5 preceding siblings ...)
  2022-06-24 12:13 ` [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 8/9] bitmap: don't assume compiler evaluates small mem*() builtins calls Alexander Lobakin
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

Kbuild spotted the following bug during the testing of one of
the optimizations:

In file included from include/linux/cpumask.h:12,
[...]
                from drivers/net/ethernet/intel/ice/ice_switch.c:4:
drivers/net/ethernet/intel/ice/ice_switch.c: In function 'ice_find_free_recp_res_idx.constprop':
include/linux/bitmap.h:447:22: warning: 'possible_idx[0]' is used uninitialized [-Wuninitialized]
  447 |                 *map |= GENMASK(start + nbits - 1, start);
      |                      ^~
In file included from drivers/net/ethernet/intel/ice/ice.h:7,
                 from drivers/net/ethernet/intel/ice/ice_lib.h:7,
                 from drivers/net/ethernet/intel/ice/ice_switch.c:4:
drivers/net/ethernet/intel/ice/ice_switch.c:4929:24: note: 'possible_idx[0]' was declared here
 4929 |         DECLARE_BITMAP(possible_idx, ICE_MAX_FV_WORDS);
      |                        ^~~~~~~~~~~~
include/linux/types.h:11:23: note: in definition of macro 'DECLARE_BITMAP'
   11 |         unsigned long name[BITS_TO_LONGS(bits)]
      |                       ^~~~

%ICE_MAX_FV_WORDS is 48, so bitmap_set() here was initializing only
48 bits, leaving a junk in the rest 16.
It was previously hidden due to that filling 48 bits makes
bitmap_set() call external __bitmap_set(), but after making it use
plain bit arithmetics on small bitmaps, compilers started seeing
the issue. It was still working because those 16 weren't used
anywhere anyhow.
bitmap_{clear,set}() are not really intended to initialize bitmaps,
rather to modify already initialized ones, as they don't do anything
past the passed number of bits. The correct function to do this in
that particular case is bitmap_fill(), so use it here. It will do
`*possible_idx = ~0UL` instead of `*possible_idx |= GENMASK(47, 0)`,
not leaving anything in an undefined state.

Fixes: fd2a6b71e300 ("ice: create advanced switch recipe")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_switch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c
index 8d8f3eec79ee..9b2872e89151 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.c
+++ b/drivers/net/ethernet/intel/ice/ice_switch.c
@@ -4934,7 +4934,7 @@ ice_find_free_recp_res_idx(struct ice_hw *hw, const unsigned long *profiles,
 	bitmap_zero(recipes, ICE_MAX_NUM_RECIPES);
 	bitmap_zero(used_idx, ICE_MAX_FV_WORDS);
 
-	bitmap_set(possible_idx, 0, ICE_MAX_FV_WORDS);
+	bitmap_fill(possible_idx, ICE_MAX_FV_WORDS);
 
 	/* For each profile we are going to associate the recipe with, add the
 	 * recipes that are associated with that profile. This will give us
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 8/9] bitmap: don't assume compiler evaluates small mem*() builtins calls
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (6 preceding siblings ...)
  2022-06-24 12:13 ` [PATCH v5 7/9] net/ice: fix initializing the bitmap in the switch code Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2022-06-24 12:13 ` [PATCH v5 9/9] lib: test_bitmap: add compile-time optimization/evaluations assertions Alexander Lobakin
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

Intel kernel bot triggered the build bug on ARC architecture that
in fact is as follows:

	DECLARE_BITMAP(bitmap, BITS_PER_LONG);

	bitmap_clear(bitmap, 0, BITS_PER_LONG);
	BUILD_BUG_ON(!__builtin_constant_p(*bitmap));

which can be expanded to:

	unsigned long bitmap[1];

	memset(bitmap, 0, sizeof(*bitmap));
	BUILD_BUG_ON(!__builtin_constant_p(*bitmap));

In most cases, a compiler is able to expand small/simple mem*()
calls to simple assignments or bitops, in this case that would mean:

	unsigned long bitmap[1] = { 0 };

	BUILD_BUG_ON(!__builtin_constant_p(*bitmap));

and on most architectures this works, but not on ARC, despite having
-O3 for every build.
So, to make this work, in case when the last bit to modify is still
within the first long (small_const_nbits()), just use plain
assignments for the rest of bitmap_*() functions which still use
mem*(), but didn't receive such compile-time optimizations yet.
This doesn't have the same coverage as compilers provide, but at
least something to start:

text: add/remove: 3/7 grow/shrink: 43/78 up/down: 1848/-3370 (-1546)
data: add/remove: 1/11 grow/shrink: 0/8 up/down: 4/-356 (-352)

notably cpumask_*() family when NR_CPUS <= BITS_PER_LONG:

netif_get_num_default_rss_queues              38       4     -34
cpumask_copy                                  90       -     -90
cpumask_clear                                146       -    -146

and the abovementioned assertion started passing.

Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
---
 include/linux/bitmap.h | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 2e6cd5681040..a0f4f3af8d30 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -238,20 +238,32 @@ extern int bitmap_print_list_to_buf(char *buf, const unsigned long *maskp,
 static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
 	unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
-	memset(dst, 0, len);
+
+	if (small_const_nbits(nbits))
+		*dst = 0;
+	else
+		memset(dst, 0, len);
 }
 
 static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
 {
 	unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
-	memset(dst, 0xff, len);
+
+	if (small_const_nbits(nbits))
+		*dst = ~0UL;
+	else
+		memset(dst, 0xff, len);
 }
 
 static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
 			unsigned int nbits)
 {
 	unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
-	memcpy(dst, src, len);
+
+	if (small_const_nbits(nbits))
+		*dst = *src;
+	else
+		memcpy(dst, src, len);
 }
 
 /*
@@ -431,6 +443,8 @@ static __always_inline void bitmap_set(unsigned long *map, unsigned int start,
 {
 	if (__builtin_constant_p(nbits) && nbits == 1)
 		__set_bit(start, map);
+	else if (small_const_nbits(start + nbits))
+		*map |= GENMASK(start + nbits - 1, start);
 	else if (__builtin_constant_p(start & BITMAP_MEM_MASK) &&
 		 IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) &&
 		 __builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
@@ -445,6 +459,8 @@ static __always_inline void bitmap_clear(unsigned long *map, unsigned int start,
 {
 	if (__builtin_constant_p(nbits) && nbits == 1)
 		__clear_bit(start, map);
+	else if (small_const_nbits(start + nbits))
+		*map &= ~GENMASK(start + nbits - 1, start);
 	else if (__builtin_constant_p(start & BITMAP_MEM_MASK) &&
 		 IS_ALIGNED(start, BITMAP_MEM_ALIGNMENT) &&
 		 __builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 9/9] lib: test_bitmap: add compile-time optimization/evaluations assertions
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (7 preceding siblings ...)
  2022-06-24 12:13 ` [PATCH v5 8/9] bitmap: don't assume compiler evaluates small mem*() builtins calls Alexander Lobakin
@ 2022-06-24 12:13 ` Alexander Lobakin
  2022-06-24 12:51 ` [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Borislav Petkov
  2022-06-30 16:56 ` Alexander Lobakin
  10 siblings, 0 replies; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-24 12:13 UTC (permalink / raw)
  To: Arnd Bergmann, Yury Norov
  Cc: Alexander Lobakin, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

Add a function to the bitmap test suite, which will ensure that
compilers are able to evaluate operations performed by the
bitops/bitmap helpers to compile-time constants when all of the
arguments are compile-time constants as well, or trigger a build
bug otherwise. This should work on all architectures and all the
optimization levels supported by Kbuild.
The function doesn't perform any runtime tests and gets optimized
out to nothing after passing the build assertions.
Unfortunately, Clang for s390 is currently broken (up to the latest
Git snapshots) -- see the comment in the code -- so for now there's
a small workaround for it which doesn't alter the logics. Hope we'll
be able to remove it one day (bugreport is on its way).

Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 lib/test_bitmap.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index d5923a640457..25967cfa4ab2 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -869,6 +869,67 @@ static void __init test_bitmap_print_buf(void)
 	}
 }
 
+static void __init test_bitmap_const_eval(void)
+{
+	DECLARE_BITMAP(bitmap, BITS_PER_LONG);
+	unsigned long initvar = BIT(2);
+	unsigned long bitopvar = 0;
+	unsigned long var = 0;
+	int res;
+
+	/*
+	 * Compilers must be able to optimize all of those to compile-time
+	 * constants on any supported optimization level (-O2, -Os) and any
+	 * architecture. Otherwise, trigger a build bug.
+	 * The whole function gets optimized out then, there's nothing to do
+	 * in runtime.
+	 */
+
+	/*
+	 * Equals to `unsigned long bitmap[1] = { GENMASK(6, 5), }`.
+	 * Clang on s390 optimizes bitops at compile-time as intended, but at
+	 * the same time stops treating @bitmap and @bitopvar as compile-time
+	 * constants after regular test_bit() is executed, thus triggering the
+	 * build bugs below. So, call const_test_bit() there directly until
+	 * the compiler is fixed.
+	 */
+	bitmap_clear(bitmap, 0, BITS_PER_LONG);
+#if defined(__s390__) && defined(__clang__)
+	if (!const_test_bit(7, bitmap))
+#else
+	if (!test_bit(7, bitmap))
+#endif
+		bitmap_set(bitmap, 5, 2);
+
+	/* Equals to `unsigned long bitopvar = BIT(20)` */
+	__change_bit(31, &bitopvar);
+	bitmap_shift_right(&bitopvar, &bitopvar, 11, BITS_PER_LONG);
+
+	/* Equals to `unsigned long var = BIT(25)` */
+	var |= BIT(25);
+	if (var & BIT(0))
+		var ^= GENMASK(9, 6);
+
+	/* __const_hweight<32|64>(GENMASK(6, 5)) == 2 */
+	res = bitmap_weight(bitmap, 20);
+	BUILD_BUG_ON(!__builtin_constant_p(res));
+	BUILD_BUG_ON(res != 2);
+
+	/* !(BIT(31) & BIT(18)) == 1 */
+	res = !test_bit(18, &bitopvar);
+	BUILD_BUG_ON(!__builtin_constant_p(res));
+	BUILD_BUG_ON(!res);
+
+	/* BIT(2) & GENMASK(14, 8) == 0 */
+	res = initvar & GENMASK(14, 8);
+	BUILD_BUG_ON(!__builtin_constant_p(res));
+	BUILD_BUG_ON(res);
+
+	/* ~BIT(25) */
+	BUILD_BUG_ON(!__builtin_constant_p(~var));
+	BUILD_BUG_ON(~var != ~BIT(25));
+}
+
 static void __init selftest(void)
 {
 	test_zero_clear();
@@ -884,6 +945,7 @@ static void __init selftest(void)
 	test_for_each_set_clump8();
 	test_bitmap_cut();
 	test_bitmap_print_buf();
+	test_bitmap_const_eval();
 }
 
 KSTM_MODULE_LOADERS(test_bitmap);
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (8 preceding siblings ...)
  2022-06-24 12:13 ` [PATCH v5 9/9] lib: test_bitmap: add compile-time optimization/evaluations assertions Alexander Lobakin
@ 2022-06-24 12:51 ` Borislav Petkov
  2022-06-30 16:56 ` Alexander Lobakin
  10 siblings, 0 replies; 22+ messages in thread
From: Borislav Petkov @ 2022-06-24 12:51 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Arnd Bergmann, Yury Norov, Andy Shevchenko, Mark Rutland,
	Matt Turner, Brian Cain, Geert Uytterhoeven, Yoshinori Sato,
	Rich Felker, David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Tony Luck, Maciej Fijalkowski, Jesse Brandeburg,
	Greg Kroah-Hartman, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	kernel test robot, linux-alpha, linux-hexagon, linux-ia64,
	linux-m68k, linux-sh, sparclinux, linux-arch, llvm, linux-kernel

On Fri, Jun 24, 2022 at 02:13:04PM +0200, Alexander Lobakin wrote:
> While I was working on converting some structure fields from a fixed

Can you please send your patchset once a week instead of spamming people
with it every three days?

Thank you.

-- 
Regards/Gruss,
    Boris.

SUSE Software Solutions Germany GmbH
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Martje Boudien Moerman
(HRB 36809, AG Nürnberg)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
                   ` (9 preceding siblings ...)
  2022-06-24 12:51 ` [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Borislav Petkov
@ 2022-06-30 16:56 ` Alexander Lobakin
  2022-07-01  2:58   ` Yury Norov
  10 siblings, 1 reply; 22+ messages in thread
From: Alexander Lobakin @ 2022-06-30 16:56 UTC (permalink / raw)
  To: Yury Norov
  Cc: Alexander Lobakin, Arnd Bergmann, Andy Shevchenko, Mark Rutland,
	Matt Turner, Brian Cain, Geert Uytterhoeven, Yoshinori Sato,
	Rich Felker, David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

From: Alexander Lobakin <alexandr.lobakin@intel.com>
Date: Fri, 24 Jun 2022 14:13:04 +0200

> While I was working on converting some structure fields from a fixed
> type to a bitmap, I started observing code size increase not only in
> places where the code works with the converted structure fields, but
> also where the converted vars were on the stack. That said, the
> following code:

Hey,

Seems like everything is fine this time. I got some reports, but
those aren't caused by any of the changes from the series.
Maybe we can take it to -next and see how it goes?

[...]

>  arch/alpha/include/asm/bitops.h               |  32 ++--
>  arch/hexagon/include/asm/bitops.h             |  24 ++-
>  arch/ia64/include/asm/bitops.h                |  42 ++---
>  arch/ia64/include/asm/processor.h             |   2 +-
>  arch/m68k/include/asm/bitops.h                |  49 ++++--
>  arch/s390/include/asm/bitops.h                |  61 +++----
>  arch/sh/include/asm/bitops-op32.h             |  34 ++--
>  arch/sparc/include/asm/bitops_32.h            |  18 +-
>  arch/sparc/lib/atomic32.c                     |  12 +-
>  arch/x86/include/asm/bitops.h                 |  22 +--
>  drivers/net/ethernet/intel/ice/ice_switch.c   |   2 +-
>  .../asm-generic/bitops/generic-non-atomic.h   | 161 ++++++++++++++++++
>  .../bitops/instrumented-non-atomic.h          |  35 ++--
>  include/asm-generic/bitops/non-atomic.h       | 121 +------------
>  .../bitops/non-instrumented-non-atomic.h      |  16 ++
>  include/linux/bitmap.h                        |  22 ++-
>  include/linux/bitops.h                        |  50 ++++++
>  lib/test_bitmap.c                             |  62 +++++++
>  tools/include/asm-generic/bitops/non-atomic.h |  34 ++--
>  tools/include/linux/bitops.h                  |  16 ++
>  20 files changed, 544 insertions(+), 271 deletions(-)
>  create mode 100644 include/asm-generic/bitops/generic-non-atomic.h
>  create mode 100644 include/asm-generic/bitops/non-instrumented-non-atomic.h
> 
> -- 
> 2.36.1

Thanks,
Olek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-06-30 16:56 ` Alexander Lobakin
@ 2022-07-01  2:58   ` Yury Norov
  0 siblings, 0 replies; 22+ messages in thread
From: Yury Norov @ 2022-07-01  2:58 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Arnd Bergmann, Andy Shevchenko, Mark Rutland, Matt Turner,
	Brian Cain, Geert Uytterhoeven, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

On Thu, Jun 30, 2022 at 06:56:11PM +0200, Alexander Lobakin wrote:
> From: Alexander Lobakin <alexandr.lobakin@intel.com>
> Date: Fri, 24 Jun 2022 14:13:04 +0200
> 
> > While I was working on converting some structure fields from a fixed
> > type to a bitmap, I started observing code size increase not only in
> > places where the code works with the converted structure fields, but
> > also where the converted vars were on the stack. That said, the
> > following code:
> 
> Hey,
> 
> Seems like everything is fine this time. I got some reports, but
> those aren't caused by any of the changes from the series.
> Maybe we can take it to -next and see how it goes?

Applied on github.com:/norov/linux.git branch bitmap-for-next

Thanks!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 3/9] bitops: unify non-atomic bitops prototypes across architectures
  2022-06-24 12:13 ` [PATCH v5 3/9] bitops: unify non-atomic bitops prototypes across architectures Alexander Lobakin
@ 2022-07-06 10:09   ` Geert Uytterhoeven
  0 siblings, 0 replies; 22+ messages in thread
From: Geert Uytterhoeven @ 2022-07-06 10:09 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Arnd Bergmann, Yury Norov, Andy Shevchenko, Mark Rutland,
	Matt Turner, Brian Cain, Yoshinori Sato, Rich Felker,
	David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, alpha,
	open list:QUALCOMM HEXAGON...,
	linux-ia64, linux-m68k, Linux-sh list, sparclinux, Linux-Arch,
	llvm, Linux Kernel Mailing List

On Fri, Jun 24, 2022 at 2:13 PM Alexander Lobakin
<alexandr.lobakin@intel.com> wrote:
> Currently, there is a mess with the prototypes of the non-atomic
> bitops across the different architectures:
>
> ret     bool, int, unsigned long
> nr      int, long, unsigned int, unsigned long
> addr    volatile unsigned long *, volatile void *
>
> Thankfully, it doesn't provoke any bugs, but can sometimes make
> the compiler angry when it's not handy at all.
> Adjust all the prototypes to the following standard:
>
> ret     bool                            retval can be only 0 or 1
> nr      unsigned long                   native; signed makes no sense
> addr    volatile unsigned long *        bitmaps are arrays of ulongs
>
> Next, some architectures don't define 'arch_' versions as they don't
> support instrumentation, others do. To make sure there is always the
> same set of callables present and to ease any potential future
> changes, make them all follow the rule:
>  * architecture-specific files define only 'arch_' versions;
>  * non-prefixed versions can be defined only in asm-generic files;
> and place the non-prefixed definitions into a new file in
> asm-generic to be included by non-instrumented architectures.
>
> Finally, add some static assertions in order to prevent people from
> making a mess in this room again.
> I also used the %__always_inline attribute consistently, so that
> they always get resolved to the actual operations.
>
> Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> Reviewed-by: Yury Norov <yury.norov@gmail.com>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>

>  arch/m68k/include/asm/bitops.h                | 49 ++++++++++-----

Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-06-24 12:13 ` [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
@ 2022-07-15  0:04   ` Guenter Roeck
  2022-07-15 13:26     ` Alexander Lobakin
  0 siblings, 1 reply; 22+ messages in thread
From: Guenter Roeck @ 2022-07-15  0:04 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Arnd Bergmann, Yury Norov, Andy Shevchenko, Mark Rutland,
	Matt Turner, Brian Cain, Geert Uytterhoeven, Yoshinori Sato,
	Rich Felker, David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

On Fri, Jun 24, 2022 at 02:13:10PM +0200, Alexander Lobakin wrote:
> Currently, many architecture-specific non-atomic bitop
> implementations use inline asm or other hacks which are faster or
> more robust when working with "real" variables (i.e. fields from
> the structures etc.), but the compilers have no clue how to optimize
> them out when called on compile-time constants. That said, the
> following code:
> 
> 	DECLARE_BITMAP(foo, BITS_PER_LONG) = { }; // -> unsigned long foo[1];
> 	unsigned long bar = BIT(BAR_BIT);
> 	unsigned long baz = 0;
> 
> 	__set_bit(FOO_BIT, foo);
> 	baz |= BIT(BAZ_BIT);
> 
> 	BUILD_BUG_ON(!__builtin_constant_p(test_bit(FOO_BIT, foo));
> 	BUILD_BUG_ON(!__builtin_constant_p(bar & BAR_BIT));
> 	BUILD_BUG_ON(!__builtin_constant_p(baz & BAZ_BIT));
> 
> triggers the first assertion on x86_64, which means that the
> compiler is unable to evaluate it to a compile-time initializer
> when the architecture-specific bitop is used even if it's obvious.
> In order to let the compiler optimize out such cases, expand the
> bitop() macro to use the "constant" C non-atomic bitop
> implementations when all of the arguments passed are compile-time
> constants, which means that the result will be a compile-time
> constant as well, so that it produces more efficient and simple
> code in 100% cases, comparing to the architecture-specific
> counterparts.
> 
> The savings are architecture, compiler and compiler flags dependent,
> for example, on x86_64 -O2:
> 
> GCC 12: add/remove: 78/29 grow/shrink: 332/525 up/down: 31325/-61560 (-30235)
> LLVM 13: add/remove: 79/76 grow/shrink: 184/537 up/down: 55076/-141892 (-86816)
> LLVM 14: add/remove: 10/3 grow/shrink: 93/138 up/down: 3705/-6992 (-3287)
> 
> and ARM64 (courtesy of Mark):
> 
> GCC 11: add/remove: 92/29 grow/shrink: 933/2766 up/down: 39340/-82580 (-43240)
> LLVM 14: add/remove: 21/11 grow/shrink: 620/651 up/down: 12060/-15824 (-3764)
> 
> Cc: Mark Rutland <mark.rutland@arm.com>
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Marco Elver <elver@google.com>

Building i386:allyesconfig ... failed
--------------
Error log:
arch/x86/platform/olpc/olpc-xo1-sci.c: In function 'send_ebook_state':
arch/x86/platform/olpc/olpc-xo1-sci.c:83:63: error: logical not is only applied to the left hand side of comparison

Bisect log attached.

Guenter

---
# bad: [4662b7adea50bb62e993a67f611f3be625d3df0d] Add linux-next specific files for 20220713
# good: [32346491ddf24599decca06190ebca03ff9de7f8] Linux 5.19-rc6
git bisect start 'HEAD' 'v5.19-rc6'
# good: [8b7e002d8bc6e17c94092d25e7261db4e6e5f2cc] Merge branch 'drm-next' of git://git.freedesktop.org/git/drm/drm.git
git bisect good 8b7e002d8bc6e17c94092d25e7261db4e6e5f2cc
# good: [07f6d21d6e33c1e28e24ae84e9d26e4e7d4853f5] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
git bisect good 07f6d21d6e33c1e28e24ae84e9d26e4e7d4853f5
# good: [5ff085e5d4f6700e03635d5e700f52163a6dc2a7] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
git bisect good 5ff085e5d4f6700e03635d5e700f52163a6dc2a7
# good: [eb9e3fdbdd8b61ef0f4bee23259fe6ab69e463ab] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git
git bisect good eb9e3fdbdd8b61ef0f4bee23259fe6ab69e463ab
# good: [9f2183cd961e5ddb7954eafb6bb01a495c6a9c7b] hexagon/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
git bisect good 9f2183cd961e5ddb7954eafb6bb01a495c6a9c7b
# bad: [e878aa5faf9ac8c0b5d0c3f293389c194c250fff] Merge branch 'mm-nonmm-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
git bisect bad e878aa5faf9ac8c0b5d0c3f293389c194c250fff
# good: [cf95d50205f62c4f5f538676def847292cf39fa9] fs: don't call ->writepage from __mpage_writepage
git bisect good cf95d50205f62c4f5f538676def847292cf39fa9
# good: [5103cbfd92d3587713476f94f9485b96e02f0146] Merge branch 'for-next/execve' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git
git bisect good 5103cbfd92d3587713476f94f9485b96e02f0146
# good: [ee56c3e8eec166f4e4a2ca842b7804d14f3a0208] Merge branch 'master' into mm-nonmm-stable
git bisect good ee56c3e8eec166f4e4a2ca842b7804d14f3a0208
# bad: [dc34d5036692c614eef23c1130ee42a201c316bf] lib: test_bitmap: add compile-time optimization/evaluations assertions
git bisect bad dc34d5036692c614eef23c1130ee42a201c316bf
# good: [bb7379bfa680bd48b468e856475778db2ad866c1] bitops: define const_*() versions of the non-atomics
git bisect good bb7379bfa680bd48b468e856475778db2ad866c1
# bad: [b03fc1173c0c2bb8fad61902a862985cecdc4b1b] bitops: let optimize out non-atomic bitops on compile-time constants
git bisect bad b03fc1173c0c2bb8fad61902a862985cecdc4b1b
# good: [e69eb9c460f128b71c6b995d75a05244e4b6cc3e] bitops: wrap non-atomic bitops with a transparent macro
git bisect good e69eb9c460f128b71c6b995d75a05244e4b6cc3e
# first bad commit: [b03fc1173c0c2bb8fad61902a862985cecdc4b1b] bitops: let optimize out non-atomic bitops on compile-time constants

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-07-15  0:04   ` Guenter Roeck
@ 2022-07-15 13:26     ` Alexander Lobakin
  2022-07-15 13:49       ` Guenter Roeck
  0 siblings, 1 reply; 22+ messages in thread
From: Alexander Lobakin @ 2022-07-15 13:26 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alexander Lobakin, Arnd Bergmann, Yury Norov, Andy Shevchenko,
	Mark Rutland, Matt Turner, Brian Cain, Geert Uytterhoeven,
	Yoshinori Sato, Rich Felker, David S. Miller, Kees Cook,
	Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

From: Guenter Roeck <linux@roeck-us.net>
Date: Thu, 14 Jul 2022 17:04:02 -0700

> On Fri, Jun 24, 2022 at 02:13:10PM +0200, Alexander Lobakin wrote:
> > Currently, many architecture-specific non-atomic bitop
> > implementations use inline asm or other hacks which are faster or

[...]

> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> > Reviewed-by: Marco Elver <elver@google.com>
> 
> Building i386:allyesconfig ... failed
> --------------
> Error log:
> arch/x86/platform/olpc/olpc-xo1-sci.c: In function 'send_ebook_state':
> arch/x86/platform/olpc/olpc-xo1-sci.c:83:63: error: logical not is only applied to the left hand side of comparison

Looks like a trigger, not a cause... Anyway, this construct:

	unsigned char state;

	[...]

	if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)

doesn't look legit enough.
That redundant double-negation [of boolean value], together with
comparing boolean to char, provokes compilers to think the author
made logical mistakes here, although it works as expected.
Could you please try (if it's not automated build which you can't
modify) the following:

--- a/arch/x86/platform/olpc/olpc-xo1-sci.c
+++ b/arch/x86/platform/olpc/olpc-xo1-sci.c
@@ -80,7 +80,7 @@ static void send_ebook_state(void)
 		return;
 	}
 
-	if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
+	if (test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == !!state)
 		return; /* Nothing new to report. */
 
 	input_report_switch(ebook_switch_idev, SW_TABLET_MODE, state);
---

We'd take it into the bitmap tree then. The series revealed
a fistful of existing code issues already :)

> 
> Bisect log attached.
> 
> Guenter
> 
> ---
> # bad: [4662b7adea50bb62e993a67f611f3be625d3df0d] Add linux-next specific files for 20220713
> # good: [32346491ddf24599decca06190ebca03ff9de7f8] Linux 5.19-rc6
> git bisect start 'HEAD' 'v5.19-rc6'
> # good: [8b7e002d8bc6e17c94092d25e7261db4e6e5f2cc] Merge branch 'drm-next' of git://git.freedesktop.org/git/drm/drm.git
> git bisect good 8b7e002d8bc6e17c94092d25e7261db4e6e5f2cc
> # good: [07f6d21d6e33c1e28e24ae84e9d26e4e7d4853f5] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
> git bisect good 07f6d21d6e33c1e28e24ae84e9d26e4e7d4853f5
> # good: [5ff085e5d4f6700e03635d5e700f52163a6dc2a7] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
> git bisect good 5ff085e5d4f6700e03635d5e700f52163a6dc2a7
> # good: [eb9e3fdbdd8b61ef0f4bee23259fe6ab69e463ab] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git
> git bisect good eb9e3fdbdd8b61ef0f4bee23259fe6ab69e463ab
> # good: [9f2183cd961e5ddb7954eafb6bb01a495c6a9c7b] hexagon/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
> git bisect good 9f2183cd961e5ddb7954eafb6bb01a495c6a9c7b
> # bad: [e878aa5faf9ac8c0b5d0c3f293389c194c250fff] Merge branch 'mm-nonmm-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> git bisect bad e878aa5faf9ac8c0b5d0c3f293389c194c250fff
> # good: [cf95d50205f62c4f5f538676def847292cf39fa9] fs: don't call ->writepage from __mpage_writepage
> git bisect good cf95d50205f62c4f5f538676def847292cf39fa9
> # good: [5103cbfd92d3587713476f94f9485b96e02f0146] Merge branch 'for-next/execve' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git
> git bisect good 5103cbfd92d3587713476f94f9485b96e02f0146
> # good: [ee56c3e8eec166f4e4a2ca842b7804d14f3a0208] Merge branch 'master' into mm-nonmm-stable
> git bisect good ee56c3e8eec166f4e4a2ca842b7804d14f3a0208
> # bad: [dc34d5036692c614eef23c1130ee42a201c316bf] lib: test_bitmap: add compile-time optimization/evaluations assertions
> git bisect bad dc34d5036692c614eef23c1130ee42a201c316bf
> # good: [bb7379bfa680bd48b468e856475778db2ad866c1] bitops: define const_*() versions of the non-atomics
> git bisect good bb7379bfa680bd48b468e856475778db2ad866c1
> # bad: [b03fc1173c0c2bb8fad61902a862985cecdc4b1b] bitops: let optimize out non-atomic bitops on compile-time constants
> git bisect bad b03fc1173c0c2bb8fad61902a862985cecdc4b1b
> # good: [e69eb9c460f128b71c6b995d75a05244e4b6cc3e] bitops: wrap non-atomic bitops with a transparent macro
> git bisect good e69eb9c460f128b71c6b995d75a05244e4b6cc3e
> # first bad commit: [b03fc1173c0c2bb8fad61902a862985cecdc4b1b] bitops: let optimize out non-atomic bitops on compile-time constants

Thanks,
Olek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-07-15 13:26     ` Alexander Lobakin
@ 2022-07-15 13:49       ` Guenter Roeck
  2022-07-15 14:19         ` Yury Norov
  0 siblings, 1 reply; 22+ messages in thread
From: Guenter Roeck @ 2022-07-15 13:49 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Arnd Bergmann, Yury Norov, Andy Shevchenko, Mark Rutland,
	Matt Turner, Brian Cain, Geert Uytterhoeven, Yoshinori Sato,
	Rich Felker, David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

On 7/15/22 06:26, Alexander Lobakin wrote:
> From: Guenter Roeck <linux@roeck-us.net>
> Date: Thu, 14 Jul 2022 17:04:02 -0700
> 
>> On Fri, Jun 24, 2022 at 02:13:10PM +0200, Alexander Lobakin wrote:
>>> Currently, many architecture-specific non-atomic bitop
>>> implementations use inline asm or other hacks which are faster or
> 
> [...]
> 
>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
>>> Reviewed-by: Marco Elver <elver@google.com>
>>
>> Building i386:allyesconfig ... failed
>> --------------
>> Error log:
>> arch/x86/platform/olpc/olpc-xo1-sci.c: In function 'send_ebook_state':
>> arch/x86/platform/olpc/olpc-xo1-sci.c:83:63: error: logical not is only applied to the left hand side of comparison
> 
> Looks like a trigger, not a cause... Anyway, this construct:
> 
> 	unsigned char state;
> 
> 	[...]
> 
> 	if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
> 
> doesn't look legit enough.
> That redundant double-negation [of boolean value], together with
> comparing boolean to char, provokes compilers to think the author
> made logical mistakes here, although it works as expected.
> Could you please try (if it's not automated build which you can't
> modify) the following:
> 

Agreed, the existing code seems wrong. The change below looks correct
and fixes the problem. Feel free to add

Reviewed-and-tested-by: Guenter Roeck <linux@roeck-us.net>

to the real patch.

Thanks,
Guenter

> --- a/arch/x86/platform/olpc/olpc-xo1-sci.c
> +++ b/arch/x86/platform/olpc/olpc-xo1-sci.c
> @@ -80,7 +80,7 @@ static void send_ebook_state(void)
>   		return;
>   	}
>   
> -	if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
> +	if (test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == !!state)
>   		return; /* Nothing new to report. */
>   
>   	input_report_switch(ebook_switch_idev, SW_TABLET_MODE, state);
> ---
> 
> We'd take it into the bitmap tree then. The series revealed
> a fistful of existing code issues already :)
> 
>>
>> Bisect log attached.
>>
>> Guenter
>>
>> ---
>> # bad: [4662b7adea50bb62e993a67f611f3be625d3df0d] Add linux-next specific files for 20220713
>> # good: [32346491ddf24599decca06190ebca03ff9de7f8] Linux 5.19-rc6
>> git bisect start 'HEAD' 'v5.19-rc6'
>> # good: [8b7e002d8bc6e17c94092d25e7261db4e6e5f2cc] Merge branch 'drm-next' of git://git.freedesktop.org/git/drm/drm.git
>> git bisect good 8b7e002d8bc6e17c94092d25e7261db4e6e5f2cc
>> # good: [07f6d21d6e33c1e28e24ae84e9d26e4e7d4853f5] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
>> git bisect good 07f6d21d6e33c1e28e24ae84e9d26e4e7d4853f5
>> # good: [5ff085e5d4f6700e03635d5e700f52163a6dc2a7] Merge branch 'staging-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
>> git bisect good 5ff085e5d4f6700e03635d5e700f52163a6dc2a7
>> # good: [eb9e3fdbdd8b61ef0f4bee23259fe6ab69e463ab] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random.git
>> git bisect good eb9e3fdbdd8b61ef0f4bee23259fe6ab69e463ab
>> # good: [9f2183cd961e5ddb7954eafb6bb01a495c6a9c7b] hexagon/mm: enable ARCH_HAS_VM_GET_PAGE_PROT
>> git bisect good 9f2183cd961e5ddb7954eafb6bb01a495c6a9c7b
>> # bad: [e878aa5faf9ac8c0b5d0c3f293389c194c250fff] Merge branch 'mm-nonmm-stable' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>> git bisect bad e878aa5faf9ac8c0b5d0c3f293389c194c250fff
>> # good: [cf95d50205f62c4f5f538676def847292cf39fa9] fs: don't call ->writepage from __mpage_writepage
>> git bisect good cf95d50205f62c4f5f538676def847292cf39fa9
>> # good: [5103cbfd92d3587713476f94f9485b96e02f0146] Merge branch 'for-next/execve' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git
>> git bisect good 5103cbfd92d3587713476f94f9485b96e02f0146
>> # good: [ee56c3e8eec166f4e4a2ca842b7804d14f3a0208] Merge branch 'master' into mm-nonmm-stable
>> git bisect good ee56c3e8eec166f4e4a2ca842b7804d14f3a0208
>> # bad: [dc34d5036692c614eef23c1130ee42a201c316bf] lib: test_bitmap: add compile-time optimization/evaluations assertions
>> git bisect bad dc34d5036692c614eef23c1130ee42a201c316bf
>> # good: [bb7379bfa680bd48b468e856475778db2ad866c1] bitops: define const_*() versions of the non-atomics
>> git bisect good bb7379bfa680bd48b468e856475778db2ad866c1
>> # bad: [b03fc1173c0c2bb8fad61902a862985cecdc4b1b] bitops: let optimize out non-atomic bitops on compile-time constants
>> git bisect bad b03fc1173c0c2bb8fad61902a862985cecdc4b1b
>> # good: [e69eb9c460f128b71c6b995d75a05244e4b6cc3e] bitops: wrap non-atomic bitops with a transparent macro
>> git bisect good e69eb9c460f128b71c6b995d75a05244e4b6cc3e
>> # first bad commit: [b03fc1173c0c2bb8fad61902a862985cecdc4b1b] bitops: let optimize out non-atomic bitops on compile-time constants
> 
> Thanks,
> Olek


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-07-15 13:49       ` Guenter Roeck
@ 2022-07-15 14:19         ` Yury Norov
  2022-07-15 14:50           ` Alexander Lobakin
  0 siblings, 1 reply; 22+ messages in thread
From: Yury Norov @ 2022-07-15 14:19 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Alexander Lobakin, Arnd Bergmann, Andy Shevchenko, Mark Rutland,
	Matt Turner, Brian Cain, Geert Uytterhoeven, Yoshinori Sato,
	Rich Felker, David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

On Fri, Jul 15, 2022 at 06:49:46AM -0700, Guenter Roeck wrote:
> On 7/15/22 06:26, Alexander Lobakin wrote:
> > From: Guenter Roeck <linux@roeck-us.net>
> > Date: Thu, 14 Jul 2022 17:04:02 -0700
> > 
> > > On Fri, Jun 24, 2022 at 02:13:10PM +0200, Alexander Lobakin wrote:
> > > > Currently, many architecture-specific non-atomic bitop
> > > > implementations use inline asm or other hacks which are faster or
> > 
> > [...]
> > 
> > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> > > > Reviewed-by: Marco Elver <elver@google.com>
> > > 
> > > Building i386:allyesconfig ... failed
> > > --------------
> > > Error log:
> > > arch/x86/platform/olpc/olpc-xo1-sci.c: In function 'send_ebook_state':
> > > arch/x86/platform/olpc/olpc-xo1-sci.c:83:63: error: logical not is only applied to the left hand side of comparison
> > 
> > Looks like a trigger, not a cause... Anyway, this construct:
> > 
> > 	unsigned char state;
> > 
> > 	[...]
> > 
> > 	if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
> > 
> > doesn't look legit enough.
> > That redundant double-negation [of boolean value], together with
> > comparing boolean to char, provokes compilers to think the author
> > made logical mistakes here, although it works as expected.
> > Could you please try (if it's not automated build which you can't
> > modify) the following:
> > 
> 
> Agreed, the existing code seems wrong. The change below looks correct
> and fixes the problem. Feel free to add
> 
> Reviewed-and-tested-by: Guenter Roeck <linux@roeck-us.net>
> 
> to the real patch.
> 
> Thanks,
> Guenter
> 
> > --- a/arch/x86/platform/olpc/olpc-xo1-sci.c
> > +++ b/arch/x86/platform/olpc/olpc-xo1-sci.c
> > @@ -80,7 +80,7 @@ static void send_ebook_state(void)
> >   		return;
> >   	}
> > -	if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
> > +	if (test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == !!state)
> >   		return; /* Nothing new to report. */
> >   	input_report_switch(ebook_switch_idev, SW_TABLET_MODE, state);
> > ---
> > 
> > We'd take it into the bitmap tree then. The series revealed
> > a fistful of existing code issues already :)

Would you like me to add your signed-off-by and apply, or you prefer
to send it out as a patch?

Thanks,
Yury

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants
  2022-07-15 14:19         ` Yury Norov
@ 2022-07-15 14:50           ` Alexander Lobakin
  0 siblings, 0 replies; 22+ messages in thread
From: Alexander Lobakin @ 2022-07-15 14:50 UTC (permalink / raw)
  To: Yury Norov
  Cc: Alexander Lobakin, Guenter Roeck, Arnd Bergmann, Andy Shevchenko,
	Mark Rutland, Matt Turner, Brian Cain, Geert Uytterhoeven,
	Yoshinori Sato, Rich Felker, David S. Miller, Kees Cook,
	Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Maciej Fijalkowski,
	Jesse Brandeburg, Greg Kroah-Hartman, Nathan Chancellor,
	Nick Desaulniers, Tom Rix, kernel test robot, linux-alpha,
	linux-hexagon, linux-ia64, linux-m68k, linux-sh, sparclinux,
	linux-arch, llvm, linux-kernel

From: Yury Norov <yury.norov@gmail.com>
Date: Fri, 15 Jul 2022 07:19:12 -0700

> On Fri, Jul 15, 2022 at 06:49:46AM -0700, Guenter Roeck wrote:
> > On 7/15/22 06:26, Alexander Lobakin wrote:
> > > From: Guenter Roeck <linux@roeck-us.net>
> > > Date: Thu, 14 Jul 2022 17:04:02 -0700
> > > 
> > > > On Fri, Jun 24, 2022 at 02:13:10PM +0200, Alexander Lobakin wrote:
> > > > > Currently, many architecture-specific non-atomic bitop
> > > > > implementations use inline asm or other hacks which are faster or
> > > 
> > > [...]
> > > 
> > > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> > > > > Reviewed-by: Marco Elver <elver@google.com>
> > > > 
> > > > Building i386:allyesconfig ... failed
> > > > --------------
> > > > Error log:
> > > > arch/x86/platform/olpc/olpc-xo1-sci.c: In function 'send_ebook_state':
> > > > arch/x86/platform/olpc/olpc-xo1-sci.c:83:63: error: logical not is only applied to the left hand side of comparison
> > > 
> > > Looks like a trigger, not a cause... Anyway, this construct:
> > > 
> > > 	unsigned char state;
> > > 
> > > 	[...]
> > > 
> > > 	if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
> > > 
> > > doesn't look legit enough.
> > > That redundant double-negation [of boolean value], together with
> > > comparing boolean to char, provokes compilers to think the author
> > > made logical mistakes here, although it works as expected.
> > > Could you please try (if it's not automated build which you can't
> > > modify) the following:
> > > 
> > 
> > Agreed, the existing code seems wrong. The change below looks correct
> > and fixes the problem. Feel free to add
> > 
> > Reviewed-and-tested-by: Guenter Roeck <linux@roeck-us.net>
> > 
> > to the real patch.
> > 
> > Thanks,
> > Guenter
> > 
> > > --- a/arch/x86/platform/olpc/olpc-xo1-sci.c
> > > +++ b/arch/x86/platform/olpc/olpc-xo1-sci.c
> > > @@ -80,7 +80,7 @@ static void send_ebook_state(void)
> > >   		return;
> > >   	}
> > > -	if (!!test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == state)
> > > +	if (test_bit(SW_TABLET_MODE, ebook_switch_idev->sw) == !!state)
> > >   		return; /* Nothing new to report. */
> > >   	input_report_switch(ebook_switch_idev, SW_TABLET_MODE, state);
> > > ---
> > > 
> > > We'd take it into the bitmap tree then. The series revealed
> > > a fistful of existing code issues already :)
> 
> Would you like me to add your signed-off-by and apply, or you prefer
> to send it out as a patch?

I'm sending it in a couple minutes :)

> 
> Thanks,
> Yury

Thanks,
Olek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] bitops: always define asm-generic non-atomic bitops
  2022-06-24 12:13 ` [PATCH v5 2/9] bitops: always define asm-generic non-atomic bitops Alexander Lobakin
@ 2023-01-02 16:14   ` Maciej Fijalkowski
  2023-01-02 16:30     ` Alexander Lobakin
  0 siblings, 1 reply; 22+ messages in thread
From: Maciej Fijalkowski @ 2023-01-02 16:14 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Arnd Bergmann, Yury Norov, Andy Shevchenko, Mark Rutland,
	Matt Turner, Brian Cain, Geert Uytterhoeven, Yoshinori Sato,
	Rich Felker, David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Jesse Brandeburg,
	Greg Kroah-Hartman, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	kernel test robot, linux-alpha, linux-hexagon, linux-ia64,
	linux-m68k, linux-sh, sparclinux, linux-arch, llvm, linux-kernel

On Fri, Jun 24, 2022 at 02:13:06PM +0200, Alexander Lobakin wrote:
> Move generic non-atomic bitops from the asm-generic header which
> gets included only when there are no architecture-specific
> alternatives, to a separate independent file to make them always
> available.
> Almost no actual code changes, only one comment added to
> generic_test_bit() saying that it's an atomic operation itself
> and thus `volatile` must always stay there with no cast-aways.
> 
> Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> # comment
> Suggested-by: Marco Elver <elver@google.com> # reference to kernel-doc
> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Reviewed-by: Marco Elver <elver@google.com>
> ---
>  .../asm-generic/bitops/generic-non-atomic.h   | 130 ++++++++++++++++++
>  include/asm-generic/bitops/non-atomic.h       | 110 ++-------------
>  2 files changed, 138 insertions(+), 102 deletions(-)
>  create mode 100644 include/asm-generic/bitops/generic-non-atomic.h
> 

Hi,

this patch gives me a headache when trying to run sparse against a module.

Olek please help :D

$ sudo make C=2 -C . M=drivers/net/ethernet/intel/ice/
make: Entering directory '/home/mfijalko/bpf-next'
  CHECK   drivers/net/ethernet/intel/ice/ice_main.c
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./arch/x86/include/asm/bitops.h:66:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/generic-non-atomic.h:30:9: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:31:9: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:33:10: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:33:16: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:28:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through arch/x86/include/asm/bitops.h, include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/instrumented-non-atomic.h:26:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./arch/x86/include/asm/bitops.h:92:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/generic-non-atomic.h:39:9: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:40:9: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:42:10: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:42:16: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:37:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through arch/x86/include/asm/bitops.h, include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/instrumented-non-atomic.h:42:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./arch/x86/include/asm/bitops.h:117:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/generic-non-atomic.h:57:9: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:58:9: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:60:10: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:60:15: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:55:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through arch/x86/include/asm/bitops.h, include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/instrumented-non-atomic.h:58:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./arch/x86/include/asm/bitops.h:150:9: warning: unreplaced symbol 'oldbit'
./arch/x86/include/asm/bitops.h:154:26: warning: unreplaced symbol 'oldbit'
./arch/x86/include/asm/bitops.h:156:16: warning: unreplaced symbol 'oldbit'
./arch/x86/include/asm/bitops.h:156:9: warning: unreplaced symbol 'return'
./arch/x86/include/asm/bitops.h:148:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/generic-non-atomic.h:75:9: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:76:9: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:77:9: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:79:10: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:79:14: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:79:20: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:80:17: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:80:23: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:80:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/generic-non-atomic.h:73:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through arch/x86/include/asm/bitops.h, include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/instrumented-non-atomic.h:100:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/instrumented-non-atomic.h:97:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/generic-non-atomic.h:95:9: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:96:9: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:97:9: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:99:10: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:99:14: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:99:21: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:100:17: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:100:23: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:100:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/generic-non-atomic.h:93:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through arch/x86/include/asm/bitops.h, include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/instrumented-non-atomic.h:115:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/instrumented-non-atomic.h:112:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./arch/x86/include/asm/bitops.h:188:9: warning: unreplaced symbol 'oldbit'
./arch/x86/include/asm/bitops.h:192:35: warning: unreplaced symbol 'oldbit'
./arch/x86/include/asm/bitops.h:195:16: warning: unreplaced symbol 'oldbit'
./arch/x86/include/asm/bitops.h:195:9: warning: unreplaced symbol 'return'
./arch/x86/include/asm/bitops.h:186:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/generic-non-atomic.h:107:9: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:108:9: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:109:9: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:111:10: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:111:14: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:111:20: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:112:17: warning: unreplaced symbol 'old'
./include/asm-generic/bitops/generic-non-atomic.h:112:23: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:112:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/generic-non-atomic.h:105:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through arch/x86/include/asm/bitops.h, include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/instrumented-non-atomic.h:130:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/instrumented-non-atomic.h:127:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./arch/x86/include/asm/bitops.h:239:9: warning: unreplaced symbol 'return'
./arch/x86/include/asm/bitops.h:237:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/generic-non-atomic.h:128:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/generic-non-atomic.h:121:1: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/generic-non-atomic.h:168:9: warning: unreplaced symbol 'p'
./include/asm-generic/bitops/generic-non-atomic.h:169:9: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:170:9: warning: unreplaced symbol 'val'
./include/asm-generic/bitops/generic-non-atomic.h:172:19: warning: unreplaced symbol 'val'
./include/asm-generic/bitops/generic-non-atomic.h:172:25: warning: unreplaced symbol 'mask'
./include/asm-generic/bitops/generic-non-atomic.h:172:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/generic-non-atomic.h:166:1: warning: unreplaced symbol 'return'
drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through arch/x86/include/asm/bitops.h, include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
./include/asm-generic/bitops/instrumented-non-atomic.h:142:9: warning: unreplaced symbol 'return'
./include/asm-generic/bitops/instrumented-non-atomic.h:139:1: warning: unreplaced symbol 'return'

that's for a single file, there's no point in including same output for
every other file being checked.

Thanks,
Maciej

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] bitops: always define asm-generic non-atomic bitops
  2023-01-02 16:14   ` Maciej Fijalkowski
@ 2023-01-02 16:30     ` Alexander Lobakin
  2023-01-02 17:24       ` Andy Shevchenko
  0 siblings, 1 reply; 22+ messages in thread
From: Alexander Lobakin @ 2023-01-02 16:30 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: Alexander Lobakin, Arnd Bergmann, Yury Norov, Andy Shevchenko,
	Mark Rutland, Matt Turner, Brian Cain, Geert Uytterhoeven,
	Yoshinori Sato, Rich Felker, David S. Miller, Kees Cook,
	Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Jesse Brandeburg,
	Greg Kroah-Hartman, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	kernel test robot, linux-alpha, linux-hexagon, linux-ia64,
	linux-m68k, linux-sh, sparclinux, linux-arch, llvm, linux-kernel

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Mon, 2 Jan 2023 17:14:31 +0100

> On Fri, Jun 24, 2022 at 02:13:06PM +0200, Alexander Lobakin wrote:
> > Move generic non-atomic bitops from the asm-generic header which
> > gets included only when there are no architecture-specific
> > alternatives, to a separate independent file to make them always
> > available.
> > Almost no actual code changes, only one comment added to
> > generic_test_bit() saying that it's an atomic operation itself
> > and thus `volatile` must always stay there with no cast-aways.
> > 
> > Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> # comment
> > Suggested-by: Marco Elver <elver@google.com> # reference to kernel-doc
> > Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
> > Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > Reviewed-by: Marco Elver <elver@google.com>
> > ---
> >  .../asm-generic/bitops/generic-non-atomic.h   | 130 ++++++++++++++++++
> >  include/asm-generic/bitops/non-atomic.h       | 110 ++-------------
> >  2 files changed, 138 insertions(+), 102 deletions(-)
> >  create mode 100644 include/asm-generic/bitops/generic-non-atomic.h
> > 
> 
> Hi,
> 
> this patch gives me a headache when trying to run sparse against a module.
> 
> Olek please help :D

It was fixed shortly after the build bots turned on on the original
series with [0]. Hovewer, no release tag's been made after the fix.
There's also a short discussion regarding packaging Sparse 0.6.4 for
Debian with that fix cherry-picked[1], not sure if it led anywhere.

> 
> $ sudo make C=2 -C . M=drivers/net/ethernet/intel/ice/
> make: Entering directory '/home/mfijalko/bpf-next'
>   CHECK   drivers/net/ethernet/intel/ice/ice_main.c
> drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
> ./arch/x86/include/asm/bitops.h:66:1: warning: unreplaced symbol 'return'

[...]

> drivers/net/ethernet/intel/ice/ice_main.c: note: in included file (through arch/x86/include/asm/bitops.h, include/linux/bitops.h, include/linux/kernel.h, drivers/net/ethernet/intel/ice/ice.h):
> ./include/asm-generic/bitops/instrumented-non-atomic.h:142:9: warning: unreplaced symbol 'return'
> ./include/asm-generic/bitops/instrumented-non-atomic.h:139:1: warning: unreplaced symbol 'return'
> 
> that's for a single file, there's no point in including same output for
> every other file being checked.
> 
> Thanks,
> Maciej

[0] https://git.kernel.org/pub/scm/devel/sparse/sparse.git/commit/?id=0e1aae55e49cad7ea43848af5b58ff0f57e7af99
[1] https://lore.kernel.org/all/Yr7kPM1wLZnOqxOA@smile.fi.intel.com

Thanks,
Olek

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] bitops: always define asm-generic non-atomic bitops
  2023-01-02 16:30     ` Alexander Lobakin
@ 2023-01-02 17:24       ` Andy Shevchenko
  0 siblings, 0 replies; 22+ messages in thread
From: Andy Shevchenko @ 2023-01-02 17:24 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Maciej Fijalkowski, Arnd Bergmann, Yury Norov, Mark Rutland,
	Matt Turner, Brian Cain, Geert Uytterhoeven, Yoshinori Sato,
	Rich Felker, David S. Miller, Kees Cook, Peter Zijlstra (Intel),
	Marco Elver, Borislav Petkov, Tony Luck, Jesse Brandeburg,
	Greg Kroah-Hartman, Nathan Chancellor, Nick Desaulniers, Tom Rix,
	kernel test robot, linux-alpha, linux-hexagon, linux-ia64,
	linux-m68k, linux-sh, sparclinux, linux-arch, llvm, linux-kernel

On Mon, Jan 02, 2023 at 05:30:59PM +0100, Alexander Lobakin wrote:
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Date: Mon, 2 Jan 2023 17:14:31 +0100
> > On Fri, Jun 24, 2022 at 02:13:06PM +0200, Alexander Lobakin wrote:

> > this patch gives me a headache when trying to run sparse against a module.

No, it's not related to this patch.

> > Olek please help :D
> 
> It was fixed shortly after the build bots turned on on the original
> series with [0]. Hovewer, no release tag's been made after the fix.
> There's also a short discussion regarding packaging Sparse 0.6.4 for
> Debian with that fix cherry-picked[1], not sure if it led anywhere.

Debian already fixed that for a few weeks at least.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-01-02 17:24 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-24 12:13 [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
2022-06-24 12:13 ` [PATCH v5 1/9] ia64, processor: fix -Wincompatible-pointer-types in ia64_get_irr() Alexander Lobakin
2022-06-24 12:13 ` [PATCH v5 2/9] bitops: always define asm-generic non-atomic bitops Alexander Lobakin
2023-01-02 16:14   ` Maciej Fijalkowski
2023-01-02 16:30     ` Alexander Lobakin
2023-01-02 17:24       ` Andy Shevchenko
2022-06-24 12:13 ` [PATCH v5 3/9] bitops: unify non-atomic bitops prototypes across architectures Alexander Lobakin
2022-07-06 10:09   ` Geert Uytterhoeven
2022-06-24 12:13 ` [PATCH v5 4/9] bitops: define const_*() versions of the non-atomics Alexander Lobakin
2022-06-24 12:13 ` [PATCH v5 5/9] bitops: wrap non-atomic bitops with a transparent macro Alexander Lobakin
2022-06-24 12:13 ` [PATCH v5 6/9] bitops: let optimize out non-atomic bitops on compile-time constants Alexander Lobakin
2022-07-15  0:04   ` Guenter Roeck
2022-07-15 13:26     ` Alexander Lobakin
2022-07-15 13:49       ` Guenter Roeck
2022-07-15 14:19         ` Yury Norov
2022-07-15 14:50           ` Alexander Lobakin
2022-06-24 12:13 ` [PATCH v5 7/9] net/ice: fix initializing the bitmap in the switch code Alexander Lobakin
2022-06-24 12:13 ` [PATCH v5 8/9] bitmap: don't assume compiler evaluates small mem*() builtins calls Alexander Lobakin
2022-06-24 12:13 ` [PATCH v5 9/9] lib: test_bitmap: add compile-time optimization/evaluations assertions Alexander Lobakin
2022-06-24 12:51 ` [PATCH v5 0/9] bitops: let optimize out non-atomic bitops on compile-time constants Borislav Petkov
2022-06-30 16:56 ` Alexander Lobakin
2022-07-01  2:58   ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).