linux-sh.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps
@ 2021-03-16  1:54 Yury Norov
  2021-03-16  1:54 ` [PATCH 01/13] tools: disable -Wno-type-limits Yury Norov
                   ` (12 more replies)
  0 siblings, 13 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Bitmap operations are much simpler and faster in case of small bitmaps
which fit into a single word. In linux/bitmap.c we have a machinery that
allows compiler to replace actual function call with a few instructions
if bitmaps passed into the function are small and their size is known at
compile time.

find_*_bit() API lacks this functionality; but users will benefit from it
a lot. One important example is cpumask subsystem when
NR_CPUS <= BITS_PER_LONG.

v1: https://www.spinics.net/lists/kernel/msg3804727.html
v2: https://www.spinics.net/lists/linux-m68k/msg16945.html
v3: https://www.spinics.net/lists/kernel/msg3837020.html
v4: - move le.h header together with find.h for m68 and sh;
    - preserve small_const_nbits() macro;
    - drop FAST_PATH config option as this series doesn't increase .text,
      instead, it compacts it;
    - add Andy and Rasmus as reviewers of BITMAP API.

Yury Norov (13):
  tools: disable -Wno-type-limits
  tools: bitmap: sync function declarations with the kernel
  arch: rearrange headers inclusion order in asm/bitops for m68k and sh
  lib: introduce BITS_{FIRST,LAST} macro
  tools: sync BITS_MASK macros with the kernel
  lib: extend the scope of small_const_nbits() macro
  tools: sync small_const_nbits() macro with the kernel
  lib: inline _find_next_bit() wrappers
  tools: sync find_next_bit implementation
  lib: add fast path for find_next_*_bit()
  lib: add fast path for find_first_*_bit() and find_last_bit()
  tools: sync lib/find_bit implementation
  MAINTAINERS: Add entry for the bitmap API

 MAINTAINERS                             |  16 ++++
 arch/m68k/include/asm/bitops.h          |   6 +-
 arch/sh/include/asm/bitops.h            |   5 +-
 include/asm-generic/bitops/find.h       | 108 +++++++++++++++++++++---
 include/asm-generic/bitops/le.h         |  38 ++++++++-
 include/asm-generic/bitsperlong.h       |   9 ++
 include/linux/bitmap.h                  |  30 +++----
 include/linux/bitops.h                  |  12 ---
 include/linux/bits.h                    |   6 ++
 include/linux/cpumask.h                 |   8 +-
 include/linux/netdev_features.h         |   2 +-
 include/linux/nodemask.h                |   2 +-
 lib/bitmap.c                            |  26 +++---
 lib/find_bit.c                          |  72 +++-------------
 lib/genalloc.c                          |   8 +-
 tools/include/asm-generic/bitops/find.h |  85 +++++++++++++++++--
 tools/include/asm-generic/bitsperlong.h |   3 +
 tools/include/linux/bitmap.h            |  31 +++----
 tools/include/linux/bits.h              |   6 ++
 tools/lib/bitmap.c                      |  10 +--
 tools/lib/find_bit.c                    |  56 +++++-------
 tools/scripts/Makefile.include          |   1 +
 tools/testing/radix-tree/bitmap.c       |   4 +-
 23 files changed, 340 insertions(+), 204 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 01/13] tools: disable -Wno-type-limits
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  8:17   ` Rasmus Villemoes
  2021-03-16  1:54 ` [PATCH 02/13] tools: bitmap: sync function declarations with the kernel Yury Norov
                   ` (11 subsequent siblings)
  12 siblings, 1 reply; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

GENMASK(h, l) may be passed with unsigned types. In such case, type-limits
warning is generated for example in case of GENMASK(h, 0).

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/scripts/Makefile.include | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
index 84dbf61a7eca..15e99905cb7d 100644
--- a/tools/scripts/Makefile.include
+++ b/tools/scripts/Makefile.include
@@ -38,6 +38,7 @@ EXTRA_WARNINGS += -Wswitch-enum
 EXTRA_WARNINGS += -Wundef
 EXTRA_WARNINGS += -Wwrite-strings
 EXTRA_WARNINGS += -Wformat
+EXTRA_WARNINGS += -Wno-type-limits
 
 CC_NO_CLANG := $(shell $(CC) -dM -E -x c /dev/null | grep -Fq "__clang__"; echo $$?)
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 02/13] tools: bitmap: sync function declarations with the kernel
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
  2021-03-16  1:54 ` [PATCH 01/13] tools: disable -Wno-type-limits Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  8:18   ` Rasmus Villemoes
  2021-03-16  1:54 ` [PATCH 03/13] arch: rearrange headers inclusion order in asm/bitops for m68k and sh Yury Norov
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Some functions in tools/include/linux/bitmap.h declare nbits as int. In the
kernel nbits is declared as unsigned int.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/include/linux/bitmap.h | 8 ++++----
 tools/lib/bitmap.c           | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index 477a1cae513f..7cbd23e56d48 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -30,7 +30,7 @@ void bitmap_clear(unsigned long *map, unsigned int start, int len);
 #define small_const_nbits(nbits) \
 	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
 
-static inline void bitmap_zero(unsigned long *dst, int nbits)
+static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
 		*dst = 0UL;
@@ -66,7 +66,7 @@ static inline int bitmap_full(const unsigned long *src, unsigned int nbits)
 	return find_first_zero_bit(src, nbits) == nbits;
 }
 
-static inline int bitmap_weight(const unsigned long *src, int nbits)
+static inline int bitmap_weight(const unsigned long *src, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
 		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
@@ -74,7 +74,7 @@ static inline int bitmap_weight(const unsigned long *src, int nbits)
 }
 
 static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
-			     const unsigned long *src2, int nbits)
+			     const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
 		*dst = *src1 | *src2;
@@ -141,7 +141,7 @@ static inline void bitmap_free(unsigned long *bitmap)
  * @buf: buffer to store output
  * @size: size of @buf
  */
-size_t bitmap_scnprintf(unsigned long *bitmap, int nbits,
+size_t bitmap_scnprintf(unsigned long *bitmap, unsigned int nbits,
 			char *buf, size_t size);
 
 /**
diff --git a/tools/lib/bitmap.c b/tools/lib/bitmap.c
index 5043747ef6c5..f4e914712b6f 100644
--- a/tools/lib/bitmap.c
+++ b/tools/lib/bitmap.c
@@ -28,11 +28,11 @@ void __bitmap_or(unsigned long *dst, const unsigned long *bitmap1,
 		dst[k] = bitmap1[k] | bitmap2[k];
 }
 
-size_t bitmap_scnprintf(unsigned long *bitmap, int nbits,
+size_t bitmap_scnprintf(unsigned long *bitmap, unsigned int nbits,
 			char *buf, size_t size)
 {
 	/* current bit is 'cur', most recently seen range is [rbot, rtop] */
-	int cur, rbot, rtop;
+	unsigned int cur, rbot, rtop;
 	bool first = true;
 	size_t ret = 0;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 03/13] arch: rearrange headers inclusion order in asm/bitops for m68k and sh
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
  2021-03-16  1:54 ` [PATCH 01/13] tools: disable -Wno-type-limits Yury Norov
  2021-03-16  1:54 ` [PATCH 02/13] tools: bitmap: sync function declarations with the kernel Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  1:54 ` [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro Yury Norov
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

m68k and sh include bitmap/{find,le}.h prior to ffs/fls headers. New
fast-path implementation in find.h requires ffs/fls. Reordering the
headers inclusion sequence helps to prevent compile-time implicit
function declaration error.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
 arch/m68k/include/asm/bitops.h | 6 +++---
 arch/sh/include/asm/bitops.h   | 5 +++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/m68k/include/asm/bitops.h b/arch/m68k/include/asm/bitops.h
index 10133a968c8e..7b414099e5fc 100644
--- a/arch/m68k/include/asm/bitops.h
+++ b/arch/m68k/include/asm/bitops.h
@@ -440,8 +440,6 @@ static inline unsigned long ffz(unsigned long word)
 
 #endif
 
-#include <asm-generic/bitops/find.h>
-
 #ifdef __KERNEL__
 
 #if defined(CONFIG_CPU_HAS_NO_BITFIELDS)
@@ -525,10 +523,12 @@ static inline int __fls(int x)
 #define __clear_bit_unlock	clear_bit_unlock
 
 #include <asm-generic/bitops/ext2-atomic.h>
-#include <asm-generic/bitops/le.h>
 #include <asm-generic/bitops/fls64.h>
 #include <asm-generic/bitops/sched.h>
 #include <asm-generic/bitops/hweight.h>
+#include <asm-generic/bitops/le.h>
 #endif /* __KERNEL__ */
 
+#include <asm-generic/bitops/find.h>
+
 #endif /* _M68K_BITOPS_H */
diff --git a/arch/sh/include/asm/bitops.h b/arch/sh/include/asm/bitops.h
index 450b5854d7c6..3b6c7b5b7ec9 100644
--- a/arch/sh/include/asm/bitops.h
+++ b/arch/sh/include/asm/bitops.h
@@ -58,15 +58,16 @@ static inline unsigned long __ffs(unsigned long word)
 	return result;
 }
 
-#include <asm-generic/bitops/find.h>
 #include <asm-generic/bitops/ffs.h>
 #include <asm-generic/bitops/hweight.h>
 #include <asm-generic/bitops/lock.h>
 #include <asm-generic/bitops/sched.h>
-#include <asm-generic/bitops/le.h>
 #include <asm-generic/bitops/ext2-atomic.h>
 #include <asm-generic/bitops/fls.h>
 #include <asm-generic/bitops/__fls.h>
 #include <asm-generic/bitops/fls64.h>
 
+#include <asm-generic/bitops/le.h>
+#include <asm-generic/bitops/find.h>
+
 #endif /* __ASM_SH_BITOPS_H */
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (2 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 03/13] arch: rearrange headers inclusion order in asm/bitops for m68k and sh Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  8:35   ` Rasmus Villemoes
  2021-03-16  1:54 ` [PATCH 05/13] tools: sync BITS_MASK macros with the kernel Yury Norov
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

BITMAP_{LAST,FIRST}_WORD_MASK() in linux/bitmap.h duplicates the
functionality of GENMASK(). The scope of BITMAP* macros is wider
than just bitmaps. This patch defines 4 new macros: BITS_FIRST(),
BITS_LAST(), BITS_FIRST_MASK() and BITS_LAST_MASK() in linux/bits.h
on top of GENMASK() and replaces BITMAP_{LAST,FIRST}_WORD_MASK()
to avoid duplication and increase the scope of the macros.

This change doesn't affect code generation. On ARM64:
scripts/bloat-o-meter vmlinux.before vmlinux
add/remove: 1/2 grow/shrink: 2/0 up/down: 17/-16 (1)
Function                                     old     new   delta
ethtool_get_drvinfo                          900     908      +8
e843419@0cf2_0001309d_7f0                      -       8      +8
vermagic                                      48      49      +1
e843419@0d45_000138bb_f68                      8       -      -8
e843419@0cc9_00012bce_198c                     8       -      -8
Total: Before=26092016, After=26092017, chg +0.00%

The compilerr is:
	aarch64-linux-gnu-gcc (Linaro GCC 7.3-2018.05) 7.3.1 20180425

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/bitmap.h          | 27 ++++++++++++---------------
 include/linux/bits.h            |  6 ++++++
 include/linux/cpumask.h         |  8 ++++----
 include/linux/netdev_features.h |  2 +-
 include/linux/nodemask.h        |  2 +-
 lib/bitmap.c                    | 26 +++++++++++++-------------
 lib/find_bit.c                  |  4 ++--
 lib/genalloc.c                  |  8 ++++----
 8 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 70a932470b2d..adf7bd9f0467 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -219,9 +219,6 @@ extern unsigned int bitmap_ord_to_pos(const unsigned long *bitmap, unsigned int
 extern int bitmap_print_to_pagebuf(bool list, char *buf,
 				   const unsigned long *maskp, int nmaskbits);
 
-#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
-#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
-
 /*
  * The static inlines below do not handle constant nbits==0 correctly,
  * so make such users (should any ever turn up) call the out-of-line
@@ -257,7 +254,7 @@ static inline void bitmap_copy_clear_tail(unsigned long *dst,
 {
 	bitmap_copy(dst, src, nbits);
 	if (nbits % BITS_PER_LONG)
-		dst[nbits / BITS_PER_LONG] &= BITMAP_LAST_WORD_MASK(nbits);
+		dst[nbits / BITS_PER_LONG] &= BITS_FIRST_MASK(nbits - 1);
 }
 
 /*
@@ -282,7 +279,7 @@ static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
 			const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return (*dst = *src1 & *src2 & BITMAP_LAST_WORD_MASK(nbits)) != 0;
+		return (*dst = *src1 & *src2 & BITS_FIRST(nbits - 1)) != 0;
 	return __bitmap_and(dst, src1, src2, nbits);
 }
 
@@ -308,7 +305,7 @@ static inline int bitmap_andnot(unsigned long *dst, const unsigned long *src1,
 			const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return (*dst = *src1 & ~(*src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0;
+		return (*dst = *src1 & ~(*src2) & BITS_FIRST(nbits - 1)) != 0;
 	return __bitmap_andnot(dst, src1, src2, nbits);
 }
 
@@ -332,7 +329,7 @@ static inline int bitmap_equal(const unsigned long *src1,
 			const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return !((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
+		return !((*src1 ^ *src2) & BITS_FIRST(nbits - 1));
 	if (__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
 	    IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
 		return !memcmp(src1, src2, nbits / 8);
@@ -356,14 +353,14 @@ static inline bool bitmap_or_equal(const unsigned long *src1,
 	if (!small_const_nbits(nbits))
 		return __bitmap_or_equal(src1, src2, src3, nbits);
 
-	return !(((*src1 | *src2) ^ *src3) & BITMAP_LAST_WORD_MASK(nbits));
+	return !(((*src1 | *src2) ^ *src3) & BITS_FIRST(nbits - 1));
 }
 
 static inline int bitmap_intersects(const unsigned long *src1,
 			const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return ((*src1 & *src2) & BITMAP_LAST_WORD_MASK(nbits)) != 0;
+		return ((*src1 & *src2) & BITS_FIRST(nbits - 1)) != 0;
 	else
 		return __bitmap_intersects(src1, src2, nbits);
 }
@@ -372,7 +369,7 @@ static inline int bitmap_subset(const unsigned long *src1,
 			const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return ! ((*src1 & ~(*src2)) & BITMAP_LAST_WORD_MASK(nbits));
+		return !((*src1 & ~(*src2)) & BITS_FIRST(nbits - 1));
 	else
 		return __bitmap_subset(src1, src2, nbits);
 }
@@ -380,7 +377,7 @@ static inline int bitmap_subset(const unsigned long *src1,
 static inline bool bitmap_empty(const unsigned long *src, unsigned nbits)
 {
 	if (small_const_nbits(nbits))
-		return ! (*src & BITMAP_LAST_WORD_MASK(nbits));
+		return !(*src & BITS_FIRST(nbits - 1));
 
 	return find_first_bit(src, nbits) == nbits;
 }
@@ -388,7 +385,7 @@ static inline bool bitmap_empty(const unsigned long *src, unsigned nbits)
 static inline bool bitmap_full(const unsigned long *src, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits));
+		return !(~(*src) & BITS_FIRST(nbits - 1));
 
 	return find_first_zero_bit(src, nbits) == nbits;
 }
@@ -396,7 +393,7 @@ static inline bool bitmap_full(const unsigned long *src, unsigned int nbits)
 static __always_inline int bitmap_weight(const unsigned long *src, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
+		return hweight_long(*src & BITS_FIRST(nbits - 1));
 	return __bitmap_weight(src, nbits);
 }
 
@@ -432,7 +429,7 @@ static inline void bitmap_shift_right(unsigned long *dst, const unsigned long *s
 				unsigned int shift, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		*dst = (*src & BITMAP_LAST_WORD_MASK(nbits)) >> shift;
+		*dst = (*src & BITS_FIRST(nbits - 1)) >> shift;
 	else
 		__bitmap_shift_right(dst, src, shift, nbits);
 }
@@ -441,7 +438,7 @@ static inline void bitmap_shift_left(unsigned long *dst, const unsigned long *sr
 				unsigned int shift, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		*dst = (*src << shift) & BITMAP_LAST_WORD_MASK(nbits);
+		*dst = (*src << shift) & BITS_FIRST(nbits - 1);
 	else
 		__bitmap_shift_left(dst, src, shift, nbits);
 }
diff --git a/include/linux/bits.h b/include/linux/bits.h
index 7f475d59a097..8c191c29506e 100644
--- a/include/linux/bits.h
+++ b/include/linux/bits.h
@@ -37,6 +37,12 @@
 #define GENMASK(h, l) \
 	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
 
+#define BITS_FIRST(nr)		GENMASK((nr), 0)
+#define BITS_LAST(nr)		GENMASK(BITS_PER_LONG - 1, (nr))
+
+#define BITS_FIRST_MASK(nr)	BITS_FIRST((nr) % BITS_PER_LONG)
+#define BITS_LAST_MASK(nr)	BITS_LAST((nr) % BITS_PER_LONG)
+
 #define __GENMASK_ULL(h, l) \
 	(((~ULL(0)) - (ULL(1) << (l)) + 1) & \
 	 (~ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h))))
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index c53364c4296d..57d7cdc22eca 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -899,7 +899,7 @@ static inline const struct cpumask *get_cpu_mask(unsigned int cpu)
 #if NR_CPUS <= BITS_PER_LONG
 #define CPU_BITS_ALL						\
 {								\
-	[BITS_TO_LONGS(NR_CPUS)-1] = BITMAP_LAST_WORD_MASK(NR_CPUS)	\
+	[BITS_TO_LONGS(NR_CPUS)-1] = BITS_FIRST_MASK(NR_CPUS - 1)	\
 }
 
 #else /* NR_CPUS > BITS_PER_LONG */
@@ -907,7 +907,7 @@ static inline const struct cpumask *get_cpu_mask(unsigned int cpu)
 #define CPU_BITS_ALL						\
 {								\
 	[0 ... BITS_TO_LONGS(NR_CPUS)-2] = ~0UL,		\
-	[BITS_TO_LONGS(NR_CPUS)-1] = BITMAP_LAST_WORD_MASK(NR_CPUS)	\
+	[BITS_TO_LONGS(NR_CPUS)-1] = BITS_FIRST_MASK(NR_CPUS - 1)	\
 }
 #endif /* NR_CPUS > BITS_PER_LONG */
 
@@ -931,13 +931,13 @@ cpumap_print_to_pagebuf(bool list, char *buf, const struct cpumask *mask)
 #if NR_CPUS <= BITS_PER_LONG
 #define CPU_MASK_ALL							\
 (cpumask_t) { {								\
-	[BITS_TO_LONGS(NR_CPUS)-1] = BITMAP_LAST_WORD_MASK(NR_CPUS)	\
+	[BITS_TO_LONGS(NR_CPUS)-1] = BITS_FIRST_MASK(NR_CPUS - 1)	\
 } }
 #else
 #define CPU_MASK_ALL							\
 (cpumask_t) { {								\
 	[0 ... BITS_TO_LONGS(NR_CPUS)-2] = ~0UL,			\
-	[BITS_TO_LONGS(NR_CPUS)-1] = BITMAP_LAST_WORD_MASK(NR_CPUS)	\
+	[BITS_TO_LONGS(NR_CPUS)-1] = BITS_FIRST_MASK(NR_CPUS - 1)	\
 } }
 #endif /* NR_CPUS > BITS_PER_LONG */
 
diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 3de38d6a0aea..4cef7fe02f09 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -173,7 +173,7 @@ enum {
  */
 static inline int find_next_netdev_feature(u64 feature, unsigned long start)
 {
-	/* like BITMAP_LAST_WORD_MASK() for u64
+	/* like BITS_FIRST_MASK() for u64
 	 * this sets the most significant 64 - start to 0.
 	 */
 	feature &= ~0ULL >> (-start & ((sizeof(feature) * 8) - 1));
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index ac398e143c9a..2df0787c9155 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -302,7 +302,7 @@ static inline int __first_unset_node(const nodemask_t *maskp)
 			find_first_zero_bit(maskp->bits, MAX_NUMNODES));
 }
 
-#define NODE_MASK_LAST_WORD BITMAP_LAST_WORD_MASK(MAX_NUMNODES)
+#define NODE_MASK_LAST_WORD BITS_FIRST_MASK(MAX_NUMNODES - 1)
 
 #if MAX_NUMNODES <= BITS_PER_LONG
 
diff --git a/lib/bitmap.c b/lib/bitmap.c
index 9f4626a4c95f..549b184a3bae 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -52,7 +52,7 @@ int __bitmap_equal(const unsigned long *bitmap1,
 			return 0;
 
 	if (bits % BITS_PER_LONG)
-		if ((bitmap1[k] ^ bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
+		if ((bitmap1[k] ^ bitmap2[k]) & BITS_FIRST_MASK(bits - 1))
 			return 0;
 
 	return 1;
@@ -76,7 +76,7 @@ bool __bitmap_or_equal(const unsigned long *bitmap1,
 		return true;
 
 	tmp = (bitmap1[k] | bitmap2[k]) ^ bitmap3[k];
-	return (tmp & BITMAP_LAST_WORD_MASK(bits)) == 0;
+	return (tmp & BITS_FIRST_MASK(bits - 1)) == 0;
 }
 
 void __bitmap_complement(unsigned long *dst, const unsigned long *src, unsigned int bits)
@@ -103,7 +103,7 @@ void __bitmap_shift_right(unsigned long *dst, const unsigned long *src,
 {
 	unsigned k, lim = BITS_TO_LONGS(nbits);
 	unsigned off = shift/BITS_PER_LONG, rem = shift % BITS_PER_LONG;
-	unsigned long mask = BITMAP_LAST_WORD_MASK(nbits);
+	unsigned long mask = BITS_FIRST_MASK(nbits - 1);
 	for (k = 0; off + k < lim; ++k) {
 		unsigned long upper, lower;
 
@@ -246,7 +246,7 @@ int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
 		result |= (dst[k] = bitmap1[k] & bitmap2[k]);
 	if (bits % BITS_PER_LONG)
 		result |= (dst[k] = bitmap1[k] & bitmap2[k] &
-			   BITMAP_LAST_WORD_MASK(bits));
+			   BITS_FIRST_MASK(bits - 1));
 	return result != 0;
 }
 EXPORT_SYMBOL(__bitmap_and);
@@ -284,7 +284,7 @@ int __bitmap_andnot(unsigned long *dst, const unsigned long *bitmap1,
 		result |= (dst[k] = bitmap1[k] & ~bitmap2[k]);
 	if (bits % BITS_PER_LONG)
 		result |= (dst[k] = bitmap1[k] & ~bitmap2[k] &
-			   BITMAP_LAST_WORD_MASK(bits));
+			   BITS_FIRST_MASK(bits - 1));
 	return result != 0;
 }
 EXPORT_SYMBOL(__bitmap_andnot);
@@ -310,7 +310,7 @@ int __bitmap_intersects(const unsigned long *bitmap1,
 			return 1;
 
 	if (bits % BITS_PER_LONG)
-		if ((bitmap1[k] & bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
+		if ((bitmap1[k] & bitmap2[k]) & BITS_FIRST_MASK(bits - 1))
 			return 1;
 	return 0;
 }
@@ -325,7 +325,7 @@ int __bitmap_subset(const unsigned long *bitmap1,
 			return 0;
 
 	if (bits % BITS_PER_LONG)
-		if ((bitmap1[k] & ~bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
+		if ((bitmap1[k] & ~bitmap2[k]) & BITS_FIRST_MASK(bits - 1))
 			return 0;
 	return 1;
 }
@@ -340,7 +340,7 @@ int __bitmap_weight(const unsigned long *bitmap, unsigned int bits)
 		w += hweight_long(bitmap[k]);
 
 	if (bits % BITS_PER_LONG)
-		w += hweight_long(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));
+		w += hweight_long(bitmap[k] & BITS_FIRST_MASK(bits - 1));
 
 	return w;
 }
@@ -351,7 +351,7 @@ void __bitmap_set(unsigned long *map, unsigned int start, int len)
 	unsigned long *p = map + BIT_WORD(start);
 	const unsigned int size = start + len;
 	int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
-	unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+	unsigned long mask_to_set = BITS_LAST_MASK(start);
 
 	while (len - bits_to_set >= 0) {
 		*p |= mask_to_set;
@@ -361,7 +361,7 @@ void __bitmap_set(unsigned long *map, unsigned int start, int len)
 		p++;
 	}
 	if (len) {
-		mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+		mask_to_set &= BITS_FIRST_MASK(size - 1);
 		*p |= mask_to_set;
 	}
 }
@@ -372,7 +372,7 @@ void __bitmap_clear(unsigned long *map, unsigned int start, int len)
 	unsigned long *p = map + BIT_WORD(start);
 	const unsigned int size = start + len;
 	int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
-	unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+	unsigned long mask_to_clear = BITS_LAST_MASK(start);
 
 	while (len - bits_to_clear >= 0) {
 		*p &= ~mask_to_clear;
@@ -382,7 +382,7 @@ void __bitmap_clear(unsigned long *map, unsigned int start, int len)
 		p++;
 	}
 	if (len) {
-		mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+		mask_to_clear &= BITS_FIRST_MASK(size - 1);
 		*p &= ~mask_to_clear;
 	}
 }
@@ -1291,7 +1291,7 @@ void bitmap_from_arr32(unsigned long *bitmap, const u32 *buf, unsigned int nbits
 
 	/* Clear tail bits in last word beyond nbits. */
 	if (nbits % BITS_PER_LONG)
-		bitmap[(halfwords - 1) / 2] &= BITMAP_LAST_WORD_MASK(nbits);
+		bitmap[(halfwords - 1) / 2] &= BITS_FIRST_MASK(nbits - 1);
 }
 EXPORT_SYMBOL(bitmap_from_arr32);
 
diff --git a/lib/find_bit.c b/lib/find_bit.c
index f67f86fd2f62..8c2a71a18793 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -44,7 +44,7 @@ static unsigned long _find_next_bit(const unsigned long *addr1,
 	tmp ^= invert;
 
 	/* Handle 1st word. */
-	mask = BITMAP_FIRST_WORD_MASK(start);
+	mask = BITS_LAST_MASK(start);
 	if (le)
 		mask = swab(mask);
 
@@ -141,7 +141,7 @@ EXPORT_SYMBOL(find_first_zero_bit);
 unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
 {
 	if (size) {
-		unsigned long val = BITMAP_LAST_WORD_MASK(size);
+		unsigned long val = BITS_FIRST_MASK(size - 1);
 		unsigned long idx = (size-1) / BITS_PER_LONG;
 
 		do {
diff --git a/lib/genalloc.c b/lib/genalloc.c
index 5dcf9cdcbc46..0af7275517ff 100644
--- a/lib/genalloc.c
+++ b/lib/genalloc.c
@@ -87,7 +87,7 @@ bitmap_set_ll(unsigned long *map, unsigned long start, unsigned long nr)
 	unsigned long *p = map + BIT_WORD(start);
 	const unsigned long size = start + nr;
 	int bits_to_set = BITS_PER_LONG - (start % BITS_PER_LONG);
-	unsigned long mask_to_set = BITMAP_FIRST_WORD_MASK(start);
+	unsigned long mask_to_set = BITS_LAST_MASK(start);
 
 	while (nr >= bits_to_set) {
 		if (set_bits_ll(p, mask_to_set))
@@ -98,7 +98,7 @@ bitmap_set_ll(unsigned long *map, unsigned long start, unsigned long nr)
 		p++;
 	}
 	if (nr) {
-		mask_to_set &= BITMAP_LAST_WORD_MASK(size);
+		mask_to_set &= BITS_FIRST_MASK(size - 1);
 		if (set_bits_ll(p, mask_to_set))
 			return nr;
 	}
@@ -123,7 +123,7 @@ bitmap_clear_ll(unsigned long *map, unsigned long start, unsigned long nr)
 	unsigned long *p = map + BIT_WORD(start);
 	const unsigned long size = start + nr;
 	int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
-	unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+	unsigned long mask_to_clear = BITS_LAST_MASK(start);
 
 	while (nr >= bits_to_clear) {
 		if (clear_bits_ll(p, mask_to_clear))
@@ -134,7 +134,7 @@ bitmap_clear_ll(unsigned long *map, unsigned long start, unsigned long nr)
 		p++;
 	}
 	if (nr) {
-		mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+		mask_to_clear &= BITS_FIRST_MASK(size - 1);
 		if (clear_bits_ll(p, mask_to_clear))
 			return nr;
 	}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 05/13] tools: sync BITS_MASK macros with the kernel
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (3 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  1:54 ` [PATCH 06/13] lib: extend the scope of small_const_nbits() macro Yury Norov
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Remove BITMAP_{FIRST,LAST}_WORD_MASK and introduce
BITS_{FIRST,LAST}{,_MASK} as per kernel implementation.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/include/linux/bitmap.h      | 20 ++++++--------------
 tools/include/linux/bits.h        |  6 ++++++
 tools/lib/bitmap.c                |  6 +++---
 tools/lib/find_bit.c              |  2 +-
 tools/testing/radix-tree/bitmap.c |  4 ++--
 5 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index 7cbd23e56d48..b6e8430c8bc9 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -19,14 +19,6 @@ int __bitmap_equal(const unsigned long *bitmap1,
 		   const unsigned long *bitmap2, unsigned int bits);
 void bitmap_clear(unsigned long *map, unsigned int start, int len);
 
-#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
-
-#define BITMAP_LAST_WORD_MASK(nbits)					\
-(									\
-	((nbits) % BITS_PER_LONG) ?					\
-		(1UL<<((nbits) % BITS_PER_LONG))-1 : ~0UL		\
-)
-
 #define small_const_nbits(nbits) \
 	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
 
@@ -47,13 +39,13 @@ static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
 		unsigned int len = (nlongs - 1) * sizeof(unsigned long);
 		memset(dst, 0xff,  len);
 	}
-	dst[nlongs - 1] = BITMAP_LAST_WORD_MASK(nbits);
+	dst[nlongs - 1] = BITS_FIRST(nbits - 1);
 }
 
 static inline int bitmap_empty(const unsigned long *src, unsigned nbits)
 {
 	if (small_const_nbits(nbits))
-		return ! (*src & BITMAP_LAST_WORD_MASK(nbits));
+		return !(*src & BITS_FIRST(nbits - 1));
 
 	return find_first_bit(src, nbits) == nbits;
 }
@@ -61,7 +53,7 @@ static inline int bitmap_empty(const unsigned long *src, unsigned nbits)
 static inline int bitmap_full(const unsigned long *src, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return ! (~(*src) & BITMAP_LAST_WORD_MASK(nbits));
+		return !(~(*src) & BITS_FIRST(nbits - 1));
 
 	return find_first_zero_bit(src, nbits) == nbits;
 }
@@ -69,7 +61,7 @@ static inline int bitmap_full(const unsigned long *src, unsigned int nbits)
 static inline int bitmap_weight(const unsigned long *src, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
+		return hweight_long(*src & BITS_FIRST(nbits - 1));
 	return __bitmap_weight(src, nbits);
 }
 
@@ -155,7 +147,7 @@ static inline int bitmap_and(unsigned long *dst, const unsigned long *src1,
 			     const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return (*dst = *src1 & *src2 & BITMAP_LAST_WORD_MASK(nbits)) != 0;
+		return (*dst = *src1 & *src2 & BITS_FIRST(nbits - 1)) != 0;
 	return __bitmap_and(dst, src1, src2, nbits);
 }
 
@@ -171,7 +163,7 @@ static inline int bitmap_equal(const unsigned long *src1,
 			const unsigned long *src2, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-		return !((*src1 ^ *src2) & BITMAP_LAST_WORD_MASK(nbits));
+		return !((*src1 ^ *src2) & BITS_FIRST(nbits - 1));
 	if (__builtin_constant_p(nbits & BITMAP_MEM_MASK) &&
 	    IS_ALIGNED(nbits, BITMAP_MEM_ALIGNMENT))
 		return !memcmp(src1, src2, nbits / 8);
diff --git a/tools/include/linux/bits.h b/tools/include/linux/bits.h
index 7f475d59a097..8c191c29506e 100644
--- a/tools/include/linux/bits.h
+++ b/tools/include/linux/bits.h
@@ -37,6 +37,12 @@
 #define GENMASK(h, l) \
 	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
 
+#define BITS_FIRST(nr)		GENMASK((nr), 0)
+#define BITS_LAST(nr)		GENMASK(BITS_PER_LONG - 1, (nr))
+
+#define BITS_FIRST_MASK(nr)	BITS_FIRST((nr) % BITS_PER_LONG)
+#define BITS_LAST_MASK(nr)	BITS_LAST((nr) % BITS_PER_LONG)
+
 #define __GENMASK_ULL(h, l) \
 	(((~ULL(0)) - (ULL(1) << (l)) + 1) & \
 	 (~ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h))))
diff --git a/tools/lib/bitmap.c b/tools/lib/bitmap.c
index f4e914712b6f..8cffad2d1f77 100644
--- a/tools/lib/bitmap.c
+++ b/tools/lib/bitmap.c
@@ -13,7 +13,7 @@ int __bitmap_weight(const unsigned long *bitmap, int bits)
 		w += hweight_long(bitmap[k]);
 
 	if (bits % BITS_PER_LONG)
-		w += hweight_long(bitmap[k] & BITMAP_LAST_WORD_MASK(bits));
+		w += hweight_long(bitmap[k] & BITS_FIRST_MASK(bits - 1));
 
 	return w;
 }
@@ -68,7 +68,7 @@ int __bitmap_and(unsigned long *dst, const unsigned long *bitmap1,
 		result |= (dst[k] = bitmap1[k] & bitmap2[k]);
 	if (bits % BITS_PER_LONG)
 		result |= (dst[k] = bitmap1[k] & bitmap2[k] &
-			   BITMAP_LAST_WORD_MASK(bits));
+			   BITS_FIRST_MASK(bits - 1));
 	return result != 0;
 }
 
@@ -81,7 +81,7 @@ int __bitmap_equal(const unsigned long *bitmap1,
 			return 0;
 
 	if (bits % BITS_PER_LONG)
-		if ((bitmap1[k] ^ bitmap2[k]) & BITMAP_LAST_WORD_MASK(bits))
+		if ((bitmap1[k] ^ bitmap2[k]) & BITS_FIRST_MASK(bits - 1))
 			return 0;
 
 	return 1;
diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
index ac37022e9486..49abb18549cc 100644
--- a/tools/lib/find_bit.c
+++ b/tools/lib/find_bit.c
@@ -43,7 +43,7 @@ static inline unsigned long _find_next_bit(const unsigned long *addr1,
 	tmp ^= invert;
 
 	/* Handle 1st word. */
-	tmp &= BITMAP_FIRST_WORD_MASK(start);
+	tmp &= BITS_LAST_MASK(start);
 	start = round_down(start, BITS_PER_LONG);
 
 	while (!tmp) {
diff --git a/tools/testing/radix-tree/bitmap.c b/tools/testing/radix-tree/bitmap.c
index 66ec4a24a203..aedc15461f78 100644
--- a/tools/testing/radix-tree/bitmap.c
+++ b/tools/testing/radix-tree/bitmap.c
@@ -7,7 +7,7 @@ void bitmap_clear(unsigned long *map, unsigned int start, int len)
 	unsigned long *p = map + BIT_WORD(start);
 	const unsigned int size = start + len;
 	int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
-	unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+	unsigned long mask_to_clear = BITS_LAST_MASK(start);
 
 	while (len - bits_to_clear >= 0) {
 		*p &= ~mask_to_clear;
@@ -17,7 +17,7 @@ void bitmap_clear(unsigned long *map, unsigned int start, int len)
 		p++;
 	}
 	if (len) {
-		mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+		mask_to_clear &= BITS_FIRST_MASK(size - 1);
 		*p &= ~mask_to_clear;
 	}
 }
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 06/13] lib: extend the scope of small_const_nbits() macro
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (4 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 05/13] tools: sync BITS_MASK macros with the kernel Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  8:56   ` Rasmus Villemoes
  2021-03-16  1:54 ` [PATCH 07/13] tools: sync small_const_nbits() macro with the kernel Yury Norov
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

find_bit would also benefit from small_const_nbits() optimizations.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/asm-generic/bitsperlong.h | 9 +++++++++
 include/linux/bitmap.h            | 3 ---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/asm-generic/bitsperlong.h b/include/asm-generic/bitsperlong.h
index 3905c1c93dc2..96032f4f908f 100644
--- a/include/asm-generic/bitsperlong.h
+++ b/include/asm-generic/bitsperlong.h
@@ -23,4 +23,13 @@
 #define BITS_PER_LONG_LONG 64
 #endif
 
+#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < BITS_PER_LONG)
+
+/*
+ * The valid number of bits for a bitmap to be small enough, or in other words,
+ * fit into a single machine word is 1 to BITS_PER_LONG inclusively. 0 is not a
+ * valid number for size, and most probably a sing of error.
+ */
+#define small_const_nbits(n) SMALL_CONST((n) - 1)
+
 #endif /* __ASM_GENERIC_BITS_PER_LONG */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index adf7bd9f0467..bc13a890ecc1 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -224,9 +224,6 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
  * so make such users (should any ever turn up) call the out-of-line
  * versions.
  */
-#define small_const_nbits(nbits) \
-	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
-
 static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
 	unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 07/13] tools: sync small_const_nbits() macro with the kernel
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (5 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 06/13] lib: extend the scope of small_const_nbits() macro Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  1:54 ` [PATCH 08/13] lib: inline _find_next_bit() wrappers Yury Norov
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Move the macro from tools/include/asm-generic/bitsperlong.h
to tools/include/linux/bitmap.h

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/include/asm-generic/bitsperlong.h | 3 +++
 tools/include/linux/bitmap.h            | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/include/asm-generic/bitsperlong.h b/tools/include/asm-generic/bitsperlong.h
index 8f2283052333..60b44bb4748b 100644
--- a/tools/include/asm-generic/bitsperlong.h
+++ b/tools/include/asm-generic/bitsperlong.h
@@ -18,4 +18,7 @@
 #define BITS_PER_LONG_LONG 64
 #endif
 
+#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < BITS_PER_LONG)
+#define small_const_nbits(n) SMALL_CONST((n) - 1)
+
 #endif /* __ASM_GENERIC_BITS_PER_LONG */
diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index b6e8430c8bc9..433997b89565 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -19,9 +19,6 @@ int __bitmap_equal(const unsigned long *bitmap1,
 		   const unsigned long *bitmap2, unsigned int bits);
 void bitmap_clear(unsigned long *map, unsigned int start, int len);
 
-#define small_const_nbits(nbits) \
-	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG)
-
 static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
 {
 	if (small_const_nbits(nbits))
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 08/13] lib: inline _find_next_bit() wrappers
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (6 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 07/13] tools: sync small_const_nbits() macro with the kernel Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  1:54 ` [PATCH 09/13] tools: sync find_next_bit implementation Yury Norov
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

lib/find_bit.c declares five single-line wrappers for _find_next_bit().
We may turn those wrappers to inline functions. It eliminates unneeded
function calls and opens room for compile-time optimizations.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/asm-generic/bitops/find.h | 28 ++++++++++++----
 include/asm-generic/bitops/le.h   | 17 +++++++---
 lib/find_bit.c                    | 56 ++-----------------------------
 3 files changed, 37 insertions(+), 64 deletions(-)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..7ad70dab8e93 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -2,6 +2,10 @@
 #ifndef _ASM_GENERIC_BITOPS_FIND_H_
 #define _ASM_GENERIC_BITOPS_FIND_H_
 
+extern unsigned long _find_next_bit(const unsigned long *addr1,
+		const unsigned long *addr2, unsigned long nbits,
+		unsigned long start, unsigned long invert, unsigned long le);
+
 #ifndef find_next_bit
 /**
  * find_next_bit - find the next set bit in a memory region
@@ -12,8 +16,12 @@
  * Returns the bit number for the next set bit
  * If no bits are set, returns @size.
  */
-extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
-		size, unsigned long offset);
+static inline
+unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+			    unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
+}
 #endif
 
 #ifndef find_next_and_bit
@@ -27,9 +35,13 @@ extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
  * Returns the bit number for the next set bit
  * If no bits are set, returns @size.
  */
-extern unsigned long find_next_and_bit(const unsigned long *addr1,
+static inline
+unsigned long find_next_and_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long size,
-		unsigned long offset);
+		unsigned long offset)
+{
+	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
+}
 #endif
 
 #ifndef find_next_zero_bit
@@ -42,8 +54,12 @@ extern unsigned long find_next_and_bit(const unsigned long *addr1,
  * Returns the bit number of the next zero bit
  * If no bits are zero, returns @size.
  */
-extern unsigned long find_next_zero_bit(const unsigned long *addr, unsigned
-		long size, unsigned long offset);
+static inline
+unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
+				 unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
+}
 #endif
 
 #ifdef CONFIG_GENERIC_FIND_FIRST_BIT
diff --git a/include/asm-generic/bitops/le.h b/include/asm-generic/bitops/le.h
index 188d3eba3ace..21305f6cea0b 100644
--- a/include/asm-generic/bitops/le.h
+++ b/include/asm-generic/bitops/le.h
@@ -2,6 +2,7 @@
 #ifndef _ASM_GENERIC_BITOPS_LE_H_
 #define _ASM_GENERIC_BITOPS_LE_H_
 
+#include <asm-generic/bitops/find.h>
 #include <asm/types.h>
 #include <asm/byteorder.h>
 
@@ -32,13 +33,21 @@ static inline unsigned long find_first_zero_bit_le(const void *addr,
 #define BITOP_LE_SWIZZLE	((BITS_PER_LONG-1) & ~0x7)
 
 #ifndef find_next_zero_bit_le
-extern unsigned long find_next_zero_bit_le(const void *addr,
-		unsigned long size, unsigned long offset);
+static inline
+unsigned long find_next_zero_bit_le(const void *addr, unsigned
+		long size, unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, ~0UL, 1);
+}
 #endif
 
 #ifndef find_next_bit_le
-extern unsigned long find_next_bit_le(const void *addr,
-		unsigned long size, unsigned long offset);
+static inline
+unsigned long find_next_bit_le(const void *addr, unsigned
+		long size, unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, 0UL, 1);
+}
 #endif
 
 #ifndef find_first_zero_bit_le
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 8c2a71a18793..2470ae390f3c 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -29,7 +29,7 @@
  *    searching it for one bits.
  *  - The optional "addr2", which is anded with "addr1" if present.
  */
-static unsigned long _find_next_bit(const unsigned long *addr1,
+unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
 		unsigned long start, unsigned long invert, unsigned long le)
 {
@@ -68,37 +68,7 @@ static unsigned long _find_next_bit(const unsigned long *addr1,
 
 	return min(start + __ffs(tmp), nbits);
 }
-#endif
-
-#ifndef find_next_bit
-/*
- * Find the next set bit in a memory region.
- */
-unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
-			    unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
-}
-EXPORT_SYMBOL(find_next_bit);
-#endif
-
-#ifndef find_next_zero_bit
-unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
-				 unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
-}
-EXPORT_SYMBOL(find_next_zero_bit);
-#endif
-
-#if !defined(find_next_and_bit)
-unsigned long find_next_and_bit(const unsigned long *addr1,
-		const unsigned long *addr2, unsigned long size,
-		unsigned long offset)
-{
-	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
-}
-EXPORT_SYMBOL(find_next_and_bit);
+EXPORT_SYMBOL(_find_next_bit);
 #endif
 
 #ifndef find_first_bit
@@ -157,28 +127,6 @@ unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
 EXPORT_SYMBOL(find_last_bit);
 #endif
 
-#ifdef __BIG_ENDIAN
-
-#ifndef find_next_zero_bit_le
-unsigned long find_next_zero_bit_le(const void *addr, unsigned
-		long size, unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, ~0UL, 1);
-}
-EXPORT_SYMBOL(find_next_zero_bit_le);
-#endif
-
-#ifndef find_next_bit_le
-unsigned long find_next_bit_le(const void *addr, unsigned
-		long size, unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, 0UL, 1);
-}
-EXPORT_SYMBOL(find_next_bit_le);
-#endif
-
-#endif /* __BIG_ENDIAN */
-
 unsigned long find_next_clump8(unsigned long *clump, const unsigned long *addr,
 			       unsigned long size, unsigned long offset)
 {
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 09/13] tools: sync find_next_bit implementation
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (7 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 08/13] lib: inline _find_next_bit() wrappers Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  1:54 ` [PATCH 10/13] lib: add fast path for find_next_*_bit() Yury Norov
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Sync the implementation with recent kernel changes.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/include/asm-generic/bitops/find.h | 27 ++++++++++---
 tools/lib/find_bit.c                    | 52 ++++++++++---------------
 2 files changed, 42 insertions(+), 37 deletions(-)

diff --git a/tools/include/asm-generic/bitops/find.h b/tools/include/asm-generic/bitops/find.h
index 16ed1982cb34..9fe62d10b084 100644
--- a/tools/include/asm-generic/bitops/find.h
+++ b/tools/include/asm-generic/bitops/find.h
@@ -2,6 +2,10 @@
 #ifndef _TOOLS_LINUX_ASM_GENERIC_BITOPS_FIND_H_
 #define _TOOLS_LINUX_ASM_GENERIC_BITOPS_FIND_H_
 
+extern unsigned long _find_next_bit(const unsigned long *addr1,
+		const unsigned long *addr2, unsigned long nbits,
+		unsigned long start, unsigned long invert, unsigned long le);
+
 #ifndef find_next_bit
 /**
  * find_next_bit - find the next set bit in a memory region
@@ -12,8 +16,12 @@
  * Returns the bit number for the next set bit
  * If no bits are set, returns @size.
  */
-extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
-		size, unsigned long offset);
+static inline
+unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
+			    unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
+}
 #endif
 
 #ifndef find_next_and_bit
@@ -27,13 +35,16 @@ extern unsigned long find_next_bit(const unsigned long *addr, unsigned long
  * Returns the bit number for the next set bit
  * If no bits are set, returns @size.
  */
-extern unsigned long find_next_and_bit(const unsigned long *addr1,
+static inline
+unsigned long find_next_and_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long size,
-		unsigned long offset);
+		unsigned long offset)
+{
+	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
+}
 #endif
 
 #ifndef find_next_zero_bit
-
 /**
  * find_next_zero_bit - find the next cleared bit in a memory region
  * @addr: The address to base the search on
@@ -43,8 +54,12 @@ extern unsigned long find_next_and_bit(const unsigned long *addr1,
  * Returns the bit number of the next zero bit
  * If no bits are zero, returns @size.
  */
+static inline
 unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
-				 unsigned long offset);
+				 unsigned long offset)
+{
+	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
+}
 #endif
 
 #ifndef find_first_bit
diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
index 49abb18549cc..c3378b291205 100644
--- a/tools/lib/find_bit.c
+++ b/tools/lib/find_bit.c
@@ -28,11 +28,12 @@
  *    searching it for one bits.
  *  - The optional "addr2", which is anded with "addr1" if present.
  */
-static inline unsigned long _find_next_bit(const unsigned long *addr1,
+unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
-		unsigned long start, unsigned long invert)
+		unsigned long start, unsigned long invert, unsigned long le)
 {
-	unsigned long tmp;
+	unsigned long tmp, mask;
+	(void) le;
 
 	if (unlikely(start >= nbits))
 		return nbits;
@@ -43,7 +44,19 @@ static inline unsigned long _find_next_bit(const unsigned long *addr1,
 	tmp ^= invert;
 
 	/* Handle 1st word. */
-	tmp &= BITS_LAST_MASK(start);
+	mask = BITS_LAST_MASK(start);
+
+	/*
+	 * Due to the lack of swab() in tools, and the fact that it doesn't
+	 * need little-endian support, just comment it out
+	 */
+#if (0)
+	if (le)
+		mask = swab(mask);
+#endif
+
+	tmp &= mask;
+
 	start = round_down(start, BITS_PER_LONG);
 
 	while (!tmp) {
@@ -57,18 +70,12 @@ static inline unsigned long _find_next_bit(const unsigned long *addr1,
 		tmp ^= invert;
 	}
 
-	return min(start + __ffs(tmp), nbits);
-}
+#if (0)
+	if (le)
+		tmp = swab(tmp);
 #endif
 
-#ifndef find_next_bit
-/*
- * Find the next set bit in a memory region.
- */
-unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
-			    unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, 0UL);
+	return min(start + __ffs(tmp), nbits);
 }
 #endif
 
@@ -105,20 +112,3 @@ unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
 	return size;
 }
 #endif
-
-#ifndef find_next_zero_bit
-unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
-				 unsigned long offset)
-{
-	return _find_next_bit(addr, NULL, size, offset, ~0UL);
-}
-#endif
-
-#ifndef find_next_and_bit
-unsigned long find_next_and_bit(const unsigned long *addr1,
-		const unsigned long *addr2, unsigned long size,
-		unsigned long offset)
-{
-	return _find_next_bit(addr1, addr2, size, offset, 0UL);
-}
-#endif
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 10/13] lib: add fast path for find_next_*_bit()
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (8 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 09/13] tools: sync find_next_bit implementation Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  1:54 ` [PATCH 11/13] lib: add fast path for find_first_*_bit() and find_last_bit() Yury Norov
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Similarly to bitmap functions, find_next_*_bit() users will benefit
if we'll handle a case of bitmaps that fit into a single word inline.
In the very best case, the compiler may replace a function call with
a few instructions.

This is the quite typical find_next_bit() user:

	unsigned int cpumask_next(int n, const struct cpumask *srcp)
	{
		/* -1 is a legal arg here. */
		if (n != -1)
			cpumask_check(n);
		return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
	}
	EXPORT_SYMBOL(cpumask_next);

Currently, on ARM64 the generated code looks like this:
	0000000000000000 <cpumask_next>:
	   0:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
	   4:   11000402        add     w2, w0, #0x1
	   8:   aa0103e0        mov     x0, x1
	   c:   d2800401        mov     x1, #0x40                       // #64
	  10:   910003fd        mov     x29, sp
	  14:   93407c42        sxtw    x2, w2
	  18:   94000000        bl      0 <find_next_bit>
	  1c:   a8c17bfd        ldp     x29, x30, [sp], #16
	  20:   d65f03c0        ret
	  24:   d503201f        nop

After applying this patch:
	0000000000000140 <cpumask_next>:
	 140:   11000400        add     w0, w0, #0x1
	 144:   93407c00        sxtw    x0, w0
	 148:   f100fc1f        cmp     x0, #0x3f
	 14c:   54000168        b.hi    178 <cpumask_next+0x38>  // b.pmore
	 150:   f9400023        ldr     x3, [x1]
	 154:   92800001        mov     x1, #0xffffffffffffffff         // #-1
	 158:   9ac02020        lsl     x0, x1, x0
	 15c:   52800802        mov     w2, #0x40                       // #64
	 160:   8a030001        and     x1, x0, x3
	 164:   dac00020        rbit    x0, x1
	 168:   f100003f        cmp     x1, #0x0
	 16c:   dac01000        clz     x0, x0
	 170:   1a800040        csel    w0, w2, w0, eq  // eq = none
	 174:   d65f03c0        ret
	 178:   52800800        mov     w0, #0x40                       // #64
	 17c:   d65f03c0        ret

find_next_bit() call is replaced with 6 instructions. find_next_bit()
itself is 41 instructions plus function call overhead.

Despite inlining, the scripts/bloat-o-meter report smaller .text size
after applying the series:
	add/remove: 11/9 grow/shrink: 233/176 up/down: 5780/-6768 (-988)

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/asm-generic/bitops/find.h | 30 ++++++++++++++++++++++++++++++
 include/asm-generic/bitops/le.h   | 21 +++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 7ad70dab8e93..4148c74a1e4d 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -20,6 +20,16 @@ static inline
 unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
 			    unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
 }
 #endif
@@ -40,6 +50,16 @@ unsigned long find_next_and_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long size,
 		unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr1 & *addr2 & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
 }
 #endif
@@ -58,6 +78,16 @@ static inline
 unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
 				 unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr | ~GENMASK(size - 1, offset);
+		return val == ~0UL ? size : ffz(val);
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
 }
 #endif
diff --git a/include/asm-generic/bitops/le.h b/include/asm-generic/bitops/le.h
index 21305f6cea0b..5a28629cbf4d 100644
--- a/include/asm-generic/bitops/le.h
+++ b/include/asm-generic/bitops/le.h
@@ -5,6 +5,7 @@
 #include <asm-generic/bitops/find.h>
 #include <asm/types.h>
 #include <asm/byteorder.h>
+#include <linux/swab.h>
 
 #if defined(__LITTLE_ENDIAN)
 
@@ -37,6 +38,16 @@ static inline
 unsigned long find_next_zero_bit_le(const void *addr, unsigned
 		long size, unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val = *(const unsigned long *)addr;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = swab(val) | ~GENMASK(size - 1, offset);
+		return val == ~0UL ? size : ffz(val);
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, ~0UL, 1);
 }
 #endif
@@ -46,6 +57,16 @@ static inline
 unsigned long find_next_bit_le(const void *addr, unsigned
 		long size, unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val = *(const unsigned long *)addr;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = swab(val) & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, 0UL, 1);
 }
 #endif
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 11/13] lib: add fast path for find_first_*_bit() and find_last_bit()
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (9 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 10/13] lib: add fast path for find_next_*_bit() Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-04-06 16:03   ` Guenter Roeck
  2021-03-16  1:54 ` [PATCH 12/13] tools: sync lib/find_bit implementation Yury Norov
  2021-03-16  1:54 ` [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API Yury Norov
  12 siblings, 1 reply; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Similarly to bitmap functions, users would benefit if we'll handle
a case of small-size bitmaps that fit into a single word.

While here, move the find_last_bit() declaration to bitops/find.h
where other find_*_bit() functions sit.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/asm-generic/bitops/find.h | 50 ++++++++++++++++++++++++++++---
 include/linux/bitops.h            | 12 --------
 lib/find_bit.c                    | 12 ++++----
 3 files changed, 52 insertions(+), 22 deletions(-)

diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
index 4148c74a1e4d..8d818b304869 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -5,6 +5,9 @@
 extern unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
 		unsigned long start, unsigned long invert, unsigned long le);
+extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
+extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
+extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
 
 #ifndef find_next_bit
 /**
@@ -102,8 +105,17 @@ unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
  * Returns the bit number of the first set bit.
  * If no bits are set, returns @size.
  */
-extern unsigned long find_first_bit(const unsigned long *addr,
-				    unsigned long size);
+static inline
+unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr & BITS_FIRST(size - 1);
+
+		return val ? __ffs(val) : size;
+	}
+
+	return _find_first_bit(addr, size);
+}
 
 /**
  * find_first_zero_bit - find the first cleared bit in a memory region
@@ -113,8 +125,17 @@ extern unsigned long find_first_bit(const unsigned long *addr,
  * Returns the bit number of the first cleared bit.
  * If no bits are zero, returns @size.
  */
-extern unsigned long find_first_zero_bit(const unsigned long *addr,
-					 unsigned long size);
+static inline
+unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr | ~BITS_FIRST(size - 1);
+
+		return val == ~0UL ? size : ffz(val);
+	}
+
+	return _find_first_zero_bit(addr, size);
+}
 #else /* CONFIG_GENERIC_FIND_FIRST_BIT */
 
 #ifndef find_first_bit
@@ -126,6 +147,27 @@ extern unsigned long find_first_zero_bit(const unsigned long *addr,
 
 #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */
 
+#ifndef find_last_bit
+/**
+ * find_last_bit - find the last set bit in a memory region
+ * @addr: The address to start the search at
+ * @size: The number of bits to search
+ *
+ * Returns the bit number of the last set bit, or size.
+ */
+static inline
+unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr & BITS_FIRST(size - 1);
+
+		return val ? __fls(val) : size;
+	}
+
+	return _find_last_bit(addr, size);
+}
+#endif
+
 /**
  * find_next_clump8 - find next 8-bit clump with set bits in a memory region
  * @clump: location to store copy of found clump
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index a5a48303b0f1..26bf15e6cd35 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -286,17 +286,5 @@ static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
 })
 #endif
 
-#ifndef find_last_bit
-/**
- * find_last_bit - find the last set bit in a memory region
- * @addr: The address to start the search at
- * @size: The number of bits to search
- *
- * Returns the bit number of the last set bit, or size.
- */
-extern unsigned long find_last_bit(const unsigned long *addr,
-				   unsigned long size);
-#endif
-
 #endif /* __KERNEL__ */
 #endif
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 2470ae390f3c..e2c301d28568 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -75,7 +75,7 @@ EXPORT_SYMBOL(_find_next_bit);
 /*
  * Find the first set bit in a memory region.
  */
-unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_first_bit(const unsigned long *addr, unsigned long size)
 {
 	unsigned long idx;
 
@@ -86,14 +86,14 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
 
 	return size;
 }
-EXPORT_SYMBOL(find_first_bit);
+EXPORT_SYMBOL(_find_first_bit);
 #endif
 
 #ifndef find_first_zero_bit
 /*
  * Find the first cleared bit in a memory region.
  */
-unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size)
 {
 	unsigned long idx;
 
@@ -104,11 +104,11 @@ unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
 
 	return size;
 }
-EXPORT_SYMBOL(find_first_zero_bit);
+EXPORT_SYMBOL(_find_first_zero_bit);
 #endif
 
 #ifndef find_last_bit
-unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_last_bit(const unsigned long *addr, unsigned long size)
 {
 	if (size) {
 		unsigned long val = BITS_FIRST_MASK(size - 1);
@@ -124,7 +124,7 @@ unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
 	}
 	return size;
 }
-EXPORT_SYMBOL(find_last_bit);
+EXPORT_SYMBOL(_find_last_bit);
 #endif
 
 unsigned long find_next_clump8(unsigned long *clump, const unsigned long *addr,
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 12/13] tools: sync lib/find_bit implementation
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (10 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 11/13] lib: add fast path for find_first_*_bit() and find_last_bit() Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16  1:54 ` [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API Yury Norov
  12 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Add fast paths to find_*_bit() functions as per kernel implementation.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 tools/include/asm-generic/bitops/find.h | 58 +++++++++++++++++++++++--
 tools/lib/find_bit.c                    |  4 +-
 2 files changed, 57 insertions(+), 5 deletions(-)

diff --git a/tools/include/asm-generic/bitops/find.h b/tools/include/asm-generic/bitops/find.h
index 9fe62d10b084..bf807af1503f 100644
--- a/tools/include/asm-generic/bitops/find.h
+++ b/tools/include/asm-generic/bitops/find.h
@@ -5,6 +5,9 @@
 extern unsigned long _find_next_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
 		unsigned long start, unsigned long invert, unsigned long le);
+extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
+extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
+extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
 
 #ifndef find_next_bit
 /**
@@ -20,6 +23,16 @@ static inline
 unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
 			    unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
 }
 #endif
@@ -40,6 +53,16 @@ unsigned long find_next_and_bit(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long size,
 		unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr1 & *addr2 & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
 	return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
 }
 #endif
@@ -58,6 +81,16 @@ static inline
 unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
 				 unsigned long offset)
 {
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr | ~GENMASK(size - 1, offset);
+		return val == ~0UL ? size : ffz(val);
+	}
+
 	return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
 }
 #endif
@@ -72,8 +105,17 @@ unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
  * Returns the bit number of the first set bit.
  * If no bits are set, returns @size.
  */
-extern unsigned long find_first_bit(const unsigned long *addr,
-				    unsigned long size);
+static inline
+unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr & BITS_FIRST(size - 1);
+
+		return val ? __ffs(val) : size;
+	}
+
+	return _find_first_bit(addr, size);
+}
 
 #endif /* find_first_bit */
 
@@ -87,7 +129,17 @@ extern unsigned long find_first_bit(const unsigned long *addr,
  * Returns the bit number of the first cleared bit.
  * If no bits are zero, returns @size.
  */
-unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size);
+static inline
+unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val = *addr | ~BITS_FIRST(size - 1);
+
+		return val == ~0UL ? size : ffz(val);
+	}
+
+	return _find_first_zero_bit(addr, size);
+}
 #endif
 
 #endif /*_TOOLS_LINUX_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
index c3378b291205..a77884ca30ec 100644
--- a/tools/lib/find_bit.c
+++ b/tools/lib/find_bit.c
@@ -83,7 +83,7 @@ unsigned long _find_next_bit(const unsigned long *addr1,
 /*
  * Find the first set bit in a memory region.
  */
-unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_first_bit(const unsigned long *addr, unsigned long size)
 {
 	unsigned long idx;
 
@@ -100,7 +100,7 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
 /*
  * Find the first cleared bit in a memory region.
  */
-unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
+unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size)
 {
 	unsigned long idx;
 
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API
  2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
                   ` (11 preceding siblings ...)
  2021-03-16  1:54 ` [PATCH 12/13] tools: sync lib/find_bit implementation Yury Norov
@ 2021-03-16  1:54 ` Yury Norov
  2021-03-16 11:45   ` Andy Shevchenko
  12 siblings, 1 reply; 31+ messages in thread
From: Yury Norov @ 2021-03-16  1:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

Add myself as maintainer for bitmap API and Andy and Rasmus as reviewers.

I'm an author of current implementation of lib/find_bit and an active
contributor to lib/bitmap. It was spotted that there's no maintainer for
bitmap API. I'm willing to maintain it.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 MAINTAINERS | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 3dd20015696e..44f94cdd5a20 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3151,6 +3151,22 @@ F:	Documentation/filesystems/bfs.rst
 F:	fs/bfs/
 F:	include/uapi/linux/bfs_fs.h
 
+BITMAP API
+M:	Yury Norov <yury.norov@gmail.com>
+R:	Andy Shevchenko <andriy.shevchenko@linux.intel.com>
+R:	Rasmus Villemoes <linux@rasmusvillemoes.dk>
+S:	Maintained
+F:	include/asm-generic/bitops/find.h
+F:	include/linux/bitmap.h
+F:	lib/bitmap.c
+F:	lib/find_bit.c
+F:	lib/find_find_bit_benchmark.c
+F:	lib/test_bitmap.c
+F:	tools/include/asm-generic/bitops/find.h
+F:	tools/include/linux/bitmap.h
+F:	tools/lib/bitmap.c
+F:	tools/lib/find_bit.c
+
 BLINKM RGB LED DRIVER
 M:	Jan-Simon Moeller <jansimon.moeller@gmx.de>
 S:	Maintained
-- 
2.25.1


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/13] tools: disable -Wno-type-limits
  2021-03-16  1:54 ` [PATCH 01/13] tools: disable -Wno-type-limits Yury Norov
@ 2021-03-16  8:17   ` Rasmus Villemoes
  2021-03-16 11:39     ` Andy Shevchenko
  0 siblings, 1 reply; 31+ messages in thread
From: Rasmus Villemoes @ 2021-03-16  8:17 UTC (permalink / raw)
  To: Yury Norov, linux-kernel
  Cc: linux-m68k, linux-arch, linux-sh, Alexey Klimov, Andrew Morton,
	Andy Shevchenko, Arnd Bergmann, David Sterba, Dennis Zhou,
	Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On 16/03/2021 02.54, Yury Norov wrote:
> GENMASK(h, l) may be passed with unsigned types. In such case, type-limits
> warning is generated for example in case of GENMASK(h, 0).
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  tools/scripts/Makefile.include | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tools/scripts/Makefile.include b/tools/scripts/Makefile.include
> index 84dbf61a7eca..15e99905cb7d 100644
> --- a/tools/scripts/Makefile.include
> +++ b/tools/scripts/Makefile.include
> @@ -38,6 +38,7 @@ EXTRA_WARNINGS += -Wswitch-enum
>  EXTRA_WARNINGS += -Wundef
>  EXTRA_WARNINGS += -Wwrite-strings
>  EXTRA_WARNINGS += -Wformat
> +EXTRA_WARNINGS += -Wno-type-limits
>

I don't like that kind of collateral damage. I seem to recall another
instance where a macro was instead rewritten to avoid triggering the
type-limits warning (with a comment explaining the uglyness). Something like

foo > bar      is the same as
!(foo <= bar)  which is the same as
!(foo == bar || foo < bar)

Dunno if that would work here, but if it did, it would have the bonus
that when somebody builds the kernel proper with Wtype-limits enabled
(maybe W=1 or W=2) there would be no false positives from GENMASK to
wade through.

Alternatively, we really should consider making use of _Pragma to
locally disable/re-enable certain warnings.

Rasmus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 02/13] tools: bitmap: sync function declarations with the kernel
  2021-03-16  1:54 ` [PATCH 02/13] tools: bitmap: sync function declarations with the kernel Yury Norov
@ 2021-03-16  8:18   ` Rasmus Villemoes
  0 siblings, 0 replies; 31+ messages in thread
From: Rasmus Villemoes @ 2021-03-16  8:18 UTC (permalink / raw)
  To: Yury Norov, linux-kernel
  Cc: linux-m68k, linux-arch, linux-sh, Alexey Klimov, Andrew Morton,
	Andy Shevchenko, Arnd Bergmann, David Sterba, Dennis Zhou,
	Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On 16/03/2021 02.54, Yury Norov wrote:
> Some functions in tools/include/linux/bitmap.h declare nbits as int. In the
> kernel nbits is declared as unsigned int.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro
  2021-03-16  1:54 ` [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro Yury Norov
@ 2021-03-16  8:35   ` Rasmus Villemoes
  2021-03-16 11:42     ` Andy Shevchenko
  0 siblings, 1 reply; 31+ messages in thread
From: Rasmus Villemoes @ 2021-03-16  8:35 UTC (permalink / raw)
  To: Yury Norov, linux-kernel
  Cc: linux-m68k, linux-arch, linux-sh, Alexey Klimov, Andrew Morton,
	Andy Shevchenko, Arnd Bergmann, David Sterba, Dennis Zhou,
	Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On 16/03/2021 02.54, Yury Norov wrote:
> BITMAP_{LAST,FIRST}_WORD_MASK() in linux/bitmap.h duplicates the
> functionality of GENMASK(). The scope of BITMAP* macros is wider
> than just bitmaps. This patch defines 4 new macros: BITS_FIRST(),
> BITS_LAST(), BITS_FIRST_MASK() and BITS_LAST_MASK() in linux/bits.h
> on top of GENMASK() and replaces BITMAP_{LAST,FIRST}_WORD_MASK()
> to avoid duplication and increase the scope of the macros.
> 
> This change doesn't affect code generation. On ARM64:
> scripts/bloat-o-meter vmlinux.before vmlinux
> add/remove: 1/2 grow/shrink: 2/0 up/down: 17/-16 (1)
> Function                                     old     new   delta
> ethtool_get_drvinfo                          900     908      +8
> e843419@0cf2_0001309d_7f0                      -       8      +8
> vermagic                                      48      49      +1
> e843419@0d45_000138bb_f68                      8       -      -8
> e843419@0cc9_00012bce_198c                     8       -      -8

[what on earth are those weird symbols?]


> diff --git a/include/linux/bits.h b/include/linux/bits.h
> index 7f475d59a097..8c191c29506e 100644
> --- a/include/linux/bits.h
> +++ b/include/linux/bits.h
> @@ -37,6 +37,12 @@
>  #define GENMASK(h, l) \
>  	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
>  
> +#define BITS_FIRST(nr)		GENMASK((nr), 0)
> +#define BITS_LAST(nr)		GENMASK(BITS_PER_LONG - 1, (nr))
> +
> +#define BITS_FIRST_MASK(nr)	BITS_FIRST((nr) % BITS_PER_LONG)
> +#define BITS_LAST_MASK(nr)	BITS_LAST((nr) % BITS_PER_LONG)

I don't think it's a good idea to propagate the unusual closed-range
semantics of GENMASK to those wrappers. Almost all C and kernel code
uses the 'inclusive lower bound, exclusive upper bound', and I'd expect
BITS_FIRST(5) to result in a word with five bits set, not six. So I
think these changes as-is make the code much harder to read and understand.

Regardless, please add some comments on the valid input ranges to the
macros, whether that ends up being 0 <= nr < BITS_PER_LONG or 0 < nr <=
BITS_PER_LONG or whatnot.

It would also be much easier to review if you just redefined the
BITMAP_LAST_WORD_MASK macros etc. in terms of these new things, so you
wouldn't have to do a lot of mechanical changes at the same time as
introducing the new ones - especially when those mechanical changes
involve adding a "minus 1" everywhere.

Rasmus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 06/13] lib: extend the scope of small_const_nbits() macro
  2021-03-16  1:54 ` [PATCH 06/13] lib: extend the scope of small_const_nbits() macro Yury Norov
@ 2021-03-16  8:56   ` Rasmus Villemoes
  2021-03-17  4:53     ` Yury Norov
  0 siblings, 1 reply; 31+ messages in thread
From: Rasmus Villemoes @ 2021-03-16  8:56 UTC (permalink / raw)
  To: Yury Norov, linux-kernel
  Cc: linux-m68k, linux-arch, linux-sh, Alexey Klimov, Andrew Morton,
	Andy Shevchenko, Arnd Bergmann, David Sterba, Dennis Zhou,
	Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On 16/03/2021 02.54, Yury Norov wrote:
> find_bit would also benefit from small_const_nbits() optimizations.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  include/asm-generic/bitsperlong.h | 9 +++++++++
>  include/linux/bitmap.h            | 3 ---
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/include/asm-generic/bitsperlong.h b/include/asm-generic/bitsperlong.h
> index 3905c1c93dc2..96032f4f908f 100644
> --- a/include/asm-generic/bitsperlong.h
> +++ b/include/asm-generic/bitsperlong.h
> @@ -23,4 +23,13 @@
>  #define BITS_PER_LONG_LONG 64
>  #endif
>  
> +#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < BITS_PER_LONG)
> +
> +/*
> + * The valid number of bits for a bitmap to be small enough, or in other words,
> + * fit into a single machine word is 1 to BITS_PER_LONG inclusively. 0 is not a
> + * valid number for size, and most probably a sing of error.
> + */
> +#define small_const_nbits(n) SMALL_CONST((n) - 1)
> +

I still don't see the point of introducing SMALL_CONST and still don't
like the much-too-generic-name - especially since AFAICT you don't
actually use it anywhere outside the definition of small_const_nbits()?

>  #endif /* __ASM_GENERIC_BITS_PER_LONG */
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index adf7bd9f0467..bc13a890ecc1 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -224,9 +224,6 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
>   * so make such users (should any ever turn up) call the out-of-line
>   * versions.
>   */

You added another comment near its new definition, but the left-behind
comment in bitmap.h is now somewhat confusing, no? I suggest expanding
your new comment a bit so it's clear why we're interested in whether a
bitmap is known at compile-time to consist of exactly one word:

/*
small_const_nbits(n) is true precisely when it is known at compile-time
that BITMAP_SIZE(n) is 1, i.e. 1 <= n <= BITS_PER_LONG. This allows
various bit/bitmap APIs to provide a fast inline implementation. Bitmaps
of size 0 are very rare, and a compile-time-known-size 0 is most likely
a sign of error. They will be handled correctly by the bit/bitmap APIs,
but using the out-of-line functions, so that the inline implementations
can unconditionally dereference the pointer(s).
*/

> -#define small_const_nbits(nbits) \
> -	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
> -
>  static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
>  {
>  	unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> 

Rasmus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 01/13] tools: disable -Wno-type-limits
  2021-03-16  8:17   ` Rasmus Villemoes
@ 2021-03-16 11:39     ` Andy Shevchenko
  0 siblings, 0 replies; 31+ messages in thread
From: Andy Shevchenko @ 2021-03-16 11:39 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Yury Norov, linux-kernel, linux-m68k, linux-arch, linux-sh,
	Alexey Klimov, Andrew Morton, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On Tue, Mar 16, 2021 at 09:17:24AM +0100, Rasmus Villemoes wrote:
> On 16/03/2021 02.54, Yury Norov wrote:
> > GENMASK(h, l) may be passed with unsigned types. In such case, type-limits
> > warning is generated for example in case of GENMASK(h, 0).

...

> I don't like that kind of collateral damage. I seem to recall another
> instance where a macro was instead rewritten to avoid triggering the
> type-limits warning (with a comment explaining the uglyness). Something like
> 
> foo > bar      is the same as
> !(foo <= bar)  which is the same as
> !(foo == bar || foo < bar)
> 
> Dunno if that would work here, but if it did, it would have the bonus
> that when somebody builds the kernel proper with Wtype-limits enabled
> (maybe W=1 or W=2) there would be no false positives from GENMASK to
> wade through.
> 
> Alternatively, we really should consider making use of _Pragma to
> locally disable/re-enable certain warnings.

Rasmus, in the kernel the same was fixed as per 355a3587d4ca.
I don't know why tools should be different to that.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro
  2021-03-16  8:35   ` Rasmus Villemoes
@ 2021-03-16 11:42     ` Andy Shevchenko
  2021-03-17  5:40       ` Yury Norov
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Shevchenko @ 2021-03-16 11:42 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Yury Norov, linux-kernel, linux-m68k, linux-arch, linux-sh,
	Alexey Klimov, Andrew Morton, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On Tue, Mar 16, 2021 at 09:35:35AM +0100, Rasmus Villemoes wrote:
> On 16/03/2021 02.54, Yury Norov wrote:
> > BITMAP_{LAST,FIRST}_WORD_MASK() in linux/bitmap.h duplicates the
> > functionality of GENMASK(). The scope of BITMAP* macros is wider
> > than just bitmaps. This patch defines 4 new macros: BITS_FIRST(),
> > BITS_LAST(), BITS_FIRST_MASK() and BITS_LAST_MASK() in linux/bits.h
> > on top of GENMASK() and replaces BITMAP_{LAST,FIRST}_WORD_MASK()
> > to avoid duplication and increase the scope of the macros.
> > 
> > This change doesn't affect code generation. On ARM64:
> > scripts/bloat-o-meter vmlinux.before vmlinux
> > add/remove: 1/2 grow/shrink: 2/0 up/down: 17/-16 (1)
> > Function                                     old     new   delta
> > ethtool_get_drvinfo                          900     908      +8
> > e843419@0cf2_0001309d_7f0                      -       8      +8
> > vermagic                                      48      49      +1
> > e843419@0d45_000138bb_f68                      8       -      -8
> > e843419@0cc9_00012bce_198c                     8       -      -8
> 
> [what on earth are those weird symbols?]
> 
> 
> > diff --git a/include/linux/bits.h b/include/linux/bits.h
> > index 7f475d59a097..8c191c29506e 100644
> > --- a/include/linux/bits.h
> > +++ b/include/linux/bits.h
> > @@ -37,6 +37,12 @@
> >  #define GENMASK(h, l) \
> >  	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
> >  
> > +#define BITS_FIRST(nr)		GENMASK((nr), 0)
> > +#define BITS_LAST(nr)		GENMASK(BITS_PER_LONG - 1, (nr))
> > +
> > +#define BITS_FIRST_MASK(nr)	BITS_FIRST((nr) % BITS_PER_LONG)
> > +#define BITS_LAST_MASK(nr)	BITS_LAST((nr) % BITS_PER_LONG)
> 
> I don't think it's a good idea to propagate the unusual closed-range
> semantics of GENMASK to those wrappers. Almost all C and kernel code
> uses the 'inclusive lower bound, exclusive upper bound', and I'd expect
> BITS_FIRST(5) to result in a word with five bits set, not six. So I
> think these changes as-is make the code much harder to read and understand.
> 
> Regardless, please add some comments on the valid input ranges to the
> macros, whether that ends up being 0 <= nr < BITS_PER_LONG or 0 < nr <=
> BITS_PER_LONG or whatnot.
> 
> It would also be much easier to review if you just redefined the
> BITMAP_LAST_WORD_MASK macros etc. in terms of these new things, so you
> wouldn't have to do a lot of mechanical changes at the same time as
> introducing the new ones - especially when those mechanical changes
> involve adding a "minus 1" everywhere.

I tend to agree with Rasmus here.

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API
  2021-03-16  1:54 ` [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API Yury Norov
@ 2021-03-16 11:45   ` Andy Shevchenko
  2021-03-17  4:47     ` Yury Norov
  0 siblings, 1 reply; 31+ messages in thread
From: Andy Shevchenko @ 2021-03-16 11:45 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Arnd Bergmann, David Sterba, Dennis Zhou,
	Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

On Mon, Mar 15, 2021 at 06:54:24PM -0700, Yury Norov wrote:
> Add myself as maintainer for bitmap API and Andy and Rasmus as reviewers.
> 
> I'm an author of current implementation of lib/find_bit and an active
> contributor to lib/bitmap. It was spotted that there's no maintainer for
> bitmap API. I'm willing to maintain it.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> ---
>  MAINTAINERS | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3dd20015696e..44f94cdd5a20 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3151,6 +3151,22 @@ F:	Documentation/filesystems/bfs.rst
>  F:	fs/bfs/
>  F:	include/uapi/linux/bfs_fs.h
>  
> +BITMAP API
> +M:	Yury Norov <yury.norov@gmail.com>
> +R:	Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> +R:	Rasmus Villemoes <linux@rasmusvillemoes.dk>
> +S:	Maintained
> +F:	include/asm-generic/bitops/find.h
> +F:	include/linux/bitmap.h
> +F:	lib/bitmap.c
> +F:	lib/find_bit.c

> +F:	lib/find_find_bit_benchmark.c

Does this file exist?
I guess checkpatch.pl nowadays has a MAINTAINER data base validation.

> +F:	lib/test_bitmap.c
> +F:	tools/include/asm-generic/bitops/find.h
> +F:	tools/include/linux/bitmap.h
> +F:	tools/lib/bitmap.c
> +F:	tools/lib/find_bit.c
> +
>  BLINKM RGB LED DRIVER
>  M:	Jan-Simon Moeller <jansimon.moeller@gmx.de>
>  S:	Maintained
> -- 
> 2.25.1
> 

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API
  2021-03-16 11:45   ` Andy Shevchenko
@ 2021-03-17  4:47     ` Yury Norov
  2021-03-17  4:57       ` Joe Perches
  0 siblings, 1 reply; 31+ messages in thread
From: Yury Norov @ 2021-03-17  4:47 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: linux-kernel, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Arnd Bergmann, David Sterba, Dennis Zhou,
	Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato, Andy Whitcroft, Dwaipayan Ray, Lukas Bulwahn

[CC Andy Whitcroft, Joe Perches, Dwaipayan Ray, Lukas Bulwahn]

On Tue, Mar 16, 2021 at 01:45:51PM +0200, Andy Shevchenko wrote:
> On Mon, Mar 15, 2021 at 06:54:24PM -0700, Yury Norov wrote:
> > Add myself as maintainer for bitmap API and Andy and Rasmus as reviewers.
> > 
> > I'm an author of current implementation of lib/find_bit and an active
> > contributor to lib/bitmap. It was spotted that there's no maintainer for
> > bitmap API. I'm willing to maintain it.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > ---
> >  MAINTAINERS | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> > 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 3dd20015696e..44f94cdd5a20 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -3151,6 +3151,22 @@ F:	Documentation/filesystems/bfs.rst
> >  F:	fs/bfs/
> >  F:	include/uapi/linux/bfs_fs.h
> >  
> > +BITMAP API
> > +M:	Yury Norov <yury.norov@gmail.com>
> > +R:	Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > +R:	Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > +S:	Maintained
> > +F:	include/asm-generic/bitops/find.h
> > +F:	include/linux/bitmap.h
> > +F:	lib/bitmap.c
> > +F:	lib/find_bit.c
> 
> > +F:	lib/find_find_bit_benchmark.c
> 
> Does this file exist?
> I guess checkpatch.pl nowadays has a MAINTAINER data base validation.

No lib/find_find_bit_benchmark.c doesn't exist. It's a typo, it should
be lib/find_bit_benchmark.c. Checkpatch doesn't warn:

yury:linux$ scripts/checkpatch.pl 0013-MAINTAINERS-Add-entry-for-the-bitmap-API.patch
total: 0 errors, 0 warnings, 22 lines checked

> > +F:	lib/test_bitmap.c
> > +F:	tools/include/asm-generic/bitops/find.h
> > +F:	tools/include/linux/bitmap.h
> > +F:	tools/lib/bitmap.c
> > +F:	tools/lib/find_bit.c
> > +
> >  BLINKM RGB LED DRIVER
> >  M:	Jan-Simon Moeller <jansimon.moeller@gmx.de>
> >  S:	Maintained
> > -- 
> > 2.25.1
> > 
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 06/13] lib: extend the scope of small_const_nbits() macro
  2021-03-16  8:56   ` Rasmus Villemoes
@ 2021-03-17  4:53     ` Yury Norov
  0 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-17  4:53 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: linux-kernel, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On Tue, Mar 16, 2021 at 09:56:49AM +0100, Rasmus Villemoes wrote:
> On 16/03/2021 02.54, Yury Norov wrote:
> > find_bit would also benefit from small_const_nbits() optimizations.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  include/asm-generic/bitsperlong.h | 9 +++++++++
> >  include/linux/bitmap.h            | 3 ---
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/asm-generic/bitsperlong.h b/include/asm-generic/bitsperlong.h
> > index 3905c1c93dc2..96032f4f908f 100644
> > --- a/include/asm-generic/bitsperlong.h
> > +++ b/include/asm-generic/bitsperlong.h
> > @@ -23,4 +23,13 @@
> >  #define BITS_PER_LONG_LONG 64
> >  #endif
> >  
> > +#define SMALL_CONST(n) (__builtin_constant_p(n) && (unsigned long)(n) < BITS_PER_LONG)
> > +
> > +/*
> > + * The valid number of bits for a bitmap to be small enough, or in other words,
> > + * fit into a single machine word is 1 to BITS_PER_LONG inclusively. 0 is not a
> > + * valid number for size, and most probably a sing of error.
> > + */
> > +#define small_const_nbits(n) SMALL_CONST((n) - 1)
> > +
> 
> I still don't see the point of introducing SMALL_CONST and still don't
> like the much-too-generic-name - especially since AFAICT you don't
> actually use it anywhere outside the definition of small_const_nbits()?
> 
> >  #endif /* __ASM_GENERIC_BITS_PER_LONG */
> > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> > index adf7bd9f0467..bc13a890ecc1 100644
> > --- a/include/linux/bitmap.h
> > +++ b/include/linux/bitmap.h
> > @@ -224,9 +224,6 @@ extern int bitmap_print_to_pagebuf(bool list, char *buf,
> >   * so make such users (should any ever turn up) call the out-of-line
> >   * versions.
> >   */
> 
> You added another comment near its new definition, but the left-behind
> comment in bitmap.h is now somewhat confusing, no? I suggest expanding
> your new comment a bit so it's clear why we're interested in whether a
> bitmap is known at compile-time to consist of exactly one word:
> 
> /*
> small_const_nbits(n) is true precisely when it is known at compile-time
> that BITMAP_SIZE(n) is 1, i.e. 1 <= n <= BITS_PER_LONG. This allows
> various bit/bitmap APIs to provide a fast inline implementation. Bitmaps
> of size 0 are very rare, and a compile-time-known-size 0 is most likely
> a sign of error. They will be handled correctly by the bit/bitmap APIs,
> but using the out-of-line functions, so that the inline implementations
> can unconditionally dereference the pointer(s).
> */

Ok, make sense

> > -#define small_const_nbits(nbits) \
> > -	(__builtin_constant_p(nbits) && (nbits) <= BITS_PER_LONG && (nbits) > 0)
> > -
> >  static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
> >  {
> >  	unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> > 
> 
> Rasmus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API
  2021-03-17  4:47     ` Yury Norov
@ 2021-03-17  4:57       ` Joe Perches
  2021-03-17  6:40         ` Lukas Bulwahn
  0 siblings, 1 reply; 31+ messages in thread
From: Joe Perches @ 2021-03-17  4:57 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko
  Cc: linux-kernel, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Arnd Bergmann, David Sterba, Dennis Zhou,
	Geert Uytterhoeven, Jianpeng Ma, John Paul Adrian Glaubitz,
	Josh Poimboeuf, Rasmus Villemoes, Rich Felker, Stefano Brivio,
	Wei Yang, Wolfram Sang, Yoshinori Sato, Andy Whitcroft,
	Dwaipayan Ray, Lukas Bulwahn

On Tue, 2021-03-16 at 21:47 -0700, Yury Norov wrote:
> [CC Andy Whitcroft, Joe Perches, Dwaipayan Ray, Lukas Bulwahn]
> 
> On Tue, Mar 16, 2021 at 01:45:51PM +0200, Andy Shevchenko wrote:
> > On Mon, Mar 15, 2021 at 06:54:24PM -0700, Yury Norov wrote:
> > > Add myself as maintainer for bitmap API and Andy and Rasmus as reviewers.
> > > 
> > > I'm an author of current implementation of lib/find_bit and an active
> > > contributor to lib/bitmap. It was spotted that there's no maintainer for
> > > bitmap API. I'm willing to maintain it.
> > > 
> > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > > ---
> > >  MAINTAINERS | 16 ++++++++++++++++
> > >  1 file changed, 16 insertions(+)
> > > 
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index 3dd20015696e..44f94cdd5a20 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -3151,6 +3151,22 @@ F:	Documentation/filesystems/bfs.rst
> > >  F:	fs/bfs/
> > >  F:	include/uapi/linux/bfs_fs.h
> > >  
> > > 
> > > +BITMAP API
> > > +M:	Yury Norov <yury.norov@gmail.com>
> > > +R:	Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > +R:	Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > > +S:	Maintained
> > > +F:	include/asm-generic/bitops/find.h
> > > +F:	include/linux/bitmap.h
> > > +F:	lib/bitmap.c
> > > +F:	lib/find_bit.c
> > 
> > > +F:	lib/find_find_bit_benchmark.c
> > 
> > Does this file exist?
> > I guess checkpatch.pl nowadays has a MAINTAINER data base validation.
> 
> No lib/find_find_bit_benchmark.c doesn't exist. It's a typo, it should
> be lib/find_bit_benchmark.c. Checkpatch doesn't warn:
> 
> yury:linux$ scripts/checkpatch.pl 0013-MAINTAINERS-Add-entry-for-the-bitmap-API.patch
> total: 0 errors, 0 warnings, 22 lines checked

checkpatch does not validate filenames for each patch.

checkpatch does have a --self-test=patterns capability that does
validate file accessibility.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro
  2021-03-16 11:42     ` Andy Shevchenko
@ 2021-03-17  5:40       ` Yury Norov
  2021-03-17 19:58         ` Rasmus Villemoes
  0 siblings, 1 reply; 31+ messages in thread
From: Yury Norov @ 2021-03-17  5:40 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Rasmus Villemoes, linux-kernel, linux-m68k, linux-arch, linux-sh,
	Alexey Klimov, Andrew Morton, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On Tue, Mar 16, 2021 at 01:42:45PM +0200, Andy Shevchenko wrote:
> On Tue, Mar 16, 2021 at 09:35:35AM +0100, Rasmus Villemoes wrote:
> > On 16/03/2021 02.54, Yury Norov wrote:
> > > BITMAP_{LAST,FIRST}_WORD_MASK() in linux/bitmap.h duplicates the
> > > functionality of GENMASK(). The scope of BITMAP* macros is wider
> > > than just bitmaps. This patch defines 4 new macros: BITS_FIRST(),
> > > BITS_LAST(), BITS_FIRST_MASK() and BITS_LAST_MASK() in linux/bits.h
> > > on top of GENMASK() and replaces BITMAP_{LAST,FIRST}_WORD_MASK()
> > > to avoid duplication and increase the scope of the macros.
> > > 
> > > This change doesn't affect code generation. On ARM64:
> > > scripts/bloat-o-meter vmlinux.before vmlinux
> > > add/remove: 1/2 grow/shrink: 2/0 up/down: 17/-16 (1)
> > > Function                                     old     new   delta
> > > ethtool_get_drvinfo                          900     908      +8
> > > e843419@0cf2_0001309d_7f0                      -       8      +8
> > > vermagic                                      48      49      +1
> > > e843419@0d45_000138bb_f68                      8       -      -8
> > > e843419@0cc9_00012bce_198c                     8       -      -8
> > 
> > [what on earth are those weird symbols?]
> > 
> > 
> > > diff --git a/include/linux/bits.h b/include/linux/bits.h
> > > index 7f475d59a097..8c191c29506e 100644
> > > --- a/include/linux/bits.h
> > > +++ b/include/linux/bits.h
> > > @@ -37,6 +37,12 @@
> > >  #define GENMASK(h, l) \
> > >  	(GENMASK_INPUT_CHECK(h, l) + __GENMASK(h, l))
> > >  
> > > +#define BITS_FIRST(nr)		GENMASK((nr), 0)
> > > +#define BITS_LAST(nr)		GENMASK(BITS_PER_LONG - 1, (nr))
> > > +
> > > +#define BITS_FIRST_MASK(nr)	BITS_FIRST((nr) % BITS_PER_LONG)
> > > +#define BITS_LAST_MASK(nr)	BITS_LAST((nr) % BITS_PER_LONG)
> > 
> > I don't think it's a good idea to propagate the unusual closed-range
> > semantics of GENMASK to those wrappers. Almost all C and kernel code
> > uses the 'inclusive lower bound, exclusive upper bound', and I'd expect
> > BITS_FIRST(5) to result in a word with five bits set, not six. So I
> > think these changes as-is make the code much harder to read and understand.
> > 
> > Regardless, please add some comments on the valid input ranges to the
> > macros, whether that ends up being 0 <= nr < BITS_PER_LONG or 0 < nr <=
> > BITS_PER_LONG or whatnot.
> > 
> > It would also be much easier to review if you just redefined the
> > BITMAP_LAST_WORD_MASK macros etc. in terms of these new things, so you
> > wouldn't have to do a lot of mechanical changes at the same time as
> > introducing the new ones - especially when those mechanical changes
> > involve adding a "minus 1" everywhere.
> 
> I tend to agree with Rasmus here.

OK. All this plus terrible GENMASK(high, low) design, when high goes
first, makes me feel like we need to deprecate GENMASK and propose a
new interface.

What do you think about this:
BITS_FIRST(bitnum)      -> [0, bitnum)
BITS_LAST(bitnum)       -> [bitnum, BITS_PER_LONG)
BITS_RANGE(begin, end)  -> [begin, end)

We can pick BITS_{LAST,FIRST} implementation from existing BITMAP_*_WORD_MASK
analogues, and make the BITS_RANGE like:
        #define BITS_RANGE(begin, end) BITS_FIRST(end) & BITS_LAST(begin)

Regarding BITMAP_*_WORD_MASK, I can save them in bitmap.h as aliases
to BITS_{LAST,FIRST} to avoid massive renaming. (Should I?)

Would this all work for you?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API
  2021-03-17  4:57       ` Joe Perches
@ 2021-03-17  6:40         ` Lukas Bulwahn
  2021-03-17 19:29           ` Yury Norov
  0 siblings, 1 reply; 31+ messages in thread
From: Lukas Bulwahn @ 2021-03-17  6:40 UTC (permalink / raw)
  To: Joe Perches
  Cc: Yury Norov, Andy Shevchenko, Linux Kernel Mailing List,
	linux-m68k, linux-arch, linux-sh, Alexey Klimov, Andrew Morton,
	Arnd Bergmann, David Sterba, Dennis Zhou, Geert Uytterhoeven,
	Jianpeng Ma, John Paul Adrian Glaubitz, Josh Poimboeuf,
	Rasmus Villemoes, Rich Felker, Stefano Brivio, Wei Yang,
	Wolfram Sang, Yoshinori Sato, Andy Whitcroft, Dwaipayan Ray

On Wed, Mar 17, 2021 at 5:57 AM Joe Perches <joe@perches.com> wrote:
>
> On Tue, 2021-03-16 at 21:47 -0700, Yury Norov wrote:
> > [CC Andy Whitcroft, Joe Perches, Dwaipayan Ray, Lukas Bulwahn]
> >
> > On Tue, Mar 16, 2021 at 01:45:51PM +0200, Andy Shevchenko wrote:
> > > On Mon, Mar 15, 2021 at 06:54:24PM -0700, Yury Norov wrote:
> > > > Add myself as maintainer for bitmap API and Andy and Rasmus as reviewers.
> > > >
> > > > I'm an author of current implementation of lib/find_bit and an active
> > > > contributor to lib/bitmap. It was spotted that there's no maintainer for
> > > > bitmap API. I'm willing to maintain it.
> > > >
> > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > > > ---
> > > >  MAINTAINERS | 16 ++++++++++++++++
> > > >  1 file changed, 16 insertions(+)
> > > >
> > > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > > index 3dd20015696e..44f94cdd5a20 100644
> > > > --- a/MAINTAINERS
> > > > +++ b/MAINTAINERS
> > > > @@ -3151,6 +3151,22 @@ F: Documentation/filesystems/bfs.rst
> > > >  F:       fs/bfs/
> > > >  F:       include/uapi/linux/bfs_fs.h
> > > >
> > > >
> > > > +BITMAP API
> > > > +M:       Yury Norov <yury.norov@gmail.com>
> > > > +R:       Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > +R:       Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > > > +S:       Maintained
> > > > +F:       include/asm-generic/bitops/find.h
> > > > +F:       include/linux/bitmap.h
> > > > +F:       lib/bitmap.c
> > > > +F:       lib/find_bit.c
> > >
> > > > +F:       lib/find_find_bit_benchmark.c
> > >
> > > Does this file exist?
> > > I guess checkpatch.pl nowadays has a MAINTAINER data base validation.
> >
> > No lib/find_find_bit_benchmark.c doesn't exist. It's a typo, it should
> > be lib/find_bit_benchmark.c. Checkpatch doesn't warn:
> >
> > yury:linux$ scripts/checkpatch.pl 0013-MAINTAINERS-Add-entry-for-the-bitmap-API.patch
> > total: 0 errors, 0 warnings, 22 lines checked
>
> checkpatch does not validate filenames for each patch.
>
> checkpatch does have a --self-test=patterns capability that does
> validate file accessibility.

Joe meant: get_maintainers does have a --self-test=patterns capability
that does validate file accessibility.

You can run that before patch submission; otherwise, I run that script
on linux-next once a week and send out correction patches as far as my
"spare" time allows to do so.

Lukas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API
  2021-03-17  6:40         ` Lukas Bulwahn
@ 2021-03-17 19:29           ` Yury Norov
  0 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-17 19:29 UTC (permalink / raw)
  To: Lukas Bulwahn
  Cc: Joe Perches, Andy Shevchenko, Linux Kernel Mailing List,
	linux-m68k, linux-arch, linux-sh, Alexey Klimov, Andrew Morton,
	Arnd Bergmann, David Sterba, Dennis Zhou, Geert Uytterhoeven,
	Jianpeng Ma, John Paul Adrian Glaubitz, Josh Poimboeuf,
	Rasmus Villemoes, Rich Felker, Stefano Brivio, Wei Yang,
	Wolfram Sang, Yoshinori Sato, Andy Whitcroft, Dwaipayan Ray

On Wed, Mar 17, 2021 at 07:40:04AM +0100, Lukas Bulwahn wrote:
> On Wed, Mar 17, 2021 at 5:57 AM Joe Perches <joe@perches.com> wrote:
> >
> > On Tue, 2021-03-16 at 21:47 -0700, Yury Norov wrote:
> > > [CC Andy Whitcroft, Joe Perches, Dwaipayan Ray, Lukas Bulwahn]
> > >
> > > On Tue, Mar 16, 2021 at 01:45:51PM +0200, Andy Shevchenko wrote:
> > > > On Mon, Mar 15, 2021 at 06:54:24PM -0700, Yury Norov wrote:
> > > > > Add myself as maintainer for bitmap API and Andy and Rasmus as reviewers.
> > > > >
> > > > > I'm an author of current implementation of lib/find_bit and an active
> > > > > contributor to lib/bitmap. It was spotted that there's no maintainer for
> > > > > bitmap API. I'm willing to maintain it.
> > > > >
> > > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > > Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > > Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > > > > ---
> > > > >  MAINTAINERS | 16 ++++++++++++++++
> > > > >  1 file changed, 16 insertions(+)
> > > > >
> > > > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > > > index 3dd20015696e..44f94cdd5a20 100644
> > > > > --- a/MAINTAINERS
> > > > > +++ b/MAINTAINERS
> > > > > @@ -3151,6 +3151,22 @@ F: Documentation/filesystems/bfs.rst
> > > > >  F:       fs/bfs/
> > > > >  F:       include/uapi/linux/bfs_fs.h
> > > > >
> > > > >
> > > > > +BITMAP API
> > > > > +M:       Yury Norov <yury.norov@gmail.com>
> > > > > +R:       Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > > +R:       Rasmus Villemoes <linux@rasmusvillemoes.dk>
> > > > > +S:       Maintained
> > > > > +F:       include/asm-generic/bitops/find.h
> > > > > +F:       include/linux/bitmap.h
> > > > > +F:       lib/bitmap.c
> > > > > +F:       lib/find_bit.c
> > > >
> > > > > +F:       lib/find_find_bit_benchmark.c
> > > >
> > > > Does this file exist?
> > > > I guess checkpatch.pl nowadays has a MAINTAINER data base validation.
> > >
> > > No lib/find_find_bit_benchmark.c doesn't exist. It's a typo, it should
> > > be lib/find_bit_benchmark.c. Checkpatch doesn't warn:
> > >
> > > yury:linux$ scripts/checkpatch.pl 0013-MAINTAINERS-Add-entry-for-the-bitmap-API.patch
> > > total: 0 errors, 0 warnings, 22 lines checked
> >
> > checkpatch does not validate filenames for each patch.
> >
> > checkpatch does have a --self-test=patterns capability that does
> > validate file accessibility.
> 
> Joe meant: get_maintainers does have a --self-test=patterns capability
> that does validate file accessibility.
> 
> You can run that before patch submission; otherwise, I run that script
> on linux-next once a week and send out correction patches as far as my
> "spare" time allows to do so.

Thanks for the hint. I see it's able to detect the issue.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro
  2021-03-17  5:40       ` Yury Norov
@ 2021-03-17 19:58         ` Rasmus Villemoes
  2021-03-17 23:33           ` Yury Norov
  0 siblings, 1 reply; 31+ messages in thread
From: Rasmus Villemoes @ 2021-03-17 19:58 UTC (permalink / raw)
  To: Yury Norov, Andy Shevchenko
  Cc: Rasmus Villemoes, linux-kernel, linux-m68k, linux-arch, linux-sh,
	Alexey Klimov, Andrew Morton, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On 17/03/2021 06.40, Yury Norov wrote:
> On Tue, Mar 16, 2021 at 01:42:45PM +0200, Andy Shevchenko wrote:

>>> It would also be much easier to review if you just redefined the
>>> BITMAP_LAST_WORD_MASK macros etc. in terms of these new things, so you
>>> wouldn't have to do a lot of mechanical changes at the same time as
>>> introducing the new ones - especially when those mechanical changes
>>> involve adding a "minus 1" everywhere.
>>
>> I tend to agree with Rasmus here.
> 
> OK. All this plus terrible GENMASK(high, low) design, when high goes
> first, makes me feel like we need to deprecate GENMASK and propose a
> new interface.
> 
> What do you think about this:
> BITS_FIRST(bitnum)      -> [0, bitnum)
> BITS_LAST(bitnum)       -> [bitnum, BITS_PER_LONG)
> BITS_RANGE(begin, end)  -> [begin, end)

Better, though I'm not too happy about BITS_LAST(n) not producing a word
with the n highest bits set. I dunno, I don't have better names.
BITS_FROM/BITS_UPTO perhaps, but not really (and upto sounds like it is
inclusive). BITS_LOW/BITS_HIGH have the same problem as BITS_LAST.

Also, be careful to document what one can expect from the boundary
values 0/BITS_PER_LONG. Is BITS_FIRST(0) a valid invocation? Does it
yield 0UL? How about BITS_FIRST(BITS_PER_LONG), does that give ~0UL?
Note that BITMAP_{FIRST,LAST}_WORD_MASK never produce 0, they're never
used except with a word we know to be part of the bitmap.

> We can pick BITS_{LAST,FIRST} implementation from existing BITMAP_*_WORD_MASK
> analogues, and make the BITS_RANGE like:
>         #define BITS_RANGE(begin, end) BITS_FIRST(end) & BITS_LAST(begin)
> 
> Regarding BITMAP_*_WORD_MASK, I can save them in bitmap.h as aliases
> to BITS_{LAST,FIRST} to avoid massive renaming. (Should I?)

Yes, now that I read these again, I definitely think the
BITMAP_{FIRST,LAST}_WORD_MASK should stay (whether their implementation
change I don't care). Their names document what they do much better than
if you replace them with their potential new implementations:
BITMAP_FIRST_WORD_MASK(start) is obviously about having to mask off some
low bits of the first word we're looking at because we're looking at an
offset into the bitmap, and similarly BITMAP_LAST_WORD_MASK(nbits)
explains itself: nbits is such that the last word needs some masking.
But their replacements would be BITS_LAST(start) and BITS_FIRST(nbits)
respectively (possibly with those arguments reduced mod N), which is
quite confusing.

> Would this all work for you?

Maybe, I think I'd have to see the implementation and how those new
macros get used.

Thanks,
Rasmus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro
  2021-03-17 19:58         ` Rasmus Villemoes
@ 2021-03-17 23:33           ` Yury Norov
  0 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-03-17 23:33 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Andy Shevchenko, linux-kernel, linux-m68k, linux-arch, linux-sh,
	Alexey Klimov, Andrew Morton, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rich Felker,
	Stefano Brivio, Wei Yang, Wolfram Sang, Yoshinori Sato

On Wed, Mar 17, 2021 at 08:58:04PM +0100, Rasmus Villemoes wrote:
> On 17/03/2021 06.40, Yury Norov wrote:
> > On Tue, Mar 16, 2021 at 01:42:45PM +0200, Andy Shevchenko wrote:
> 
> >>> It would also be much easier to review if you just redefined the
> >>> BITMAP_LAST_WORD_MASK macros etc. in terms of these new things, so you
> >>> wouldn't have to do a lot of mechanical changes at the same time as
> >>> introducing the new ones - especially when those mechanical changes
> >>> involve adding a "minus 1" everywhere.
> >>
> >> I tend to agree with Rasmus here.
> > 
> > OK. All this plus terrible GENMASK(high, low) design, when high goes
> > first, makes me feel like we need to deprecate GENMASK and propose a
> > new interface.
> > 
> > What do you think about this:
> > BITS_FIRST(bitnum)      -> [0, bitnum)
> > BITS_LAST(bitnum)       -> [bitnum, BITS_PER_LONG)
> > BITS_RANGE(begin, end)  -> [begin, end)
> 
> Better, though I'm not too happy about BITS_LAST(n) not producing a word
> with the n highest bits set. I dunno, I don't have better names.
> BITS_FROM/BITS_UPTO perhaps, but not really (and upto sounds like it is
> inclusive). BITS_LOW/BITS_HIGH have the same problem as BITS_LAST.
> 
> Also, be careful to document what one can expect from the boundary
> values 0/BITS_PER_LONG. Is BITS_FIRST(0) a valid invocation? Does it
> yield 0UL? How about BITS_FIRST(BITS_PER_LONG), does that give ~0UL?
> Note that BITMAP_{FIRST,LAST}_WORD_MASK never produce 0, they're never
> used except with a word we know to be part of the bitmap.
> 
> > We can pick BITS_{LAST,FIRST} implementation from existing BITMAP_*_WORD_MASK
> > analogues, and make the BITS_RANGE like:
> >         #define BITS_RANGE(begin, end) BITS_FIRST(end) & BITS_LAST(begin)
> > 
> > Regarding BITMAP_*_WORD_MASK, I can save them in bitmap.h as aliases
> > to BITS_{LAST,FIRST} to avoid massive renaming. (Should I?)
> 
> Yes, now that I read these again, I definitely think the
> BITMAP_{FIRST,LAST}_WORD_MASK should stay (whether their implementation
> change I don't care). Their names document what they do much better than
> if you replace them with their potential new implementations:
> BITMAP_FIRST_WORD_MASK(start) is obviously about having to mask off some
> low bits of the first word we're looking at because we're looking at an
> offset into the bitmap, and similarly BITMAP_LAST_WORD_MASK(nbits)
> explains itself: nbits is such that the last word needs some masking.
> But their replacements would be BITS_LAST(start) and BITS_FIRST(nbits)
> respectively (possibly with those arguments reduced mod N), which is
> quite confusing.
> 
> > Would this all work for you?
> 
> Maybe, I think I'd have to see the implementation and how those new
> macros get used.
> 
> Thanks,
> Rasmus

I looked at this with a fresh eye this morning. All the noise we
discuss I made to just call BITS_FIRST() 3 times in find.h. I think
that for the purpose of this series, in find.h, it's worth to use
GENMASK(size - 1, 0) where needed.

Regarding the general view on this, all the problems come from the
fact that bitmap API is split between linux/bitmap.h and
asm_generic/bitops/find.h. Find.h is naturally a part of bitmaps,
but because of the split, it's hard to use handy bitmap.h macros
in find.h.

Joining the headers is far out of the scope of this series. If no
objections, I'd prefer to drop this patch now, and later carefully
figure out how to join find.h and bitmap.h.

Yury

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 11/13] lib: add fast path for find_first_*_bit() and find_last_bit()
  2021-03-16  1:54 ` [PATCH 11/13] lib: add fast path for find_first_*_bit() and find_last_bit() Yury Norov
@ 2021-04-06 16:03   ` Guenter Roeck
  2021-04-06 18:15     ` Yury Norov
  0 siblings, 1 reply; 31+ messages in thread
From: Guenter Roeck @ 2021-04-06 16:03 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

On Mon, Mar 15, 2021 at 06:54:22PM -0700, Yury Norov wrote:
> Similarly to bitmap functions, users would benefit if we'll handle
> a case of small-size bitmaps that fit into a single word.
> 
> While here, move the find_last_bit() declaration to bitops/find.h
> where other find_*_bit() functions sit.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  include/asm-generic/bitops/find.h | 50 ++++++++++++++++++++++++++++---
>  include/linux/bitops.h            | 12 --------
>  lib/find_bit.c                    | 12 ++++----
>  3 files changed, 52 insertions(+), 22 deletions(-)
> 
> diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
> index 4148c74a1e4d..8d818b304869 100644
> --- a/include/asm-generic/bitops/find.h
> +++ b/include/asm-generic/bitops/find.h
> @@ -5,6 +5,9 @@
>  extern unsigned long _find_next_bit(const unsigned long *addr1,
>  		const unsigned long *addr2, unsigned long nbits,
>  		unsigned long start, unsigned long invert, unsigned long le);
> +extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
> +extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
> +extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
>  
>  #ifndef find_next_bit
>  /**
> @@ -102,8 +105,17 @@ unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
>   * Returns the bit number of the first set bit.
>   * If no bits are set, returns @size.
>   */
> -extern unsigned long find_first_bit(const unsigned long *addr,
> -				    unsigned long size);
> +static inline
> +unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
> +{
> +	if (small_const_nbits(size)) {
> +		unsigned long val = *addr & BITS_FIRST(size - 1);
> +
> +		return val ? __ffs(val) : size;

This patch results in:

include/asm-generic/bitops/find.h: In function 'find_last_bit':
include/asm-generic/bitops/find.h:164:16: error: implicit declaration of function '__fls'; did you mean '__ffs'?

and:

./include/asm-generic/bitops/__fls.h: At top level:
./include/asm-generic/bitops/__fls.h:13:38: error: conflicting types for '__fls'

when building scripts/mod/devicetable-offsets.o.

Seen with h8300 builds.

Guenter

> +	}
> +
> +	return _find_first_bit(addr, size);
> +}
>  
>  /**
>   * find_first_zero_bit - find the first cleared bit in a memory region
> @@ -113,8 +125,17 @@ extern unsigned long find_first_bit(const unsigned long *addr,
>   * Returns the bit number of the first cleared bit.
>   * If no bits are zero, returns @size.
>   */
> -extern unsigned long find_first_zero_bit(const unsigned long *addr,
> -					 unsigned long size);
> +static inline
> +unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
> +{
> +	if (small_const_nbits(size)) {
> +		unsigned long val = *addr | ~BITS_FIRST(size - 1);
> +
> +		return val == ~0UL ? size : ffz(val);
> +	}
> +
> +	return _find_first_zero_bit(addr, size);
> +}
>  #else /* CONFIG_GENERIC_FIND_FIRST_BIT */
>  
>  #ifndef find_first_bit
> @@ -126,6 +147,27 @@ extern unsigned long find_first_zero_bit(const unsigned long *addr,
>  
>  #endif /* CONFIG_GENERIC_FIND_FIRST_BIT */
>  
> +#ifndef find_last_bit
> +/**
> + * find_last_bit - find the last set bit in a memory region
> + * @addr: The address to start the search at
> + * @size: The number of bits to search
> + *
> + * Returns the bit number of the last set bit, or size.
> + */
> +static inline
> +unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
> +{
> +	if (small_const_nbits(size)) {
> +		unsigned long val = *addr & BITS_FIRST(size - 1);
> +
> +		return val ? __fls(val) : size;
> +	}
> +
> +	return _find_last_bit(addr, size);
> +}
> +#endif
> +
>  /**
>   * find_next_clump8 - find next 8-bit clump with set bits in a memory region
>   * @clump: location to store copy of found clump
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index a5a48303b0f1..26bf15e6cd35 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -286,17 +286,5 @@ static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
>  })
>  #endif
>  
> -#ifndef find_last_bit
> -/**
> - * find_last_bit - find the last set bit in a memory region
> - * @addr: The address to start the search at
> - * @size: The number of bits to search
> - *
> - * Returns the bit number of the last set bit, or size.
> - */
> -extern unsigned long find_last_bit(const unsigned long *addr,
> -				   unsigned long size);
> -#endif
> -
>  #endif /* __KERNEL__ */
>  #endif
> diff --git a/lib/find_bit.c b/lib/find_bit.c
> index 2470ae390f3c..e2c301d28568 100644
> --- a/lib/find_bit.c
> +++ b/lib/find_bit.c
> @@ -75,7 +75,7 @@ EXPORT_SYMBOL(_find_next_bit);
>  /*
>   * Find the first set bit in a memory region.
>   */
> -unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
> +unsigned long _find_first_bit(const unsigned long *addr, unsigned long size)
>  {
>  	unsigned long idx;
>  
> @@ -86,14 +86,14 @@ unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
>  
>  	return size;
>  }
> -EXPORT_SYMBOL(find_first_bit);
> +EXPORT_SYMBOL(_find_first_bit);
>  #endif
>  
>  #ifndef find_first_zero_bit
>  /*
>   * Find the first cleared bit in a memory region.
>   */
> -unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
> +unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size)
>  {
>  	unsigned long idx;
>  
> @@ -104,11 +104,11 @@ unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
>  
>  	return size;
>  }
> -EXPORT_SYMBOL(find_first_zero_bit);
> +EXPORT_SYMBOL(_find_first_zero_bit);
>  #endif
>  
>  #ifndef find_last_bit
> -unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
> +unsigned long _find_last_bit(const unsigned long *addr, unsigned long size)
>  {
>  	if (size) {
>  		unsigned long val = BITS_FIRST_MASK(size - 1);
> @@ -124,7 +124,7 @@ unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
>  	}
>  	return size;
>  }
> -EXPORT_SYMBOL(find_last_bit);
> +EXPORT_SYMBOL(_find_last_bit);
>  #endif
>  
>  unsigned long find_next_clump8(unsigned long *clump, const unsigned long *addr,

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 11/13] lib: add fast path for find_first_*_bit() and find_last_bit()
  2021-04-06 16:03   ` Guenter Roeck
@ 2021-04-06 18:15     ` Yury Norov
  0 siblings, 0 replies; 31+ messages in thread
From: Yury Norov @ 2021-04-06 18:15 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-kernel, linux-m68k, linux-arch, linux-sh, Alexey Klimov,
	Andrew Morton, Andy Shevchenko, Arnd Bergmann, David Sterba,
	Dennis Zhou, Geert Uytterhoeven, Jianpeng Ma, Joe Perches,
	John Paul Adrian Glaubitz, Josh Poimboeuf, Rasmus Villemoes,
	Rich Felker, Stefano Brivio, Wei Yang, Wolfram Sang,
	Yoshinori Sato

On Tue, Apr 06, 2021 at 09:03:27AM -0700, Guenter Roeck wrote:
> On Mon, Mar 15, 2021 at 06:54:22PM -0700, Yury Norov wrote:
> > Similarly to bitmap functions, users would benefit if we'll handle
> > a case of small-size bitmaps that fit into a single word.
> > 
> > While here, move the find_last_bit() declaration to bitops/find.h
> > where other find_*_bit() functions sit.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  include/asm-generic/bitops/find.h | 50 ++++++++++++++++++++++++++++---
> >  include/linux/bitops.h            | 12 --------
> >  lib/find_bit.c                    | 12 ++++----
> >  3 files changed, 52 insertions(+), 22 deletions(-)
> > 
> > diff --git a/include/asm-generic/bitops/find.h b/include/asm-generic/bitops/find.h
> > index 4148c74a1e4d..8d818b304869 100644
> > --- a/include/asm-generic/bitops/find.h
> > +++ b/include/asm-generic/bitops/find.h
> > @@ -5,6 +5,9 @@
> >  extern unsigned long _find_next_bit(const unsigned long *addr1,
> >  		const unsigned long *addr2, unsigned long nbits,
> >  		unsigned long start, unsigned long invert, unsigned long le);
> > +extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
> > +extern unsigned long _find_first_zero_bit(const unsigned long *addr, unsigned long size);
> > +extern unsigned long _find_last_bit(const unsigned long *addr, unsigned long size);
> >  
> >  #ifndef find_next_bit
> >  /**
> > @@ -102,8 +105,17 @@ unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
> >   * Returns the bit number of the first set bit.
> >   * If no bits are set, returns @size.
> >   */
> > -extern unsigned long find_first_bit(const unsigned long *addr,
> > -				    unsigned long size);
> > +static inline
> > +unsigned long find_first_bit(const unsigned long *addr, unsigned long size)
> > +{
> > +	if (small_const_nbits(size)) {
> > +		unsigned long val = *addr & BITS_FIRST(size - 1);
> > +
> > +		return val ? __ffs(val) : size;
> 
> This patch results in:
> 
> include/asm-generic/bitops/find.h: In function 'find_last_bit':
> include/asm-generic/bitops/find.h:164:16: error: implicit declaration of function '__fls'; did you mean '__ffs'?
> 
> and:
> 
> ./include/asm-generic/bitops/__fls.h: At top level:
> ./include/asm-generic/bitops/__fls.h:13:38: error: conflicting types for '__fls'
> 
> when building scripts/mod/devicetable-offsets.o.
> 
> Seen with h8300 builds.
> 
> Guenter

The patch is here:

https://lkml.org/lkml/2021/4/1/1184
 
Yury

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-04-06 18:15 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16  1:54 [PATCH v4 00/13] lib/find_bit: fast path for small bitmaps Yury Norov
2021-03-16  1:54 ` [PATCH 01/13] tools: disable -Wno-type-limits Yury Norov
2021-03-16  8:17   ` Rasmus Villemoes
2021-03-16 11:39     ` Andy Shevchenko
2021-03-16  1:54 ` [PATCH 02/13] tools: bitmap: sync function declarations with the kernel Yury Norov
2021-03-16  8:18   ` Rasmus Villemoes
2021-03-16  1:54 ` [PATCH 03/13] arch: rearrange headers inclusion order in asm/bitops for m68k and sh Yury Norov
2021-03-16  1:54 ` [PATCH 04/13] lib: introduce BITS_{FIRST,LAST} macro Yury Norov
2021-03-16  8:35   ` Rasmus Villemoes
2021-03-16 11:42     ` Andy Shevchenko
2021-03-17  5:40       ` Yury Norov
2021-03-17 19:58         ` Rasmus Villemoes
2021-03-17 23:33           ` Yury Norov
2021-03-16  1:54 ` [PATCH 05/13] tools: sync BITS_MASK macros with the kernel Yury Norov
2021-03-16  1:54 ` [PATCH 06/13] lib: extend the scope of small_const_nbits() macro Yury Norov
2021-03-16  8:56   ` Rasmus Villemoes
2021-03-17  4:53     ` Yury Norov
2021-03-16  1:54 ` [PATCH 07/13] tools: sync small_const_nbits() macro with the kernel Yury Norov
2021-03-16  1:54 ` [PATCH 08/13] lib: inline _find_next_bit() wrappers Yury Norov
2021-03-16  1:54 ` [PATCH 09/13] tools: sync find_next_bit implementation Yury Norov
2021-03-16  1:54 ` [PATCH 10/13] lib: add fast path for find_next_*_bit() Yury Norov
2021-03-16  1:54 ` [PATCH 11/13] lib: add fast path for find_first_*_bit() and find_last_bit() Yury Norov
2021-04-06 16:03   ` Guenter Roeck
2021-04-06 18:15     ` Yury Norov
2021-03-16  1:54 ` [PATCH 12/13] tools: sync lib/find_bit implementation Yury Norov
2021-03-16  1:54 ` [PATCH 13/13] MAINTAINERS: Add entry for the bitmap API Yury Norov
2021-03-16 11:45   ` Andy Shevchenko
2021-03-17  4:47     ` Yury Norov
2021-03-17  4:57       ` Joe Perches
2021-03-17  6:40         ` Lukas Bulwahn
2021-03-17 19:29           ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).