All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] cpumask: repair cpumask_check()
@ 2022-09-19 21:05 Yury Norov
  2022-09-19 21:05 ` [PATCH 1/7] cpumask: fix checking valid cpu range Yury Norov
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Yury Norov @ 2022-09-19 21:05 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

After switching cpumask to use nr_cpu_ids in [1], cpumask_check() started
generating many false-positive warnings. There are some more issues with
the cpumask_check() that brake it.

This series fixes cpumask_check() mess and addresses most of the
false-positive warnings observed on boot of x86_64 and arm64.

[1] https://lore.kernel.org/lkml/20220905230820.3295223-4-yury.norov@gmail.com/T/

Yury Norov (7):
  cpumask: fix checking valid cpu range
  net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}
  cpumask: switch for_each_cpu{,_not} to use for_each_bit()
  lib/find_bit: add find_next{,_and}_bit_wrap
  lib/bitmap: introduce for_each_set_bit_wrap() macro
  lib/find: optimize for_each() macros
  lib/bitmap: add tests for for_each() iterators

 include/linux/cpumask.h   |  37 ++----
 include/linux/find.h      | 140 +++++++++++++++++-----
 include/linux/netdevice.h |  10 +-
 lib/cpumask.c             |  12 +-
 lib/test_bitmap.c         | 244 +++++++++++++++++++++++++++++++++++++-
 5 files changed, 375 insertions(+), 68 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/7] cpumask: fix checking valid cpu range
  2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
@ 2022-09-19 21:05 ` Yury Norov
  2022-09-28 12:18   ` Valentin Schneider
  2022-09-19 21:05 ` [PATCH 2/7] net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and} Yury Norov
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Yury Norov @ 2022-09-19 21:05 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

The range of valid CPUs is [0, nr_cpu_ids). Some cpumask functions are
passed with a shifted CPU index, and for them, the valid range is
[-1, nr_cpu_ids-1). Currently for those functions, we check the index
against [-1, nr_cpu_ids), which is wrong.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index e4f9136a4a63..a1cd4eb1a3d6 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -174,9 +174,8 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
 static inline
 unsigned int cpumask_next(int n, const struct cpumask *srcp)
 {
-	/* -1 is a legal arg here. */
-	if (n != -1)
-		cpumask_check(n);
+	/* n is a prior cpu */
+	cpumask_check(n + 1);
 	return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
 }
 
@@ -189,9 +188,8 @@ unsigned int cpumask_next(int n, const struct cpumask *srcp)
  */
 static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
 {
-	/* -1 is a legal arg here. */
-	if (n != -1)
-		cpumask_check(n);
+	/* n is a prior cpu */
+	cpumask_check(n + 1);
 	return find_next_zero_bit(cpumask_bits(srcp), nr_cpumask_bits, n+1);
 }
 
@@ -231,9 +229,8 @@ static inline
 unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
 		     const struct cpumask *src2p)
 {
-	/* -1 is a legal arg here. */
-	if (n != -1)
-		cpumask_check(n);
+	/* n is a prior cpu */
+	cpumask_check(n + 1);
 	return find_next_and_bit(cpumask_bits(src1p), cpumask_bits(src2p),
 		nr_cpumask_bits, n + 1);
 }
@@ -267,8 +264,8 @@ static inline
 unsigned int cpumask_next_wrap(int n, const struct cpumask *mask, int start, bool wrap)
 {
 	cpumask_check(start);
-	if (n != -1)
-		cpumask_check(n);
+	/* n is a prior cpu */
+	cpumask_check(n + 1);
 
 	/*
 	 * Return the first available CPU when wrapping, or when starting before cpu0,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/7] net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}
  2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
  2022-09-19 21:05 ` [PATCH 1/7] cpumask: fix checking valid cpu range Yury Norov
@ 2022-09-19 21:05 ` Yury Norov
  2022-09-26 17:34   ` Jakub Kicinski
  2022-09-19 21:05 ` [PATCH 3/7] cpumask: switch for_each_cpu{,_not} to use for_each_bit() Yury Norov
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Yury Norov @ 2022-09-19 21:05 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

The functions require to be passed with a cpu index prior to one that is
the first to start search, so the valid input range is [-1, nr_cpu_ids-1).
However, the code checks against [-1, nr_cpu_ids).

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/netdevice.h | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 05d6f3facd5a..4d6d5a2dd82e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3643,9 +3643,8 @@ static inline bool netif_attr_test_online(unsigned long j,
 static inline unsigned int netif_attrmask_next(int n, const unsigned long *srcp,
 					       unsigned int nr_bits)
 {
-	/* -1 is a legal arg here. */
-	if (n != -1)
-		cpu_max_bits_warn(n, nr_bits);
+	/* n is a prior cpu */
+	cpu_max_bits_warn(n + 1, nr_bits);
 
 	if (srcp)
 		return find_next_bit(srcp, nr_bits, n + 1);
@@ -3666,9 +3665,8 @@ static inline int netif_attrmask_next_and(int n, const unsigned long *src1p,
 					  const unsigned long *src2p,
 					  unsigned int nr_bits)
 {
-	/* -1 is a legal arg here. */
-	if (n != -1)
-		cpu_max_bits_warn(n, nr_bits);
+	/* n is a prior cpu */
+	cpu_max_bits_warn(n + 1, nr_bits);
 
 	if (src1p && src2p)
 		return find_next_and_bit(src1p, src2p, nr_bits, n + 1);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/7] cpumask: switch for_each_cpu{,_not} to use for_each_bit()
  2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
  2022-09-19 21:05 ` [PATCH 1/7] cpumask: fix checking valid cpu range Yury Norov
  2022-09-19 21:05 ` [PATCH 2/7] net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and} Yury Norov
@ 2022-09-19 21:05 ` Yury Norov
  2022-09-19 21:05 ` [PATCH 4/7] lib/find_bit: add find_next{,_and}_bit_wrap Yury Norov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Yury Norov @ 2022-09-19 21:05 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

The difference between for_each_cpu() and for_each_set_bit()
is that the latter uses cpumask_next() instead of find_next_bit(),
and so calls cpumask_check().

This check is useless because the iterator value is not provided by
user. It generates false-positives for the very last iteration
of for_each_cpu().

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 12 +++---------
 include/linux/find.h    |  5 +++++
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index a1cd4eb1a3d6..3a9566f1373a 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -243,9 +243,7 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
  * After the loop, cpu is >= nr_cpu_ids.
  */
 #define for_each_cpu(cpu, mask)				\
-	for ((cpu) = -1;				\
-		(cpu) = cpumask_next((cpu), (mask)),	\
-		(cpu) < nr_cpu_ids;)
+	for_each_set_bit(cpu, cpumask_bits(mask), nr_cpumask_bits)
 
 /**
  * for_each_cpu_not - iterate over every cpu in a complemented mask
@@ -255,9 +253,7 @@ unsigned int cpumask_next_and(int n, const struct cpumask *src1p,
  * After the loop, cpu is >= nr_cpu_ids.
  */
 #define for_each_cpu_not(cpu, mask)				\
-	for ((cpu) = -1;					\
-		(cpu) = cpumask_next_zero((cpu), (mask)),	\
-		(cpu) < nr_cpu_ids;)
+	for_each_clear_bit(cpu, cpumask_bits(mask), nr_cpumask_bits)
 
 #if NR_CPUS == 1
 static inline
@@ -310,9 +306,7 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
  * After the loop, cpu is >= nr_cpu_ids.
  */
 #define for_each_cpu_and(cpu, mask1, mask2)				\
-	for ((cpu) = -1;						\
-		(cpu) = cpumask_next_and((cpu), (mask1), (mask2)),	\
-		(cpu) < nr_cpu_ids;)
+	for_each_and_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), nr_cpumask_bits)
 
 /**
  * cpumask_any_but - return a "random" in a cpumask, but not this one.
diff --git a/include/linux/find.h b/include/linux/find.h
index b100944daba0..128615a3f93e 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -390,6 +390,11 @@ unsigned long find_next_bit_le(const void *addr, unsigned
 	     (bit) < (size);					\
 	     (bit) = find_next_bit((addr), (size), (bit) + 1))
 
+#define for_each_and_bit(bit, addr1, addr2, size) \
+	for ((bit) = find_next_and_bit((addr1), (addr2), (size), 0);		\
+	     (bit) < (size);							\
+	     (bit) = find_next_and_bit((addr1), (addr2), (size), (bit) + 1))
+
 /* same as for_each_set_bit() but use bit as value to start with */
 #define for_each_set_bit_from(bit, addr, size) \
 	for ((bit) = find_next_bit((addr), (size), (bit));	\
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/7] lib/find_bit: add find_next{,_and}_bit_wrap
  2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
                   ` (2 preceding siblings ...)
  2022-09-19 21:05 ` [PATCH 3/7] cpumask: switch for_each_cpu{,_not} to use for_each_bit() Yury Norov
@ 2022-09-19 21:05 ` Yury Norov
  2022-09-19 21:05 ` [PATCH 5/7] lib/bitmap: introduce for_each_set_bit_wrap() macro Yury Norov
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Yury Norov @ 2022-09-19 21:05 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

The helper is better optimized for the worst case: in case of empty
cpumask, current code traverses 2 * size:

  next = cpumask_next_and(prev, src1p, src2p);
  if (next >= nr_cpu_ids)
  	next = cpumask_first_and(src1p, src2p);

At bitmap level we can stop earlier after checking 'size + offset' bits.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/find.h | 46 ++++++++++++++++++++++++++++++++++++++++++++
 lib/cpumask.c        | 12 +++---------
 2 files changed, 49 insertions(+), 9 deletions(-)

diff --git a/include/linux/find.h b/include/linux/find.h
index 128615a3f93e..77c087b7a451 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -290,6 +290,52 @@ unsigned long find_last_bit(const unsigned long *addr, unsigned long size)
 }
 #endif
 
+/**
+ * find_next_and_bit_wrap - find the next set bit in both memory regions
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @size: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Returns the bit number for the next set bit, or first set bit up to @offset
+ * If no bits are set, returns @size.
+ */
+static inline
+unsigned long find_next_and_bit_wrap(const unsigned long *addr1,
+					const unsigned long *addr2,
+					unsigned long size, unsigned long offset)
+{
+	unsigned long bit = find_next_and_bit(addr1, addr2, size, offset);
+
+	if (bit < size)
+		return bit;
+
+	bit = find_first_and_bit(addr1, addr2, offset);
+	return bit < offset ? bit : size;
+}
+
+/**
+ * find_next_bit_wrap - find the next set bit in both memory regions
+ * @addr: The first address to base the search on
+ * @size: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Returns the bit number for the next set bit, or first set bit up to @offset
+ * If no bits are set, returns @size.
+ */
+static inline
+unsigned long find_next_bit_wrap(const unsigned long *addr,
+					unsigned long size, unsigned long offset)
+{
+	unsigned long bit = find_next_bit(addr, size, offset);
+
+	if (bit < size)
+		return bit;
+
+	bit = find_first_bit(addr, offset);
+	return bit < offset ? bit : size;
+}
+
 /**
  * find_next_clump8 - find next 8-bit clump with set bits in a memory region
  * @clump: location to store copy of found clump
diff --git a/lib/cpumask.c b/lib/cpumask.c
index 2c4a63b6f03f..c7c392514fd3 100644
--- a/lib/cpumask.c
+++ b/lib/cpumask.c
@@ -166,10 +166,8 @@ unsigned int cpumask_any_and_distribute(const struct cpumask *src1p,
 	/* NOTE: our first selection will skip 0. */
 	prev = __this_cpu_read(distribute_cpu_mask_prev);
 
-	next = cpumask_next_and(prev, src1p, src2p);
-	if (next >= nr_cpu_ids)
-		next = cpumask_first_and(src1p, src2p);
-
+	next = find_next_and_bit_wrap(cpumask_bits(src1p), cpumask_bits(src2p),
+					nr_cpumask_bits, prev + 1);
 	if (next < nr_cpu_ids)
 		__this_cpu_write(distribute_cpu_mask_prev, next);
 
@@ -183,11 +181,7 @@ unsigned int cpumask_any_distribute(const struct cpumask *srcp)
 
 	/* NOTE: our first selection will skip 0. */
 	prev = __this_cpu_read(distribute_cpu_mask_prev);
-
-	next = cpumask_next(prev, srcp);
-	if (next >= nr_cpu_ids)
-		next = cpumask_first(srcp);
-
+	next = find_next_bit_wrap(cpumask_bits(srcp), nr_cpumask_bits, prev + 1);
 	if (next < nr_cpu_ids)
 		__this_cpu_write(distribute_cpu_mask_prev, next);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/7] lib/bitmap: introduce for_each_set_bit_wrap() macro
  2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
                   ` (3 preceding siblings ...)
  2022-09-19 21:05 ` [PATCH 4/7] lib/find_bit: add find_next{,_and}_bit_wrap Yury Norov
@ 2022-09-19 21:05 ` Yury Norov
  2022-09-19 21:05 ` [PATCH 6/7] lib/find: optimize for_each() macros Yury Norov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Yury Norov @ 2022-09-19 21:05 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

Add for_each_set_bit_wrap() macro and use it in for_each_cpu_wrap(). The
new macro is based on __for_each_wrap() iterator, which is simpler and
smaller than cpumask_next_wrap().

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h |  6 ++----
 include/linux/find.h    | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 3a9566f1373a..286804bfe3b7 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -286,10 +286,8 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
  *
  * After the loop, cpu is >= nr_cpu_ids.
  */
-#define for_each_cpu_wrap(cpu, mask, start)					\
-	for ((cpu) = cpumask_next_wrap((start)-1, (mask), (start), false);	\
-	     (cpu) < nr_cpumask_bits;						\
-	     (cpu) = cpumask_next_wrap((cpu), (mask), (start), true))
+#define for_each_cpu_wrap(cpu, mask, start)				\
+	for_each_set_bit_wrap(cpu, cpumask_bits(mask), nr_cpumask_bits, start)
 
 /**
  * for_each_cpu_and - iterate over every cpu in both masks
diff --git a/include/linux/find.h b/include/linux/find.h
index 77c087b7a451..3b746a183216 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -336,6 +336,32 @@ unsigned long find_next_bit_wrap(const unsigned long *addr,
 	return bit < offset ? bit : size;
 }
 
+/*
+ * Helper for for_each_set_bit_wrap(). Make sure you're doing right thing
+ * before using it alone.
+ */
+static inline
+unsigned long __for_each_wrap(const unsigned long *bitmap, unsigned long size,
+				 unsigned long start, unsigned long n)
+{
+	unsigned long bit;
+
+	/* If not wrapped around */
+	if (n > start) {
+		/* and have a bit, just return it. */
+		bit = find_next_bit(bitmap, size, n);
+		if (bit < size)
+			return bit;
+
+		/* Otherwise, wrap around and ... */
+		n = 0;
+	}
+
+	/* Search the other part. */
+	bit = find_next_bit(bitmap, start, n);
+	return bit < start ? bit : size;
+}
+
 /**
  * find_next_clump8 - find next 8-bit clump with set bits in a memory region
  * @clump: location to store copy of found clump
@@ -514,6 +540,19 @@ unsigned long find_next_bit_le(const void *addr, unsigned
 	     (b) = find_next_zero_bit((addr), (size), (e) + 1),	\
 	     (e) = find_next_bit((addr), (size), (b) + 1))
 
+/**
+ * for_each_set_bit_wrap - iterate over all set bits starting from @start, and
+ * wrapping around the end of bitmap.
+ * @bit: offset for current iteration
+ * @addr: bitmap address to base the search on
+ * @size: bitmap size in number of bits
+ * @start: Starting bit for bitmap traversing, wrapping around the bitmap end
+ */
+#define for_each_set_bit_wrap(bit, addr, size, start) \
+	for ((bit) = find_next_bit_wrap((addr), (size), (start));		\
+	     (bit) < (size);							\
+	     (bit) = __for_each_wrap((addr), (size), (start), (bit) + 1))
+
 /**
  * for_each_set_clump8 - iterate over bitmap for each 8-bit clump with set bits
  * @start: bit offset to start search and to store the current iteration offset
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 6/7] lib/find: optimize for_each() macros
  2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
                   ` (4 preceding siblings ...)
  2022-09-19 21:05 ` [PATCH 5/7] lib/bitmap: introduce for_each_set_bit_wrap() macro Yury Norov
@ 2022-09-19 21:05 ` Yury Norov
  2022-09-19 21:05 ` [PATCH 7/7] lib/bitmap: add tests for for_each() loops Yury Norov
  2022-09-25 15:47 ` [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
  7 siblings, 0 replies; 17+ messages in thread
From: Yury Norov @ 2022-09-19 21:05 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

Moving an iterator of the macros inside conditional part of for-loop
helps to generate a better code. It had been first implemented in commit
7baac8b91f9871ba ("cpumask: make for_each_cpu_mask a bit smaller").

Now that cpumask for-loops are the aliases to bitmap loops, it's worth
to optimize them the same way.

Bloat-o-meter says:
add/remove: 8/12 grow/shrink: 147/592 up/down: 4876/-24416 (-19540)

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/find.h | 56 ++++++++++++++++++++------------------------
 1 file changed, 25 insertions(+), 31 deletions(-)

diff --git a/include/linux/find.h b/include/linux/find.h
index 3b746a183216..0cdfab9734a6 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -458,31 +458,25 @@ unsigned long find_next_bit_le(const void *addr, unsigned
 #endif
 
 #define for_each_set_bit(bit, addr, size) \
-	for ((bit) = find_next_bit((addr), (size), 0);		\
-	     (bit) < (size);					\
-	     (bit) = find_next_bit((addr), (size), (bit) + 1))
+	for ((bit) = 0; (bit) = find_next_bit((addr), (size), (bit)), (bit) < (size); (bit)++)
 
 #define for_each_and_bit(bit, addr1, addr2, size) \
-	for ((bit) = find_next_and_bit((addr1), (addr2), (size), 0);		\
-	     (bit) < (size);							\
-	     (bit) = find_next_and_bit((addr1), (addr2), (size), (bit) + 1))
+	for ((bit) = 0;									\
+	     (bit) = find_next_and_bit((addr1), (addr2), (size), (bit)), (bit) < (size);\
+	     (bit)++)
 
 /* same as for_each_set_bit() but use bit as value to start with */
 #define for_each_set_bit_from(bit, addr, size) \
-	for ((bit) = find_next_bit((addr), (size), (bit));	\
-	     (bit) < (size);					\
-	     (bit) = find_next_bit((addr), (size), (bit) + 1))
+	for (; (bit) = find_next_bit((addr), (size), (bit)), (bit) < (size); (bit)++)
 
 #define for_each_clear_bit(bit, addr, size) \
-	for ((bit) = find_next_zero_bit((addr), (size), 0);	\
-	     (bit) < (size);					\
-	     (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
+	for ((bit) = 0;									\
+	     (bit) = find_next_zero_bit((addr), (size), (bit)), (bit) < (size);		\
+	     (bit)++)
 
 /* same as for_each_clear_bit() but use bit as value to start with */
 #define for_each_clear_bit_from(bit, addr, size) \
-	for ((bit) = find_next_zero_bit((addr), (size), (bit));	\
-	     (bit) < (size);					\
-	     (bit) = find_next_zero_bit((addr), (size), (bit) + 1))
+	for (; (bit) = find_next_zero_bit((addr), (size), (bit)), (bit) < (size); (bit)++)
 
 /**
  * for_each_set_bitrange - iterate over all set bit ranges [b; e)
@@ -492,11 +486,11 @@ unsigned long find_next_bit_le(const void *addr, unsigned
  * @size: bitmap size in number of bits
  */
 #define for_each_set_bitrange(b, e, addr, size)			\
-	for ((b) = find_next_bit((addr), (size), 0),		\
-	     (e) = find_next_zero_bit((addr), (size), (b) + 1);	\
+	for ((b) = 0;						\
+	     (b) = find_next_bit((addr), (size), b),		\
+	     (e) = find_next_zero_bit((addr), (size), (b) + 1),	\
 	     (b) < (size);					\
-	     (b) = find_next_bit((addr), (size), (e) + 1),	\
-	     (e) = find_next_zero_bit((addr), (size), (b) + 1))
+	     (b) = (e) + 1)
 
 /**
  * for_each_set_bitrange_from - iterate over all set bit ranges [b; e)
@@ -506,11 +500,11 @@ unsigned long find_next_bit_le(const void *addr, unsigned
  * @size: bitmap size in number of bits
  */
 #define for_each_set_bitrange_from(b, e, addr, size)		\
-	for ((b) = find_next_bit((addr), (size), (b)),		\
-	     (e) = find_next_zero_bit((addr), (size), (b) + 1);	\
+	for (;							\
+	     (b) = find_next_bit((addr), (size), (b)),		\
+	     (e) = find_next_zero_bit((addr), (size), (b) + 1),	\
 	     (b) < (size);					\
-	     (b) = find_next_bit((addr), (size), (e) + 1),	\
-	     (e) = find_next_zero_bit((addr), (size), (b) + 1))
+	     (b) = (e) + 1)
 
 /**
  * for_each_clear_bitrange - iterate over all unset bit ranges [b; e)
@@ -520,11 +514,11 @@ unsigned long find_next_bit_le(const void *addr, unsigned
  * @size: bitmap size in number of bits
  */
 #define for_each_clear_bitrange(b, e, addr, size)		\
-	for ((b) = find_next_zero_bit((addr), (size), 0),	\
-	     (e) = find_next_bit((addr), (size), (b) + 1);	\
+	for ((b) = 0;						\
+	     (b) = find_next_zero_bit((addr), (size), (b)),	\
+	     (e) = find_next_bit((addr), (size), (b) + 1),	\
 	     (b) < (size);					\
-	     (b) = find_next_zero_bit((addr), (size), (e) + 1),	\
-	     (e) = find_next_bit((addr), (size), (b) + 1))
+	     (b) = (e) + 1)
 
 /**
  * for_each_clear_bitrange_from - iterate over all unset bit ranges [b; e)
@@ -534,11 +528,11 @@ unsigned long find_next_bit_le(const void *addr, unsigned
  * @size: bitmap size in number of bits
  */
 #define for_each_clear_bitrange_from(b, e, addr, size)		\
-	for ((b) = find_next_zero_bit((addr), (size), (b)),	\
-	     (e) = find_next_bit((addr), (size), (b) + 1);	\
+	for (;							\
+	     (b) = find_next_zero_bit((addr), (size), (b)),	\
+	     (e) = find_next_bit((addr), (size), (b) + 1),	\
 	     (b) < (size);					\
-	     (b) = find_next_zero_bit((addr), (size), (e) + 1),	\
-	     (e) = find_next_bit((addr), (size), (b) + 1))
+	     (b) = (e) + 1)
 
 /**
  * for_each_set_bit_wrap - iterate over all set bits starting from @start, and
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 7/7] lib/bitmap: add tests for for_each() loops
  2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
                   ` (5 preceding siblings ...)
  2022-09-19 21:05 ` [PATCH 6/7] lib/find: optimize for_each() macros Yury Norov
@ 2022-09-19 21:05 ` Yury Norov
  2022-09-25 15:47 ` [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
  7 siblings, 0 replies; 17+ messages in thread
From: Yury Norov @ 2022-09-19 21:05 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

We have a test for test_for_each_set_clump8 only. Add basic tests for
the others.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/test_bitmap.c | 244 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 243 insertions(+), 1 deletion(-)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index da52dc759c95..a8005ad3bd58 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -726,6 +726,239 @@ static void __init test_for_each_set_clump8(void)
 		expect_eq_clump8(start, CLUMP_EXP_NUMBITS, clump_exp, &clump);
 }
 
+static void __init test_for_each_set_bit_wrap(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int wr, bit;
+
+	bitmap_zero(orig, 500);
+
+	/* Set individual bits */
+	for (bit = 0; bit < 500; bit += 10)
+		bitmap_set(orig, bit, 1);
+
+	/* Set range of bits */
+	bitmap_set(orig, 100, 50);
+
+	for (wr = 0; wr < 500; wr++) {
+		bitmap_zero(copy, 500);
+
+		for_each_set_bit_wrap(bit, orig, 500, wr)
+			bitmap_set(copy, bit, 1);
+
+		expect_eq_bitmap(orig, copy, 500);
+	}
+}
+
+static void __init test_for_each_set_bit(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int bit;
+
+	bitmap_zero(orig, 500);
+	bitmap_zero(copy, 500);
+
+	/* Set individual bits */
+	for (bit = 0; bit < 500; bit += 10)
+		bitmap_set(orig, bit, 1);
+
+	/* Set range of bits */
+	bitmap_set(orig, 100, 50);
+
+	for_each_set_bit(bit, orig, 500)
+		bitmap_set(copy, bit, 1);
+
+	expect_eq_bitmap(orig, copy, 500);
+}
+
+static void __init test_for_each_set_bit_from(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int wr, bit;
+
+	bitmap_zero(orig, 500);
+
+	/* Set individual bits */
+	for (bit = 0; bit < 500; bit += 10)
+		bitmap_set(orig, bit, 1);
+
+	/* Set range of bits */
+	bitmap_set(orig, 100, 50);
+
+	for (wr = 0; wr < 500; wr++) {
+		DECLARE_BITMAP(tmp, 500);
+
+		bitmap_zero(copy, 500);
+		bit = wr;
+
+		for_each_set_bit_from(bit, orig, 500)
+			bitmap_set(copy, bit, 1);
+
+		bitmap_copy(tmp, orig, 500);
+		bitmap_clear(tmp, 0, wr);
+		expect_eq_bitmap(tmp, copy, 500);
+	}
+}
+
+static void __init test_for_each_clear_bit(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int bit;
+
+	bitmap_fill(orig, 500);
+	bitmap_fill(copy, 500);
+
+	/* Set individual bits */
+	for (bit = 0; bit < 500; bit += 10)
+		bitmap_clear(orig, bit, 1);
+
+	/* Set range of bits */
+	bitmap_clear(orig, 100, 50);
+
+	for_each_clear_bit(bit, orig, 500)
+		bitmap_clear(copy, bit, 1);
+
+	expect_eq_bitmap(orig, copy, 500);
+}
+
+static void __init test_for_each_clear_bit_from(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int wr, bit;
+
+	bitmap_fill(orig, 500);
+
+	/* Set individual bits */
+	for (bit = 0; bit < 500; bit += 10)
+		bitmap_clear(orig, bit, 1);
+
+	/* Set range of bits */
+	bitmap_clear(orig, 100, 50);
+
+	for (wr = 0; wr < 500; wr++) {
+		DECLARE_BITMAP(tmp, 500);
+
+		bitmap_fill(copy, 500);
+		bit = wr;
+
+		for_each_clear_bit_from(bit, orig, 500)
+			bitmap_clear(copy, bit, 1);
+
+		bitmap_copy(tmp, orig, 500);
+		bitmap_set(tmp, 0, wr);
+		expect_eq_bitmap(tmp, copy, 500);
+	}
+}
+
+static void __init test_for_each_set_bitrange(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int s, e;
+
+	bitmap_zero(orig, 500);
+	bitmap_zero(copy, 500);
+
+	/* Set individual bits */
+	for (s = 0; s < 500; s += 10)
+		bitmap_set(orig, s, 1);
+
+	/* Set range of bits */
+	bitmap_set(orig, 100, 50);
+
+	for_each_set_bitrange(s, e, orig, 500)
+		bitmap_set(copy, s, e-s);
+
+	expect_eq_bitmap(orig, copy, 500);
+}
+
+static void __init test_for_each_clear_bitrange(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int s, e;
+
+	bitmap_fill(orig, 500);
+	bitmap_fill(copy, 500);
+
+	/* Set individual bits */
+	for (s = 0; s < 500; s += 10)
+		bitmap_clear(orig, s, 1);
+
+	/* Set range of bits */
+	bitmap_clear(orig, 100, 50);
+
+	for_each_clear_bitrange(s, e, orig, 500)
+		bitmap_clear(copy, s, e-s);
+
+	expect_eq_bitmap(orig, copy, 500);
+}
+
+static void __init test_for_each_set_bitrange_from(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int wr, s, e;
+
+	bitmap_zero(orig, 500);
+
+	/* Set individual bits */
+	for (s = 0; s < 500; s += 10)
+		bitmap_set(orig, s, 1);
+
+	/* Set range of bits */
+	bitmap_set(orig, 100, 50);
+
+	for (wr = 0; wr < 500; wr++) {
+		DECLARE_BITMAP(tmp, 500);
+
+		bitmap_zero(copy, 500);
+		s = wr;
+
+		for_each_set_bitrange_from(s, e, orig, 500)
+			bitmap_set(copy, s, e - s);
+
+		bitmap_copy(tmp, orig, 500);
+		bitmap_clear(tmp, 0, wr);
+		expect_eq_bitmap(tmp, copy, 500);
+	}
+}
+
+static void __init test_for_each_clear_bitrange_from(void)
+{
+	DECLARE_BITMAP(orig, 500);
+	DECLARE_BITMAP(copy, 500);
+	unsigned int wr, s, e;
+
+	bitmap_fill(orig, 500);
+
+	/* Set individual bits */
+	for (s = 0; s < 500; s += 10)
+		bitmap_clear(orig, s, 1);
+
+	/* Set range of bits */
+	bitmap_set(orig, 100, 50);
+
+	for (wr = 0; wr < 500; wr++) {
+		DECLARE_BITMAP(tmp, 500);
+
+		bitmap_fill(copy, 500);
+		s = wr;
+
+		for_each_clear_bitrange_from(s, e, orig, 500)
+			bitmap_clear(copy, s, e - s);
+
+		bitmap_copy(tmp, orig, 500);
+		bitmap_set(tmp, 0, wr);
+		expect_eq_bitmap(tmp, copy, 500);
+	}
+}
+
 struct test_bitmap_cut {
 	unsigned int first;
 	unsigned int cut;
@@ -989,12 +1222,21 @@ static void __init selftest(void)
 	test_bitmap_parselist();
 	test_bitmap_printlist();
 	test_mem_optimisations();
-	test_for_each_set_clump8();
 	test_bitmap_cut();
 	test_bitmap_print_buf();
 	test_bitmap_const_eval();
 
 	test_find_nth_bit();
+	test_for_each_set_bit();
+	test_for_each_set_bit_from();
+	test_for_each_clear_bit();
+	test_for_each_clear_bit_from();
+	test_for_each_set_bitrange();
+	test_for_each_clear_bitrange();
+	test_for_each_set_bitrange_from();
+	test_for_each_clear_bitrange_from();
+	test_for_each_set_clump8();
+	test_for_each_set_bit_wrap();
 }
 
 KSTM_MODULE_LOADERS(test_bitmap);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/7] cpumask: repair cpumask_check()
  2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
                   ` (6 preceding siblings ...)
  2022-09-19 21:05 ` [PATCH 7/7] lib/bitmap: add tests for for_each() loops Yury Norov
@ 2022-09-25 15:47 ` Yury Norov
  2022-09-26 15:09   ` Jakub Kicinski
  7 siblings, 1 reply; 17+ messages in thread
From: Yury Norov @ 2022-09-25 15:47 UTC (permalink / raw)
  To: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes,
	Yury Norov

Ping?

On Mon, Sep 19, 2022 at 2:06 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> After switching cpumask to use nr_cpu_ids in [1], cpumask_check() started
> generating many false-positive warnings. There are some more issues with
> the cpumask_check() that brake it.
>
> This series fixes cpumask_check() mess and addresses most of the
> false-positive warnings observed on boot of x86_64 and arm64.
>
> [1] https://lore.kernel.org/lkml/20220905230820.3295223-4-yury.norov@gmail.com/T/
>
> Yury Norov (7):
>   cpumask: fix checking valid cpu range
>   net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}
>   cpumask: switch for_each_cpu{,_not} to use for_each_bit()
>   lib/find_bit: add find_next{,_and}_bit_wrap
>   lib/bitmap: introduce for_each_set_bit_wrap() macro
>   lib/find: optimize for_each() macros
>   lib/bitmap: add tests for for_each() iterators
>
>  include/linux/cpumask.h   |  37 ++----
>  include/linux/find.h      | 140 +++++++++++++++++-----
>  include/linux/netdevice.h |  10 +-
>  lib/cpumask.c             |  12 +-
>  lib/test_bitmap.c         | 244 +++++++++++++++++++++++++++++++++++++-
>  5 files changed, 375 insertions(+), 68 deletions(-)
>
> --
> 2.34.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/7] cpumask: repair cpumask_check()
  2022-09-25 15:47 ` [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
@ 2022-09-26 15:09   ` Jakub Kicinski
  2022-09-26 16:27     ` Yury Norov
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2022-09-26 15:09 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Paolo Abeni, Rasmus Villemoes

On Sun, 25 Sep 2022 08:47:24 -0700 Yury Norov wrote:
> Ping?

Sugar sweet, you really need to say more than ping. You put the entire
recipient list in the To:, I have no idea what kind of feedback you
expect and from whom.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/7] cpumask: repair cpumask_check()
  2022-09-26 15:09   ` Jakub Kicinski
@ 2022-09-26 16:27     ` Yury Norov
  0 siblings, 0 replies; 17+ messages in thread
From: Yury Norov @ 2022-09-26 16:27 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Paolo Abeni, Rasmus Villemoes

On Mon, Sep 26, 2022 at 08:09:10AM -0700, Jakub Kicinski wrote:
> On Sun, 25 Sep 2022 08:47:24 -0700 Yury Norov wrote:
> > Ping?
> 
> Sugar sweet, you really need to say more than ping. You put the entire
> recipient list in the To:, I have no idea what kind of feedback you
> expect and from whom.

From you I'd like to have a feedback on patch #2

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/7] net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}
  2022-09-19 21:05 ` [PATCH 2/7] net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and} Yury Norov
@ 2022-09-26 17:34   ` Jakub Kicinski
  2022-09-26 17:47     ` Yury Norov
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2022-09-26 17:34 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Paolo Abeni, Rasmus Villemoes

On Mon, 19 Sep 2022 14:05:54 -0700 Yury Norov wrote:
> The functions require to be passed with a cpu index prior to one that is
> the first to start search, so the valid input range is [-1, nr_cpu_ids-1).
> However, the code checks against [-1, nr_cpu_ids).

Yup, the analysis looks correct:

Acked-by: Jakub Kicinski <kuba@kernel.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/7] net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}
  2022-09-26 17:34   ` Jakub Kicinski
@ 2022-09-26 17:47     ` Yury Norov
  0 siblings, 0 replies; 17+ messages in thread
From: Yury Norov @ 2022-09-26 17:47 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Paolo Abeni, Rasmus Villemoes

On Mon, Sep 26, 2022 at 10:34:37AM -0700, Jakub Kicinski wrote:
> On Mon, 19 Sep 2022 14:05:54 -0700 Yury Norov wrote:
> > The functions require to be passed with a cpu index prior to one that is
> > the first to start search, so the valid input range is [-1, nr_cpu_ids-1).
> > However, the code checks against [-1, nr_cpu_ids).
> 
> Yup, the analysis looks correct:
> 
> Acked-by: Jakub Kicinski <kuba@kernel.org>

Thanks. If no other comments from you and others, I'll take the series in
bitmap-for-next.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/7] cpumask: fix checking valid cpu range
  2022-09-19 21:05 ` [PATCH 1/7] cpumask: fix checking valid cpu range Yury Norov
@ 2022-09-28 12:18   ` Valentin Schneider
  2022-09-28 14:49     ` Yury Norov
  0 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2022-09-28 12:18 UTC (permalink / raw)
  To: Yury Norov, linux-kernel, netdev, Andy Shevchenko,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Rasmus Villemoes, Yury Norov

On 19/09/22 14:05, Yury Norov wrote:
> The range of valid CPUs is [0, nr_cpu_ids). Some cpumask functions are
> passed with a shifted CPU index, and for them, the valid range is
> [-1, nr_cpu_ids-1). Currently for those functions, we check the index
> against [-1, nr_cpu_ids), which is wrong.
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  include/linux/cpumask.h | 19 ++++++++-----------
>  1 file changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index e4f9136a4a63..a1cd4eb1a3d6 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -174,9 +174,8 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
>  static inline
>  unsigned int cpumask_next(int n, const struct cpumask *srcp)
>  {
> -	/* -1 is a legal arg here. */
> -	if (n != -1)
> -		cpumask_check(n);
> +	/* n is a prior cpu */
> +	cpumask_check(n + 1);
>       return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);

I'm confused, this makes passing nr_cpu_ids-1 to cpumask_next*() trigger a
warning. The documentation does states:

* @n: the cpu prior to the place to search (ie. return will be > @n)

So n is a valid CPU number (with -1 being the exception for scan
initialization), this shouldn't exclude nr_cpu_ids-1.

IMO passing nr_cpu_ids-1 should be treated the same as passing the
last set bit in a bitmap: no warning, and returns the bitmap
size. Otherwise reaching nr_cpu_ids-1 has to be special-cased by the
calling code which seems like unnecessary boiler plate

For instance, I trigger the cpumask_check() warning there:

3d2dcab932d0:block/blk-mq.c @l2047
        if (--hctx->next_cpu_batch <= 0) {
select_cpu:
                next_cpu = cpumask_next_and(next_cpu, hctx->cpumask, <-----
                                cpu_online_mask);
                if (next_cpu >= nr_cpu_ids)
                        next_cpu = blk_mq_first_mapped_cpu(hctx);
                hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
        }

next_cpu is a valid CPU number, shifting it doesn't seem to make sense, and
we do want it to reach nr_cpu_ids-1.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/7] cpumask: fix checking valid cpu range
  2022-09-28 12:18   ` Valentin Schneider
@ 2022-09-28 14:49     ` Yury Norov
  2022-09-30 17:04       ` Valentin Schneider
  0 siblings, 1 reply; 17+ messages in thread
From: Yury Norov @ 2022-09-28 14:49 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes

On Wed, Sep 28, 2022 at 01:18:20PM +0100, Valentin Schneider wrote:
> On 19/09/22 14:05, Yury Norov wrote:
> > The range of valid CPUs is [0, nr_cpu_ids). Some cpumask functions are
> > passed with a shifted CPU index, and for them, the valid range is
> > [-1, nr_cpu_ids-1). Currently for those functions, we check the index
> > against [-1, nr_cpu_ids), which is wrong.
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  include/linux/cpumask.h | 19 ++++++++-----------
> >  1 file changed, 8 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> > index e4f9136a4a63..a1cd4eb1a3d6 100644
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -174,9 +174,8 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
> >  static inline
> >  unsigned int cpumask_next(int n, const struct cpumask *srcp)
> >  {
> > -	/* -1 is a legal arg here. */
> > -	if (n != -1)
> > -		cpumask_check(n);
> > +	/* n is a prior cpu */
> > +	cpumask_check(n + 1);
> >       return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
> 
> I'm confused, this makes passing nr_cpu_ids-1 to cpumask_next*() trigger a
> warning. The documentation does states:
> 
> * @n: the cpu prior to the place to search (ie. return will be > @n)
> 
> So n is a valid CPU number (with -1 being the exception for scan
> initialization), this shouldn't exclude nr_cpu_ids-1.

For a regular cpumask function, like cpumask_any_but(), the valid range is
[0, nr_cpu_ids).

'Special' functions shift by 1 when call underlying find API:

  static inline
  unsigned int cpumask_next(int n, const struct cpumask *srcp)
  {
          /* n is a prior cpu */
          cpumask_check(n + 1);
          return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
  }

So, for them the valid range [0, nr_cpu_ids) must be shifted in other
direction: [-1, nr_cpu_ids-1). 

> IMO passing nr_cpu_ids-1 should be treated the same as passing the
> last set bit in a bitmap: no warning, and returns the bitmap
> size.

This is how cpumask_check() works for normal functions. For
cpumask_next() passing nr_cpu_ids-1 is the same as passing nr_cpu_ids
for cpumask_any_but(), and it should trigger warning in both cases.
(Or should not, but it's a different story.)

> calling code which seems like unnecessary boiler plate
> 
> For instance, I trigger the cpumask_check() warning there:
> 
> 3d2dcab932d0:block/blk-mq.c @l2047
>         if (--hctx->next_cpu_batch <= 0) {
> select_cpu:
>                 next_cpu = cpumask_next_and(next_cpu, hctx->cpumask, <-----
>                                 cpu_online_mask);
>                 if (next_cpu >= nr_cpu_ids)
>                         next_cpu = blk_mq_first_mapped_cpu(hctx);
>                 hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>         }
> 
> next_cpu is a valid CPU number, shifting it doesn't seem to make sense, and
> we do want it to reach nr_cpu_ids-1.

next_cpu is a valid CPU number for all, but not for cpumask_next().
The warning is valid. If we are at the very last cpu, what for we look
for next?

The snippet above should be fixed like this:

          if (--hctx->next_cpu_batch <= 0) {
  select_cpu:
                  if (next_cpu == nr_cpu_ids - 1)
                          next_cpu = nr_cpu_ids;
                  else
                          next_cpu = cpumask_next_and(next_cpu,
                                                      hctx->cpumask,
                                                      cpu_online_mask);
                  if (next_cpu >= nr_cpu_ids)
                          next_cpu = blk_mq_first_mapped_cpu(hctx);
                  hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
          }

The original motivation for this special shifted semantics was to
avoid passing '+1' in cpumask_next() everywhere where it's used to
iterate over cpumask. This is especially ugly because it brings negative
semantics in such a simple thing like an index, and makes people confused.
It was a bad decision, but now it's so broadly used that we have to live
with it.

The strategy to mitigate this is to minimize using of that 'special'
functions. They all are cpumask_next()-like. In this series I reworked
for_each_cpu() to not use cpumask_next().

Often, cpumask_next() is a part of opencoded for_each_cpu(), and this
is relatively easy to fix. In case of blk_mq_hctx_next_cpu() that you
mentioned above, cpumask_next_and() usage looks unavoidable, and
there's nothing to do with that, except that being careful.

It didn't trigger the warning in my test setup, so I didn't fix it.
Feel free to submit a patch, if you observe the warning for yourself.

Maybe we should consider nr_cpu_ids as a special valid index for
cpumask_check(), a sign of the end of an array. This would help to
silence many warnings, like this one. For now I'm leaning towards that
it's more a hack than a meaningful change. 

Thanks,
Yury

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/7] cpumask: fix checking valid cpu range
  2022-09-28 14:49     ` Yury Norov
@ 2022-09-30 17:04       ` Valentin Schneider
  2022-10-01  2:02         ` Yury Norov
  0 siblings, 1 reply; 17+ messages in thread
From: Valentin Schneider @ 2022-09-30 17:04 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes

On 28/09/22 07:49, Yury Norov wrote:
> On Wed, Sep 28, 2022 at 01:18:20PM +0100, Valentin Schneider wrote:
>> On 19/09/22 14:05, Yury Norov wrote:
>> > @@ -174,9 +174,8 @@ static inline unsigned int cpumask_last(const struct cpumask *srcp)
>> >  static inline
>> >  unsigned int cpumask_next(int n, const struct cpumask *srcp)
>> >  {
>> > -	/* -1 is a legal arg here. */
>> > -	if (n != -1)
>> > -		cpumask_check(n);
>> > +	/* n is a prior cpu */
>> > +	cpumask_check(n + 1);
>> >       return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
>>
>> I'm confused, this makes passing nr_cpu_ids-1 to cpumask_next*() trigger a
>> warning. The documentation does states:
>>
>> * @n: the cpu prior to the place to search (ie. return will be > @n)
>>
>> So n is a valid CPU number (with -1 being the exception for scan
>> initialization), this shouldn't exclude nr_cpu_ids-1.
>
> For a regular cpumask function, like cpumask_any_but(), the valid range is
> [0, nr_cpu_ids).
>
> 'Special' functions shift by 1 when call underlying find API:
>
>   static inline
>   unsigned int cpumask_next(int n, const struct cpumask *srcp)
>   {
>           /* n is a prior cpu */
>           cpumask_check(n + 1);
>           return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
>   }
>
> So, for them the valid range [0, nr_cpu_ids) must be shifted in other
> direction: [-1, nr_cpu_ids-1).
>

The way I've been seeing this is that the [0, nr_cpu_ids) range is extended
to [-1, nr_cpu_ids) to accommodate for iteration starts.

>> IMO passing nr_cpu_ids-1 should be treated the same as passing the
>> last set bit in a bitmap: no warning, and returns the bitmap
>> size.
>
> This is how cpumask_check() works for normal functions. For
> cpumask_next() passing nr_cpu_ids-1 is the same as passing nr_cpu_ids
> for cpumask_any_but(), and it should trigger warning in both cases.
> (Or should not, but it's a different story.)
>
>> calling code which seems like unnecessary boiler plate
>>
>> For instance, I trigger the cpumask_check() warning there:
>>
>> 3d2dcab932d0:block/blk-mq.c @l2047
>>         if (--hctx->next_cpu_batch <= 0) {
>> select_cpu:
>>                 next_cpu = cpumask_next_and(next_cpu, hctx->cpumask, <-----
>>                                 cpu_online_mask);
>>                 if (next_cpu >= nr_cpu_ids)
>>                         next_cpu = blk_mq_first_mapped_cpu(hctx);
>>                 hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>>         }
>>
>> next_cpu is a valid CPU number, shifting it doesn't seem to make sense, and
>> we do want it to reach nr_cpu_ids-1.
>
> next_cpu is a valid CPU number for all, but not for cpumask_next().
> The warning is valid. If we are at the very last cpu, what for we look
> for next?
>

Consider:

  nr_cpu_ids=4

  A)
  cpumask: 0.1.1.0
  CPU      0 1 2 3
  n            ^
  result: nr_cpu_ids

  B)
  cpumask: 0.0.1.1
  CPU      0 1 2 3
  n              ^
  result: nr_cpu_ids + WARN

Both scenarios are identical from a user perspective: a valid CPU number
was passed in (either from smp_processor_id() or from a previous call to
cpumask_next*()), but there are no more bits set in the cpumask. There's no
more CPUs to search for in both scenarios, but only one produces as WARN.

> The snippet above should be fixed like this:
>
>           if (--hctx->next_cpu_batch <= 0) {
>   select_cpu:
>                   if (next_cpu == nr_cpu_ids - 1)
>                           next_cpu = nr_cpu_ids;
>                   else
>                           next_cpu = cpumask_next_and(next_cpu,
>                                                       hctx->cpumask,
>                                                       cpu_online_mask);
>                   if (next_cpu >= nr_cpu_ids)
>                           next_cpu = blk_mq_first_mapped_cpu(hctx);
>                   hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>           }
>
> The original motivation for this special shifted semantics was to
> avoid passing '+1' in cpumask_next() everywhere where it's used to
> iterate over cpumask. This is especially ugly because it brings negative
> semantics in such a simple thing like an index, and makes people confused.
> It was a bad decision, but now it's so broadly used that we have to live
> with it.
>
> The strategy to mitigate this is to minimize using of that 'special'
> functions. They all are cpumask_next()-like. In this series I reworked
> for_each_cpu() to not use cpumask_next().
>
> Often, cpumask_next() is a part of opencoded for_each_cpu(), and this
> is relatively easy to fix. In case of blk_mq_hctx_next_cpu() that you
> mentioned above, cpumask_next_and() usage looks unavoidable, and
> there's nothing to do with that, except that being careful.
>
> It didn't trigger the warning in my test setup, so I didn't fix it.
> Feel free to submit a patch, if you observe the warning for yourself.
>
> Maybe we should consider nr_cpu_ids as a special valid index for
> cpumask_check(), a sign of the end of an array. This would help to
> silence many warnings, like this one. For now I'm leaning towards that
> it's more a hack than a meaningful change.
>

I agree, we definitely want to warn for e.g.

  cpumask_set_cpu(nr_cpu_ids, ...);

Could we instead make cpumask_next*() immediately return nr_cpu_ids when
passed n=nr_cpu_ids-1?

Also, what about cpumask_next_wrap()? That uses cpumask_next() under the
hood and is bound to warn when wrapping after n=nr_cpu_ids-1, I think.


> Thanks,
> Yury


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/7] cpumask: fix checking valid cpu range
  2022-09-30 17:04       ` Valentin Schneider
@ 2022-10-01  2:02         ` Yury Norov
  0 siblings, 0 replies; 17+ messages in thread
From: Yury Norov @ 2022-10-01  2:02 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, netdev, Andy Shevchenko, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Rasmus Villemoes

On Fri, Sep 30, 2022 at 06:04:08PM +0100, Valentin Schneider wrote:
[...]

> > next_cpu is a valid CPU number for all, but not for cpumask_next().
> > The warning is valid. If we are at the very last cpu, what for we look
> > for next?
> >
> 
> Consider:
> 
>   nr_cpu_ids=4
> 
>   A)
>   cpumask: 0.1.1.0
>   CPU      0 1 2 3
>   n            ^
>   result: nr_cpu_ids
> 
>   B)
>   cpumask: 0.0.1.1
>   CPU      0 1 2 3
>   n              ^
>   result: nr_cpu_ids + WARN
> 
> Both scenarios are identical from a user perspective: a valid CPU number
> was passed in (either from smp_processor_id() or from a previous call to
> cpumask_next*()), but there are no more bits set in the cpumask. There's no
> more CPUs to search for in both scenarios, but only one produces as WARN.

It seems I have to repeat it for the 3rd time.

cpumask_next() takes shifted cpu index. That's why cpumask_check()
must shift the index in the other direction to keep all that
checking logic consistent.

This is a bad design, and all users of cpumask_next() must be aware of
this pitfall.
 
[...]

> > Maybe we should consider nr_cpu_ids as a special valid index for
> > cpumask_check(), a sign of the end of an array. This would help to
> > silence many warnings, like this one. For now I'm leaning towards that
> > it's more a hack than a meaningful change.
> >
> 
> I agree, we definitely want to warn for e.g.
> 
>   cpumask_set_cpu(nr_cpu_ids, ...);
> 
> Could we instead make cpumask_next*() immediately return nr_cpu_ids when
> passed n=nr_cpu_ids-1?

This is what FIND_NEXT_BIT() does. If you're suggesting to silence the
warning - what for do we need it at all?
 
> Also, what about cpumask_next_wrap()? That uses cpumask_next() under the
> hood and is bound to warn when wrapping after n=nr_cpu_ids-1, I think.

I'm working on a fix for it. Hopefully will merge it in next window.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-10-01  2:05 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-19 21:05 [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
2022-09-19 21:05 ` [PATCH 1/7] cpumask: fix checking valid cpu range Yury Norov
2022-09-28 12:18   ` Valentin Schneider
2022-09-28 14:49     ` Yury Norov
2022-09-30 17:04       ` Valentin Schneider
2022-10-01  2:02         ` Yury Norov
2022-09-19 21:05 ` [PATCH 2/7] net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and} Yury Norov
2022-09-26 17:34   ` Jakub Kicinski
2022-09-26 17:47     ` Yury Norov
2022-09-19 21:05 ` [PATCH 3/7] cpumask: switch for_each_cpu{,_not} to use for_each_bit() Yury Norov
2022-09-19 21:05 ` [PATCH 4/7] lib/find_bit: add find_next{,_and}_bit_wrap Yury Norov
2022-09-19 21:05 ` [PATCH 5/7] lib/bitmap: introduce for_each_set_bit_wrap() macro Yury Norov
2022-09-19 21:05 ` [PATCH 6/7] lib/find: optimize for_each() macros Yury Norov
2022-09-19 21:05 ` [PATCH 7/7] lib/bitmap: add tests for for_each() loops Yury Norov
2022-09-25 15:47 ` [PATCH 0/7] cpumask: repair cpumask_check() Yury Norov
2022-09-26 15:09   ` Jakub Kicinski
2022-09-26 16:27     ` Yury Norov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.