linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot()
@ 2022-10-03 15:34 Valentin Schneider
  2022-10-03 15:34 ` [PATCH bitmap-for-next 1/5] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu() Valentin Schneider
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Valentin Schneider @ 2022-10-03 15:34 UTC (permalink / raw)
  To: linux-block, linux-kernel
  Cc: Jens Axboe, Yury Norov, Andy Shevchenko, Rasmus Villemoes,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

As suggested by Yury, this is the bitmap/cpumask specific bits of [1]. This now
also contains an extra fix for blk_mq.

This is based on top of Yury's bitmap-for-next [2].

A note on treewide use of for_each_cpu_andnot()
===============================================

I've used the below coccinelle script to find places that could be patched (I
couldn't figure out the valid syntax to patch from coccinelle itself):

,-----
@tmpandnot@
expression tmpmask;
iterator for_each_cpu;
position p;
statement S;
@@
cpumask_andnot(tmpmask, ...);

...

(
for_each_cpu@p(..., tmpmask, ...)
	S
|
for_each_cpu@p(..., tmpmask, ...)
{
	...
}
)

@script:python depends on tmpandnot@
p << tmpandnot.p;
@@
coccilib.report.print_report(p[0], "andnot loop here")
'-----

Which yields (against c40e8341e3b3):

.//arch/powerpc/kernel/smp.c:1587:1-13: andnot loop here
.//arch/powerpc/kernel/smp.c:1530:1-13: andnot loop here
.//arch/powerpc/kernel/smp.c:1440:1-13: andnot loop here
.//arch/powerpc/platforms/powernv/subcore.c:306:2-14: andnot loop here
.//arch/x86/kernel/apic/x2apic_cluster.c:62:1-13: andnot loop here
.//drivers/acpi/acpi_pad.c:110:1-13: andnot loop here
.//drivers/cpufreq/armada-8k-cpufreq.c:148:1-13: andnot loop here
.//drivers/cpufreq/powernv-cpufreq.c:931:1-13: andnot loop here
.//drivers/net/ethernet/sfc/efx_channels.c:73:1-13: andnot loop here
.//drivers/net/ethernet/sfc/siena/efx_channels.c:73:1-13: andnot loop here
.//kernel/sched/core.c:345:1-13: andnot loop here
.//kernel/sched/core.c:366:1-13: andnot loop here
.//net/core/dev.c:3058:1-13: andnot loop here

A lot of those are actually of the shape

  for_each_cpu(cpu, mask) {
      ...
      cpumask_andnot(mask, ...);
  }

I think *some* of the powerpc ones would be a match for for_each_cpu_andnot(),
but I decided to just stick to the one obvious one in __sched_core_flip().

[1]: https://lore.kernel.org/all/20220923132527.1001870-1-vschneid@redhat.com/
[2]: https://github.com/norov/linux.git/ -b bitmap-for-next

Cheers,
Valentin

Valentin Schneider (5):
  blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu()
  lib/find_bit: Introduce find_next_andnot_bit()
  cpumask: Introduce for_each_cpu_andnot()
  lib/test_cpumask: Add for_each_cpu_and(not) tests
  sched/core: Merge cpumask_andnot()+for_each_cpu() into
    for_each_cpu_andnot()

 block/blk-mq.c          |  9 +++++++--
 include/linux/cpumask.h | 18 ++++++++++++++++++
 include/linux/find.h    | 38 ++++++++++++++++++++++++++++++++++++++
 kernel/sched/core.c     |  5 +----
 lib/cpumask_kunit.c     | 19 +++++++++++++++++++
 lib/find_bit.c          |  9 +++++++++
 6 files changed, 92 insertions(+), 6 deletions(-)

--
2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH bitmap-for-next 1/5] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu()
  2022-10-03 15:34 [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Valentin Schneider
@ 2022-10-03 15:34 ` Valentin Schneider
  2022-10-03 17:54   ` Yury Norov
  2022-10-03 15:34 ` [PATCH bitmap-for-next 2/5] lib/find_bit: Introduce find_next_andnot_bit() Valentin Schneider
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Valentin Schneider @ 2022-10-03 15:34 UTC (permalink / raw)
  To: linux-block, linux-kernel
  Cc: Yury Norov, Jens Axboe, Andy Shevchenko, Rasmus Villemoes,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

A recent commit made cpumask_next*() trigger a warning when passed
n = nr_cpu_ids - 1. This means extra care must be taken when feeding CPU
numbers back into cpumask_next*().

The warning occurs nearly every boot on QEMU:

[    5.398453] WARNING: CPU: 3 PID: 162 at include/linux/cpumask.h:110 __blk_mq_delay_run_hw_queue+0x16b/0x180
[    5.399317] Modules linked in:
[    5.399646] CPU: 3 PID: 162 Comm: ssh-keygen Tainted: G                 N 6.0.0-rc4-00004-g93003cb24006 #55
[    5.400135] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[    5.405430] Call Trace:
[    5.406152]  <TASK>
[    5.406452]  blk_mq_sched_insert_requests+0x67/0x150
[    5.406759]  blk_mq_flush_plug_list+0xd0/0x280
[    5.406987]  ? bit_wait+0x60/0x60
[    5.407317]  __blk_flush_plug+0xdb/0x120
[    5.407561]  ? bit_wait+0x60/0x60
[    5.407765]  io_schedule_prepare+0x38/0x40
[    5.407974]  io_schedule+0x6/0x40
[    5.408283]  bit_wait_io+0x8/0x60
[    5.408485]  __wait_on_bit+0x72/0x90
[    5.408668]  out_of_line_wait_on_bit+0x8c/0xb0
[    5.408879]  ? swake_up_one+0x30/0x30
[    5.409158]  ext4_read_bh+0x7a/0x80
[    5.409381]  ext4_get_branch+0xc0/0x130
[    5.409583]  ext4_ind_map_blocks+0x1ac/0xb30
[    5.409844]  ? __es_remove_extent+0x61/0x6d0
[    5.410128]  ? blk_account_io_merge_bio+0x67/0xd0
[    5.410416]  ? percpu_counter_add_batch+0x59/0xb0
[    5.410720]  ? percpu_counter_add_batch+0x59/0xb0
[    5.410949]  ? _raw_read_unlock+0x13/0x30
[    5.411323]  ext4_map_blocks+0xc2/0x590
[    5.411609]  ? xa_load+0x7c/0xa0
[    5.411859]  ext4_mpage_readpages+0x4a2/0xaa0
[    5.412192]  read_pages+0x69/0x2b0
[    5.412477]  ? folio_add_lru+0x4f/0x70
[    5.412696]  page_cache_ra_unbounded+0x11d/0x170
[    5.412960]  filemap_get_pages+0x109/0x5d0
[    5.413340]  ? __vma_adjust+0x348/0x930
[    5.413576]  filemap_read+0xb7/0x380
[    5.413805]  ? vma_merge+0x2e9/0x330
[    5.414067]  ? vma_set_page_prot+0x43/0x80
[    5.414309]  ? __inode_security_revalidate+0x5e/0x80
[    5.414581]  ? selinux_file_permission+0x107/0x130
[    5.414889]  vfs_read+0x205/0x2e0
[    5.415162]  ksys_read+0x54/0xd0
[    5.415348]  do_syscall_64+0x3a/0x90
[    5.415584]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 78e5a3399421 ("cpumask: fix checking valid cpu range")
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 block/blk-mq.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index c96c8c4f751b..30ae51eda95e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2046,8 +2046,13 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 
 	if (--hctx->next_cpu_batch <= 0) {
 select_cpu:
-		next_cpu = cpumask_next_and(next_cpu, hctx->cpumask,
-				cpu_online_mask);
+		if (next_cpu == nr_cpu_ids - 1)
+			next_cpu = nr_cpu_ids;
+		else
+			next_cpu = cpumask_next_and(next_cpu,
+						    hctx->cpumask,
+						    cpu_online_mask);
+
 		if (next_cpu >= nr_cpu_ids)
 			next_cpu = blk_mq_first_mapped_cpu(hctx);
 		hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bitmap-for-next 2/5] lib/find_bit: Introduce find_next_andnot_bit()
  2022-10-03 15:34 [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Valentin Schneider
  2022-10-03 15:34 ` [PATCH bitmap-for-next 1/5] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu() Valentin Schneider
@ 2022-10-03 15:34 ` Valentin Schneider
  2022-10-03 15:34 ` [PATCH bitmap-for-next 3/5] cpumask: Introduce for_each_cpu_andnot() Valentin Schneider
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Valentin Schneider @ 2022-10-03 15:34 UTC (permalink / raw)
  To: linux-block, linux-kernel
  Cc: Jens Axboe, Yury Norov, Andy Shevchenko, Rasmus Villemoes,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

In preparation of introducing for_each_cpu_andnot(), add a variant of
find_next_bit() that negate the bits in @addr2 when ANDing them with the
bits in @addr1.

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 include/linux/find.h | 33 +++++++++++++++++++++++++++++++++
 lib/find_bit.c       |  9 +++++++++
 2 files changed, 42 insertions(+)

diff --git a/include/linux/find.h b/include/linux/find.h
index 0cdfab9734a6..2235af19e7e1 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -12,6 +12,8 @@ unsigned long _find_next_bit(const unsigned long *addr1, unsigned long nbits,
 				unsigned long start);
 unsigned long _find_next_and_bit(const unsigned long *addr1, const unsigned long *addr2,
 					unsigned long nbits, unsigned long start);
+unsigned long _find_next_andnot_bit(const unsigned long *addr1, const unsigned long *addr2,
+					unsigned long nbits, unsigned long start);
 unsigned long _find_next_zero_bit(const unsigned long *addr, unsigned long nbits,
 					 unsigned long start);
 extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
@@ -91,6 +93,37 @@ unsigned long find_next_and_bit(const unsigned long *addr1,
 }
 #endif
 
+#ifndef find_next_andnot_bit
+/**
+ * find_next_andnot_bit - find the next set bit in *addr1 excluding all the bits
+ *                        in *addr2
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @size: The bitmap size in bits
+ * @offset: The bitnumber to start searching at
+ *
+ * Returns the bit number for the next set bit
+ * If no bits are set, returns @size.
+ */
+static inline
+unsigned long find_next_andnot_bit(const unsigned long *addr1,
+		const unsigned long *addr2, unsigned long size,
+		unsigned long offset)
+{
+	if (small_const_nbits(size)) {
+		unsigned long val;
+
+		if (unlikely(offset >= size))
+			return size;
+
+		val = *addr1 & ~*addr2 & GENMASK(size - 1, offset);
+		return val ? __ffs(val) : size;
+	}
+
+	return _find_next_andnot_bit(addr1, addr2, size, offset);
+}
+#endif
+
 #ifndef find_next_zero_bit
 /**
  * find_next_zero_bit - find the next cleared bit in a memory region
diff --git a/lib/find_bit.c b/lib/find_bit.c
index 25609974cbe4..18bc0a7ac8ee 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -164,6 +164,15 @@ unsigned long _find_next_and_bit(const unsigned long *addr1, const unsigned long
 EXPORT_SYMBOL(_find_next_and_bit);
 #endif
 
+#ifndef find_next_andnot_bit
+unsigned long _find_next_andnot_bit(const unsigned long *addr1, const unsigned long *addr2,
+					unsigned long nbits, unsigned long start)
+{
+	return FIND_NEXT_BIT(addr1[idx] & ~addr2[idx], /* nop */, nbits, start);
+}
+EXPORT_SYMBOL(_find_next_andnot_bit);
+#endif
+
 #ifndef find_next_zero_bit
 unsigned long _find_next_zero_bit(const unsigned long *addr, unsigned long nbits,
 					 unsigned long start)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bitmap-for-next 3/5] cpumask: Introduce for_each_cpu_andnot()
  2022-10-03 15:34 [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Valentin Schneider
  2022-10-03 15:34 ` [PATCH bitmap-for-next 1/5] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu() Valentin Schneider
  2022-10-03 15:34 ` [PATCH bitmap-for-next 2/5] lib/find_bit: Introduce find_next_andnot_bit() Valentin Schneider
@ 2022-10-03 15:34 ` Valentin Schneider
  2022-10-03 15:34 ` [PATCH bitmap-for-next 4/5] lib/test_cpumask: Add for_each_cpu_and(not) tests Valentin Schneider
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Valentin Schneider @ 2022-10-03 15:34 UTC (permalink / raw)
  To: linux-block, linux-kernel
  Cc: Jens Axboe, Yury Norov, Andy Shevchenko, Rasmus Villemoes,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

for_each_cpu_and() is very convenient as it saves having to allocate a
temporary cpumask to store the result of cpumask_and(). The same issue
applies to cpumask_andnot() which doesn't actually need temporary storage
for iteration purposes.

Following what has been done for for_each_cpu_and(), introduce
for_each_cpu_andnot().

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 include/linux/cpumask.h | 18 ++++++++++++++++++
 include/linux/find.h    |  5 +++++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 286804bfe3b7..2f065ad97541 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -306,6 +306,24 @@ unsigned int __pure cpumask_next_wrap(int n, const struct cpumask *mask, int sta
 #define for_each_cpu_and(cpu, mask1, mask2)				\
 	for_each_and_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), nr_cpumask_bits)
 
+/**
+ * for_each_cpu_andnot - iterate over every cpu present in one mask, excluding
+ *			 those present in another.
+ * @cpu: the (optionally unsigned) integer iterator
+ * @mask1: the first cpumask pointer
+ * @mask2: the second cpumask pointer
+ *
+ * This saves a temporary CPU mask in many places.  It is equivalent to:
+ *	struct cpumask tmp;
+ *	cpumask_andnot(&tmp, &mask1, &mask2);
+ *	for_each_cpu(cpu, &tmp)
+ *		...
+ *
+ * After the loop, cpu is >= nr_cpu_ids.
+ */
+#define for_each_cpu_andnot(cpu, mask1, mask2)				\
+	for_each_andnot_bit(cpu, cpumask_bits(mask1), cpumask_bits(mask2), nr_cpumask_bits)
+
 /**
  * cpumask_any_but - return a "random" in a cpumask, but not this one.
  * @mask: the cpumask to search
diff --git a/include/linux/find.h b/include/linux/find.h
index 2235af19e7e1..ccaf61a0f5fd 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -498,6 +498,11 @@ unsigned long find_next_bit_le(const void *addr, unsigned
 	     (bit) = find_next_and_bit((addr1), (addr2), (size), (bit)), (bit) < (size);\
 	     (bit)++)
 
+#define for_each_andnot_bit(bit, addr1, addr2, size) \
+	for ((bit) = 0;									\
+	     (bit) = find_next_andnot_bit((addr1), (addr2), (size), (bit)), (bit) < (size);\
+	     (bit)++)
+
 /* same as for_each_set_bit() but use bit as value to start with */
 #define for_each_set_bit_from(bit, addr, size) \
 	for (; (bit) = find_next_bit((addr), (size), (bit)), (bit) < (size); (bit)++)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bitmap-for-next 4/5] lib/test_cpumask: Add for_each_cpu_and(not) tests
  2022-10-03 15:34 [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Valentin Schneider
                   ` (2 preceding siblings ...)
  2022-10-03 15:34 ` [PATCH bitmap-for-next 3/5] cpumask: Introduce for_each_cpu_andnot() Valentin Schneider
@ 2022-10-03 15:34 ` Valentin Schneider
  2022-10-03 15:34 ` [PATCH bitmap-for-next 5/5] sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot() Valentin Schneider
  2022-10-03 18:52 ` [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Yury Norov
  5 siblings, 0 replies; 10+ messages in thread
From: Valentin Schneider @ 2022-10-03 15:34 UTC (permalink / raw)
  To: linux-block, linux-kernel
  Cc: Yury Norov, Jens Axboe, Andy Shevchenko, Rasmus Villemoes,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

Following the recent introduction of for_each_andnot(), add some tests to
ensure for_each_cpu_and(not) results in the same as iterating over the
result of cpumask_and(not)().

Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 lib/cpumask_kunit.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/lib/cpumask_kunit.c b/lib/cpumask_kunit.c
index ecbeec72221e..d1fc6ece21f3 100644
--- a/lib/cpumask_kunit.c
+++ b/lib/cpumask_kunit.c
@@ -33,6 +33,19 @@
 		KUNIT_EXPECT_EQ_MSG((test), nr_cpu_ids - mask_weight, iter, MASK_MSG(mask));	\
 	} while (0)
 
+#define EXPECT_FOR_EACH_CPU_OP_EQ(test, op, mask1, mask2)			\
+	do {									\
+		const cpumask_t *m1 = (mask1);					\
+		const cpumask_t *m2 = (mask2);					\
+		int weight;                                                     \
+		int cpu, iter = 0;						\
+		cpumask_##op(&mask_tmp, m1, m2);                                \
+		weight = cpumask_weight(&mask_tmp);				\
+		for_each_cpu_##op(cpu, mask1, mask2)				\
+			iter++;							\
+		KUNIT_EXPECT_EQ((test), weight, iter);				\
+	} while (0)
+
 #define EXPECT_FOR_EACH_CPU_WRAP_EQ(test, mask)			\
 	do {							\
 		const cpumask_t *m = (mask);			\
@@ -54,6 +67,7 @@
 
 static cpumask_t mask_empty;
 static cpumask_t mask_all;
+static cpumask_t mask_tmp;
 
 static void test_cpumask_weight(struct kunit *test)
 {
@@ -101,10 +115,15 @@ static void test_cpumask_iterators(struct kunit *test)
 	EXPECT_FOR_EACH_CPU_EQ(test, &mask_empty);
 	EXPECT_FOR_EACH_CPU_NOT_EQ(test, &mask_empty);
 	EXPECT_FOR_EACH_CPU_WRAP_EQ(test, &mask_empty);
+	EXPECT_FOR_EACH_CPU_OP_EQ(test, and, &mask_empty, &mask_empty);
+	EXPECT_FOR_EACH_CPU_OP_EQ(test, and, cpu_possible_mask, &mask_empty);
+	EXPECT_FOR_EACH_CPU_OP_EQ(test, andnot, &mask_empty, &mask_empty);
 
 	EXPECT_FOR_EACH_CPU_EQ(test, cpu_possible_mask);
 	EXPECT_FOR_EACH_CPU_NOT_EQ(test, cpu_possible_mask);
 	EXPECT_FOR_EACH_CPU_WRAP_EQ(test, cpu_possible_mask);
+	EXPECT_FOR_EACH_CPU_OP_EQ(test, and, cpu_possible_mask, cpu_possible_mask);
+	EXPECT_FOR_EACH_CPU_OP_EQ(test, andnot, cpu_possible_mask, &mask_empty);
 }
 
 static void test_cpumask_iterators_builtin(struct kunit *test)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH bitmap-for-next 5/5] sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot()
  2022-10-03 15:34 [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Valentin Schneider
                   ` (3 preceding siblings ...)
  2022-10-03 15:34 ` [PATCH bitmap-for-next 4/5] lib/test_cpumask: Add for_each_cpu_and(not) tests Valentin Schneider
@ 2022-10-03 15:34 ` Valentin Schneider
  2022-10-03 18:52 ` [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Yury Norov
  5 siblings, 0 replies; 10+ messages in thread
From: Valentin Schneider @ 2022-10-03 15:34 UTC (permalink / raw)
  To: linux-block, linux-kernel
  Cc: Yury Norov, Jens Axboe, Andy Shevchenko, Rasmus Villemoes,
	Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira

This removes the second use of the sched_core_mask temporary mask.

Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
 kernel/sched/core.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ee28253c9ac0..b4c3112b0095 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -360,10 +360,7 @@ static void __sched_core_flip(bool enabled)
 	/*
 	 * Toggle the offline CPUs.
 	 */
-	cpumask_copy(&sched_core_mask, cpu_possible_mask);
-	cpumask_andnot(&sched_core_mask, &sched_core_mask, cpu_online_mask);
-
-	for_each_cpu(cpu, &sched_core_mask)
+	for_each_cpu_andnot(cpu, cpu_possible_mask, cpu_online_mask)
 		cpu_rq(cpu)->core_enabled = enabled;
 
 	cpus_read_unlock();
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH bitmap-for-next 1/5] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu()
  2022-10-03 15:34 ` [PATCH bitmap-for-next 1/5] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu() Valentin Schneider
@ 2022-10-03 17:54   ` Yury Norov
  2022-10-05 12:19     ` Valentin Schneider
  0 siblings, 1 reply; 10+ messages in thread
From: Yury Norov @ 2022-10-03 17:54 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-block, linux-kernel, Jens Axboe, Andy Shevchenko,
	Rasmus Villemoes, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira

On Mon, Oct 03, 2022 at 04:34:16PM +0100, Valentin Schneider wrote:
> A recent commit made cpumask_next*() trigger a warning when passed
> n = nr_cpu_ids - 1. This means extra care must be taken when feeding CPU
> numbers back into cpumask_next*().
> 
> The warning occurs nearly every boot on QEMU:

[...]
 
> Fixes: 78e5a3399421 ("cpumask: fix checking valid cpu range")

No! It fixes blk-mq bug, which has been revealed after 78e5a3399421.

> Suggested-by: Yury Norov <yury.norov@gmail.com>

OK, maybe I suggested something like this. But after looking into the code
of blk_mq_hctx_next_cpu() code for more, I have a feeling that this should
be overridden deeper. 

Can you check - did this warning raise because hctx->next_cpu, or
because cpumask_next_and() was called twice after jumping on
select_cpu label?

> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
>  block/blk-mq.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index c96c8c4f751b..30ae51eda95e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2046,8 +2046,13 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
>  
>  	if (--hctx->next_cpu_batch <= 0) {
>  select_cpu:

Because we have backward looking goto, I have a strong feeling that the
code should be reorganized.

> -		next_cpu = cpumask_next_and(next_cpu, hctx->cpumask,
> -				cpu_online_mask);
> +		if (next_cpu == nr_cpu_ids - 1)
> +			next_cpu = nr_cpu_ids;
> +		else
> +			next_cpu = cpumask_next_and(next_cpu,
> +						    hctx->cpumask,
> +						    cpu_online_mask);
> +
>  		if (next_cpu >= nr_cpu_ids)
>  			next_cpu = blk_mq_first_mapped_cpu(hctx);

This simply means 'let's start from the beginning', and should be
replaced with cpumask_next_and_wrap().

>  		hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;


Maybe something like this would work?

        if (--hctx->next_cpu_batch > 0 && cpu_online(next_cpu)) {
                hctx->next_cpu = next_cpu;
                return next_cpu;
        }

        next_cpu = cpumask_next_and_wrap(next_cpu, hctx->cpumask, cpu_online_mask)
        if (next_cpu < nr_cpu_ids) {
                hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
                hctx->next_cpu = next_cpu;
                return next_cpu;
        }

        /*
         * Make sure to re-select CPU next time once after CPUs
         * in hctx->cpumask become online again.
         */
        hctx->next_cpu = next_cpu;
        hctx->next_cpu_batch = 1;
        return WORK_CPU_UNBOUND;

I didn't test it and likely screwed some corner case. I'm just
trying to say that picking next cpu should be an easier thing.

Thanks,
Yury

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot()
  2022-10-03 15:34 [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Valentin Schneider
                   ` (4 preceding siblings ...)
  2022-10-03 15:34 ` [PATCH bitmap-for-next 5/5] sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot() Valentin Schneider
@ 2022-10-03 18:52 ` Yury Norov
  2022-10-06 12:58   ` Yury Norov
  5 siblings, 1 reply; 10+ messages in thread
From: Yury Norov @ 2022-10-03 18:52 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-block, linux-kernel, Jens Axboe, Andy Shevchenko,
	Rasmus Villemoes, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira

On Mon, Oct 03, 2022 at 04:34:15PM +0100, Valentin Schneider wrote:
> As suggested by Yury, this is the bitmap/cpumask specific bits of [1]. This now
> also contains an extra fix for blk_mq.

Patches ##2-5 look good to me, but I'd like to give them some testing.
Let's also wait for other comments, and if nothing wrong will be spotted,
I'll take it.

Thanks,
Yury
 
> This is based on top of Yury's bitmap-for-next [2].
> 
> A note on treewide use of for_each_cpu_andnot()
> ===============================================
> 
> I've used the below coccinelle script to find places that could be patched (I
> couldn't figure out the valid syntax to patch from coccinelle itself):
> 
> ,-----
> @tmpandnot@
> expression tmpmask;
> iterator for_each_cpu;
> position p;
> statement S;
> @@
> cpumask_andnot(tmpmask, ...);
> 
> ...
> 
> (
> for_each_cpu@p(..., tmpmask, ...)
> 	S
> |
> for_each_cpu@p(..., tmpmask, ...)
> {
> 	...
> }
> )
> 
> @script:python depends on tmpandnot@
> p << tmpandnot.p;
> @@
> coccilib.report.print_report(p[0], "andnot loop here")
> '-----
> 
> Which yields (against c40e8341e3b3):
> 
> .//arch/powerpc/kernel/smp.c:1587:1-13: andnot loop here
> .//arch/powerpc/kernel/smp.c:1530:1-13: andnot loop here
> .//arch/powerpc/kernel/smp.c:1440:1-13: andnot loop here
> .//arch/powerpc/platforms/powernv/subcore.c:306:2-14: andnot loop here
> .//arch/x86/kernel/apic/x2apic_cluster.c:62:1-13: andnot loop here
> .//drivers/acpi/acpi_pad.c:110:1-13: andnot loop here
> .//drivers/cpufreq/armada-8k-cpufreq.c:148:1-13: andnot loop here
> .//drivers/cpufreq/powernv-cpufreq.c:931:1-13: andnot loop here
> .//drivers/net/ethernet/sfc/efx_channels.c:73:1-13: andnot loop here
> .//drivers/net/ethernet/sfc/siena/efx_channels.c:73:1-13: andnot loop here
> .//kernel/sched/core.c:345:1-13: andnot loop here
> .//kernel/sched/core.c:366:1-13: andnot loop here
> .//net/core/dev.c:3058:1-13: andnot loop here
> 
> A lot of those are actually of the shape
> 
>   for_each_cpu(cpu, mask) {
>       ...
>       cpumask_andnot(mask, ...);
>   }
> 
> I think *some* of the powerpc ones would be a match for for_each_cpu_andnot(),
> but I decided to just stick to the one obvious one in __sched_core_flip().
> 
> [1]: https://lore.kernel.org/all/20220923132527.1001870-1-vschneid@redhat.com/
> [2]: https://github.com/norov/linux.git/ -b bitmap-for-next
> 
> Cheers,
> Valentin
> 
> Valentin Schneider (5):
>   blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu()
>   lib/find_bit: Introduce find_next_andnot_bit()
>   cpumask: Introduce for_each_cpu_andnot()
>   lib/test_cpumask: Add for_each_cpu_and(not) tests
>   sched/core: Merge cpumask_andnot()+for_each_cpu() into
>     for_each_cpu_andnot()
> 
>  block/blk-mq.c          |  9 +++++++--
>  include/linux/cpumask.h | 18 ++++++++++++++++++
>  include/linux/find.h    | 38 ++++++++++++++++++++++++++++++++++++++
>  kernel/sched/core.c     |  5 +----
>  lib/cpumask_kunit.c     | 19 +++++++++++++++++++
>  lib/find_bit.c          |  9 +++++++++
>  6 files changed, 92 insertions(+), 6 deletions(-)
> 
> --
> 2.31.1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bitmap-for-next 1/5] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu()
  2022-10-03 17:54   ` Yury Norov
@ 2022-10-05 12:19     ` Valentin Schneider
  0 siblings, 0 replies; 10+ messages in thread
From: Valentin Schneider @ 2022-10-05 12:19 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-block, linux-kernel, Jens Axboe, Andy Shevchenko,
	Rasmus Villemoes, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira

On 03/10/22 10:54, Yury Norov wrote:
> On Mon, Oct 03, 2022 at 04:34:16PM +0100, Valentin Schneider wrote:
>> A recent commit made cpumask_next*() trigger a warning when passed
>> n = nr_cpu_ids - 1. This means extra care must be taken when feeding CPU
>> numbers back into cpumask_next*().
>>
>> The warning occurs nearly every boot on QEMU:
>
> [...]
>
>> Fixes: 78e5a3399421 ("cpumask: fix checking valid cpu range")
>
> No! It fixes blk-mq bug, which has been revealed after 78e5a3399421.
>
>> Suggested-by: Yury Norov <yury.norov@gmail.com>
>
> OK, maybe I suggested something like this. But after looking into the code
> of blk_mq_hctx_next_cpu() code for more, I have a feeling that this should
> be overridden deeper.
>
> Can you check - did this warning raise because hctx->next_cpu, or
> because cpumask_next_and() was called twice after jumping on
> select_cpu label?
>

It seems to always happen when hctx->next_cpu == nr_cpu_ids-1 at the start
of the function - no jumping involved.

>> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
>> ---
>>  block/blk-mq.c | 9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index c96c8c4f751b..30ae51eda95e 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2046,8 +2046,13 @@ static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
>>
>>      if (--hctx->next_cpu_batch <= 0) {
>>  select_cpu:
>
> Because we have backward looking goto, I have a strong feeling that the
> code should be reorganized.
>
>> -		next_cpu = cpumask_next_and(next_cpu, hctx->cpumask,
>> -				cpu_online_mask);
>> +		if (next_cpu == nr_cpu_ids - 1)
>> +			next_cpu = nr_cpu_ids;
>> +		else
>> +			next_cpu = cpumask_next_and(next_cpu,
>> +						    hctx->cpumask,
>> +						    cpu_online_mask);
>> +
>>              if (next_cpu >= nr_cpu_ids)
>>                      next_cpu = blk_mq_first_mapped_cpu(hctx);
>
> This simply means 'let's start from the beginning', and should be
> replaced with cpumask_next_and_wrap().

I hadn't looked in depth there, but that's a strange behaviour.
If we get to the end of the cpumask, blk_mq_first_mapped_cpu() does:

        int cpu = cpumask_first_and(hctx->cpumask, cpu_online_mask);

        if (cpu >= nr_cpu_ids)
                cpu = cpumask_first(hctx->cpumask);
        return cpu;

That if branch means the returned CPU is offline, which then triggers:

        if (!cpu_online(next_cpu)) {
                if (!tried) {
                        tried = true;
                        goto select_cpu;
                }

but going back to select_cpu doesn't make much sense, we've already checked
that hctx->cpumask and cpu_online_mask were disjoint.

>
>>              hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>
>
> Maybe something like this would work?
>
>         if (--hctx->next_cpu_batch > 0 && cpu_online(next_cpu)) {
>                 hctx->next_cpu = next_cpu;
>                 return next_cpu;
>         }
>
>         next_cpu = cpumask_next_and_wrap(next_cpu, hctx->cpumask, cpu_online_mask)
>         if (next_cpu < nr_cpu_ids) {
>                 hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
>                 hctx->next_cpu = next_cpu;
>                 return next_cpu;
>         }
>
>         /*
>          * Make sure to re-select CPU next time once after CPUs
>          * in hctx->cpumask become online again.
>          */
>         hctx->next_cpu = next_cpu;
>         hctx->next_cpu_batch = 1;
>         return WORK_CPU_UNBOUND;
>
> I didn't test it and likely screwed some corner case. I'm just
> trying to say that picking next cpu should be an easier thing.
>

Agreed, your suggestion looks sane, let me play with that a bit.

> Thanks,
> Yury


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot()
  2022-10-03 18:52 ` [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Yury Norov
@ 2022-10-06 12:58   ` Yury Norov
  0 siblings, 0 replies; 10+ messages in thread
From: Yury Norov @ 2022-10-06 12:58 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-block, linux-kernel, Jens Axboe, Andy Shevchenko,
	Rasmus Villemoes, Ingo Molnar, Peter Zijlstra, Juri Lelli,
	Vincent Guittot, Dietmar Eggemann, Steven Rostedt, Ben Segall,
	Mel Gorman, Daniel Bristot de Oliveira

On Mon, Oct 03, 2022 at 11:54:20AM -0700, Yury Norov wrote:
> On Mon, Oct 03, 2022 at 04:34:15PM +0100, Valentin Schneider wrote:
> > As suggested by Yury, this is the bitmap/cpumask specific bits of [1]. This now
> > also contains an extra fix for blk_mq.
> 
> Patches ##2-5 look good to me, but I'd like to give them some testing.
> Let's also wait for other comments, and if nothing wrong will be spotted,
> I'll take it.

OK, taking those in bitmap-for-next.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-10-06 13:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-03 15:34 [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Valentin Schneider
2022-10-03 15:34 ` [PATCH bitmap-for-next 1/5] blk_mq: Fix cpumask_check() warning in blk_mq_hctx_next_cpu() Valentin Schneider
2022-10-03 17:54   ` Yury Norov
2022-10-05 12:19     ` Valentin Schneider
2022-10-03 15:34 ` [PATCH bitmap-for-next 2/5] lib/find_bit: Introduce find_next_andnot_bit() Valentin Schneider
2022-10-03 15:34 ` [PATCH bitmap-for-next 3/5] cpumask: Introduce for_each_cpu_andnot() Valentin Schneider
2022-10-03 15:34 ` [PATCH bitmap-for-next 4/5] lib/test_cpumask: Add for_each_cpu_and(not) tests Valentin Schneider
2022-10-03 15:34 ` [PATCH bitmap-for-next 5/5] sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot() Valentin Schneider
2022-10-03 18:52 ` [PATCH bitmap-for-next 0/5] bitmap,cpumask: Add for_each_cpu_andnot() Yury Norov
2022-10-06 12:58   ` Yury Norov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).