All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues
@ 2024-02-14 18:39 Tejun Heo
  2024-02-14 19:03 ` Linus Torvalds
  2024-02-16  1:23 ` [PATCH wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK Tejun Heo
  0 siblings, 2 replies; 7+ messages in thread
From: Tejun Heo @ 2024-02-14 18:39 UTC (permalink / raw)
  To: Lai Jiangshan; +Cc: linux-kernel, torvalds

From 2f34d7337d98f3eae7bd3d1270efaf9d8a17cfc6 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Wed, 14 Feb 2024 08:33:55 -1000

When queue_work_on() is used to queue a BH work item on a remote CPU, the
work item is queued on that CPU but kick_pool() raises softirq on the local
CPU. This leads to stalls as the work item won't be executed until something
else on the remote CPU schedules a BH work item or tasklet locally.

Fix it by bouncing raising softirq to the target CPU using per-cpu irq_work.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 4cb1ef64609f ("workqueue: Implement BH workqueues to eventually replace tasklets")
---
Oops, I just found out that I was raising softirq on the wrong CPU when BH
work items are queued on remote CPUs. This fixes it. Applied to wq/for-6.9.

Thanks.

 kernel/workqueue.c | 41 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 4950bfc2cdcc..04e35dbe6799 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -54,6 +54,7 @@
 #include <linux/nmi.h>
 #include <linux/kvm_para.h>
 #include <linux/delay.h>
+#include <linux/irq_work.h>
 
 #include "workqueue_internal.h"
 
@@ -457,6 +458,10 @@ static bool wq_debug_force_rr_cpu = false;
 #endif
 module_param_named(debug_force_rr_cpu, wq_debug_force_rr_cpu, bool, 0644);
 
+/* to raise softirq for the BH worker pools on other CPUs */
+static DEFINE_PER_CPU_SHARED_ALIGNED(struct irq_work [NR_STD_WORKER_POOLS],
+				     bh_pool_irq_works);
+
 /* the BH worker pools */
 static DEFINE_PER_CPU_SHARED_ALIGNED(struct worker_pool [NR_STD_WORKER_POOLS],
 				     bh_worker_pools);
@@ -1197,6 +1202,13 @@ static bool assign_work(struct work_struct *work, struct worker *worker,
 	return true;
 }
 
+static struct irq_work *bh_pool_irq_work(struct worker_pool *pool)
+{
+	int high = pool->attrs->nice == HIGHPRI_NICE_LEVEL ? 1 : 0;
+
+	return &per_cpu(bh_pool_irq_works, pool->cpu)[high];
+}
+
 /**
  * kick_pool - wake up an idle worker if necessary
  * @pool: pool to kick
@@ -1215,10 +1227,15 @@ static bool kick_pool(struct worker_pool *pool)
 		return false;
 
 	if (pool->flags & POOL_BH) {
-		if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
-			raise_softirq_irqoff(HI_SOFTIRQ);
-		else
-			raise_softirq_irqoff(TASKLET_SOFTIRQ);
+		if (likely(pool->cpu == smp_processor_id())) {
+			if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
+				raise_softirq_irqoff(HI_SOFTIRQ);
+			else
+				raise_softirq_irqoff(TASKLET_SOFTIRQ);
+		} else {
+			irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
+		}
+
 		return true;
 	}
 
@@ -7367,6 +7384,16 @@ static inline void wq_watchdog_init(void) { }
 
 #endif	/* CONFIG_WQ_WATCHDOG */
 
+static void bh_pool_kick_normal(struct irq_work *irq_work)
+{
+	raise_softirq_irqoff(TASKLET_SOFTIRQ);
+}
+
+static void bh_pool_kick_highpri(struct irq_work *irq_work)
+{
+	raise_softirq_irqoff(HI_SOFTIRQ);
+}
+
 static void __init restrict_unbound_cpumask(const char *name, const struct cpumask *mask)
 {
 	if (!cpumask_intersects(wq_unbound_cpumask, mask)) {
@@ -7408,6 +7435,8 @@ void __init workqueue_init_early(void)
 {
 	struct wq_pod_type *pt = &wq_pod_types[WQ_AFFN_SYSTEM];
 	int std_nice[NR_STD_WORKER_POOLS] = { 0, HIGHPRI_NICE_LEVEL };
+	void (*irq_work_fns[2])(struct irq_work *) = { bh_pool_kick_normal,
+						       bh_pool_kick_highpri };
 	int i, cpu;
 
 	BUILD_BUG_ON(__alignof__(struct pool_workqueue) < __alignof__(long long));
@@ -7455,8 +7484,10 @@ void __init workqueue_init_early(void)
 
 		i = 0;
 		for_each_bh_worker_pool(pool, cpu) {
-			init_cpu_worker_pool(pool, cpu, std_nice[i++]);
+			init_cpu_worker_pool(pool, cpu, std_nice[i]);
 			pool->flags |= POOL_BH;
+			init_irq_work(bh_pool_irq_work(pool), irq_work_fns[i]);
+			i++;
 		}
 
 		i = 0;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues
  2024-02-14 18:39 [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues Tejun Heo
@ 2024-02-14 19:03 ` Linus Torvalds
  2024-02-14 19:16   ` Tejun Heo
  2024-02-16  1:23 ` [PATCH wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK Tejun Heo
  1 sibling, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2024-02-14 19:03 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel

On Wed, 14 Feb 2024 at 10:39, Tejun Heo <tj@kernel.org> wrote:
>
> When queue_work_on() is used to queue a BH work item on a remote CPU, the
> work item is queued on that CPU but kick_pool() raises softirq on the local
> CPU.

Now, does it make a lot of sense to ask to queue a BH work on another
CPU in the first place?

I don't think tasklets supported that. And while the workqueues
obviously do - and you fix that case - I wonder if we shouldn't say
"that operation makes no sense, please don't do it" rather than
actually support it?

What made you notice this issue?

                  Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues
  2024-02-14 19:03 ` Linus Torvalds
@ 2024-02-14 19:16   ` Tejun Heo
  0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2024-02-14 19:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Lai Jiangshan, linux-kernel

Hello,

On Wed, Feb 14, 2024 at 11:03:46AM -0800, Linus Torvalds wrote:
> On Wed, 14 Feb 2024 at 10:39, Tejun Heo <tj@kernel.org> wrote:
> >
> > When queue_work_on() is used to queue a BH work item on a remote CPU, the
> > work item is queued on that CPU but kick_pool() raises softirq on the local
> > CPU.
> 
> Now, does it make a lot of sense to ask to queue a BH work on another
> CPU in the first place?
> 
> I don't think tasklets supported that. And while the workqueues
> obviously do - and you fix that case - I wonder if we shouldn't say
> "that operation makes no sense, please don't do it" rather than
> actually support it?
> 
> What made you notice this issue?

It's test code I'm using to verify new features to cover
tasklet_disable/enable(), so not a real use case. For tasklet migration,
this isn't necessary but at the same time all the mechanics are already
there, so it's nice to keep the API orthogonal and one can conceive
plausible use cases - e.g. "I'm using per-cpu work items to collect per-cpu
states but that comes with really high tail latencies, lemme switch to
per-cpu BH work items".

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK
  2024-02-14 18:39 [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues Tejun Heo
  2024-02-14 19:03 ` Linus Torvalds
@ 2024-02-16  1:23 ` Tejun Heo
  2024-02-16  5:10   ` [PATCH v2 " Tejun Heo
  1 sibling, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2024-02-16  1:23 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, torvalds, Naresh Kamboju, Anders Roxell,
	Peter Zijlstra, Valentin Schneider, Sebastian Andrzej Siewior

From 9d6efa8d0dd012db22958a225246812441b25405 Mon Sep 17 00:00:00 2001
From: Tejun Heo <tj@kernel.org>
Date: Thu, 15 Feb 2024 15:02:30 -1000

2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues") added
irq_work usage to workqueue; however, it turns out irq_work is actually
optional and the change breaks build on configuration which doesn't have
CONFIG_IRQ_WORK enabled.

Fix build by making workqueue use irq_work only when CONFIG_SMP and enabling
CONFIG_IRQ_WORK when CONFIG_SMP is set. It's reasonable to argue that it may
be better to just always enable it. However, this still saves a small bit of
memory for tiny UP configs and also the least amount of change, so, for now,
let's keep it conditional.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: 2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
---
Hello,

As this is breaking build for some configs, I'll apply this to wq/for-6.9.
If there's any objection, please let me know. I'll back out and work on a
different approach.

Thanks.

 init/Kconfig       |  2 ++
 kernel/workqueue.c | 24 +++++++++++++++---------
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 8df18f3a9748..41be05a8ba5e 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -107,6 +107,8 @@ config CONSTRUCTORS
 
 config IRQ_WORK
 	bool
+	depends on SMP
+	default y
 
 config BUILDTIME_TABLE_SORT
 	bool
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 04e35dbe6799..6ae441e13804 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1209,6 +1209,20 @@ static struct irq_work *bh_pool_irq_work(struct worker_pool *pool)
 	return &per_cpu(bh_pool_irq_works, pool->cpu)[high];
 }
 
+static void kick_bh_pool(struct worker_pool *pool)
+{
+#ifdef CONFIG_SMP
+	if (unlikely(pool->cpu != smp_processor_id())) {
+		irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
+		return;
+	}
+#endif
+	if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
+		raise_softirq_irqoff(HI_SOFTIRQ);
+	else
+		raise_softirq_irqoff(TASKLET_SOFTIRQ);
+}
+
 /**
  * kick_pool - wake up an idle worker if necessary
  * @pool: pool to kick
@@ -1227,15 +1241,7 @@ static bool kick_pool(struct worker_pool *pool)
 		return false;
 
 	if (pool->flags & POOL_BH) {
-		if (likely(pool->cpu == smp_processor_id())) {
-			if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
-				raise_softirq_irqoff(HI_SOFTIRQ);
-			else
-				raise_softirq_irqoff(TASKLET_SOFTIRQ);
-		} else {
-			irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
-		}
-
+		kick_bh_pool(pool);
 		return true;
 	}
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK
  2024-02-16  1:23 ` [PATCH wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK Tejun Heo
@ 2024-02-16  5:10   ` Tejun Heo
  2024-02-16 10:24     ` Anders Roxell
  0 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2024-02-16  5:10 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: linux-kernel, torvalds, Naresh Kamboju, Anders Roxell,
	Peter Zijlstra, Valentin Schneider, Sebastian Andrzej Siewior,
	Stephen Rothwell

2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues") added
irq_work usage to workqueue; however, it turns out irq_work is actually
optional and the change breaks build on configuration which doesn't have
CONFIG_IRQ_WORK enabled.

Fix build by making workqueue use irq_work only when CONFIG_SMP and enabling
CONFIG_IRQ_WORK when CONFIG_SMP is set. It's reasonable to argue that it may
be better to just always enable it. However, this still saves a small bit of
memory for tiny UP configs and also the least amount of change, so, for now,
let's keep it conditional.

Verified to do the right thing for x86_64 allnoconfig and defconfig, and
aarch64 allnoconfig, allnoconfig + prink disable (SMP but nothing selects
IRQ_WORK) and a modified aarch64 Kconfig where !SMP and nothing selects
IRQ_WORK.

v2: `depends on SMP` leads to Kconfig warnings when CONFIG_IRQ_WORK is
    selected by something else when !CONFIG_SMP. Use `def_bool y if SMP`
    instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues")
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
---
Hello,

Unfortunately, the previous patch triggers Kconfig warnings when IRQ_WORK is
selected by something else but !CONFIG_SMP. This one seems to do the right
thing in all cases.

Naresh, Anders, can you please test it again?

Thanks.

 init/Kconfig       |  2 +-
 kernel/workqueue.c | 24 +++++++++++++++---------
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 8df18f3a9748..0d21c9e0398f 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -106,7 +106,7 @@ config CONSTRUCTORS
 	bool
 
 config IRQ_WORK
-	bool
+	def_bool y if SMP
 
 config BUILDTIME_TABLE_SORT
 	bool
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 04e35dbe6799..6ae441e13804 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1209,6 +1209,20 @@ static struct irq_work *bh_pool_irq_work(struct worker_pool *pool)
 	return &per_cpu(bh_pool_irq_works, pool->cpu)[high];
 }
 
+static void kick_bh_pool(struct worker_pool *pool)
+{
+#ifdef CONFIG_SMP
+	if (unlikely(pool->cpu != smp_processor_id())) {
+		irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
+		return;
+	}
+#endif
+	if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
+		raise_softirq_irqoff(HI_SOFTIRQ);
+	else
+		raise_softirq_irqoff(TASKLET_SOFTIRQ);
+}
+
 /**
  * kick_pool - wake up an idle worker if necessary
  * @pool: pool to kick
@@ -1227,15 +1241,7 @@ static bool kick_pool(struct worker_pool *pool)
 		return false;
 
 	if (pool->flags & POOL_BH) {
-		if (likely(pool->cpu == smp_processor_id())) {
-			if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
-				raise_softirq_irqoff(HI_SOFTIRQ);
-			else
-				raise_softirq_irqoff(TASKLET_SOFTIRQ);
-		} else {
-			irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
-		}
-
+		kick_bh_pool(pool);
 		return true;
 	}
 
-- 
2.43.2


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK
  2024-02-16  5:10   ` [PATCH v2 " Tejun Heo
@ 2024-02-16 10:24     ` Anders Roxell
  2024-02-16 16:35       ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Anders Roxell @ 2024-02-16 10:24 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Lai Jiangshan, linux-kernel, torvalds, Naresh Kamboju,
	Peter Zijlstra, Valentin Schneider, Sebastian Andrzej Siewior,
	Stephen Rothwell

On Fri, 16 Feb 2024 at 06:10, Tejun Heo <tj@kernel.org> wrote:
>
> 2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues") added
> irq_work usage to workqueue; however, it turns out irq_work is actually
> optional and the change breaks build on configuration which doesn't have
> CONFIG_IRQ_WORK enabled.
>
> Fix build by making workqueue use irq_work only when CONFIG_SMP and enabling
> CONFIG_IRQ_WORK when CONFIG_SMP is set. It's reasonable to argue that it may
> be better to just always enable it. However, this still saves a small bit of
> memory for tiny UP configs and also the least amount of change, so, for now,
> let's keep it conditional.
>
> Verified to do the right thing for x86_64 allnoconfig and defconfig, and
> aarch64 allnoconfig, allnoconfig + prink disable (SMP but nothing selects
> IRQ_WORK) and a modified aarch64 Kconfig where !SMP and nothing selects
> IRQ_WORK.
>
> v2: `depends on SMP` leads to Kconfig warnings when CONFIG_IRQ_WORK is
>     selected by something else when !CONFIG_SMP. Use `def_bool y if SMP`
>     instead.
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
> Fixes: 2f34d7337d98 ("workqueue: Fix queue_work_on() with BH workqueues")
> Cc: Stephen Rothwell <sfr@canb.auug.org.au>
> ---
> Hello,
>
> Unfortunately, the previous patch triggers Kconfig warnings when IRQ_WORK is
> selected by something else but !CONFIG_SMP. This one seems to do the right
> thing in all cases.
>
> Naresh, Anders, can you please test it again?

I applied this patch on yesterdays next tag next-20240215 and I was able to
build the arm64 tinyconfig.

Tested-by: Anders Roxell <anders.roxell@linaro.org>

>
> Thanks.
>
>  init/Kconfig       |  2 +-
>  kernel/workqueue.c | 24 +++++++++++++++---------
>  2 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index 8df18f3a9748..0d21c9e0398f 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -106,7 +106,7 @@ config CONSTRUCTORS
>         bool
>
>  config IRQ_WORK
> -       bool
> +       def_bool y if SMP
>
>  config BUILDTIME_TABLE_SORT
>         bool
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 04e35dbe6799..6ae441e13804 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1209,6 +1209,20 @@ static struct irq_work *bh_pool_irq_work(struct worker_pool *pool)
>         return &per_cpu(bh_pool_irq_works, pool->cpu)[high];
>  }
>
> +static void kick_bh_pool(struct worker_pool *pool)
> +{
> +#ifdef CONFIG_SMP
> +       if (unlikely(pool->cpu != smp_processor_id())) {
> +               irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
> +               return;
> +       }
> +#endif
> +       if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
> +               raise_softirq_irqoff(HI_SOFTIRQ);
> +       else
> +               raise_softirq_irqoff(TASKLET_SOFTIRQ);
> +}
> +
>  /**
>   * kick_pool - wake up an idle worker if necessary
>   * @pool: pool to kick
> @@ -1227,15 +1241,7 @@ static bool kick_pool(struct worker_pool *pool)
>                 return false;
>
>         if (pool->flags & POOL_BH) {
> -               if (likely(pool->cpu == smp_processor_id())) {
> -                       if (pool->attrs->nice == HIGHPRI_NICE_LEVEL)
> -                               raise_softirq_irqoff(HI_SOFTIRQ);
> -                       else
> -                               raise_softirq_irqoff(TASKLET_SOFTIRQ);
> -               } else {
> -                       irq_work_queue_on(bh_pool_irq_work(pool), pool->cpu);
> -               }
> -
> +               kick_bh_pool(pool);
>                 return true;
>         }
>
> --
> 2.43.2
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK
  2024-02-16 10:24     ` Anders Roxell
@ 2024-02-16 16:35       ` Tejun Heo
  0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2024-02-16 16:35 UTC (permalink / raw)
  To: Anders Roxell
  Cc: Lai Jiangshan, linux-kernel, torvalds, Naresh Kamboju,
	Peter Zijlstra, Valentin Schneider, Sebastian Andrzej Siewior,
	Stephen Rothwell

On Fri, Feb 16, 2024 at 11:24:26AM +0100, Anders Roxell wrote:
> I applied this patch on yesterdays next tag next-20240215 and I was able to
> build the arm64 tinyconfig.
> 
> Tested-by: Anders Roxell <anders.roxell@linaro.org>

Thanks for confirming. Applying to wq/for-6.9. Hopefully, this won't cause
any new breakages.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-02-16 16:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-14 18:39 [PATCH wq/for-6.9] workqueue: Fix queue_work_on() with BH workqueues Tejun Heo
2024-02-14 19:03 ` Linus Torvalds
2024-02-14 19:16   ` Tejun Heo
2024-02-16  1:23 ` [PATCH wq/for-6.9] workqueue, irq_work: Build fix for !CONFIG_IRQ_WORK Tejun Heo
2024-02-16  5:10   ` [PATCH v2 " Tejun Heo
2024-02-16 10:24     ` Anders Roxell
2024-02-16 16:35       ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.