[PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity
@ 2020-12-10 16:38 Valentin Schneider
  2020-12-10 16:38 ` [PATCH 1/2] stop_machine: Add caller debug info to queue_stop_cpus_work Valentin Schneider
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Valentin Schneider @ 2020-12-10 16:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, tglx, mingo, bigeasy, qais.yousef, swood,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, vincent.donnefort, tj, ouwen210, Qian Cai

Hi folks,

This should fix the issue reported by Qian at [1]. Dietmar mentioned some
other issue with hotplug & deadline tasks, but that's being investigated by
someone else ATM. 


I would like to mention I suspect this bug comes straight from $hell, as
I've never ever had to fight off so many (mostly unrelated) issues while
looking into it. Distro kernel being mangled, git tree corruption, periods of
heisenbugism...

Cheers,
Valentin

[1]: https://lore.kernel.org/r/ff62e3ee994efb3620177bf7b19fab16f4866845.camel@redhat.com

Valentin Schneider (2):
  stop_machine: Add caller debug info to queue_stop_cpus_work
  workqueue: Fix affinity of kworkers attached during late hotplug

 kernel/stop_machine.c |  1 +
 kernel/workqueue.c    | 24 +++++++++++++++++-------
 2 files changed, 18 insertions(+), 7 deletions(-)

--
2.27.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/2] stop_machine: Add caller debug info to queue_stop_cpus_work
  2020-12-10 16:38 [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity Valentin Schneider
@ 2020-12-10 16:38 ` Valentin Schneider
  2021-03-23 15:08   ` [tip: sched/core] " tip-bot2 for Valentin Schneider
  2020-12-10 16:38 ` [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug Valentin Schneider
  2020-12-11 12:52 ` [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity Peter Zijlstra
  2 siblings, 1 reply; 11+ messages in thread
From: Valentin Schneider @ 2020-12-10 16:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, tglx, mingo, bigeasy, qais.yousef, swood,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, vincent.donnefort, tj, ouwen210, Qian Cai

Most callsites were covered by commit

  a8b62fd08505 ("stop_machine: Add function and caller debug info")

but this skipped queue_stop_cpus_work(). Add caller debug info to it.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/stop_machine.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 971d8acceaec..cbc30271ea4d 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -409,6 +409,7 @@ static bool queue_stop_cpus_work(const struct cpumask *cpumask,
 		work->fn = fn;
 		work->arg = arg;
 		work->done = done;
+		work->caller = _RET_IP_;
 		if (cpu_stop_queue_work(cpu, work))
 			queued = true;
 	}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip: sched/core] stop_machine: Add caller debug info to queue_stop_cpus_work
  2020-12-10 16:38 ` [PATCH 1/2] stop_machine: Add caller debug info to queue_stop_cpus_work Valentin Schneider
@ 2021-03-23 15:08   ` tip-bot2 for Valentin Schneider
  0 siblings, 0 replies; 11+ messages in thread
From: tip-bot2 for Valentin Schneider @ 2021-03-23 15:08 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Valentin Schneider, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     2a2f80ff63bc36a874ed569bcaef932a8fe43514
Gitweb:        https://git.kernel.org/tip/2a2f80ff63bc36a874ed569bcaef932a8fe43514
Author:        Valentin Schneider <valentin.schneider@arm.com>
AuthorDate:    Thu, 10 Dec 2020 16:38:29 
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 23 Mar 2021 16:01:58 +01:00

stop_machine: Add caller debug info to queue_stop_cpus_work

Most callsites were covered by commit

  a8b62fd08505 ("stop_machine: Add function and caller debug info")

but this skipped queue_stop_cpus_work(). Add caller debug info to it.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201210163830.21514-2-valentin.schneider@arm.com
---
 kernel/stop_machine.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 971d8ac..cbc3027 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -409,6 +409,7 @@ static bool queue_stop_cpus_work(const struct cpumask *cpumask,
 		work->fn = fn;
 		work->arg = arg;
 		work->done = done;
+		work->caller = _RET_IP_;
 		if (cpu_stop_queue_work(cpu, work))
 			queued = true;
 	}

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug
  2020-12-10 16:38 [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity Valentin Schneider
  2020-12-10 16:38 ` [PATCH 1/2] stop_machine: Add caller debug info to queue_stop_cpus_work Valentin Schneider
@ 2020-12-10 16:38 ` Valentin Schneider
  2020-12-11 11:39   ` Vincent Donnefort
  2020-12-11 12:52 ` [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity Peter Zijlstra
  2 siblings, 1 reply; 11+ messages in thread
From: Valentin Schneider @ 2020-12-10 16:38 UTC (permalink / raw)
  To: linux-kernel
  Cc: Qian Cai, Peter Zijlstra, tglx, mingo, bigeasy, qais.yousef,
	swood, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
	bsegall, mgorman, bristot, vincent.donnefort, tj, ouwen210

Per-CPU kworkers forcefully migrated away by hotplug via
workqueue_offline_cpu() can end up spawning more kworkers via

  manage_workers() -> maybe_create_worker()

Workers created at this point will be bound using

  pool->attrs->cpumask

which in this case is wrong, as the hotplug state machine already migrated
all pinned kworkers away from this CPU. This ends up triggering the BUG_ON
condition is sched_cpu_dying() (i.e. there's a kworker enqueued on the
dying rq).

Special-case workers being attached to DISASSOCIATED pools and bind them to
cpu_active_mask, mimicking them being present when workqueue_offline_cpu()
was invoked.

Link: https://lore.kernel.org/r/ff62e3ee994efb3620177bf7b19fab16f4866845.camel@redhat.com
Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug")
Reported-by: Qian Cai <cai@redhat.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
---
 kernel/workqueue.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 9880b6c0e272..fb1418edf85c 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1848,19 +1848,29 @@ static void worker_attach_to_pool(struct worker *worker,
 {
 	mutex_lock(&wq_pool_attach_mutex);
 
-	/*
-	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
-	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
-	 */
-	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
-
 	/*
 	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
 	 * stable across this function.  See the comments above the flag
 	 * definition for details.
+	 *
+	 * Worker might get attached to a pool *after* workqueue_offline_cpu()
+	 * was run - e.g. created by manage_workers() from a kworker which was
+	 * forcefully moved away by hotplug. Kworkers created from this point on
+	 * need to have their affinity changed as if they were present during
+	 * workqueue_offline_cpu().
+	 *
+	 * This will be resolved in rebind_workers().
 	 */
-	if (pool->flags & POOL_DISASSOCIATED)
+	if (pool->flags & POOL_DISASSOCIATED) {
 		worker->flags |= WORKER_UNBOUND;
+		set_cpus_allowed_ptr(worker->task, cpu_active_mask);
+	} else {
+		/*
+		 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
+		 * online CPUs. It'll be re-applied when any of the CPUs come up.
+		 */
+		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+	}
 
 	list_add_tail(&worker->node, &pool->workers);
 	worker->pool = pool;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug
  2020-12-10 16:38 ` [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug Valentin Schneider
@ 2020-12-11 11:39   ` Vincent Donnefort
  2020-12-11 12:41     ` Peter Zijlstra
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Vincent Donnefort @ 2020-12-11 11:39 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, Qian Cai, Peter Zijlstra, tglx, mingo, bigeasy,
	qais.yousef, swood, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, tj,
	ouwen210

Hi Valentin,

On Thu, Dec 10, 2020 at 04:38:30PM +0000, Valentin Schneider wrote:
> Per-CPU kworkers forcefully migrated away by hotplug via
> workqueue_offline_cpu() can end up spawning more kworkers via
> 
>   manage_workers() -> maybe_create_worker()
> 
> Workers created at this point will be bound using
> 
>   pool->attrs->cpumask
> 
> which in this case is wrong, as the hotplug state machine already migrated
> all pinned kworkers away from this CPU. This ends up triggering the BUG_ON
> condition is sched_cpu_dying() (i.e. there's a kworker enqueued on the
> dying rq).
> 
> Special-case workers being attached to DISASSOCIATED pools and bind them to
> cpu_active_mask, mimicking them being present when workqueue_offline_cpu()
> was invoked.
> 
> Link: https://lore.kernel.org/r/ff62e3ee994efb3620177bf7b19fab16f4866845.camel@redhat.com
> Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug")

Isn't the problem introduced by 1cf12e0 ("sched/hotplug: Consolidate
task migration on CPU unplug") ?

Previously we had:

 AP_WORKQUEUE_ONLINE -> set POOL_DISASSOCIATED
   ...
 TEARDOWN_CPU -> clear CPU in cpu_online_mask
   |
   |-AP_SCHED_STARTING -> migrate_tasks()
   |
  AP_OFFLINE

worker_attach_to_pool(), is "protected" by the cpu_online_mask in
set_cpus_allowed_ptr(). IIUC, now, the tasks being migrated before the
cpu_online_mask is actually flipped, there's a window, between
CPUHP_AP_SCHED_WAIT_EMPTY and CPUHP_TEARDOWN_CPU where a kworker can wake-up
a new one, for the hotunplugged pool that wouldn't be caught by the
hotunplug migration.

> Reported-by: Qian Cai <cai@redhat.com>
> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
> ---
>  kernel/workqueue.c | 24 +++++++++++++++++-------
>  1 file changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 9880b6c0e272..fb1418edf85c 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1848,19 +1848,29 @@ static void worker_attach_to_pool(struct worker *worker,
>  {
>  	mutex_lock(&wq_pool_attach_mutex);
>  
> -	/*
> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
> -	 */
> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> -
>  	/*
>  	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>  	 * stable across this function.  See the comments above the flag
>  	 * definition for details.
> +	 *
> +	 * Worker might get attached to a pool *after* workqueue_offline_cpu()
> +	 * was run - e.g. created by manage_workers() from a kworker which was
> +	 * forcefully moved away by hotplug. Kworkers created from this point on
> +	 * need to have their affinity changed as if they were present during
> +	 * workqueue_offline_cpu().
> +	 *
> +	 * This will be resolved in rebind_workers().
>  	 */
> -	if (pool->flags & POOL_DISASSOCIATED)
> +	if (pool->flags & POOL_DISASSOCIATED) {
>  		worker->flags |= WORKER_UNBOUND;
> +		set_cpus_allowed_ptr(worker->task, cpu_active_mask);
> +	} else {
> +		/*
> +		 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> +		 * online CPUs. It'll be re-applied when any of the CPUs come up.
> +		 */

Does this comment still stand ? IIUC, we should always be in the
POOL_DISASSOCIATED case if the CPU from cpumask is offline. Unless a
pool->attrs->cpumask can have several CPUs. In that case maybe we should check
for the cpu_active_mask here too ?

-- 
Vincent

> +		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> +	}
>  
>  	list_add_tail(&worker->node, &pool->workers);
>  	worker->pool = pool;
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug
  2020-12-11 11:39   ` Vincent Donnefort
@ 2020-12-11 12:41     ` Peter Zijlstra
  2020-12-11 12:51     ` Peter Zijlstra
  2020-12-11 12:51     ` Valentin Schneider
  2 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-12-11 12:41 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: Valentin Schneider, linux-kernel, Qian Cai, tglx, mingo, bigeasy,
	qais.yousef, swood, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, tj,
	ouwen210

On Fri, Dec 11, 2020 at 11:39:21AM +0000, Vincent Donnefort wrote:
> Hi Valentin,
> 
> On Thu, Dec 10, 2020 at 04:38:30PM +0000, Valentin Schneider wrote:
> > Per-CPU kworkers forcefully migrated away by hotplug via
> > workqueue_offline_cpu() can end up spawning more kworkers via
> > 
> >   manage_workers() -> maybe_create_worker()
> > 
> > Workers created at this point will be bound using
> > 
> >   pool->attrs->cpumask
> > 
> > which in this case is wrong, as the hotplug state machine already migrated
> > all pinned kworkers away from this CPU. This ends up triggering the BUG_ON
> > condition is sched_cpu_dying() (i.e. there's a kworker enqueued on the
> > dying rq).
> > 
> > Special-case workers being attached to DISASSOCIATED pools and bind them to
> > cpu_active_mask, mimicking them being present when workqueue_offline_cpu()
> > was invoked.
> > 
> > Link: https://lore.kernel.org/r/ff62e3ee994efb3620177bf7b19fab16f4866845.camel@redhat.com
> > Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug")
> 
> Isn't the problem introduced by 1cf12e0 ("sched/hotplug: Consolidate
> task migration on CPU unplug") ?
> 
> Previously we had:
> 
>  AP_WORKQUEUE_ONLINE -> set POOL_DISASSOCIATED
>    ...
>  TEARDOWN_CPU -> clear CPU in cpu_online_mask
>    |
>    |-AP_SCHED_STARTING -> migrate_tasks()
>    |
>   AP_OFFLINE
> 
> worker_attach_to_pool(), is "protected" by the cpu_online_mask in
> set_cpus_allowed_ptr(). IIUC, now, the tasks being migrated before the
> cpu_online_mask is actually flipped, there's a window, between
> CPUHP_AP_SCHED_WAIT_EMPTY and CPUHP_TEARDOWN_CPU where a kworker can wake-up
> a new one, for the hotunplugged pool that wouldn't be caught by the
> hotunplug migration.

Yes, very much so, however the commit Valentin picked was supposed to
preemptively fix this. So we can consider this a fix for the fix.

But I don't mind an alternative or perhaps even second Fixes tag on
this.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug
  2020-12-11 11:39   ` Vincent Donnefort
  2020-12-11 12:41     ` Peter Zijlstra
@ 2020-12-11 12:51     ` Peter Zijlstra
  2020-12-11 12:51     ` Valentin Schneider
  2 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-12-11 12:51 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: Valentin Schneider, linux-kernel, Qian Cai, tglx, mingo, bigeasy,
	qais.yousef, swood, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, tj,
	ouwen210

On Fri, Dec 11, 2020 at 11:39:21AM +0000, Vincent Donnefort wrote:
> On Thu, Dec 10, 2020 at 04:38:30PM +0000, Valentin Schneider wrote:
> > +	if (pool->flags & POOL_DISASSOCIATED) {
> >  		worker->flags |= WORKER_UNBOUND;
> > +		set_cpus_allowed_ptr(worker->task, cpu_active_mask);
> > +	} else {
> > +		/*
> > +		 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> > +		 * online CPUs. It'll be re-applied when any of the CPUs come up.
> > +		 */
> 
> Does this comment still stand ? IIUC, we should always be in the
> POOL_DISASSOCIATED case if the CPU from cpumask is offline. Unless a
> pool->attrs->cpumask can have several CPUs. In that case maybe we should check
> for the cpu_active_mask here too ?

IIUC it can be a numa mask, and would still be valid in that case.

> > +		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> > +	}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug
  2020-12-11 11:39   ` Vincent Donnefort
  2020-12-11 12:41     ` Peter Zijlstra
  2020-12-11 12:51     ` Peter Zijlstra
@ 2020-12-11 12:51     ` Valentin Schneider
  2020-12-11 13:13       ` Valentin Schneider
  2 siblings, 1 reply; 11+ messages in thread
From: Valentin Schneider @ 2020-12-11 12:51 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: linux-kernel, Qian Cai, Peter Zijlstra, tglx, mingo, bigeasy,
	qais.yousef, swood, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, tj,
	ouwen210


Hi Vincent,

On 11/12/20 11:39, Vincent Donnefort wrote:
> Hi Valentin,
>
> On Thu, Dec 10, 2020 at 04:38:30PM +0000, Valentin Schneider wrote:
>> Fixes: 06249738a41a ("workqueue: Manually break affinity on hotplug")
>
> Isn't the problem introduced by 1cf12e0 ("sched/hotplug: Consolidate
> task migration on CPU unplug") ?
>
> Previously we had:
>
>  AP_WORKQUEUE_ONLINE -> set POOL_DISASSOCIATED
>    ...
>  TEARDOWN_CPU -> clear CPU in cpu_online_mask
>    |
>    |-AP_SCHED_STARTING -> migrate_tasks()
>    |
>   AP_OFFLINE
>
> worker_attach_to_pool(), is "protected" by the cpu_online_mask in
> set_cpus_allowed_ptr(). IIUC, now, the tasks being migrated before the
> cpu_online_mask is actually flipped, there's a window, between
> CPUHP_AP_SCHED_WAIT_EMPTY and CPUHP_TEARDOWN_CPU where a kworker can wake-up
> a new one, for the hotunplugged pool that wouldn't be caught by the
> hotunplug migration.
>

You're right, the splat should only happen with that other commit. That
said, this fix complements the one referred to in Fixes:, which is the
"logic" I went for.

>> Reported-by: Qian Cai <cai@redhat.com>
>> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
>> ---
>>  kernel/workqueue.c | 24 +++++++++++++++++-------
>>  1 file changed, 17 insertions(+), 7 deletions(-)
>>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index 9880b6c0e272..fb1418edf85c 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -1848,19 +1848,29 @@ static void worker_attach_to_pool(struct worker *worker,
>>  {
>>      mutex_lock(&wq_pool_attach_mutex);
>>
>> -	/*
>> -	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>> -	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
>> -	 */
>> -	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>> -
>>      /*
>>       * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>>       * stable across this function.  See the comments above the flag
>>       * definition for details.
>> +	 *
>> +	 * Worker might get attached to a pool *after* workqueue_offline_cpu()
>> +	 * was run - e.g. created by manage_workers() from a kworker which was
>> +	 * forcefully moved away by hotplug. Kworkers created from this point on
>> +	 * need to have their affinity changed as if they were present during
>> +	 * workqueue_offline_cpu().
>> +	 *
>> +	 * This will be resolved in rebind_workers().
>>       */
>> -	if (pool->flags & POOL_DISASSOCIATED)
>> +	if (pool->flags & POOL_DISASSOCIATED) {
>>              worker->flags |= WORKER_UNBOUND;
>> +		set_cpus_allowed_ptr(worker->task, cpu_active_mask);
>> +	} else {
>> +		/*
>> +		 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>> +		 * online CPUs. It'll be re-applied when any of the CPUs come up.
>> +		 */
>
> Does this comment still stand ? IIUC, we should always be in the
> POOL_DISASSOCIATED case if the CPU from cpumask is offline. Unless a
> pool->attrs->cpumask can have several CPUs.

AIUI that should the case for unbound pools

> In that case maybe we should check for the cpu_active_mask here too ?

Looking at it again, I think we might need to.

IIUC you can end up with pools bound to a single NUMA node (?). In that
case, say the last CPU of a node is going down, then:

  workqueue_offline_cpu()
    wq_update_unbound_numa()
      alloc_unbound_pwq()
        get_unbound_pool()

would still pick that node, because it doesn't look at the online / active
mask. And at this point, we would affine the
kworkers to that node, and we're back to having kworkers enqueued on a
(!active, online) CPU that is going down...

The annoying thing is we can't just compare attrs->cpumask with
cpu_active_mask, because workqueue_offline_cpu() happens a few steps below
sched_cpu_deactivate() (CPUHP_AP_ACTIVE):

  CPUHP_ONLINE -> CPUHP_AP_ACTIVE # CPU X is !active

  # Some new kworker gets created here
  worker_attach_to_pool()
    !cpumask_subset(attrs->cpumask, cpu_active_mask)
    -> affine worker to active CPUs

  CPUHP_AP_ACTIVE -> CPUHP_ONLINE # CPU X is active
  # Nothing will ever correct the kworker's affinity :(

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug
  2020-12-11 12:51     ` Valentin Schneider
@ 2020-12-11 13:13       ` Valentin Schneider
  2020-12-11 13:16         ` Vincent Donnefort
  0 siblings, 1 reply; 11+ messages in thread
From: Valentin Schneider @ 2020-12-11 13:13 UTC (permalink / raw)
  To: Vincent Donnefort
  Cc: linux-kernel, Qian Cai, Peter Zijlstra, tglx, mingo, bigeasy,
	qais.yousef, swood, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, tj,
	ouwen210

On 11/12/20 12:51, Valentin Schneider wrote:
>> In that case maybe we should check for the cpu_active_mask here too ?
>
> Looking at it again, I think we might need to.
>
> IIUC you can end up with pools bound to a single NUMA node (?). In that
> case, say the last CPU of a node is going down, then:
>
>   workqueue_offline_cpu()
>     wq_update_unbound_numa()
>       alloc_unbound_pwq()
>         get_unbound_pool()
>
> would still pick that node, because it doesn't look at the online / active
> mask. And at this point, we would affine the
> kworkers to that node, and we're back to having kworkers enqueued on a
> (!active, online) CPU that is going down...

Assuming a node covers at least 2 CPUs, that can't actually happen per
is_cpu_allowed().

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug
  2020-12-11 13:13       ` Valentin Schneider
@ 2020-12-11 13:16         ` Vincent Donnefort
  0 siblings, 0 replies; 11+ messages in thread
From: Vincent Donnefort @ 2020-12-11 13:16 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, Qian Cai, Peter Zijlstra, tglx, mingo, bigeasy,
	qais.yousef, swood, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, mgorman, bristot, tj,
	ouwen210

On Fri, Dec 11, 2020 at 01:13:35PM +0000, Valentin Schneider wrote:
> On 11/12/20 12:51, Valentin Schneider wrote:
> >> In that case maybe we should check for the cpu_active_mask here too ?
> >
> > Looking at it again, I think we might need to.
> >
> > IIUC you can end up with pools bound to a single NUMA node (?). In that
> > case, say the last CPU of a node is going down, then:
> >
> >   workqueue_offline_cpu()
> >     wq_update_unbound_numa()
> >       alloc_unbound_pwq()
> >         get_unbound_pool()
> >
> > would still pick that node, because it doesn't look at the online / active
> > mask. And at this point, we would affine the
> > kworkers to that node, and we're back to having kworkers enqueued on a
> > (!active, online) CPU that is going down...
> 
> Assuming a node covers at least 2 CPUs, that can't actually happen per
> is_cpu_allowed().

Yes indeed, my bad, no problem here.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity
  2020-12-10 16:38 [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity Valentin Schneider
  2020-12-10 16:38 ` [PATCH 1/2] stop_machine: Add caller debug info to queue_stop_cpus_work Valentin Schneider
  2020-12-10 16:38 ` [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug Valentin Schneider
@ 2020-12-11 12:52 ` Peter Zijlstra
  2 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2020-12-11 12:52 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: linux-kernel, tglx, mingo, bigeasy, qais.yousef, swood,
	juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
	mgorman, bristot, vincent.donnefort, tj, ouwen210, Qian Cai

On Thu, Dec 10, 2020 at 04:38:28PM +0000, Valentin Schneider wrote:
> Valentin Schneider (2):
>   stop_machine: Add caller debug info to queue_stop_cpus_work
>   workqueue: Fix affinity of kworkers attached during late hotplug
> 
>  kernel/stop_machine.c |  1 +
>  kernel/workqueue.c    | 24 +++++++++++++++++-------
>  2 files changed, 18 insertions(+), 7 deletions(-)

TJ, would you be okay if I take these through the sched tree that
contain the migrate_disable() patches, such that problem and fix stay
together?

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-03-23 15:18 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-10 16:38 [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity Valentin Schneider
2020-12-10 16:38 ` [PATCH 1/2] stop_machine: Add caller debug info to queue_stop_cpus_work Valentin Schneider
2021-03-23 15:08   ` [tip: sched/core] " tip-bot2 for Valentin Schneider
2020-12-10 16:38 ` [PATCH 2/2] workqueue: Fix affinity of kworkers attached during late hotplug Valentin Schneider
2020-12-11 11:39   ` Vincent Donnefort
2020-12-11 12:41     ` Peter Zijlstra
2020-12-11 12:51     ` Peter Zijlstra
2020-12-11 12:51     ` Valentin Schneider
2020-12-11 13:13       ` Valentin Schneider
2020-12-11 13:16         ` Vincent Donnefort
2020-12-11 12:52 ` [PATCH 0/2] workqueue: Fix migrate_disable hotplug changes vs kworker affinity Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).