All of lore.kernel.org
 help / color / mirror / Atom feed
From: Schspa Shi <schspa@gmail.com>
To: jiangshanlai@gmail.com, tj@kernel.org
Cc: linux-kernel@vger.kernel.org, zhaohui.shi@horizon.ai,
	Schspa Shi <schspa@gmail.com>
Subject: [PATCH v3] workqueue: Use active mask for new worker when pool is DISASSOCIATED
Date: Thu, 14 Jul 2022 11:16:45 +0800	[thread overview]
Message-ID: <20220714031645.28004-1-schspa@gmail.com> (raw)

When CPU-[un]hotplugs, all workers will be bound to active CPU via
unbind_workers().

But the unbound worker still has a chance to create a new worker, which
has bound the newly created task to pool->attrs->cpumask. But the CPU has
been unplugged.

Please refer to the following scenarios.

           CPU0                                  CPU1
------------------------------------------------------------------
sched_cpu_deactivate(cpu_active_mask clear)
workqueue_offline_cpu(work pool POOL_DISASSOCIATED)
  -- all worker will migrate to another cpu --
                                    worker_thread
                                    -- will create new worker if
                                       pool->worklist is not empty
                                       create_worker()
                                     -- new kworker will bound to CPU0
                               (pool->attrs->cpumask will be mask of CPU0).
      kworker/0:x will running on rq

sched_cpu_dying
  if (rq->nr_running != 1 || rq_has_pinned_tasks(rq))
    WARN(true, "Dying CPU not properly vacated!");
      ---------OOPS-------------

And the crash log as fellowing.

The crash log is as follows:
[ 1622.829074] ------------[ cut here ]------------
[ 1622.829081] Dying CPU not properly vacated!
[ 1622.829091] WARNING: CPU: 3 PID: 31 at kernel/sched/core.c:7756 sched_cpu_dying+0x74/0x204
[ 1622.829374] CPU: 3 PID: 31 Comm: migration/3 Tainted: P           O      5.10.59-rt52 #2
[ 1622.829386] Stopper: multi_cpu_stop+0x0/0x160 <- 0x0
[ 1622.829400] pstate: 60c00009 (nZCv daif +PAN +UAO -TCO BTYPE=--)
[ 1622.829408] pc : sched_cpu_dying+0x74/0x204
[ 1622.829415] lr : sched_cpu_dying+0x74/0x204
[ 1622.829421] sp : ffff800012933c70
[ 1622.829424] pmr_save: 00000060
[ 1622.829426] x29: ffff800012933c70 x28: 0000000000000000
[ 1622.829435] x27: 0000000000000000 x26: 0000000000000001
[ 1622.829444] x25: 0000000000000000 x24: ffff800018353c2c
[ 1622.829452] x23: 0000000000000003 x22: 0000000000000003
[ 1622.829460] x21: 0000000000000059 x20: 0000000000000000
[ 1622.829468] x19: ffff00027ee68a80 x18: 0000000000000000
[ 1622.829477] x17: 0000000000000000 x16: 0000000000000000
[ 1622.829485] x15: ffffffffffffffff x14: ffff80001169ae30
[ 1622.829493] x13: ffffffffffc38c07 x12: ffffffffffffffff
[ 1622.829501] x11: ffffffffffe00000 x10: ffff80001169ae58
[ 1622.829510] x9 : 000000000000001e x8 : ffff80001169ae30
[ 1622.829518] x7 : ffff800012933ab0 x6 : 00000000ffff0e20
[ 1622.829526] x5 : ffff00027ee62988 x4 : 00000000ffff0e20
[ 1622.829535] x3 : ffff800011e30180 x2 : 0000000100000002
[ 1622.829543] x1 : 0000000000000000 x0 : 0000000000000000
[ 1622.829552] Call trace:
[ 1622.829555]  sched_cpu_dying+0x74/0x204
[ 1622.829562]  cpuhp_invoke_callback+0xc0/0x1b0
[ 1622.829571]  take_cpu_down+0xbc/0xd4
[ 1622.829577]  multi_cpu_stop+0x138/0x160
[ 1622.829584]  cpu_stopper_thread+0x9c/0x118
[ 1622.829591]  smpboot_thread_fn+0x1e8/0x1ec
[ 1622.829598]  kthread+0x114/0x124
[ 1622.829606]  ret_from_fork+0x10/0x30
[ 1622.829615] ---[ end trace 0000000000000002 ]---
[ 1623.830273] CPU3 enqueued tasks (2 total):
[ 1623.830291] 	pid: 31, name: migration/3
[ 1623.830440] 	pid: 25654, name: kworker/3:0
[ 1623.830444] task:kworker/3:0     state:R  running task     stack:    0 pid:25654 ppid:     2 flags:0x00000028
[ 1623.830458] Call trace:
[ 1623.830460]  __switch_to+0x164/0x17c
[ 1623.830472]  __schedule+0x4cc/0x5c0
[ 1623.830483]  schedule+0x7c/0xcc
[ 1623.830491]  schedule_preempt_disabled+0x14/0x24
[ 1623.830500]  kthread+0xd8/0x124
[ 1623.830509]  ret_from_fork+0x10/0x30

In the crash log, the error enqueued kworker(pid: 25654, name: kworker/3:0)
is a newly created thread, and have bind to a unpluged CPU 3.

Can crash dump can also verify this.
crash> task -R nr_cpus_allowed,cpus_mask 25654
PID: 25654  TASK: ffff000181ff0000  CPU: 3   COMMAND: "kworker/3:0"
  nr_cpus_allowed = 1,
  cpus_mask = {
    bits = {8, 0}
  },

crash> struct worker_pool.cpu,nr_workers,attrs 0xffff00027ee68380
  cpu = 3,
  nr_workers = 3,
  attrs = 0xffff000180004480,
crash> struct workqueue_attrs 0xffff000180004480
struct workqueue_attrs {
  nice = 0,
  cpumask = {{
      bits = {8, 0}
    }},
  no_numa = false
}

The stack trace of the bad running task was dumped via the following patch:
Link: https://lore.kernel.org/all/20220519161125.41144-1-schspa@gmail.com/
And I think this debug patch needs to be added to the mainline,
it can help us to debug this kind of problem

To fix it, we can use cpu_active_mask when work pool is DISASSOCIATED.

Signed-off-by: Schspa Shi <schspa@gmail.com>

--

Changelog:
v1 -> v2:
        - Move worker task bind to worker_attach_to_pool, remove extra
        wq_pool_attach_mutex added.
        - Add a timing diagram to make this question clearer.
v2 -> v3:
        - Add missing PF_NO_SETAFFINITY, use cpumask_intersects to
        avoid setting bad mask for unbound work pool as Lai Jiangshan
        advised.
        - Call kthread_set_pre_cpu correctly for unbound worker.
---
 kernel/workqueue.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 1ea50f6be843..b3e9289d9640 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1860,8 +1860,16 @@ static struct worker *alloc_worker(int node)
 static void worker_attach_to_pool(struct worker *worker,
 				   struct worker_pool *pool)
 {
+	const struct cpumask *cpu_mask;
+
 	mutex_lock(&wq_pool_attach_mutex);
 
+	if (cpumask_intersects(pool->attrs->cpumask, cpu_active_mask))
+		cpu_mask = pool->attrs->cpumask;
+	else
+		cpu_mask = wq_unbound_cpumask;
+
+	set_cpus_allowed_ptr(worker->task, cpu_mask);
 	/*
 	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
 	 * stable across this function.  See the comments above the flag
@@ -1870,10 +1878,8 @@ static void worker_attach_to_pool(struct worker *worker,
 	if (pool->flags & POOL_DISASSOCIATED)
 		worker->flags |= WORKER_UNBOUND;
 	else
-		kthread_set_per_cpu(worker->task, pool->cpu);
-
-	if (worker->rescue_wq)
-		set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+		kthread_set_per_cpu(worker->task,
+				cpu_mask == wq_unbound_cpumask ? -1 : pool->cpu);
 
 	list_add_tail(&worker->node, &pool->workers);
 	worker->pool = pool;
@@ -1952,8 +1958,8 @@ static struct worker *create_worker(struct worker_pool *pool)
 		goto fail;
 
 	set_user_nice(worker->task, pool->attrs->nice);
-	kthread_bind_mask(worker->task, pool->attrs->cpumask);
 
+	worker->task->flags |= PF_NO_SETAFFINITY;
 	/* successful, attach the worker to the pool */
 	worker_attach_to_pool(worker, pool);
 
-- 
2.29.0


             reply	other threads:[~2022-07-14  3:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-14  3:16 Schspa Shi [this message]
2022-07-14  8:34 ` [PATCH v3] workqueue: Use active mask for new worker when pool is DISASSOCIATED Lai Jiangshan
2022-07-14  8:58   ` Zhaohui Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220714031645.28004-1-schspa@gmail.com \
    --to=schspa@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    --cc=zhaohui.shi@horizon.ai \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.