linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: jiangshanlai@gmail.com, linux-kernel@vger.kernel.org
Subject: Re: Workqueues splat due to ending up on wrong CPU
Date: Tue, 26 Nov 2019 10:33:34 -0800	[thread overview]
Message-ID: <20191126183334.GE2867037@devbig004.ftw2.facebook.com> (raw)
In-Reply-To: <20191125230312.GP2889@paulmck-ThinkPad-P72>

Hello, Paul.

On Mon, Nov 25, 2019 at 03:03:12PM -0800, Paul E. McKenney wrote:
> I am seeing this occasionally during rcutorture runs in the presence
> of CPU hotplug.  This is on v5.4-rc1 in process_one_work() at the first
> WARN_ON():
> 
> 	WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
> 		     raw_smp_processor_id() != pool->cpu);

Hmm... so it's saying that this worker's pool is supposed to be bound
to a cpu but it's currently running on the wrong cpu.

> What should I do to help further debug this?

Do you always see rescuer_thread in the backtrace?  Can you please
apply the following patch and reproduce the problem?

Thanks.

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 914b845ad4ff..6f7f185cd146 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1842,13 +1842,18 @@ static struct worker *alloc_worker(int node)
 static void worker_attach_to_pool(struct worker *worker,
 				   struct worker_pool *pool)
 {
+	int ret;
+
 	mutex_lock(&wq_pool_attach_mutex);
 
 	/*
 	 * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
 	 * online CPUs.  It'll be re-applied when any of the CPUs come up.
 	 */
-	set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+	ret = set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+	if (ret && !(pool->flags & POOL_DISASSOCIATED))
+		printk("XXX worker pid %d failed to attach to cpus of pool %d, ret=%d\n",
+		       task_pid_nr(worker->task), pool->id, ret);
 
 	/*
 	 * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
@@ -2183,8 +2188,10 @@ __acquires(&pool->lock)
 	lockdep_copy_map(&lockdep_map, &work->lockdep_map);
 #endif
 	/* ensure we're on the correct CPU */
-	WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
-		     raw_smp_processor_id() != pool->cpu);
+	WARN_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
+		  raw_smp_processor_id() != pool->cpu,
+		  "expected on cpu %d but on cpu %d, pool %d, workfn=%pf\n",
+		  pool->cpu, raw_smp_processor_id(), pool->id, work->func);
 
 	/*
 	 * A single work shouldn't be executed concurrently by

-- 
tejun

  reply	other threads:[~2019-11-26 18:33 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-25 23:03 Workqueues splat due to ending up on wrong CPU Paul E. McKenney
2019-11-26 18:33 ` Tejun Heo [this message]
2019-11-26 22:05   ` Paul E. McKenney
2019-11-27 15:50     ` Paul E. McKenney
2019-11-28 16:18       ` Paul E. McKenney
2019-11-29 15:58         ` Paul E. McKenney
2019-12-02  1:55           ` Paul E. McKenney
2019-12-02 20:13             ` Tejun Heo
2019-12-02 23:39               ` Paul E. McKenney
2019-12-03 10:00                 ` Peter Zijlstra
2019-12-03 17:45                   ` Paul E. McKenney
2019-12-03 18:13                     ` Tejun Heo
2019-12-03  9:55               ` Peter Zijlstra
2019-12-03 10:06                 ` Peter Zijlstra
2019-12-03 15:42                 ` Tejun Heo
2019-12-03 16:04                   ` Paul E. McKenney
2019-12-04 20:11                 ` Paul E. McKenney
2019-12-05 10:29                   ` Peter Zijlstra
2019-12-05 10:32                     ` Peter Zijlstra
2019-12-05 14:48                       ` Paul E. McKenney
2019-12-06  3:19                         ` Paul E. McKenney
2019-12-06 18:52                         ` Paul E. McKenney
2019-12-06 22:00                           ` Paul E. McKenney
2019-12-09 18:59                             ` Paul E. McKenney
2019-12-10  9:08                               ` Peter Zijlstra
2019-12-10 22:56                                 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191126183334.GE2867037@devbig004.ftw2.facebook.com \
    --to=tj@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).