From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753827AbdFMU6l (ORCPT ); Tue, 13 Jun 2017 16:58:41 -0400 Received: from mail-yb0-f176.google.com ([209.85.213.176]:34490 "EHLO mail-yb0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752846AbdFMU6j (ORCPT ); Tue, 13 Jun 2017 16:58:39 -0400 Date: Tue, 13 Jun 2017 16:58:37 -0400 From: Tejun Heo To: "Paul E. McKenney" Cc: jiangshanlai@gmail.com, linux-kernel@vger.kernel.org Subject: Re: WARN_ON_ONCE() in process_one_work()? Message-ID: <20170613205837.GB7359@htj.duckdns.org> References: <20170501165747.GA993@linux.vnet.ibm.com> <20170501183807.GA7054@linux.vnet.ibm.com> <20170501184402.GB8921@htj.duckdns.org> <20170501185819.GJ3956@linux.vnet.ibm.com> <20170505171159.GA10296@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170505171159.GA10296@linux.vnet.ibm.com> User-Agent: Mutt/1.8.2 (2017-04-18) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Paul. On Fri, May 05, 2017 at 10:11:59AM -0700, Paul E. McKenney wrote: > Just following up... I have hit this bug a couple of times over the > past few days. Anything I can do to help? My apologies for dropping the ball on this. I've gone over the hot plug code in workqueue several times but can't really find how this would happen. Can you please apply the following patch and see what it says when the problem happens? Thanks. diff --git a/kernel/workqueue.c b/kernel/workqueue.c index c74bf39ef764..bd2ce3cbfb41 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1691,13 +1691,20 @@ static struct worker *alloc_worker(int node) static void worker_attach_to_pool(struct worker *worker, struct worker_pool *pool) { + int ret; + mutex_lock(&pool->attach_mutex); /* * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any * online CPUs. It'll be re-applied when any of the CPUs come up. */ - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask); + ret = set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask); + + WARN(ret && !(pool->flags & POOL_DISASSOCIATED), + "set_cpus_allowed_ptr failed, ret=%d pool->cpu/flags=%d/0x%x cpumask=%*pbl online=%*pbl active=%*pbl\n", + ret, pool->cpu, pool->flags, cpumask_pr_args(pool->attrs->cpumask), + cpumask_pr_args(cpu_online_mask), cpumask_pr_args(cpu_active_mask)); /* * The pool->attach_mutex ensures %POOL_DISASSOCIATED remains @@ -2037,8 +2044,11 @@ __acquires(&pool->lock) lockdep_copy_map(&lockdep_map, &work->lockdep_map); #endif /* ensure we're on the correct CPU */ - WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) && - raw_smp_processor_id() != pool->cpu); + if (WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) && + raw_smp_processor_id() != pool->cpu)) + printk_once("XXX workfn=%pf pool->cpu/flags=%d/0x%x curcpu=%d online=%*pbl active=%*pbl\n", + work->func, pool->cpu, pool->flags, raw_smp_processor_id(), + cpumask_pr_args(cpu_online_mask), cpumask_pr_args(cpu_active_mask)); /* * A single work shouldn't be executed concurrently by