From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751542AbdEASmy (ORCPT ); Mon, 1 May 2017 14:42:54 -0400 Received: from mail-yw0-f179.google.com ([209.85.161.179]:34697 "EHLO mail-yw0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751129AbdEASmx (ORCPT ); Mon, 1 May 2017 14:42:53 -0400 Date: Mon, 1 May 2017 14:42:50 -0400 From: Tejun Heo To: "Paul E. McKenney" , Steven Rostedt Cc: jiangshanlai@gmail.com, linux-kernel@vger.kernel.org Subject: Re: WARN_ON_ONCE() in process_one_work()? Message-ID: <20170501184250.GA8921@htj.duckdns.org> References: <20170501165747.GA993@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170501165747.GA993@linux.vnet.ibm.com> User-Agent: Mutt/1.8.0 (2017-02-23) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Paul. Hmmm... Steven reproted a similar issue. http://lkml.kernel.org/r/20170405151628.33df783f@gandalf.local.home On Mon, May 01, 2017 at 09:57:47AM -0700, Paul E. McKenney wrote: > Hello! > > I am hitting this WARN_ON_ONCE() in process_one_work() and am wondering > what I did wrong to make this happen: > > ------------------------------------------------------------------------ > > static void process_one_work(struct worker *worker, struct work_struct *work) > __releases(&pool->lock) > __acquires(&pool->lock) > { > struct pool_workqueue *pwq = get_work_pwq(work); > struct worker_pool *pool = worker->pool; > bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE; > int work_color; > struct worker *collision; > #ifdef CONFIG_LOCKDEP > /* > * It is permissible to free the struct work_struct from > * inside the function that is called from it, this we need to > * take into account for lockdep too. To avoid bogus "held > * lock freed" warnings as well as problems when looking into > * work->lockdep_map, make a copy and use that here. > */ > struct lockdep_map lockdep_map; > > lockdep_copy_map(&lockdep_map, &work->lockdep_map); > #endif > /* ensure we're on the correct CPU */ > WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) && > raw_smp_processor_id() != pool->cpu); > > ------------------------------------------------------------------------ > > Here is the splat: > > ------------------------------------------------------------------------ > > [12600.593006] WARNING: CPU: 0 PID: 6 at /home/paulmck/public_git/linux-rcu/kernel/workqueue.c:2041 process_one_work+0x46c/0x4d0 > [12600.593006] Modules linked in: > [12600.593006] CPU: 0 PID: 6 Comm: mm_percpu_wq Not tainted 4.11.0-rc7+ #1 > [12600.593006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 > [12600.593006] Call Trace: > [12600.593006] dump_stack+0x4f/0x72 > [12600.593006] __warn+0xc6/0xe0 > [12600.593006] warn_slowpath_null+0x18/0x20 > [12600.593006] process_one_work+0x46c/0x4d0 > [12600.593006] rescuer_thread+0x20e/0x3b0 > [12600.593006] kthread+0x104/0x140 > [12600.593006] ? worker_thread+0x4e0/0x4e0 > [12600.593006] ? kthread_create_on_node+0x40/0x40 > [12600.593006] ret_from_fork+0x29/0x40 > > ------------------------------------------------------------------------ > > This happens about 3.5 hours into the TREE03 rcutorture scenario, .config > attached. Steven's involved a rescuer too. One possibility was cpuset being involved somehow and messing up the affinity of the rescuer kthread unexpectedly. Is cpuset involved in any way? Thanks. -- tejun