All of lore.kernel.org
 help / color / mirror / Atom feed
* [WARNING] kernel/workqueue.c:2041 process_one_work (when cpu goes offline)
@ 2017-04-05 19:16 Steven Rostedt
  2017-04-11  0:08 ` Tejun Heo
  0 siblings, 1 reply; 3+ messages in thread
From: Steven Rostedt @ 2017-04-05 19:16 UTC (permalink / raw)
  To: Tejun Heo; +Cc: LKML

Hi Tejun,

My tests have started to recently trigger this warning quite often,
which causes my tests to fail. The test that triggers this is running
the mmiotracer which forces all but one CPU offline.

------------[ cut here ]------------
WARNING: CPU: 0 PID: 6 at /work/autotest/nobackup/linux-test.git/kernel/workqueue.c:2041 process_one_work+0x90/0x485
Modules linked in: ppdev parport_pc parport [last unloaded: trace_events_sample]
CPU: 0 PID: 6 Comm: vmstat Not tainted 4.11.0-rc5-test+ #3
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
Call Trace:
 dump_stack+0x68/0x92
 __warn+0xc2/0xdd
 warn_slowpath_null+0x1d/0x1f
 process_one_work+0x90/0x485
 process_scheduled_works+0x2c/0x33
 rescuer_thread+0x19c/0x295
 ? process_scheduled_works+0x33/0x33
 kthread+0xf4/0xf9
 ? __list_del_entry+0x22/0x22
 ret_from_fork+0x2e/0x40
---[ end trace ed53fc9d3ce10aa8 ]---

#ifdef CONFIG_LOCKDEP
	/*
	 * It is permissible to free the struct work_struct from
	 * inside the function that is called from it, this we need to
	 * take into account for lockdep too.  To avoid bogus "held
	 * lock freed" warnings as well as problems when looking into
	 * work->lockdep_map, make a copy and use that here.
	 */
	struct lockdep_map lockdep_map;

	lockdep_copy_map(&lockdep_map, &work->lockdep_map);
#endif
	/* ensure we're on the correct CPU */
	WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&   <<--- line 2041
		     raw_smp_processor_id() != pool->cpu);

	/*
	 * A single work shouldn't be executed concurrently by
	 * multiple workers on a single cpu.  Check whether anyone is
	 * already processing the work.  If so, defer the work to the
	 * currently executing one.
	 */


I'm assuming that this thread was migrated due to the CPU offlining and
causes pool->cpu not to equal raw_smp_processor_id(). Or should that
not be happening?

Thoughts?

-- Steve

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [WARNING] kernel/workqueue.c:2041 process_one_work (when cpu goes offline)
  2017-04-05 19:16 [WARNING] kernel/workqueue.c:2041 process_one_work (when cpu goes offline) Steven Rostedt
@ 2017-04-11  0:08 ` Tejun Heo
  2017-04-18  8:12   ` Tejun Heo
  0 siblings, 1 reply; 3+ messages in thread
From: Tejun Heo @ 2017-04-11  0:08 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML

Hello, Steven.

On Wed, Apr 05, 2017 at 03:16:28PM -0400, Steven Rostedt wrote:
> My tests have started to recently trigger this warning quite often,
> which causes my tests to fail. The test that triggers this is running
> the mmiotracer which forces all but one CPU offline.
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 6 at /work/autotest/nobackup/linux-test.git/kernel/workqueue.c:2041 process_one_work+0x90/0x485
> Modules linked in: ppdev parport_pc parport [last unloaded: trace_events_sample]
> CPU: 0 PID: 6 Comm: vmstat Not tainted 4.11.0-rc5-test+ #3
> Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> Call Trace:
>  dump_stack+0x68/0x92
>  __warn+0xc2/0xdd
>  warn_slowpath_null+0x1d/0x1f
>  process_one_work+0x90/0x485
>  process_scheduled_works+0x2c/0x33
>  rescuer_thread+0x19c/0x295
>  ? process_scheduled_works+0x33/0x33
>  kthread+0xf4/0xf9
>  ? __list_del_entry+0x22/0x22
>  ret_from_fork+0x2e/0x40
> ---[ end trace ed53fc9d3ce10aa8 ]---
> 
> #ifdef CONFIG_LOCKDEP
> 	/*
> 	 * It is permissible to free the struct work_struct from
> 	 * inside the function that is called from it, this we need to
> 	 * take into account for lockdep too.  To avoid bogus "held
> 	 * lock freed" warnings as well as problems when looking into
> 	 * work->lockdep_map, make a copy and use that here.
> 	 */
> 	struct lockdep_map lockdep_map;
> 
> 	lockdep_copy_map(&lockdep_map, &work->lockdep_map);
> #endif
> 	/* ensure we're on the correct CPU */
> 	WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&   <<--- line 2041
> 		     raw_smp_processor_id() != pool->cpu);
> 
> 	/*
> 	 * A single work shouldn't be executed concurrently by
> 	 * multiple workers on a single cpu.  Check whether anyone is
> 	 * already processing the work.  If so, defer the work to the
> 	 * currently executing one.
> 	 */
> 
> 
> I'm assuming that this thread was migrated due to the CPU offlining and
> causes pool->cpu not to equal raw_smp_processor_id(). Or should that
> not be happening?

If this happens while CPU is going donw, the pool should have
POOL_DISASSOCIATED set by that point and the actual affinity shouldn't
matter.  Maybe I messed up the rescuer part of it.  I'll look into it.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [WARNING] kernel/workqueue.c:2041 process_one_work (when cpu goes offline)
  2017-04-11  0:08 ` Tejun Heo
@ 2017-04-18  8:12   ` Tejun Heo
  0 siblings, 0 replies; 3+ messages in thread
From: Tejun Heo @ 2017-04-18  8:12 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: LKML

Hello,

On Tue, Apr 11, 2017 at 09:08:37AM +0900, Tejun Heo wrote:
> On Wed, Apr 05, 2017 at 03:16:28PM -0400, Steven Rostedt wrote:
> > My tests have started to recently trigger this warning quite often,
> > which causes my tests to fail. The test that triggers this is running
> > the mmiotracer which forces all but one CPU offline.

So, the rescuer handling seems fine and nothing really changed from
workqueue side.  Any chance cpuset is involved?  If so, there was a
recent race condition fix 77f88796cee8 ("cgroup, kthread: close race
window where new kthreads can be migrated to non-root cgroups").  If
that's not it, it'd be great if you can explain the test case so that
I can repro the problem.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-04-18  8:12 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-05 19:16 [WARNING] kernel/workqueue.c:2041 process_one_work (when cpu goes offline) Steven Rostedt
2017-04-11  0:08 ` Tejun Heo
2017-04-18  8:12   ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.