[bisected] "sched: Allow per-cpu kernel threads to run on online && !active" causes warning

* [bisected] "sched: Allow per-cpu kernel threads to run on online && !active" causes warning
@ 2016-07-27 12:54 Heiko Carstens
  2016-07-27 15:23 ` Thomas Gleixner
  0 siblings, 1 reply; 16+ messages in thread
From: Heiko Carstens @ 2016-07-27 12:54 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Thomas Gleixner

Hi Peter,

I get the following warning on s390 when using fake NUMA beginning with
your patch
e9d867a67fd0 "sched: Allow per-cpu kernel threads to run on online && !active"

[    3.162909] WARNING: CPU: 0 PID: 1 at include/linux/cpumask.h:121 select_task_rq+0xe6/0x1a8
[    3.162911] Modules linked in:
[    3.162914] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00001-ge9d867a67fd0-dirty #28
[    3.162917] task: 00000001dd270008 ti: 00000001eccb4000 task.ti: 00000001eccb4000
[    3.162918] Krnl PSW : 0404c00180000000 0000000000176c56 (select_task_rq+0xe6/0x1a8)
[    3.162923]            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
               Krnl GPRS: 00000000009f3d00 00000001dd270008 0000000000000100 00000000000000f4
[    3.162927]            0000000000eaf4e2 0000000000000000 00000001eccb7bc0 0400000000000001
[    3.162929]            00000001ec660950 0000000000000000 00000001ec660520 00000001ec660008
[    3.162930]            0000000000000100 000000000099c1a0 0000000000176c30 00000001eccb7a70
[    3.162940] Krnl Code: 0000000000176c4a: a774002f            brc     7,176ca8
                          0000000000176c4e: 92014000            mvi     0(%r4),1
                         #0000000000176c52: a7f40001            brc     15,176c54
                         >0000000000176c56: a7f40029            brc     15,176ca8
                          0000000000176c5a: 95004000            cli     0(%r4),0
                          0000000000176c5e: a7740006            brc     7,176c6a
                          0000000000176c62: 92014000            mvi     0(%r4),1
                          0000000000176c66: a7f40001            brc     15,176c68
[    3.162958] Call Trace:
[    3.162961] ([<0000000000176c30>] select_task_rq+0xc0/0x1a8)
[    3.162963] ([<0000000000177d64>] try_to_wake_up+0x2e4/0x478)
[    3.162968] ([<000000000015d46c>] create_worker+0x174/0x1c0)
[    3.162971] ([<0000000000161a98>] alloc_unbound_pwq+0x360/0x438)
[    3.162973] ([<0000000000162550>] apply_wqattrs_prepare+0x200/0x2a0)
[    3.162975] ([<000000000016266a>] apply_workqueue_attrs_locked+0x7a/0xb0)
[    3.162977] ([<0000000000162af0>] apply_workqueue_attrs+0x50/0x78)
[    3.162979] ([<000000000016441c>] __alloc_workqueue_key+0x304/0x520)
[    3.162983] ([<0000000000ee3706>] default_bdi_init+0x3e/0x70)
[    3.162986] ([<0000000000100270>] do_one_initcall+0x140/0x1d8)
[    3.162990] ([<0000000000ec9da8>] kernel_init_freeable+0x220/0x2d8)
[    3.162993] ([<0000000000984a7a>] kernel_init+0x2a/0x150)
[    3.162996] ([<00000000009913fa>] kernel_thread_starter+0x6/0xc)
[    3.162998] ([<00000000009913f4>] kernel_thread_starter+0x0/0xc)
[    3.163000] 4 locks held by swapper/0/1:
[    3.163002]  #0:  (cpu_hotplug.lock){++++++}, at: [<000000000013ebe0>] get_online_cpus+0x48/0xb8
[    3.163010]  #1:  (wq_pool_mutex){+.+.+.}, at: [<0000000000162ae2>] apply_workqueue_attrs+0x42/0x78
[    3.163016]  #2:  (&pool->lock/1){......}, at: [<000000000015d44a>] create_worker+0x152/0x1c0
[    3.163022]  #3:  (&p->pi_lock){..-...}, at: [<0000000000177ac4>] try_to_wake_up+0x44/0x478
[    3.163028] Last Breaking-Event-Address:
[    3.163030]  [<0000000000176c52>] select_task_rq+0xe2/0x1a8

For some unknown reason select_task_rq() gets called with a task that has
nr_cpus_allowed == 0. Hence "cpu = cpumask_any(tsk_cpus_allowed(p));"
within select_task_rq() will set cpu to nr_cpu_ids which in turn causes the
warning later on.

It only happens with more than one node, otherwise it seems to work fine.

Any idea what could be wrong here?

^ permalink raw reply	[flat|nested] 16+ messages in thread