From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755944AbcG0MyV (ORCPT ); Wed, 27 Jul 2016 08:54:21 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:48314 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754421AbcG0MyT (ORCPT ); Wed, 27 Jul 2016 08:54:19 -0400 X-IBM-Helo: d06dlp02.portsmouth.uk.ibm.com X-IBM-MailFrom: heiko.carstens@de.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Wed, 27 Jul 2016 14:54:12 +0200 From: Heiko Carstens To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Thomas Gleixner Subject: [bisected] "sched: Allow per-cpu kernel threads to run on online && !active" causes warning MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16072712-0040-0000-0000-000002B575C2 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16072712-0041-0000-0000-00001C62B20F Message-Id: <20160727125412.GB3912@osiris> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-07-27_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1607270132 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, I get the following warning on s390 when using fake NUMA beginning with your patch e9d867a67fd0 "sched: Allow per-cpu kernel threads to run on online && !active" [ 3.162909] WARNING: CPU: 0 PID: 1 at include/linux/cpumask.h:121 select_task_rq+0xe6/0x1a8 [ 3.162911] Modules linked in: [ 3.162914] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc6-00001-ge9d867a67fd0-dirty #28 [ 3.162917] task: 00000001dd270008 ti: 00000001eccb4000 task.ti: 00000001eccb4000 [ 3.162918] Krnl PSW : 0404c00180000000 0000000000176c56 (select_task_rq+0xe6/0x1a8) [ 3.162923] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 Krnl GPRS: 00000000009f3d00 00000001dd270008 0000000000000100 00000000000000f4 [ 3.162927] 0000000000eaf4e2 0000000000000000 00000001eccb7bc0 0400000000000001 [ 3.162929] 00000001ec660950 0000000000000000 00000001ec660520 00000001ec660008 [ 3.162930] 0000000000000100 000000000099c1a0 0000000000176c30 00000001eccb7a70 [ 3.162940] Krnl Code: 0000000000176c4a: a774002f brc 7,176ca8 0000000000176c4e: 92014000 mvi 0(%r4),1 #0000000000176c52: a7f40001 brc 15,176c54 >0000000000176c56: a7f40029 brc 15,176ca8 0000000000176c5a: 95004000 cli 0(%r4),0 0000000000176c5e: a7740006 brc 7,176c6a 0000000000176c62: 92014000 mvi 0(%r4),1 0000000000176c66: a7f40001 brc 15,176c68 [ 3.162958] Call Trace: [ 3.162961] ([<0000000000176c30>] select_task_rq+0xc0/0x1a8) [ 3.162963] ([<0000000000177d64>] try_to_wake_up+0x2e4/0x478) [ 3.162968] ([<000000000015d46c>] create_worker+0x174/0x1c0) [ 3.162971] ([<0000000000161a98>] alloc_unbound_pwq+0x360/0x438) [ 3.162973] ([<0000000000162550>] apply_wqattrs_prepare+0x200/0x2a0) [ 3.162975] ([<000000000016266a>] apply_workqueue_attrs_locked+0x7a/0xb0) [ 3.162977] ([<0000000000162af0>] apply_workqueue_attrs+0x50/0x78) [ 3.162979] ([<000000000016441c>] __alloc_workqueue_key+0x304/0x520) [ 3.162983] ([<0000000000ee3706>] default_bdi_init+0x3e/0x70) [ 3.162986] ([<0000000000100270>] do_one_initcall+0x140/0x1d8) [ 3.162990] ([<0000000000ec9da8>] kernel_init_freeable+0x220/0x2d8) [ 3.162993] ([<0000000000984a7a>] kernel_init+0x2a/0x150) [ 3.162996] ([<00000000009913fa>] kernel_thread_starter+0x6/0xc) [ 3.162998] ([<00000000009913f4>] kernel_thread_starter+0x0/0xc) [ 3.163000] 4 locks held by swapper/0/1: [ 3.163002] #0: (cpu_hotplug.lock){++++++}, at: [<000000000013ebe0>] get_online_cpus+0x48/0xb8 [ 3.163010] #1: (wq_pool_mutex){+.+.+.}, at: [<0000000000162ae2>] apply_workqueue_attrs+0x42/0x78 [ 3.163016] #2: (&pool->lock/1){......}, at: [<000000000015d44a>] create_worker+0x152/0x1c0 [ 3.163022] #3: (&p->pi_lock){..-...}, at: [<0000000000177ac4>] try_to_wake_up+0x44/0x478 [ 3.163028] Last Breaking-Event-Address: [ 3.163030] [<0000000000176c52>] select_task_rq+0xe2/0x1a8 For some unknown reason select_task_rq() gets called with a task that has nr_cpus_allowed == 0. Hence "cpu = cpumask_any(tsk_cpus_allowed(p));" within select_task_rq() will set cpu to nr_cpu_ids which in turn causes the warning later on. It only happens with more than one node, otherwise it seems to work fine. Any idea what could be wrong here?