From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932589AbcFOKTz (ORCPT ); Wed, 15 Jun 2016 06:19:55 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:43006 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932521AbcFOKTu (ORCPT ); Wed, 15 Jun 2016 06:19:50 -0400 X-IBM-Helo: d03dlp01.boulder.ibm.com X-IBM-MailFrom: ego@linux.vnet.ibm.com X-IBM-RcptTo: mpe@ellerman.id.au;htejun@gmail.com;peterz@infradead.org;tglx@linutronix.de;linuxppc-dev@lists.ozlabs.org;linux-kernel@vger.kernel.org Date: Wed, 15 Jun 2016 15:49:36 +0530 From: Gautham R Shenoy To: Peter Zijlstra Cc: "Gautham R. Shenoy" , Thomas Gleixner , Tejun Heo , Michael Ellerman , Abdul Haleem , Aneesh Kumar , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 2/2] workqueue:Fix affinity of an unbound worker of a node with 1 online CPU Reply-To: ego@linux.vnet.ibm.com References: <20160614112234.GF30154@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160614112234.GF30154@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23 (2014-03-12) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16061510-0020-0000-0000-00000919BAB4 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16061510-0021-0000-0000-000052D4114E Message-Id: <20160615101936.GA31671@in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-06-15_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1606150115 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On Tue, Jun 14, 2016 at 01:22:34PM +0200, Peter Zijlstra wrote: > On Tue, Jun 07, 2016 at 08:44:03PM +0530, Gautham R. Shenoy wrote: > > I'm still puzzled why we don't see this on x86. Afaict there's nothing > PPC specific about this. You are right. On PPC, at boot time we hit the WARN_ON like once in 5 times. Using some debug prints, I have verified that these are instances when the workqueue subsystem gets initialized before all the CPUs come online. On x86, I have never been able to hit this since it appears that every time the workqueues get initialized only after all the CPUs have come online. PPC doesn't uses any specific unbound workqueue early in the boot. The unbound workqueues causing the WARN_ON() were the "events_unbound" workqueue which was created by workqueue_init(). ================================================================================= [WQ] Creating Unbound workers for WQ events_unbound,cpumask 0-127. online mask 0 [WQ] Creating Unbound workers for WQ events_unbound,cpumask 0-31. online mask 0 [WQ] Creating Unbound workers for WQ events_unbound,cpumask 32-63. online mask 0 [WQ] Creating Unbound workers for WQ events_unbound,cpumask 64-95. online mask 0 [WQ] Creating Unbound workers for WQ events_unbound,cpumask 96-127. online mask 0 ================================================================================= Also, with the first patch in the series (which ensures that restore_unbound_workers are called *after* the new workers for the newly onlined CPUs are created) and without this one, you can reproduce this WARN_ON on both x86 and PPC by offlining all the CPUs of a node and bringing just one of them online. So essentially the BUG fixed by the previous patch is currently hiding this BUG which is why we are not able to reproduce this WARN_ON() with CPU-hotplug once the system has booted. > > > This patch sets the affinity of the worker to > > a) the only online CPU in the cpumask of the worker pool when it comes > > online. > > b) the cpumask of the worker pool when the second CPU in the pool's > > cpumask comes online. > > This basically works around the WARN conditions, which I suppose is fair > enough, but I would like a note here to revisit this once the whole cpu > hotplug rework has settled. > Sure. > The real problem is that workqueues seem to want to create worker > threads before there's anybody who would use them or something like > that. I am not sure about that. The workqueue creates unbound workers for a node via wq_update_unbound_numa() whenever the first CPU of every node comes online. So that seems legitimate. It then tries to affine these workers to the cpumask of that node. Again this seems right. As an optimization, it does this only when the first CPU of the node comes online. Since this online CPU is not yet active, and since nr_cpus_allowed > 1, we will hit the WARN_ON(). However, I agree with you that during boot-up, the workqueue subsystem needs to create unbound worker threads for only the online CPUs (instead of all possible CPUs as it currently does!) and let the CPU_ONLINE notification take care of creating the remaining workers when they are really required. > > Or is that what PPC does funny? Use an unbound workqueue this early in > cpu bringup? Like I pointed out above, PPC doesn't use an unbound workqueue early in the CPU bring up. -- Thanks and Regards gautham.