From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S932589AbcFOKTz (ORCPT <rfc822;w@1wt.eu>);
	Wed, 15 Jun 2016 06:19:55 -0400
Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:43006 "EHLO
	mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL)
	by vger.kernel.org with ESMTP id S932521AbcFOKTu (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Wed, 15 Jun 2016 06:19:50 -0400
X-IBM-Helo: d03dlp01.boulder.ibm.com
X-IBM-MailFrom: ego@linux.vnet.ibm.com
X-IBM-RcptTo: mpe@ellerman.id.au;htejun@gmail.com;peterz@infradead.org;tglx@linutronix.de;linuxppc-dev@lists.ozlabs.org;linux-kernel@vger.kernel.org
Date: Wed, 15 Jun 2016 15:49:36 +0530
From: Gautham R Shenoy <ego@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: "Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
        Thomas Gleixner <tglx@linutronix.de>, Tejun Heo <htejun@gmail.com>,
        Michael Ellerman <mpe@ellerman.id.au>,
        Abdul Haleem <abdhalee@linux.vnet.ibm.com>,
        Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
        linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] workqueue:Fix affinity of an unbound worker of a
 node with 1 online CPU
Reply-To: ego@linux.vnet.ibm.com
References: <cover.1465311052.git.ego@linux.vnet.ibm.com>
 <c284ee977a3d52ddd5c01638be391e24b7a59b3d.1465311052.git.ego@linux.vnet.ibm.com>
 <20160614112234.GF30154@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20160614112234.GF30154@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-TM-AS-GCONF: 00
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 16061510-0020-0000-0000-00000919BAB4
X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused
x-cbparentid: 16061510-0021-0000-0000-000052D4114E
Message-Id: <20160615101936.GA31671@in.ibm.com>
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-06-15_06:,,
 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0
 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam
 adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000
 definitions=main-1606150115
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Hi Peter,

On Tue, Jun 14, 2016 at 01:22:34PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 07, 2016 at 08:44:03PM +0530, Gautham R. Shenoy wrote:
> 
> I'm still puzzled why we don't see this on x86. Afaict there's nothing
> PPC specific about this.

You are right. On PPC, at boot time we hit the WARN_ON like once in 5
times. Using some debug prints, I have verified that these are
instances when the workqueue subsystem gets initialized before all the
CPUs come online. On x86, I have never been able to hit this since it
appears that every time the workqueues get initialized only after all
the CPUs have come online.

PPC doesn't uses any specific unbound workqueue early in the boot. The
unbound workqueues causing the WARN_ON() were the
"events_unbound" workqueue which was created by workqueue_init().

=================================================================================
[WQ] Creating Unbound workers for WQ events_unbound,cpumask 0-127.
     online mask 0
[WQ] Creating Unbound workers for WQ events_unbound,cpumask 0-31. 
     online mask 0
[WQ] Creating Unbound workers for WQ events_unbound,cpumask 32-63.
      online mask 0
[WQ] Creating Unbound workers for WQ events_unbound,cpumask 64-95.
      online mask 0
[WQ] Creating Unbound workers for WQ events_unbound,cpumask 96-127.
      online mask 0
=================================================================================

Also, with the first patch in the series (which ensures that
restore_unbound_workers are called *after* the new workers for the
newly onlined CPUs are created) and without this one, you can
reproduce this WARN_ON on both x86 and PPC by offlining all the CPUs
of a node and bringing just one of them online. So essentially the BUG
fixed by the previous patch is currently hiding this BUG which is why
we are not able to reproduce this WARN_ON() with CPU-hotplug once the
system has booted.

> 
> > This patch sets the affinity of the worker to
> > a) the only online CPU in the cpumask of the worker pool when it comes
> >    online.
> > b) the cpumask of the worker pool when the second CPU in the pool's
> >    cpumask comes online.
> 
> This basically works around the WARN conditions, which I suppose is fair
> enough, but I would like a note here to revisit this once the whole cpu
> hotplug rework has settled.
> 

Sure.

> The real problem is that workqueues seem to want to create worker
> threads before there's anybody who would use them or something like
> that.

I am not sure about that. The workqueue creates unbound workers for a
node via wq_update_unbound_numa() whenever the first CPU of every node
comes online. So that seems legitimate. It then tries to affine these
workers to the cpumask of that node. Again this seems right. As an
optimization, it does this only when the first CPU of the node comes
online. Since this online CPU is not yet active, and since
nr_cpus_allowed > 1, we will hit the WARN_ON().

However, I agree with you that during boot-up, the workqueue subsystem
needs to create unbound worker threads for only the online CPUs
(instead of all possible CPUs as it currently does!) and let the
CPU_ONLINE notification take care of creating the remaining workers
when they are really required.

> 
> Or is that what PPC does funny? Use an unbound workqueue this early in
> cpu bringup?

Like I pointed out above, PPC doesn't use an unbound workqueue early
in the CPU bring up.

--
Thanks and Regards
gautham.