From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941383AbcLVQdl (ORCPT ); Thu, 22 Dec 2016 11:33:41 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:52753 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1757312AbcLVQdj (ORCPT ); Thu, 22 Dec 2016 11:33:39 -0500 Date: Thu, 22 Dec 2016 08:33:30 -0800 From: "Paul E. McKenney" To: Peter Zijlstra Cc: Will Deacon , Mark Rutland , linux-kernel@vger.kernel.org, Ingo Molnar , Arnaldo Carvalho de Melo , Thomas Gleixner , Sebastian Andrzej Siewior , jeremy.linton@arm.com, Boqun Feng Subject: Re: Perf hotplug lockup in v4.9-rc8 Reply-To: paulmck@linux.vnet.ibm.com References: <20161207135217.GA25605@leverpostej> <20161207175347.GB13840@leverpostej> <20161207183455.GQ3124@twins.programming.kicks-ass.net> <20161209135900.GU3174@twins.programming.kicks-ass.net> <20161212114640.GD21248@arm.com> <20161212124228.GE3124@twins.programming.kicks-ass.net> <20161222084509.GX3174@twins.programming.kicks-ass.net> <20161222140010.GY3174@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161222140010.GY3174@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16122216-0004-0000-0000-000011265D48 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006296; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000198; SDB=6.00797508; UDB=6.00387176; IPR=6.00575315; BA=6.00005000; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013687; XFM=3.00000011; UTC=2016-12-22 16:33:21 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16122216-0005-0000-0000-00007BA05295 Message-Id: <20161222163330.GT3924@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-12-22_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1612220270 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 22, 2016 at 03:00:10PM +0100, Peter Zijlstra wrote: > On Thu, Dec 22, 2016 at 09:45:09AM +0100, Peter Zijlstra wrote: > > On Mon, Dec 12, 2016 at 01:42:28PM +0100, Peter Zijlstra wrote: > > > > > What are you trying to order here? > > > > > > I suppose something like this: > > > > > > > > > CPU0 CPU1 CPU2 > > > > > > (current == t) > > > > > > t->perf_event_ctxp[] = ctx; > > > smp_mb(); > > > cpu = task_cpu(t); > > > > > > switch(t, n); > > > migrate(t, 2); > > > switch(p, t); > > > > > > ctx = t->perf_event_ctxp[]; // must not be NULL > > > > > > > So I think I can cast the above into a test like: > > > > W[x] = 1 W[y] = 1 R[z] = 1 > > mb mb mb > > R[y] = 0 W[z] = 1 R[x] = 0 > > > > Where x is the perf_event_ctxp[], y is our task's cpu and z is our task > > being placed on the rq of cpu2. > > > > See also commit: 8643cda549ca ("sched/core, locking: Document > > Program-Order guarantees"), Independent of which cpu initiates the > > migration between CPU1 and CPU2 there is ordering between the CPUs. > > I think that when we assume RCpc locks, the above CPU1 mb ends up being > something like an smp_wmb() (ie. non transitive). CPU2 needs to do a > context switch between observing the task on its runqueue and getting to > switching in perf-events for the task, which keeps that a full mb. > > Now, if only this model would have locks in ;-) Yeah, we are slow. ;-) But you should be able to emulate them with xchg_acquire() and smp_store_release(). Thanx, Paul > > This would then translate into something like: > > > > C C-peterz > > > > { > > } > > > > P0(int *x, int *y) > > { > > int r1; > > > > WRITE_ONCE(*x, 1); > > smp_mb(); > > r1 = READ_ONCE(*y); > > } > > > > P1(int *y, int *z) > > { > > WRITE_ONCE(*y, 1); > > smp_mb(); > > And this modified to: smp_wmb() > > > WRITE_ONCE(*z, 1); > > } > > > > P2(int *x, int *z) > > { > > int r1; > > int r2; > > > > r1 = READ_ONCE(*z); > > smp_mb(); > > r2 = READ_ONCE(*x); > > } > > > > exists > > (0:r1=0 /\ 2:r1=1 /\ 2:r2=0) > > Still results in the same outcome. > > If however we change P2's barrier into a smp_rmb() it does become > possible, but as said above, there's a context switch in between which > implies a full barrier so no worries. > > Similar if I replace everything z with smp_store_release() and > smp_load_acquire(). > > > Of course, its entirely possible the litmus test doesn't reflect > reality, I still find it somewhat hard to write these things. >