From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934270AbcIFLSB (ORCPT ); Tue, 6 Sep 2016 07:18:01 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:53072 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934200AbcIFLSA (ORCPT ); Tue, 6 Sep 2016 07:18:00 -0400 Date: Tue, 6 Sep 2016 13:17:53 +0200 From: Peter Zijlstra To: Will Deacon Cc: Linus Torvalds , Oleg Nesterov , Paul McKenney , Benjamin Herrenschmidt , Michael Ellerman , linux-kernel@vger.kernel.org, Nicholas Piggin , Ingo Molnar , Alan Stern Subject: Re: Question on smp_mb__before_spinlock Message-ID: <20160906111753.GA10121@twins.programming.kicks-ass.net> References: <20160905093753.GN10138@twins.programming.kicks-ass.net> <20160905101021.GC2649@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160905101021.GC2649@arm.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 05, 2016 at 11:10:22AM +0100, Will Deacon wrote: > > The second issue I wondered about is spinlock transitivity. All except > > powerpc have RCsc locks, and since Power already does a full mb, would > > it not make sense to put it _after_ the spin_lock(), which would provide > > the same guarantee, but also upgrades the section to RCsc. > > > > That would make all schedule() calls fully transitive against one > > another. > > It would also match the way in which the arm64 atomic_*_return ops > are implemented, since full barrier semantics are required there. Hmm, are you sure; the way I read arch/arm64/include/asm/atomic_ll_sc.h is that you do ll/sc-rel + mb. > > That is, would something like the below make sense? > > Works for me, but I'll do a fix to smp_mb__before_spinlock anyway for > the stable tree. Indeed, thanks! > > The only slight annoyance is that, on arm64 anyway, a store-release > appearing in program order before the LOCK operation will be observed > in order, so if the write of CONDITION=1 in the try_to_wake_up case > used smp_store_release, we wouldn't need this barrier at all. Right, but this is because your load-acquire and store-release are much stronger than Linux's. Not only are they RCsc, they are also globally ordered irrespective of the variable (iirc). This wouldn't work for PPC (even if we could find all such prior stores). OK, I suppose I'll go stare what we can do about the mm_types.h use and spin a patch with Changelog.