From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755605AbcBBRv3 (ORCPT ); Tue, 2 Feb 2016 12:51:29 -0500 Received: from foss.arm.com ([217.140.101.70]:57373 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753530AbcBBRv2 (ORCPT ); Tue, 2 Feb 2016 12:51:28 -0500 Date: Tue, 2 Feb 2016 17:51:27 +0000 From: Will Deacon To: Linus Torvalds Cc: Boqun Feng , Paul McKenney , Peter Zijlstra , "Maciej W. Rozycki" , David Daney , =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= , Ralf Baechle , Linux Kernel Mailing List Subject: Re: [RFC][PATCH] mips: Fix arch_spin_unlock() Message-ID: <20160202175127.GO10166@arm.com> References: <20160129095958.GA4541@arm.com> <20160129102253.GG4503@linux.vnet.ibm.com> <20160201135621.GD6828@arm.com> <20160202035458.GF6719@linux.vnet.ibm.com> <20160202051904.GC1239@fixme-laptop.cn.ibm.com> <20160202064433.GG6719@linux.vnet.ibm.com> <20160202093440.GD1239@fixme-laptop.cn.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 02, 2016 at 09:30:26AM -0800, Linus Torvalds wrote: > On Tue, Feb 2, 2016 at 1:34 AM, Boqun Feng wrote: > > > > Just to be clear, what Will, Paul and I are discussing here is about > > local transitivity, > > I really don't think that changes the picture. For the general point about mixed methods, perhaps not, but it does mean that we can't describe all of the issues using fewer than three processors. > Given that > > (a) we already mix ordering methods and there are good reasons for > it, and I'd expect transitivity only makes that more likely > > (b) we expect transitivity from the individual ordering methods > > (c) I don't think that there are any relevant CPU's that violate this anyway > > I really think that not expecting that to hold for mixed accesses > would be a complete disaster. It will confuse the hell out of people. > > And the basic argument really stands: we should make the memory > ordering expectations as strong as we can, given the existing relevant > architecture constraints (ie x86/arm/power). > > If that then means that some other architecture might need to add > extra serialization that that architecture doesn't _want_ to add, > tough luck. I absolutely hate the fact that alpha forced us to add > that crazy read-depends barrier, and I want to discourage that a lot. > > In fact, I'd be willing to strengthen our existing orderings just in > the name of sanity, and say that "rcu_dereference()" should just be an > acquire, and say that if the architecture makes that more expensive, > then who the hell cares? I have not been very happy with the "consume" > memory ordering discussions for C++. Yes, it would hurt pre-lwsync > power a bit, and it would hurt 32-bit arm, but enough that we should > have the headache of the existing semantics? Given that the vast majority of weakly ordered architectures respect address dependencies, I would expect all of them to be hurt if they were forced to use barrier instructions instead, even those where the microarchitecture is fairly strongly ordered in practice. Even load-acquire on ARMv8 has more work to do than a plain old address dependency, so I'd be sad to see us upgrading rcu_dereference like this, particularly when its a relatively uncontentious, easy to understand part of the kernel memory model. As far as I understand it, the problems with "consume" have centred largely around compiler and specification issues, which we don't have with rcu_dereference (i.e. we ignore thin-air and use volatile casts /barrier() to keep the optimizer at bay). Will