From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754191AbcBBIHc (ORCPT ); Tue, 2 Feb 2016 03:07:32 -0500 Received: from mail-io0-f169.google.com ([209.85.223.169]:34449 "EHLO mail-io0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753110AbcBBIH3 (ORCPT ); Tue, 2 Feb 2016 03:07:29 -0500 MIME-Version: 1.0 In-Reply-To: <20160202064433.GG6719@linux.vnet.ibm.com> References: <20160127145421.GT6357@twins.programming.kicks-ass.net> <20160127152158.GJ2390@arm.com> <20160127233836.GQ4503@linux.vnet.ibm.com> <20160128095718.GC30928@arm.com> <20160128223131.GV4503@linux.vnet.ibm.com> <20160129095958.GA4541@arm.com> <20160129102253.GG4503@linux.vnet.ibm.com> <20160201135621.GD6828@arm.com> <20160202035458.GF6719@linux.vnet.ibm.com> <20160202051904.GC1239@fixme-laptop.cn.ibm.com> <20160202064433.GG6719@linux.vnet.ibm.com> Date: Tue, 2 Feb 2016 00:07:28 -0800 X-Google-Sender-Auth: USBbv5OE3u2vvt8RMIyvtJqBs3w Message-ID: Subject: Re: [RFC][PATCH] mips: Fix arch_spin_unlock() From: Linus Torvalds To: Paul McKenney Cc: Boqun Feng , Will Deacon , Peter Zijlstra , "Maciej W. Rozycki" , David Daney , =?UTF-8?B?TcOlbnMgUnVsbGfDpXJk?= , Ralf Baechle , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 1, 2016 at 10:44 PM, Paul E. McKenney wrote: > On Tue, Feb 02, 2016 at 01:19:04PM +0800, Boqun Feng wrote: >> Hi Paul, >> If so, do we also need to take the following pairing into consideration? >> >> o smp_store_release() -> READ_ONCE(); if ;smp_rmb(); > > It looks like we will be restricing smp_rmb() and smp_wmb() to pairwise > scenarios only. So no transitivity in any scenarios involving these > two primitives. NO! Stop this idiocy now, Paul. We are not trying to make the kernel memory ordering rules pander to shit. > But where there is uncertainty, we should -not- assume ordering. It is > easy to tighten the rules later, but hard to loosen them. Bullshit. The rules should absolutely be as tight as at all humanly possible, given real hardware constraints. It's also entirely wrong to say that "we can tighten the rules later". No way. Because before those rules get tightened, we'll see confused people who then try to "fix" code that doesn't want fixing, and adding new odd constructs. We already had that with the whole bogus "control dependency" shit. That crap came exactly from the fact that people thought it was a good idea to make weak rules. So you had better re-think the whole thing. I refuse to pander to the idiotic and wrongheaded "weak memory ordering is good" braindamage. Weak memory ordering is *not* good. It's shit. It's hard to think about, and people won't even realize that the assumptions they make (unconsciously!) may be wrong. So the memory ordering rules should be as strong as at all possible, and should be as intuitive as possible, and follow peoples expectations. And quite frankly, if some crazy shit-for-brains architecture gets its memory ordering wrong (like alpha did), then that crazy architecture should pay the price. There is absolutely _zero_ valid reason why smp_store_release should not pair fine with a smp_rmb() on the other side. Can you actually point to a relevant architecture that doesn't do that? I can't even imagine how you'd make that pairing not work. If the releasing store doesn't imply that it is ordered wrt previous stores, then it damn well isn't a "release". And if the other side does a "smp-rmb()", then the loads are ordered on the other side, so it had damn well just work. How could it not work? So I can't even see how your "we won't guarantee" that wouldn't work. It sounds insane. But more importantly, your whole notion of "let's make things as weak as possible" is *wrong*. That is now AT ALL what we should strive for. We should strive for the strictest possible memory ordering that we can get away with, and have as few "undefined" cases as at all possible. So we *absolutely* should say that *OF COURSE* these things work: - CPU A: .. initialize data structure -> smp_wmb() -> WRITE_ONCE(ptr); - CPU B: smp_load_acquire(ptr) - we can rely on things behind "ptr" being initialized and that the above mirror of that (ie smp_store_release paired with READ_ONCE+smp_rmb) also works. There are often reasons to prefer one model over the other, and sometimes the reasons might argue for mixed access models. For example, we might decide that the producer uses "smp_wmb()", because that is universally fairly fast, and avoids a full memory barrier on those architectures that don't really have "release" semantics natively. But at the same time the _consumer_ side might want a "smp_load_acquire()" simply because it requires subsequent loads and stores to be ordered if it sees that state (see for example our "sk_state_store/load()" cases in networking - right now those pair with release/acquire, but I suspect that "release" really could be a "smp_wmb() + WRITE_ONCE()" instead, which could be faster on some architectures).. Because I really don't think there is any sane reason why that kind of mixed access shouldn't work. If those pairings don't work, there's something wrong with the architecture, and the architecture will need to do whatever barriers it needs to make it work in its store-release/load-acquire so that it pairs with smp_wmb/rmb. And if there against all sanity really is some crazy reason why that pairing can be problematic, then that insane reason needs to be extensively documented. Because that reason is so insane that it needs a big honking description, so that people are aware of it. But at no point should the logic be "let's leave it weakly defined". Because at that point the documentation is actively _worse_ than not having documentation at all, and just letting sanity prevail. Linus