From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57D52ECDFAA for ; Fri, 13 Jul 2018 02:17:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1D8AB214AB for ; Fri, 13 Jul 2018 02:17:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1D8AB214AB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=nvidia.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388082AbeGMCaL (ORCPT ); Thu, 12 Jul 2018 22:30:11 -0400 Received: from hqemgate15.nvidia.com ([216.228.121.64]:1226 "EHLO hqemgate15.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387972AbeGMCaL (ORCPT ); Thu, 12 Jul 2018 22:30:11 -0400 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqemgate15.nvidia.com (using TLS: TLSv1, AES128-SHA) id ; Thu, 12 Jul 2018 19:17:03 -0700 Received: from HQMAIL105.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Thu, 12 Jul 2018 19:17:48 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Thu, 12 Jul 2018 19:17:48 -0700 Received: from [10.2.167.205] (10.2.167.205) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Fri, 13 Jul 2018 02:17:47 +0000 Subject: Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire To: Will Deacon , Peter Zijlstra CC: Andrea Parri , Alan Stern , "Paul E. McKenney" , LKMM Maintainers -- Akira Yokosawa , Boqun Feng , David Howells , Jade Alglave , Luc Maranget , Nicholas Piggin , Kernel development list References: <20180710093821.GA5414@andrea> <20180711094310.GA13963@arm.com> <20180711123421.GA9673@andrea> <20180712074040.GA4920@worktop.programming.kicks-ass.net> <20180712093432.GV2512@hirez.programming.kicks-ass.net> <20180712094510.GA23415@arm.com> From: Daniel Lustig Message-ID: <96aa1571-d9b9-9554-7017-d173e1a795e9@nvidia.com> Date: Thu, 12 Jul 2018 19:17:46 -0700 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20180712094510.GA23415@arm.com> X-Originating-IP: [10.2.167.205] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL105.nvidia.com (172.20.187.12) Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 7/12/2018 2:45 AM, Will Deacon wrote: > On Thu, Jul 12, 2018 at 11:34:32AM +0200, Peter Zijlstra wrote: >> On Thu, Jul 12, 2018 at 09:40:40AM +0200, Peter Zijlstra wrote: >>> And I think if we raise atomic*_acquire() to require TSO (but ideally >>> raise it to RCsc) we're there. >> >> To clarify, just the RmW-acquire. Things like atomic_read_acquire() can >> stay smp_load_acquire() and be RCpc. > > I don't have strong opinions about strengthening RmW atomics to TSO, so > if it helps to unblock Alan's patch (which doesn't go near this!) then I'll > go with it. The important part is that we continue to allow roach motel > into the RmW for other accesses in the non-fully-ordered cases. > > Daniel -- your AMO instructions are cool with this, right? It's just the > fence-based implementations that will need help? > > Will Right, let me pull this part out of the overly-long response I just gave on the thread with Linus :) if we pair AMOs with AMOs, we get RCsc, and everything is fine. If we start mixing in fences (mostly because we don't currently have native load-acquire or store-release opcodes), then that's when all the rest of the complexity comes in. Dan