From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A7F9C43140 for ; Thu, 21 Jun 2018 17:27:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B561B21884 for ; Thu, 21 Jun 2018 17:27:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B561B21884 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=rowland.harvard.edu Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933167AbeFUR1O (ORCPT ); Thu, 21 Jun 2018 13:27:14 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:36822 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S932994AbeFUR1N (ORCPT ); Thu, 21 Jun 2018 13:27:13 -0400 Received: (qmail 4486 invoked by uid 2102); 21 Jun 2018 13:27:12 -0400 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 21 Jun 2018 13:27:12 -0400 Date: Thu, 21 Jun 2018 13:27:12 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: LKMM Maintainers -- Akira Yokosawa , Andrea Parri , Boqun Feng , David Howells , Jade Alglave , Luc Maranget , Nicholas Piggin , "Paul E. McKenney" , Peter Zijlstra , Will Deacon cc: Kernel development list Subject: [PATCH 2/2] tools/memory-model: Add write ordering by release-acquire and by locks Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org More than one kernel developer has expressed the opinion that the LKMM should enforce ordering of writes by release-acquire chains and by locking. In other words, given the following code: WRITE_ONCE(x, 1); spin_unlock(&s): spin_lock(&s); WRITE_ONCE(y, 1); or the following: smp_store_release(&x, 1); r1 = smp_load_acquire(&x); // r1 = 1 WRITE_ONCE(y, 1); the stores to x and y should be propagated in order to all other CPUs, even though those other CPUs might not access the lock s or be part of the release-acquire chain. In terms of the memory model, this means that rel-rf-acq-po should be part of the cumul-fence relation. All the architectures supported by the Linux kernel (including RISC-V) do behave this way, albeit for varying reasons. Therefore this patch changes the model in accordance with the developers' wishes. Signed-off-by: Alan Stern --- [as1871] tools/memory-model/Documentation/explanation.txt | 81 +++++++++++++++++++++++ tools/memory-model/linux-kernel.cat | 2 2 files changed, 82 insertions(+), 1 deletion(-) Index: usb-4.x/tools/memory-model/linux-kernel.cat =================================================================== --- usb-4.x.orig/tools/memory-model/linux-kernel.cat +++ usb-4.x/tools/memory-model/linux-kernel.cat @@ -66,7 +66,7 @@ let ppo = to-r | to-w | fence (* Propagation: Ordering from release operations and strong fences. *) let A-cumul(r) = rfe? ; r -let cumul-fence = A-cumul(strong-fence | po-rel) | wmb +let cumul-fence = A-cumul(strong-fence | po-rel) | wmb | rel-rf-acq-po let prop = (overwrite & ext)? ; cumul-fence* ; rfe? (* Index: usb-4.x/tools/memory-model/Documentation/explanation.txt =================================================================== --- usb-4.x.orig/tools/memory-model/Documentation/explanation.txt +++ usb-4.x/tools/memory-model/Documentation/explanation.txt @@ -1897,3 +1897,84 @@ non-deadlocking executions. For example Is it possible to end up with r0 = 36 at the end? The LKMM will tell you it is not, but the model won't mention that this is because P1 will self-deadlock in the executions where it stores 36 in y. + +In the LKMM, locks and release-acquire chains cause stores to +propagate in order. For example: + + int x, y, z; + + P0() + { + WRITE_ONCE(x, 1); + smp_store_release(&y, 1); + } + + P1() + { + int r1; + + r1 = smp_load_acquire(&y); + WRITE_ONCE(z, 1); + } + + P2() + { + int r2, r3, r4; + + r2 = READ_ONCE(z); + smp_rmb(); + r3 = READ_ONCE(x); + r4 = READ_ONCE(y); + } + +If r1 = 1 and r2 = 1 at the end, then both r3 and r4 must also be 1. +In other words, the smp_store_release() read by the smp_load_acquire() +together act as a sort of inter-processor fence, forcing the stores to +x and y to propagate to P2 before the store to z does, regardless of +the fact that P2 doesn't execute any release or acquire instructions. +This conclusion would hold even if P0 and P1 were on the same CPU, so +long as r1 = 1. + +We have mentioned that the LKMM treats locks as acquires and unlocks +as releases. Therefore it should not be surprising that something +analogous to this ordering also holds for locks: + + int x, y; + spinlock_t s; + + P0() + { + spin_lock(&s); + WRITE_ONCE(x, 1); + spin_unlock(&s); + } + + P1() + { + int r1; + + spin_lock(&s); + r1 = READ_ONCE(x): + WRITE_ONCE(y, 1); + spin_unlock(&s); + } + + P2() + { + int r2, r3; + + r2 = READ_ONCE(y); + smp_rmb(); + r3 = READ_ONCE(x); + } + +If r1 = 1 at the end (implying that P1's critical section executes +after P0's) and r2 = 1, then r3 must be 1; the ordering of the +critical sections forces the store to x to propagate to P2 before the +store to y does. + +In both versions of this scenario, the store-propagation ordering is +not required by the operational model. However, it does happen on all +the architectures supporting the Linux kernel, and kernel developers +seem to expect it; they have requested that this behavior be included +in the LKMM.