From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0C50C433E0 for ; Tue, 30 Jun 2020 17:38:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9FFCA20829 for ; Tue, 30 Jun 2020 17:38:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593538704; bh=bP3p7dVexcSkteyD/dA+wmwP54RNKmjexoS8VX/yL5Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=ubfpwztRNlj4YwTPf2Jp25PoiyL0407MfEG+V68hkyZYEfOa6DyD/hohr87luaATQ ZXJHXxEsWkh4C7WWa7q/1bfCT6dqiQBepou4O4w+J4ZLgDPqJgf0jWJoMjFRyj5SyP 000g48lLPRlZfRWG6GaGcgP/aH4Q3ltr1iuRVx8o= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390402AbgF3RiX (ORCPT ); Tue, 30 Jun 2020 13:38:23 -0400 Received: from mail.kernel.org ([198.145.29.99]:51342 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390353AbgF3RiW (ORCPT ); Tue, 30 Jun 2020 13:38:22 -0400 Received: from localhost.localdomain (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1B1162083E; Tue, 30 Jun 2020 17:38:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593538701; bh=bP3p7dVexcSkteyD/dA+wmwP54RNKmjexoS8VX/yL5Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QdbuWMrkMfSwq4uJ0dE+MaBUXfWAXh4mUItpEYk3m3ySP5Ykil0SoWaDHGKe4AiaC la5/K0HaSktEtvbrsQE4tBkfI7p4ADAZX8LRUg7jiqX69JQDIH670SxaFC6z+ZnYgr GkiwruxZBoqBoqCmBhyJFi/VqHNWjElPMsoOSr/g= From: Will Deacon To: linux-kernel@vger.kernel.org Cc: Will Deacon , Sami Tolvanen , Nick Desaulniers , Kees Cook , Marco Elver , "Paul E. McKenney" , Josh Triplett , Matt Turner , Ivan Kokshaysky , Richard Henderson , Peter Zijlstra , Alan Stern , "Michael S. Tsirkin" , Jason Wang , Arnd Bergmann , Boqun Feng , Catalin Marinas , Mark Rutland , linux-arm-kernel@lists.infradead.org, linux-alpha@vger.kernel.org, virtualization@lists.linux-foundation.org, kernel-team@android.com Subject: [PATCH 09/18] Documentation/barriers: Remove references to [smp_]read_barrier_depends() Date: Tue, 30 Jun 2020 18:37:25 +0100 Message-Id: <20200630173734.14057-10-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200630173734.14057-1-will@kernel.org> References: <20200630173734.14057-1-will@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The [smp_]read_barrier_depends() barrier macros no longer exist as part of the Linux memory model, so remove all references to them from the Documentation/ directory. Although this is fairly mechanical on the whole, we drop the "CACHE COHERENCY" section entirely from 'memory-barriers.txt' as it doesn't make any sense now that the dependency barriers have been removed. Acked-by: Paul E. McKenney Signed-off-by: Will Deacon --- .../RCU/Design/Requirements/Requirements.rst | 2 +- Documentation/memory-barriers.txt | 156 +----------------- 2 files changed, 9 insertions(+), 149 deletions(-) diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index 75b8ca007a11..50d5c43c48b0 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -463,7 +463,7 @@ again without disrupting RCU readers. This guarantee was only partially premeditated. DYNIX/ptx used an explicit memory barrier for publication, but had nothing resembling ``rcu_dereference()`` for subscription, nor did it have anything -resembling the ``smp_read_barrier_depends()`` that was later subsumed +resembling the dependency-ordering barrier that was later subsumed into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The need for these operations made itself known quite suddenly at a late-1990s meeting with the DEC Alpha architects, back in the days when diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index eaabc3134294..4e55aba3eb4a 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -553,12 +553,12 @@ There are certain things that the Linux kernel memory barriers do not guarantee: DATA DEPENDENCY BARRIERS (HISTORICAL) ------------------------------------- -As of v4.15 of the Linux kernel, an smp_read_barrier_depends() was -added to READ_ONCE(), which means that about the only people who -need to pay attention to this section are those working on DEC Alpha -architecture-specific code and those working on READ_ONCE() itself. -For those who need it, and for those who are interested in the history, -here is the story of data-dependency barriers. +As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for +DEC Alpha, which means that about the only people who need to pay attention +to this section are those working on DEC Alpha architecture-specific code +and those working on READ_ONCE() itself. For those who need it, and for +those who are interested in the history, here is the story of +data-dependency barriers. The usage requirements of data dependency barriers are a little subtle, and it's not always obvious that they're needed. To illustrate, consider the @@ -2708,144 +2708,6 @@ the properties of the memory window through which devices are accessed and/or the use of any special device communication instructions the CPU may have. -CACHE COHERENCY ---------------- - -Life isn't quite as simple as it may appear above, however: for while the -caches are expected to be coherent, there's no guarantee that that coherency -will be ordered. This means that while changes made on one CPU will -eventually become visible on all CPUs, there's no guarantee that they will -become apparent in the same order on those other CPUs. - - -Consider dealing with a system that has a pair of CPUs (1 & 2), each of which -has a pair of parallel data caches (CPU 1 has A/B, and CPU 2 has C/D): - - : - : +--------+ - : +---------+ | | - +--------+ : +--->| Cache A |<------->| | - | | : | +---------+ | | - | CPU 1 |<---+ | | - | | : | +---------+ | | - +--------+ : +--->| Cache B |<------->| | - : +---------+ | | - : | Memory | - : +---------+ | System | - +--------+ : +--->| Cache C |<------->| | - | | : | +---------+ | | - | CPU 2 |<---+ | | - | | : | +---------+ | | - +--------+ : +--->| Cache D |<------->| | - : +---------+ | | - : +--------+ - : - -Imagine the system has the following properties: - - (*) an odd-numbered cache line may be in cache A, cache C or it may still be - resident in memory; - - (*) an even-numbered cache line may be in cache B, cache D or it may still be - resident in memory; - - (*) while the CPU core is interrogating one cache, the other cache may be - making use of the bus to access the rest of the system - perhaps to - displace a dirty cacheline or to do a speculative load; - - (*) each cache has a queue of operations that need to be applied to that cache - to maintain coherency with the rest of the system; - - (*) the coherency queue is not flushed by normal loads to lines already - present in the cache, even though the contents of the queue may - potentially affect those loads. - -Imagine, then, that two writes are made on the first CPU, with a write barrier -between them to guarantee that they will appear to reach that CPU's caches in -the requisite order: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); Make sure change to v is visible before - change to p - v is now in cache A exclusively - p = &v; - p is now in cache B exclusively - -The write memory barrier forces the other CPUs in the system to perceive that -the local CPU's caches have apparently been updated in the correct order. But -now imagine that the second CPU wants to read those values: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - ... - q = p; - x = *q; - -The above pair of reads may then fail to happen in the expected order, as the -cacheline holding p may get updated in one of the second CPU's caches while -the update to the cacheline holding v is delayed in the other of the second -CPU's caches by some other cache event: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); - - - p = &v; q = p; - - - - x = *q; - Reads from v before v updated in cache - - - -Basically, while both cachelines will be updated on CPU 2 eventually, there's -no guarantee that, without intervention, the order of update will be the same -as that committed on CPU 1. - - -To intervene, we need to interpolate a data dependency barrier or a read -barrier between the loads (which as of v4.15 is supplied unconditionally -by the READ_ONCE() macro). This will force the cache to commit its -coherency queue before processing any further requests: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); - - - p = &v; q = p; - - - - smp_read_barrier_depends() - - - x = *q; - Reads from v after v updated in cache - - -This sort of problem can be encountered on DEC Alpha processors as they have a -split cache that improves performance by making better use of the data bus. -While most CPUs do imply a data dependency barrier on the read when a memory -access depends on a read, not all do, so it may not be relied on. - -Other CPUs may also have split caches, but must coordinate between the various -cachelets for normal memory accesses. The semantics of the Alpha removes the -need for hardware coordination in the absence of memory barriers, which -permitted Alpha to sport higher CPU clock rates back in the day. However, -please note that (again, as of v4.15) smp_read_barrier_depends() should not -be used except in Alpha arch-specific code and within the READ_ONCE() macro. - - CACHE COHERENCY VS DMA ---------------------- @@ -3009,10 +2871,8 @@ caches with the memory coherence system, thus making it seem like pointer changes vs new data occur in the right order. The Alpha defines the Linux kernel's memory model, although as of v4.15 -the Linux kernel's addition of smp_read_barrier_depends() to READ_ONCE() -greatly reduced Alpha's impact on the memory model. - -See the subsection on "Cache Coherency" above. +the Linux kernel's addition of smp_mb() to READ_ONCE() on Alpha greatly +reduced its impact on the memory model. VIRTUAL MACHINE GUESTS -- 2.27.0.212.ge8ba1cc988-goog From mboxrd@z Thu Jan 1 00:00:00 1970 From: Will Deacon Subject: [PATCH 09/18] Documentation/barriers: Remove references to [smp_]read_barrier_depends() Date: Tue, 30 Jun 2020 18:37:25 +0100 Message-ID: <20200630173734.14057-10-will@kernel.org> References: <20200630173734.14057-1-will@kernel.org> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Return-path: In-Reply-To: <20200630173734.14057-1-will@kernel.org> Sender: linux-kernel-owner@vger.kernel.org To: linux-kernel@vger.kernel.org Cc: Will Deacon , Sami Tolvanen , Nick Desaulniers , Kees Cook , Marco Elver , "Paul E. McKenney" , Josh Triplett , Matt Turner , Ivan Kokshaysky , Richard Henderson , Peter Zijlstra , Alan Stern , "Michael S. Tsirkin" , Jason Wang , Arnd Bergmann , Boqun Feng , Catalin Marinas , Mark Rutland , linux-arm-kernel@lists.infradead.org, linux-alpha@vger.kernel.orgv List-Id: virtualization@lists.linuxfoundation.org The [smp_]read_barrier_depends() barrier macros no longer exist as part of the Linux memory model, so remove all references to them from the Documentation/ directory. Although this is fairly mechanical on the whole, we drop the "CACHE COHERENCY" section entirely from 'memory-barriers.txt' as it doesn't make any sense now that the dependency barriers have been removed. Acked-by: Paul E. McKenney Signed-off-by: Will Deacon --- .../RCU/Design/Requirements/Requirements.rst | 2 +- Documentation/memory-barriers.txt | 156 +----------------- 2 files changed, 9 insertions(+), 149 deletions(-) diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index 75b8ca007a11..50d5c43c48b0 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -463,7 +463,7 @@ again without disrupting RCU readers. This guarantee was only partially premeditated. DYNIX/ptx used an explicit memory barrier for publication, but had nothing resembling ``rcu_dereference()`` for subscription, nor did it have anything -resembling the ``smp_read_barrier_depends()`` that was later subsumed +resembling the dependency-ordering barrier that was later subsumed into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The need for these operations made itself known quite suddenly at a late-1990s meeting with the DEC Alpha architects, back in the days when diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index eaabc3134294..4e55aba3eb4a 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -553,12 +553,12 @@ There are certain things that the Linux kernel memory barriers do not guarantee: DATA DEPENDENCY BARRIERS (HISTORICAL) ------------------------------------- -As of v4.15 of the Linux kernel, an smp_read_barrier_depends() was -added to READ_ONCE(), which means that about the only people who -need to pay attention to this section are those working on DEC Alpha -architecture-specific code and those working on READ_ONCE() itself. -For those who need it, and for those who are interested in the history, -here is the story of data-dependency barriers. +As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for +DEC Alpha, which means that about the only people who need to pay attention +to this section are those working on DEC Alpha architecture-specific code +and those working on READ_ONCE() itself. For those who need it, and for +those who are interested in the history, here is the story of +data-dependency barriers. The usage requirements of data dependency barriers are a little subtle, and it's not always obvious that they're needed. To illustrate, consider the @@ -2708,144 +2708,6 @@ the properties of the memory window through which devices are accessed and/or the use of any special device communication instructions the CPU may have. -CACHE COHERENCY ---------------- - -Life isn't quite as simple as it may appear above, however: for while the -caches are expected to be coherent, there's no guarantee that that coherency -will be ordered. This means that while changes made on one CPU will -eventually become visible on all CPUs, there's no guarantee that they will -become apparent in the same order on those other CPUs. - - -Consider dealing with a system that has a pair of CPUs (1 & 2), each of which -has a pair of parallel data caches (CPU 1 has A/B, and CPU 2 has C/D): - - : - : +--------+ - : +---------+ | | - +--------+ : +--->| Cache A |<------->| | - | | : | +---------+ | | - | CPU 1 |<---+ | | - | | : | +---------+ | | - +--------+ : +--->| Cache B |<------->| | - : +---------+ | | - : | Memory | - : +---------+ | System | - +--------+ : +--->| Cache C |<------->| | - | | : | +---------+ | | - | CPU 2 |<---+ | | - | | : | +---------+ | | - +--------+ : +--->| Cache D |<------->| | - : +---------+ | | - : +--------+ - : - -Imagine the system has the following properties: - - (*) an odd-numbered cache line may be in cache A, cache C or it may still be - resident in memory; - - (*) an even-numbered cache line may be in cache B, cache D or it may still be - resident in memory; - - (*) while the CPU core is interrogating one cache, the other cache may be - making use of the bus to access the rest of the system - perhaps to - displace a dirty cacheline or to do a speculative load; - - (*) each cache has a queue of operations that need to be applied to that cache - to maintain coherency with the rest of the system; - - (*) the coherency queue is not flushed by normal loads to lines already - present in the cache, even though the contents of the queue may - potentially affect those loads. - -Imagine, then, that two writes are made on the first CPU, with a write barrier -between them to guarantee that they will appear to reach that CPU's caches in -the requisite order: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); Make sure change to v is visible before - change to p - v is now in cache A exclusively - p = &v; - p is now in cache B exclusively - -The write memory barrier forces the other CPUs in the system to perceive that -the local CPU's caches have apparently been updated in the correct order. But -now imagine that the second CPU wants to read those values: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - ... - q = p; - x = *q; - -The above pair of reads may then fail to happen in the expected order, as the -cacheline holding p may get updated in one of the second CPU's caches while -the update to the cacheline holding v is delayed in the other of the second -CPU's caches by some other cache event: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); - - - p = &v; q = p; - - - - x = *q; - Reads from v before v updated in cache - - - -Basically, while both cachelines will be updated on CPU 2 eventually, there's -no guarantee that, without intervention, the order of update will be the same -as that committed on CPU 1. - - -To intervene, we need to interpolate a data dependency barrier or a read -barrier between the loads (which as of v4.15 is supplied unconditionally -by the READ_ONCE() macro). This will force the cache to commit its -coherency queue before processing any further requests: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); - - - p = &v; q = p; - - - - smp_read_barrier_depends() - - - x = *q; - Reads from v after v updated in cache - - -This sort of problem can be encountered on DEC Alpha processors as they have a -split cache that improves performance by making better use of the data bus. -While most CPUs do imply a data dependency barrier on the read when a memory -access depends on a read, not all do, so it may not be relied on. - -Other CPUs may also have split caches, but must coordinate between the various -cachelets for normal memory accesses. The semantics of the Alpha removes the -need for hardware coordination in the absence of memory barriers, which -permitted Alpha to sport higher CPU clock rates back in the day. However, -please note that (again, as of v4.15) smp_read_barrier_depends() should not -be used except in Alpha arch-specific code and within the READ_ONCE() macro. - - CACHE COHERENCY VS DMA ---------------------- @@ -3009,10 +2871,8 @@ caches with the memory coherence system, thus making it seem like pointer changes vs new data occur in the right order. The Alpha defines the Linux kernel's memory model, although as of v4.15 -the Linux kernel's addition of smp_read_barrier_depends() to READ_ONCE() -greatly reduced Alpha's impact on the memory model. - -See the subsection on "Cache Coherency" above. +the Linux kernel's addition of smp_mb() to READ_ONCE() on Alpha greatly +reduced its impact on the memory model. VIRTUAL MACHINE GUESTS -- 2.27.0.212.ge8ba1cc988-goog From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2186EC433E0 for ; Tue, 30 Jun 2020 17:41:14 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D7D2C20775 for ; Tue, 30 Jun 2020 17:41:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="iKFwwrxb"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="QdbuWMrk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D7D2C20775 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=9blqwa3EErdccJJ24D8sgHJA4dfbj0Xik+uOibIdCYE=; b=iKFwwrxbAxrqGcpKGhk29MnuG 7DdTPyewKRxyyVWHC5oul6HXEfAlFY0bJTYpyteQqqztLqS7DrllZMOjkmoYLIsyvClJ2v8cy/QR2 yL6s2BPEKcbKhBnX3cK+Iv/nKHeqROw+TOTCdUMxagB9VbrRJTgeQDouSkOeVxVEkNTcZCEqsltbF G3HHuS6NpnbexZOaptHv50RI5lp88aTqnlESUX1uG9SlsmtpHYeXNVf6Zgz26HEwYQoxvsRa5Z8sX RtdbU+C3+RUqxTurOKpsqhxINJIJeTJyQ6TYKwHXS3K/R8o6NoIW4sVwJLSfMsaa576Wsmzy/HovV QCltq0ljQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jqKE4-0001i4-D3; Tue, 30 Jun 2020 17:39:20 +0000 Received: from mail.kernel.org ([198.145.29.99]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jqKD8-0001FG-80 for linux-arm-kernel@lists.infradead.org; Tue, 30 Jun 2020 17:38:24 +0000 Received: from localhost.localdomain (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1B1162083E; Tue, 30 Jun 2020 17:38:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593538701; bh=bP3p7dVexcSkteyD/dA+wmwP54RNKmjexoS8VX/yL5Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QdbuWMrkMfSwq4uJ0dE+MaBUXfWAXh4mUItpEYk3m3ySP5Ykil0SoWaDHGKe4AiaC la5/K0HaSktEtvbrsQE4tBkfI7p4ADAZX8LRUg7jiqX69JQDIH670SxaFC6z+ZnYgr GkiwruxZBoqBoqCmBhyJFi/VqHNWjElPMsoOSr/g= From: Will Deacon To: linux-kernel@vger.kernel.org Subject: [PATCH 09/18] Documentation/barriers: Remove references to [smp_]read_barrier_depends() Date: Tue, 30 Jun 2020 18:37:25 +0100 Message-Id: <20200630173734.14057-10-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200630173734.14057-1-will@kernel.org> References: <20200630173734.14057-1-will@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200630_133822_596556_FA7441D6 X-CRM114-Status: GOOD ( 26.10 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Mark Rutland , "Michael S. Tsirkin" , Peter Zijlstra , Catalin Marinas , Jason Wang , virtualization@lists.linux-foundation.org, Will Deacon , Arnd Bergmann , Alan Stern , Sami Tolvanen , Matt Turner , kernel-team@android.com, Marco Elver , Kees Cook , "Paul E. McKenney" , Boqun Feng , Josh Triplett , Ivan Kokshaysky , linux-arm-kernel@lists.infradead.org, Richard Henderson , Nick Desaulniers , linux-alpha@vger.kernel.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org The [smp_]read_barrier_depends() barrier macros no longer exist as part of the Linux memory model, so remove all references to them from the Documentation/ directory. Although this is fairly mechanical on the whole, we drop the "CACHE COHERENCY" section entirely from 'memory-barriers.txt' as it doesn't make any sense now that the dependency barriers have been removed. Acked-by: Paul E. McKenney Signed-off-by: Will Deacon --- .../RCU/Design/Requirements/Requirements.rst | 2 +- Documentation/memory-barriers.txt | 156 +----------------- 2 files changed, 9 insertions(+), 149 deletions(-) diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index 75b8ca007a11..50d5c43c48b0 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -463,7 +463,7 @@ again without disrupting RCU readers. This guarantee was only partially premeditated. DYNIX/ptx used an explicit memory barrier for publication, but had nothing resembling ``rcu_dereference()`` for subscription, nor did it have anything -resembling the ``smp_read_barrier_depends()`` that was later subsumed +resembling the dependency-ordering barrier that was later subsumed into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The need for these operations made itself known quite suddenly at a late-1990s meeting with the DEC Alpha architects, back in the days when diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index eaabc3134294..4e55aba3eb4a 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -553,12 +553,12 @@ There are certain things that the Linux kernel memory barriers do not guarantee: DATA DEPENDENCY BARRIERS (HISTORICAL) ------------------------------------- -As of v4.15 of the Linux kernel, an smp_read_barrier_depends() was -added to READ_ONCE(), which means that about the only people who -need to pay attention to this section are those working on DEC Alpha -architecture-specific code and those working on READ_ONCE() itself. -For those who need it, and for those who are interested in the history, -here is the story of data-dependency barriers. +As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for +DEC Alpha, which means that about the only people who need to pay attention +to this section are those working on DEC Alpha architecture-specific code +and those working on READ_ONCE() itself. For those who need it, and for +those who are interested in the history, here is the story of +data-dependency barriers. The usage requirements of data dependency barriers are a little subtle, and it's not always obvious that they're needed. To illustrate, consider the @@ -2708,144 +2708,6 @@ the properties of the memory window through which devices are accessed and/or the use of any special device communication instructions the CPU may have. -CACHE COHERENCY ---------------- - -Life isn't quite as simple as it may appear above, however: for while the -caches are expected to be coherent, there's no guarantee that that coherency -will be ordered. This means that while changes made on one CPU will -eventually become visible on all CPUs, there's no guarantee that they will -become apparent in the same order on those other CPUs. - - -Consider dealing with a system that has a pair of CPUs (1 & 2), each of which -has a pair of parallel data caches (CPU 1 has A/B, and CPU 2 has C/D): - - : - : +--------+ - : +---------+ | | - +--------+ : +--->| Cache A |<------->| | - | | : | +---------+ | | - | CPU 1 |<---+ | | - | | : | +---------+ | | - +--------+ : +--->| Cache B |<------->| | - : +---------+ | | - : | Memory | - : +---------+ | System | - +--------+ : +--->| Cache C |<------->| | - | | : | +---------+ | | - | CPU 2 |<---+ | | - | | : | +---------+ | | - +--------+ : +--->| Cache D |<------->| | - : +---------+ | | - : +--------+ - : - -Imagine the system has the following properties: - - (*) an odd-numbered cache line may be in cache A, cache C or it may still be - resident in memory; - - (*) an even-numbered cache line may be in cache B, cache D or it may still be - resident in memory; - - (*) while the CPU core is interrogating one cache, the other cache may be - making use of the bus to access the rest of the system - perhaps to - displace a dirty cacheline or to do a speculative load; - - (*) each cache has a queue of operations that need to be applied to that cache - to maintain coherency with the rest of the system; - - (*) the coherency queue is not flushed by normal loads to lines already - present in the cache, even though the contents of the queue may - potentially affect those loads. - -Imagine, then, that two writes are made on the first CPU, with a write barrier -between them to guarantee that they will appear to reach that CPU's caches in -the requisite order: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); Make sure change to v is visible before - change to p - v is now in cache A exclusively - p = &v; - p is now in cache B exclusively - -The write memory barrier forces the other CPUs in the system to perceive that -the local CPU's caches have apparently been updated in the correct order. But -now imagine that the second CPU wants to read those values: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - ... - q = p; - x = *q; - -The above pair of reads may then fail to happen in the expected order, as the -cacheline holding p may get updated in one of the second CPU's caches while -the update to the cacheline holding v is delayed in the other of the second -CPU's caches by some other cache event: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); - - - p = &v; q = p; - - - - x = *q; - Reads from v before v updated in cache - - - -Basically, while both cachelines will be updated on CPU 2 eventually, there's -no guarantee that, without intervention, the order of update will be the same -as that committed on CPU 1. - - -To intervene, we need to interpolate a data dependency barrier or a read -barrier between the loads (which as of v4.15 is supplied unconditionally -by the READ_ONCE() macro). This will force the cache to commit its -coherency queue before processing any further requests: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); - - - p = &v; q = p; - - - - smp_read_barrier_depends() - - - x = *q; - Reads from v after v updated in cache - - -This sort of problem can be encountered on DEC Alpha processors as they have a -split cache that improves performance by making better use of the data bus. -While most CPUs do imply a data dependency barrier on the read when a memory -access depends on a read, not all do, so it may not be relied on. - -Other CPUs may also have split caches, but must coordinate between the various -cachelets for normal memory accesses. The semantics of the Alpha removes the -need for hardware coordination in the absence of memory barriers, which -permitted Alpha to sport higher CPU clock rates back in the day. However, -please note that (again, as of v4.15) smp_read_barrier_depends() should not -be used except in Alpha arch-specific code and within the READ_ONCE() macro. - - CACHE COHERENCY VS DMA ---------------------- @@ -3009,10 +2871,8 @@ caches with the memory coherence system, thus making it seem like pointer changes vs new data occur in the right order. The Alpha defines the Linux kernel's memory model, although as of v4.15 -the Linux kernel's addition of smp_read_barrier_depends() to READ_ONCE() -greatly reduced Alpha's impact on the memory model. - -See the subsection on "Cache Coherency" above. +the Linux kernel's addition of smp_mb() to READ_ONCE() on Alpha greatly +reduced its impact on the memory model. VIRTUAL MACHINE GUESTS -- 2.27.0.212.ge8ba1cc988-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 From: Will Deacon Subject: [PATCH 09/18] Documentation/barriers: Remove references to [smp_]read_barrier_depends() Date: Tue, 30 Jun 2020 18:37:25 +0100 Message-ID: <20200630173734.14057-10-will@kernel.org> References: <20200630173734.14057-1-will@kernel.org> Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593538701; bh=bP3p7dVexcSkteyD/dA+wmwP54RNKmjexoS8VX/yL5Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QdbuWMrkMfSwq4uJ0dE+MaBUXfWAXh4mUItpEYk3m3ySP5Ykil0SoWaDHGKe4AiaC la5/K0HaSktEtvbrsQE4tBkfI7p4ADAZX8LRUg7jiqX69JQDIH670SxaFC6z+ZnYgr GkiwruxZBoqBoqCmBhyJFi/VqHNWjElPMsoOSr/g= In-Reply-To: <20200630173734.14057-1-will@kernel.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: Content-Type: text/plain; charset="us-ascii" To: linux-kernel@vger.kernel.org Cc: Will Deacon , Sami Tolvanen , Nick Desaulniers , Kees Cook , Marco Elver , "Paul E. McKenney" , Josh Triplett , Matt Turner , Ivan Kokshaysky , Richard Henderson , Peter Zijlstra , Alan Stern , "Michael S. Tsirkin" , Jason Wang , Arnd Bergmann , Boqun Feng , Catalin Marinas , Mark Rutland , linux-arm-kernel@lists.infradead.org, linux-alpha@vger.kernel.org, v The [smp_]read_barrier_depends() barrier macros no longer exist as part of the Linux memory model, so remove all references to them from the Documentation/ directory. Although this is fairly mechanical on the whole, we drop the "CACHE COHERENCY" section entirely from 'memory-barriers.txt' as it doesn't make any sense now that the dependency barriers have been removed. Acked-by: Paul E. McKenney Signed-off-by: Will Deacon --- .../RCU/Design/Requirements/Requirements.rst | 2 +- Documentation/memory-barriers.txt | 156 +----------------- 2 files changed, 9 insertions(+), 149 deletions(-) diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst index 75b8ca007a11..50d5c43c48b0 100644 --- a/Documentation/RCU/Design/Requirements/Requirements.rst +++ b/Documentation/RCU/Design/Requirements/Requirements.rst @@ -463,7 +463,7 @@ again without disrupting RCU readers. This guarantee was only partially premeditated. DYNIX/ptx used an explicit memory barrier for publication, but had nothing resembling ``rcu_dereference()`` for subscription, nor did it have anything -resembling the ``smp_read_barrier_depends()`` that was later subsumed +resembling the dependency-ordering barrier that was later subsumed into ``rcu_dereference()`` and later still into ``READ_ONCE()``. The need for these operations made itself known quite suddenly at a late-1990s meeting with the DEC Alpha architects, back in the days when diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt index eaabc3134294..4e55aba3eb4a 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -553,12 +553,12 @@ There are certain things that the Linux kernel memory barriers do not guarantee: DATA DEPENDENCY BARRIERS (HISTORICAL) ------------------------------------- -As of v4.15 of the Linux kernel, an smp_read_barrier_depends() was -added to READ_ONCE(), which means that about the only people who -need to pay attention to this section are those working on DEC Alpha -architecture-specific code and those working on READ_ONCE() itself. -For those who need it, and for those who are interested in the history, -here is the story of data-dependency barriers. +As of v4.15 of the Linux kernel, an smp_mb() was added to READ_ONCE() for +DEC Alpha, which means that about the only people who need to pay attention +to this section are those working on DEC Alpha architecture-specific code +and those working on READ_ONCE() itself. For those who need it, and for +those who are interested in the history, here is the story of +data-dependency barriers. The usage requirements of data dependency barriers are a little subtle, and it's not always obvious that they're needed. To illustrate, consider the @@ -2708,144 +2708,6 @@ the properties of the memory window through which devices are accessed and/or the use of any special device communication instructions the CPU may have. -CACHE COHERENCY ---------------- - -Life isn't quite as simple as it may appear above, however: for while the -caches are expected to be coherent, there's no guarantee that that coherency -will be ordered. This means that while changes made on one CPU will -eventually become visible on all CPUs, there's no guarantee that they will -become apparent in the same order on those other CPUs. - - -Consider dealing with a system that has a pair of CPUs (1 & 2), each of which -has a pair of parallel data caches (CPU 1 has A/B, and CPU 2 has C/D): - - : - : +--------+ - : +---------+ | | - +--------+ : +--->| Cache A |<------->| | - | | : | +---------+ | | - | CPU 1 |<---+ | | - | | : | +---------+ | | - +--------+ : +--->| Cache B |<------->| | - : +---------+ | | - : | Memory | - : +---------+ | System | - +--------+ : +--->| Cache C |<------->| | - | | : | +---------+ | | - | CPU 2 |<---+ | | - | | : | +---------+ | | - +--------+ : +--->| Cache D |<------->| | - : +---------+ | | - : +--------+ - : - -Imagine the system has the following properties: - - (*) an odd-numbered cache line may be in cache A, cache C or it may still be - resident in memory; - - (*) an even-numbered cache line may be in cache B, cache D or it may still be - resident in memory; - - (*) while the CPU core is interrogating one cache, the other cache may be - making use of the bus to access the rest of the system - perhaps to - displace a dirty cacheline or to do a speculative load; - - (*) each cache has a queue of operations that need to be applied to that cache - to maintain coherency with the rest of the system; - - (*) the coherency queue is not flushed by normal loads to lines already - present in the cache, even though the contents of the queue may - potentially affect those loads. - -Imagine, then, that two writes are made on the first CPU, with a write barrier -between them to guarantee that they will appear to reach that CPU's caches in -the requisite order: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); Make sure change to v is visible before - change to p - v is now in cache A exclusively - p = &v; - p is now in cache B exclusively - -The write memory barrier forces the other CPUs in the system to perceive that -the local CPU's caches have apparently been updated in the correct order. But -now imagine that the second CPU wants to read those values: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - ... - q = p; - x = *q; - -The above pair of reads may then fail to happen in the expected order, as the -cacheline holding p may get updated in one of the second CPU's caches while -the update to the cacheline holding v is delayed in the other of the second -CPU's caches by some other cache event: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); - - - p = &v; q = p; - - - - x = *q; - Reads from v before v updated in cache - - - -Basically, while both cachelines will be updated on CPU 2 eventually, there's -no guarantee that, without intervention, the order of update will be the same -as that committed on CPU 1. - - -To intervene, we need to interpolate a data dependency barrier or a read -barrier between the loads (which as of v4.15 is supplied unconditionally -by the READ_ONCE() macro). This will force the cache to commit its -coherency queue before processing any further requests: - - CPU 1 CPU 2 COMMENT - =============== =============== ======================================= - u == 0, v == 1 and p == &u, q == &u - v = 2; - smp_wmb(); - - - p = &v; q = p; - - - - smp_read_barrier_depends() - - - x = *q; - Reads from v after v updated in cache - - -This sort of problem can be encountered on DEC Alpha processors as they have a -split cache that improves performance by making better use of the data bus. -While most CPUs do imply a data dependency barrier on the read when a memory -access depends on a read, not all do, so it may not be relied on. - -Other CPUs may also have split caches, but must coordinate between the various -cachelets for normal memory accesses. The semantics of the Alpha removes the -need for hardware coordination in the absence of memory barriers, which -permitted Alpha to sport higher CPU clock rates back in the day. However, -please note that (again, as of v4.15) smp_read_barrier_depends() should not -be used except in Alpha arch-specific code and within the READ_ONCE() macro. - - CACHE COHERENCY VS DMA ---------------------- @@ -3009,10 +2871,8 @@ caches with the memory coherence system, thus making it seem like pointer changes vs new data occur in the right order. The Alpha defines the Linux kernel's memory model, although as of v4.15 -the Linux kernel's addition of smp_read_barrier_depends() to READ_ONCE() -greatly reduced Alpha's impact on the memory model. - -See the subsection on "Cache Coherency" above. +the Linux kernel's addition of smp_mb() to READ_ONCE() on Alpha greatly +reduced its impact on the memory model. VIRTUAL MACHINE GUESTS -- 2.27.0.212.ge8ba1cc988-goog