From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5BCDC7618F for ; Mon, 15 Jul 2019 13:54:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ACC072086C for ; Mon, 15 Jul 2019 13:54:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1563198842; bh=r9VbFvgB86loUol1iTq1VKBzHGe1IiiSjvv69koL+AE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=eJ59pyUn2jLzqO6BcguB0SNxr2oovuroOZZXmQkI2W9WELawCk2wl0ppEPuKZE+tT lhaiye9I8StENm2CNAqMgvpcN/c+tjjRy/UjDh14FjrIbDAEiBjrnfq+2oSym8il5V VAgdu4gSymwmulEEQaIfZLlYVlCZ/59OZuCt5U0Y= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732471AbfGONyB (ORCPT ); Mon, 15 Jul 2019 09:54:01 -0400 Received: from mail.kernel.org ([198.145.29.99]:52880 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731710AbfGONyA (ORCPT ); Mon, 15 Jul 2019 09:54:00 -0400 Received: from sasha-vm.mshome.net (unknown [73.61.17.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id CEE5D206B8; Mon, 15 Jul 2019 13:53:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1563198839; bh=r9VbFvgB86loUol1iTq1VKBzHGe1IiiSjvv69koL+AE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rcMIgMYDyt6AHibyKKeOamfqZKJjmZXoy0OpRhrlvfGNGoUWQucYXba1fnflSCpXz a9xMI9Pk6IFP6fwhxfkSWwK6Iqx9tIEVj6ITt/g72MlaPXnlQcoLfc4FcufQWvipA4 rXLLxNGusdqSs0DOrzRbArofDaHgcGB6T5UetWz0= From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Peter Zijlstra , Linus Torvalds , Thomas Gleixner , Ingo Molnar , Sasha Levin , linux-arch@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH AUTOSEL 5.2 117/249] x86/atomic: Fix smp_mb__{before,after}_atomic() Date: Mon, 15 Jul 2019 09:44:42 -0400 Message-Id: <20190715134655.4076-117-sashal@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190715134655.4076-1-sashal@kernel.org> References: <20190715134655.4076-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Peter Zijlstra [ Upstream commit 69d927bba39517d0980462efc051875b7f4db185 ] Recent probing at the Linux Kernel Memory Model uncovered a 'surprise'. Strongly ordered architectures where the atomic RmW primitive implies full memory ordering and smp_mb__{before,after}_atomic() are a simple barrier() (such as x86) fail for: *x = 1; atomic_inc(u); smp_mb__after_atomic(); r0 = *y; Because, while the atomic_inc() implies memory order, it (surprisingly) does not provide a compiler barrier. This then allows the compiler to re-order like so: atomic_inc(u); *x = 1; smp_mb__after_atomic(); r0 = *y; Which the CPU is then allowed to re-order (under TSO rules) like: atomic_inc(u); r0 = *y; *x = 1; And this very much was not intended. Therefore strengthen the atomic RmW ops to include a compiler barrier. NOTE: atomic_{or,and,xor} and the bitops already had the compiler barrier. Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- Documentation/atomic_t.txt | 3 +++ arch/x86/include/asm/atomic.h | 8 ++++---- arch/x86/include/asm/atomic64_64.h | 8 ++++---- arch/x86/include/asm/barrier.h | 4 ++-- 4 files changed, 13 insertions(+), 10 deletions(-) diff --git a/Documentation/atomic_t.txt b/Documentation/atomic_t.txt index dca3fb0554db..65bb09a29324 100644 --- a/Documentation/atomic_t.txt +++ b/Documentation/atomic_t.txt @@ -194,6 +194,9 @@ These helper barriers exist because architectures have varying implicit ordering on their SMP atomic primitives. For example our TSO architectures provide full ordered atomics and these barriers are no-ops. +NOTE: when the atomic RmW ops are fully ordered, they should also imply a +compiler barrier. + Thus: atomic_fetch_add(); diff --git a/arch/x86/include/asm/atomic.h b/arch/x86/include/asm/atomic.h index ea3d95275b43..115127c7ad28 100644 --- a/arch/x86/include/asm/atomic.h +++ b/arch/x86/include/asm/atomic.h @@ -54,7 +54,7 @@ static __always_inline void arch_atomic_add(int i, atomic_t *v) { asm volatile(LOCK_PREFIX "addl %1,%0" : "+m" (v->counter) - : "ir" (i)); + : "ir" (i) : "memory"); } /** @@ -68,7 +68,7 @@ static __always_inline void arch_atomic_sub(int i, atomic_t *v) { asm volatile(LOCK_PREFIX "subl %1,%0" : "+m" (v->counter) - : "ir" (i)); + : "ir" (i) : "memory"); } /** @@ -95,7 +95,7 @@ static __always_inline bool arch_atomic_sub_and_test(int i, atomic_t *v) static __always_inline void arch_atomic_inc(atomic_t *v) { asm volatile(LOCK_PREFIX "incl %0" - : "+m" (v->counter)); + : "+m" (v->counter) :: "memory"); } #define arch_atomic_inc arch_atomic_inc @@ -108,7 +108,7 @@ static __always_inline void arch_atomic_inc(atomic_t *v) static __always_inline void arch_atomic_dec(atomic_t *v) { asm volatile(LOCK_PREFIX "decl %0" - : "+m" (v->counter)); + : "+m" (v->counter) :: "memory"); } #define arch_atomic_dec arch_atomic_dec diff --git a/arch/x86/include/asm/atomic64_64.h b/arch/x86/include/asm/atomic64_64.h index dadc20adba21..5e86c0d68ac1 100644 --- a/arch/x86/include/asm/atomic64_64.h +++ b/arch/x86/include/asm/atomic64_64.h @@ -45,7 +45,7 @@ static __always_inline void arch_atomic64_add(long i, atomic64_t *v) { asm volatile(LOCK_PREFIX "addq %1,%0" : "=m" (v->counter) - : "er" (i), "m" (v->counter)); + : "er" (i), "m" (v->counter) : "memory"); } /** @@ -59,7 +59,7 @@ static inline void arch_atomic64_sub(long i, atomic64_t *v) { asm volatile(LOCK_PREFIX "subq %1,%0" : "=m" (v->counter) - : "er" (i), "m" (v->counter)); + : "er" (i), "m" (v->counter) : "memory"); } /** @@ -87,7 +87,7 @@ static __always_inline void arch_atomic64_inc(atomic64_t *v) { asm volatile(LOCK_PREFIX "incq %0" : "=m" (v->counter) - : "m" (v->counter)); + : "m" (v->counter) : "memory"); } #define arch_atomic64_inc arch_atomic64_inc @@ -101,7 +101,7 @@ static __always_inline void arch_atomic64_dec(atomic64_t *v) { asm volatile(LOCK_PREFIX "decq %0" : "=m" (v->counter) - : "m" (v->counter)); + : "m" (v->counter) : "memory"); } #define arch_atomic64_dec arch_atomic64_dec diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index 14de0432d288..84f848c2541a 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -80,8 +80,8 @@ do { \ }) /* Atomic operations are already serializing on x86 */ -#define __smp_mb__before_atomic() barrier() -#define __smp_mb__after_atomic() barrier() +#define __smp_mb__before_atomic() do { } while (0) +#define __smp_mb__after_atomic() do { } while (0) #include -- 2.20.1