From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24B74C433DF for ; Wed, 1 Jul 2020 10:24:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F2FF52074D for ; Wed, 1 Jul 2020 10:24:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593599077; bh=DIRVH3Dx4QYNN2YPXtzS3PQZFuRSscQzlztUVHv4lV0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=eTj+BdOG+FvzjVpeVDgvhfUzVf2SWeowhJ/UV5LCFthXR4It32Be+35LCe4KlxFYa FIIQBV5W08x+PKA5RqBsFJETbhpJ4NrGK0qGIVCmvI91a1wPpvMjwcfM3KJtN/nf3V vBZ6OnHbDMg9aN3mvocTmdLCdwS3+Ha9VVHavPdg= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729922AbgGAKYf (ORCPT ); Wed, 1 Jul 2020 06:24:35 -0400 Received: from mail.kernel.org ([198.145.29.99]:43690 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729358AbgGAKYe (ORCPT ); Wed, 1 Jul 2020 06:24:34 -0400 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0376C2067D; Wed, 1 Jul 2020 10:24:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1593599074; bh=DIRVH3Dx4QYNN2YPXtzS3PQZFuRSscQzlztUVHv4lV0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=BQsjf2QASBPMulYwQCwFIhtZgrBXvfSWBtaKBJMnxr/D6qZZUAb93ZxNwnDQ/k+iu LtKSGpZpP5vyXtYOSgswDLkQz833Nb2EEtzkLk17HnOXYgukK/+FWQtB3qnyxYfnLD 3lrHkOwmMYKE6f10gOLnWhO6jCuo/EOgXrC1A/Iw= Date: Wed, 1 Jul 2020 11:24:28 +0100 From: Will Deacon To: Marco Elver Cc: Mark Rutland , Kees Cook , "Paul E. McKenney" , "Michael S. Tsirkin" , Peter Zijlstra , Catalin Marinas , Jason Wang , Nick Desaulniers , Josh Triplett , LKML , Ivan Kokshaysky , linux-arm-kernel@lists.infradead.org, Sami Tolvanen , linux-alpha@vger.kernel.org, Alan Stern , Matt Turner , virtualization@lists.linux-foundation.org, Android Kernel Team , Boqun Feng , Arnd Bergmann , Richard Henderson Subject: Re: [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y Message-ID: <20200701102427.GD14959@willie-the-truck> References: <20200630173734.14057-1-will@kernel.org> <20200630173734.14057-19-will@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 30, 2020 at 09:47:30PM +0200, Marco Elver wrote: > On Tue, 30 Jun 2020 at 19:39, Will Deacon wrote: > > > > When building with LTO, there is an increased risk of the compiler > > converting an address dependency headed by a READ_ONCE() invocation > > into a control dependency and consequently allowing for harmful > > reordering by the CPU. > > > > Ensure that such transformations are harmless by overriding the generic > > READ_ONCE() definition with one that provides acquire semantics when > > building with LTO. > > > > Signed-off-by: Will Deacon > > --- > > arch/arm64/include/asm/rwonce.h | 63 +++++++++++++++++++++++++++++++ > > arch/arm64/kernel/vdso/Makefile | 2 +- > > arch/arm64/kernel/vdso32/Makefile | 2 +- > > 3 files changed, 65 insertions(+), 2 deletions(-) > > create mode 100644 arch/arm64/include/asm/rwonce.h > > This seems reasonable, given we can't realistically tell the compiler > about dependent loads. What (if any), is the performance impact? I > guess this also heavily depends on the actual silicon. Right, it depends both on the CPU micro-architecture and also the workload. When we ran some basic tests, the overhead wasn't greater than the benefit seen by enabling LTO, so it seems like a reasonable trade-off (given that LTO is a dependency for CFI, so it's not just about performance). Will