From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1073AC433DF for ; Wed, 1 Jul 2020 14:20:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EC2DB2077D for ; Wed, 1 Jul 2020 14:20:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731631AbgGAOUS convert rfc822-to-8bit (ORCPT ); Wed, 1 Jul 2020 10:20:18 -0400 Received: from eu-smtp-delivery-151.mimecast.com ([207.82.80.151]:49767 "EHLO eu-smtp-delivery-151.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731623AbgGAOUS (ORCPT ); Wed, 1 Jul 2020 10:20:18 -0400 Received: from AcuMS.aculab.com (156.67.243.126 [156.67.243.126]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mta-16-BVV8QEx9N7etV0JoaSvqjg-1; Wed, 01 Jul 2020 15:20:14 +0100 X-MC-Unique: BVV8QEx9N7etV0JoaSvqjg-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) by AcuMS.aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Wed, 1 Jul 2020 15:20:13 +0100 Received: from AcuMS.Aculab.com ([fe80::43c:695e:880f:8750]) by AcuMS.aculab.com ([fe80::43c:695e:880f:8750%12]) with mapi id 15.00.1347.000; Wed, 1 Jul 2020 15:20:13 +0100 From: David Laight To: 'Peter Zijlstra' , "Paul E. McKenney" CC: Marco Elver , Nick Desaulniers , Sami Tolvanen , "Masahiro Yamada" , Will Deacon , "Greg Kroah-Hartman" , Kees Cook , clang-built-linux , Kernel Hardening , linux-arch , Linux ARM , Linux Kbuild mailing list , LKML , "linux-pci@vger.kernel.org" , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" Subject: RE: [PATCH 00/22] add support for Clang LTO Thread-Topic: [PATCH 00/22] add support for Clang LTO Thread-Index: AQHWT4eVR3DE4y9c50++UkzL75GurajywsMg Date: Wed, 1 Jul 2020 14:20:13 +0000 Message-ID: <4427b0f825324da4b1640e32265b04bd@AcuMS.aculab.com> References: <20200624203200.78870-1-samitolvanen@google.com> <20200624211540.GS4817@hirez.programming.kicks-ass.net> <20200625080313.GY4817@hirez.programming.kicks-ass.net> <20200625082433.GC117543@hirez.programming.kicks-ass.net> <20200625085745.GD117543@hirez.programming.kicks-ass.net> <20200630191931.GA884155@elver.google.com> <20200630201243.GD4817@hirez.programming.kicks-ass.net> <20200630203016.GI9247@paulmck-ThinkPad-P72> <20200701091054.GW4781@hirez.programming.kicks-ass.net> In-Reply-To: <20200701091054.GW4781@hirez.programming.kicks-ass.net> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Peter Zijlstra > Sent: 01 July 2020 10:11 > On Tue, Jun 30, 2020 at 01:30:16PM -0700, Paul E. McKenney wrote: > > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > > > I'm not convinced C11 memory_order_consume would actually work for us, > > > even if it would work. That is, given: > > > > > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ > > > > > > only pointers can have consume, but like I pointed out, we have code > > > that relies on dependent loads from integers. > > > > I agree that C11 memory_order_consume is not normally what we want, > > given that it is universally promoted to memory_order_acquire. > > > > However, dependent loads from integers are, if anything, more difficult > > to defend from the compiler than are control dependencies. This applies > > doubly to integers that are used to index two-element arrays, in which > > case you are just asking the compiler to destroy your dependent loads > > by converting them into control dependencies. > > Yes, I'm aware. However, as you might know, I'm firmly in the 'C is a > glorified assembler' camp (as I expect most actual OS people are, out of > necessity if nothing else) and if I wanted a control dependency I > would've bloody well written one. I write in C because doing register tracking is hard :-) I've got an hdlc implementation in C that is carefully adjusted so that the worst case path is bounded. I probably know every one of the 1000 instructions in it. Would an asm statement that uses the same 'register' for input and output but doesn't actually do anything help? It won't generate any code, but the compiler ought to assume that it might change the value - so can't do optimisations that track the value across the call. > I think an optimizing compiler is awesome, but only in so far as that > optimization is actually helpful -- and yes, I just stepped into a giant > twilight zone there. That is, any optimization that has _any_ > controversy should be controllable (like -fno-strict-overflow > -fno-strict-aliasing) and I'd very much like the same here. I'm fed up of gcc generating the code that uses SIMD instructions for the 'tail' loop at the end of a function that is already doing SIMD operations for the main part of the loop. And compilers that convert a byte copy loop to 'rep movsb'. If I'm copying 3 or 4 bytes I don't want a 40 clock overhead. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F631C433DF for ; Wed, 1 Jul 2020 14:21:53 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 34BEB2067D for ; Wed, 1 Jul 2020 14:21:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="o4w9xNHZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 34BEB2067D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=ACULAB.COM Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:MIME-Version:In-Reply-To:References:Message-ID:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=qqYK7tpncv6cfluy40mZLrRxrU2wpWcvmtFlc6Xijfw=; b=o4w9xNHZvOhbtcI4XOWEi0fuf 4KR5kAfq2brC2pIeJr4zaAvBga4MCbILvVcLtXojGWLyzdzn77zAX+RoisbwMJ1PBOxjCQKtydItj 1RF0OvoIsa255DHK4X6YLR08WjBVxXbAKRrRn16ml69hudtKKiA7P7WpSt/asDumiTK75YLRMSGac 1zyqHQ7cMdFm2rbz3yLccu27onom8Wqs/MbbqGcOx16KeS6JCv6BLzfs6jPMkRiA6UpINe0GQhgu5 qOd2+xyRIRVUcfm78yLkNPcnsH6T8e1I7nlPngonA0khg9YsZe5Kq3lGFE3ng9+1KLbiptGvZFssw zOTL83/YQ==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1jqdb7-0000TJ-DI; Wed, 01 Jul 2020 14:20:25 +0000 Received: from eu-smtp-delivery-151.mimecast.com ([207.82.80.151]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1jqdb3-0000RY-QX for linux-arm-kernel@lists.infradead.org; Wed, 01 Jul 2020 14:20:23 +0000 Received: from AcuMS.aculab.com (156.67.243.126 [156.67.243.126]) (Using TLS) by relay.mimecast.com with ESMTP id uk-mta-16-BVV8QEx9N7etV0JoaSvqjg-1; Wed, 01 Jul 2020 15:20:14 +0100 X-MC-Unique: BVV8QEx9N7etV0JoaSvqjg-1 Received: from AcuMS.Aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) by AcuMS.aculab.com (fd9f:af1c:a25b:0:43c:695e:880f:8750) with Microsoft SMTP Server (TLS) id 15.0.1347.2; Wed, 1 Jul 2020 15:20:13 +0100 Received: from AcuMS.Aculab.com ([fe80::43c:695e:880f:8750]) by AcuMS.aculab.com ([fe80::43c:695e:880f:8750%12]) with mapi id 15.00.1347.000; Wed, 1 Jul 2020 15:20:13 +0100 From: David Laight To: 'Peter Zijlstra' , "Paul E. McKenney" Subject: RE: [PATCH 00/22] add support for Clang LTO Thread-Topic: [PATCH 00/22] add support for Clang LTO Thread-Index: AQHWT4eVR3DE4y9c50++UkzL75GurajywsMg Date: Wed, 1 Jul 2020 14:20:13 +0000 Message-ID: <4427b0f825324da4b1640e32265b04bd@AcuMS.aculab.com> References: <20200624203200.78870-1-samitolvanen@google.com> <20200624211540.GS4817@hirez.programming.kicks-ass.net> <20200625080313.GY4817@hirez.programming.kicks-ass.net> <20200625082433.GC117543@hirez.programming.kicks-ass.net> <20200625085745.GD117543@hirez.programming.kicks-ass.net> <20200630191931.GA884155@elver.google.com> <20200630201243.GD4817@hirez.programming.kicks-ass.net> <20200630203016.GI9247@paulmck-ThinkPad-P72> <20200701091054.GW4781@hirez.programming.kicks-ass.net> In-Reply-To: <20200701091054.GW4781@hirez.programming.kicks-ass.net> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=C51A453 smtp.mailfrom=david.laight@aculab.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20200701_102022_056420_314E29C9 X-CRM114-Status: GOOD ( 24.40 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arch , Marco Elver , "maintainer:X86 ARCHITECTURE \(32-BIT AND 64-BIT\)" , Kees Cook , Kernel Hardening , Greg Kroah-Hartman , Masahiro Yamada , Linux Kbuild mailing list , Nick Desaulniers , LKML , clang-built-linux , Sami Tolvanen , "linux-pci@vger.kernel.org" , Will Deacon , Linux ARM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Peter Zijlstra > Sent: 01 July 2020 10:11 > On Tue, Jun 30, 2020 at 01:30:16PM -0700, Paul E. McKenney wrote: > > On Tue, Jun 30, 2020 at 10:12:43PM +0200, Peter Zijlstra wrote: > > > > I'm not convinced C11 memory_order_consume would actually work for us, > > > even if it would work. That is, given: > > > > > > https://lore.kernel.org/lkml/20150520005510.GA23559@linux.vnet.ibm.com/ > > > > > > only pointers can have consume, but like I pointed out, we have code > > > that relies on dependent loads from integers. > > > > I agree that C11 memory_order_consume is not normally what we want, > > given that it is universally promoted to memory_order_acquire. > > > > However, dependent loads from integers are, if anything, more difficult > > to defend from the compiler than are control dependencies. This applies > > doubly to integers that are used to index two-element arrays, in which > > case you are just asking the compiler to destroy your dependent loads > > by converting them into control dependencies. > > Yes, I'm aware. However, as you might know, I'm firmly in the 'C is a > glorified assembler' camp (as I expect most actual OS people are, out of > necessity if nothing else) and if I wanted a control dependency I > would've bloody well written one. I write in C because doing register tracking is hard :-) I've got an hdlc implementation in C that is carefully adjusted so that the worst case path is bounded. I probably know every one of the 1000 instructions in it. Would an asm statement that uses the same 'register' for input and output but doesn't actually do anything help? It won't generate any code, but the compiler ought to assume that it might change the value - so can't do optimisations that track the value across the call. > I think an optimizing compiler is awesome, but only in so far as that > optimization is actually helpful -- and yes, I just stepped into a giant > twilight zone there. That is, any optimization that has _any_ > controversy should be controllable (like -fno-strict-overflow > -fno-strict-aliasing) and I'd very much like the same here. I'm fed up of gcc generating the code that uses SIMD instructions for the 'tail' loop at the end of a function that is already doing SIMD operations for the main part of the loop. And compilers that convert a byte copy loop to 'rep movsb'. If I'm copying 3 or 4 bytes I don't want a 40 clock overhead. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel