From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91069C433B4 for ; Wed, 12 May 2021 14:36:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5E26D611AE for ; Wed, 12 May 2021 14:36:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231211AbhELOhz (ORCPT ); Wed, 12 May 2021 10:37:55 -0400 Received: from gate.crashing.org ([63.228.1.57]:48806 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230202AbhELOhy (ORCPT ); Wed, 12 May 2021 10:37:54 -0400 Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 14CEV6Yp018137; Wed, 12 May 2021 09:31:06 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 14CEV5SC018136; Wed, 12 May 2021 09:31:05 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Wed, 12 May 2021 09:31:05 -0500 From: Segher Boessenkool To: Christophe Leroy Cc: Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] powerpc: Force inlining of csum_add() Message-ID: <20210512143105.GW10366@gate.crashing.org> References: <20210511105154.GJ10366@gate.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4.2.3i Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 12, 2021 at 02:56:56PM +0200, Christophe Leroy wrote: > Le 11/05/2021 à 12:51, Segher Boessenkool a écrit : > >Something seems to have decided this asm is more expensive than it is. > >That isn't always avoidable -- the compiler cannot look inside asms -- > >but it seems it could be improved here. > > > >Do you have (or can make) a self-contained testcase? > > I have not tried, and I fear it might be difficult, because on a kernel > build with dozens of calls to csum_add(), only ip6_tunnel.o exhibits such > an issue. Yeah. Sometimes you can force some of the decisions, but that usually requires knowing too many GCC internals :-/ > >>And there is even one completely unused instance of csum_add(). > > > >That is strange, that should never happen. > > It seems that several .o include unused versions of csum_add. After the > final link, one remains (in addition to the used one) in vmlinux. But it is a static function, so it should not end up in any object file where it isn't used. > >>In the non-inlined version, the first sum with 0 was performed. > >>Here it is skipped. > > > >That is because of how __builtin_constant_p works, most likely. As we > >discussed elsewhere it is evaluated before all forms of loop unrolling. > > But we are not talking about loop unrolling here, are we ? Oh, right you are, but that doesn't change much. The _builtin_constant_p(len) is evaluated long before the compiler sees len is a constant here. > It seems that the reason here is that __builtin_constant_p() is evaluated > long after GCC decided to not inline that call to csum_add(). Yes, it seems we do not currently do even trivial inlining except very early in the compiler. Thanks, Segher From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06E72C433ED for ; Wed, 12 May 2021 14:34:00 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0F27A61411 for ; Wed, 12 May 2021 14:33:59 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0F27A61411 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4FgHNP4Yspz308x for ; Thu, 13 May 2021 00:33:57 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=permerror (SPF Permanent Error: Unknown mechanism found: ip:192.40.192.88/32) smtp.mailfrom=kernel.crashing.org (client-ip=63.228.1.57; helo=gate.crashing.org; envelope-from=segher@kernel.crashing.org; receiver=) Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by lists.ozlabs.org (Postfix) with ESMTP id 4FgHMy73Nlz2xZN for ; Thu, 13 May 2021 00:33:34 +1000 (AEST) Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 14CEV6Yp018137; Wed, 12 May 2021 09:31:06 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 14CEV5SC018136; Wed, 12 May 2021 09:31:05 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Wed, 12 May 2021 09:31:05 -0500 From: Segher Boessenkool To: Christophe Leroy Subject: Re: [PATCH] powerpc: Force inlining of csum_add() Message-ID: <20210512143105.GW10366@gate.crashing.org> References: <20210511105154.GJ10366@gate.crashing.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4.2.3i X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paul Mackerras , linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Wed, May 12, 2021 at 02:56:56PM +0200, Christophe Leroy wrote: > Le 11/05/2021 à 12:51, Segher Boessenkool a écrit : > >Something seems to have decided this asm is more expensive than it is. > >That isn't always avoidable -- the compiler cannot look inside asms -- > >but it seems it could be improved here. > > > >Do you have (or can make) a self-contained testcase? > > I have not tried, and I fear it might be difficult, because on a kernel > build with dozens of calls to csum_add(), only ip6_tunnel.o exhibits such > an issue. Yeah. Sometimes you can force some of the decisions, but that usually requires knowing too many GCC internals :-/ > >>And there is even one completely unused instance of csum_add(). > > > >That is strange, that should never happen. > > It seems that several .o include unused versions of csum_add. After the > final link, one remains (in addition to the used one) in vmlinux. But it is a static function, so it should not end up in any object file where it isn't used. > >>In the non-inlined version, the first sum with 0 was performed. > >>Here it is skipped. > > > >That is because of how __builtin_constant_p works, most likely. As we > >discussed elsewhere it is evaluated before all forms of loop unrolling. > > But we are not talking about loop unrolling here, are we ? Oh, right you are, but that doesn't change much. The _builtin_constant_p(len) is evaluated long before the compiler sees len is a constant here. > It seems that the reason here is that __builtin_constant_p() is evaluated > long after GCC decided to not inline that call to csum_add(). Yes, it seems we do not currently do even trivial inlining except very early in the compiler. Thanks, Segher