From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D76A6C388F2 for ; Thu, 22 Oct 2020 04:01:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 72FB62225D for ; Thu, 22 Oct 2020 04:01:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="HuUJDCw6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725819AbgJVEBJ (ORCPT ); Thu, 22 Oct 2020 00:01:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725818AbgJVEBI (ORCPT ); Thu, 22 Oct 2020 00:01:08 -0400 Received: from merlin.infradead.org (merlin.infradead.org [IPv6:2001:8b0:10b:1231::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 59C3EC0613CE for ; Wed, 21 Oct 2020 21:01:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=Content-Type:In-Reply-To:MIME-Version: Date:Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=IS6O1nYxSaGRmR4NjmVJ09OkAhgrTCyEwvw2xsOY+D8=; b=HuUJDCw6X1IriJ2zar5AR39Yji YjSrGPlgMOy3x4StB9UZ9tuWOFPZOGhvlEJO/ulpTLUE8nuBCQFRgwZw5AnO0jqFDCdqm2usVr0r1 EZTz5/QPtozHM1t0a0rK5L6oARd6ZdKBj/Np45SlMvWEb4epYFZzL2OZYUg++9WCEsIa8ZlAqBcvt bPnmPjcG1i/MbkpUunhXInQUY8TubwKdgFXDRLHhNVuEGLQtQLXTQwVkIuV7YZdES6nrPVUKeAHYQ SWh5ea4DS3U8yBeB129tR3aD/sVx4PSYG/JezDwRs3lbl38HQeZ8CD9L8B2S5s4G4taG9I/aZPBP4 vAxmJWOA==; Received: from [2601:1c0:6280:3f0::507c] by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1kVRmj-0007Vh-Gq; Thu, 22 Oct 2020 04:01:05 +0000 Subject: Re: ilog2 vs. GCC inlining heuristics To: Christophe Leroy , Peter Zijlstra , Jakub Jelinek Cc: linux-toolchains@vger.kernel.org References: <20201021132718.GB2176@tucnak> <20201021151947.GL2628@hirez.programming.kicks-ass.net> <21556974-eea1-ed6a-ea6f-3e97a6eea4bc@csgroup.eu> From: Randy Dunlap Message-ID: Date: Wed, 21 Oct 2020 21:01:01 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 MIME-Version: 1.0 In-Reply-To: <21556974-eea1-ed6a-ea6f-3e97a6eea4bc@csgroup.eu> Content-Type: multipart/mixed; boundary="------------C5962A6D2E2BA618E743ADA5" Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-toolchains@vger.kernel.org This is a multi-part message in MIME format. --------------C5962A6D2E2BA618E743ADA5 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit On 10/21/20 11:40 AM, Christophe Leroy wrote: > > > Le 21/10/2020 à 17:19, Peter Zijlstra a écrit : >> On Wed, Oct 21, 2020 at 03:27:18PM +0200, Jakub Jelinek wrote: >>> Hi! >>> >>> Based on the GCC PR97445 discussions, I'd like to propose following change, >>> which should significantly decrease the amount of code in inline functions >>> that use ilog2, but as I'm already two decades out of the Linux kernel >>> development, I'd appreciate if some kernel developer could try that (all >>> I have done is check that it gives the same results as before) and if it >>> works submit it for inclusion into the kernel? >> >> I'll stick it in my queue and feed it to the robots. >> > > I did a mpc885_ads_defconfig build with your patch. That's far better, even better than with gcc 9 without the patch. > > I only have two instances of get_order() in vmlinux, one of it is used twice, the other is never user. > > With -Winline, the reason for not inlining is for both because "the call is unlikely and the code size would grow" > > Christophe Do we want a lib/test_log2.c test? Although Jakub says that he already verified that is gives the same results as the previous code. I ran a (new) lib/test_log2.ko before and after Jakub's patch and got the same results, although I am not claiming that it has an exhaustive set of tests in it. [it is attached] -- ~Randy --------------C5962A6D2E2BA618E743ADA5 Content-Type: text/x-patch; charset=UTF-8; name="lib-test-log2.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="lib-test-log2.patch" From: Randy Dunlap // modified from lib/test_bitops.c Signed-off-by: Randy Dunlap --- lib/Kconfig.debug | 11 +++ lib/Makefile | 1 lib/test_log2.c | 139 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 151 insertions(+) --- linux-next-20201021.orig/lib/Kconfig.debug +++ linux-next-20201021/lib/Kconfig.debug @@ -2136,6 +2136,17 @@ config TEST_BITOPS If unsure, say N. +config TEST_LOG2 + tristate "Test module for log2 interfaces" + depends on m + help + This builds the "test_log2" module that does basic testing of + all interfaces. + It has no dependencies and doesn't run or load unless + explicitly requested by name. For example: modprobe test_log2. + + If unsure, say N. + config TEST_VMALLOC tristate "Test module for stress/performance analysis of vmalloc allocator" default n --- linux-next-20201021.orig/lib/Makefile +++ linux-next-20201021/lib/Makefile @@ -62,6 +62,7 @@ obj-$(CONFIG_TEST_BPF) += test_bpf.o obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o obj-$(CONFIG_TEST_BITOPS) += test_bitops.o CFLAGS_test_bitops.o += -Werror +obj-$(CONFIG_TEST_LOG2) += test_log2.o obj-$(CONFIG_TEST_SYSCTL) += test_sysctl.o obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o obj-$(CONFIG_TEST_IDA) += test_ida.o --- /dev/null +++ linux-next-20201021/lib/test_log2.c @@ -0,0 +1,139 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +// currently missing tests of roundup/rounddown macros; + +#include +#include +#include +#include +#include + +/* + * a tiny module for testing all interfaces in + */ + +static unsigned int test_power_of_2[][2] = { + {0, 0}, + {PAGE_SIZE, 1}, + {65535, 0}, + {65536, 1}, + {(unsigned int)~0, 0}, + {0x80000000, 1}, + {0x80008000, 0}, +}; + +static unsigned int test_ilog2_uint[][2] = { + //{0, -1}, + {1, 0}, + {2, 1}, + {4, 2}, + {(unsigned int)~0, 31}, + {0x80000000, 31}, + {0x80008000, 31}, +}; + +static unsigned long test_ilog2_ulong[][2] = { + //{0, -1}, + {1, 0}, + {2, 1}, + {4, 2}, + {(unsigned long)~0, 63}, + {0x80000000UL, 31}, + {0x80008000UL, 31}, +}; + +static unsigned int order_base2[][2] = { + {0x00000003, 2}, + {0x00000004, 2}, + {0x00001fff, 13}, + {0x00002000, 13}, + {0x50000000, 31}, + {0x80000000, 31}, + {0x80003000, 32}, +}; + +//#ifdef CONFIG_64BIT +static unsigned long order_base2_long[][2] = { + {0x0000000300000000, 34}, + {0x0000000400000000, 34}, + {0x00001fff00000000, 45}, + {0x0000200000000000, 45}, + {0x5000000000000000, 63}, + {0x8000000000000000, 63}, + {0x8000300000000000, 64}, +}; +//#endif + +static unsigned int test_bits_per[][2] = { + {0, 1}, + {1, 1}, + {2, 2}, + {3, 2}, + {4, 3}, + {0x0fff, 12}, + {0x1000, 13}, + {0x80003000, 32}, +}; + +static int __init test_log2_start(void) +{ + int i; + + pr_info("Starting log2 tests\n"); + + for (i = 0; i < ARRAY_SIZE(test_power_of_2); i++) { + if (is_power_of_2(test_power_of_2[i][0]) != test_power_of_2[i][1]) + pr_warn("is_power_of_2 wrong for %x\n", + test_power_of_2[i][0]); + } + + for (i = 0; i < ARRAY_SIZE(test_ilog2_uint); i++) { + if (ilog2(test_ilog2_uint[i][0]) != test_ilog2_uint[i][1]) + pr_warn("ilog2 wrong for uint %x\n", + test_ilog2_uint[i][0]); + } + + for (i = 0; i < ARRAY_SIZE(test_ilog2_ulong); i++) { + if (ilog2(test_ilog2_ulong[i][0]) != test_ilog2_ulong[i][1]) + pr_warn("ilog2 wrong for ulong %x\n", + test_ilog2_ulong[i][0]); + } + + for (i = 0; i < ARRAY_SIZE(order_base2); i++) { + if (order_base_2(order_base2[i][0]) != order_base2[i][1]) + pr_warn("order_base_2 wrong for %x\n", + order_base2[i][0]); + } + +//#ifdef CONFIG_64BIT + for (i = 0; i < ARRAY_SIZE(order_base2_long); i++) { + if (order_base_2(order_base2_long[i][0]) != + order_base2_long[i][1]) + pr_warn("order_base_2 wrong for %lx\n", + order_base2_long[i][0]); + } +//#endif + + for (i = 0; i < ARRAY_SIZE(test_bits_per); i++) { + if (bits_per(test_bits_per[i][0]) != test_bits_per[i][1]) + pr_warn("bits_per wrong for %x\n", + test_bits_per[i][0]); + } + + pr_info("Completed log2 tests\n"); + + return 0; +} + +static void __exit test_log2_fini(void) +{ +} + +module_init(test_log2_start); +module_exit(test_log2_fini); + +MODULE_AUTHOR("Randy Dunlap "); +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("log2 test module"); --------------C5962A6D2E2BA618E743ADA5--