From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E811C388F9 for ; Sun, 8 Nov 2020 17:40:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2571220760 for ; Sun, 8 Nov 2020 17:40:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oTiZFw3p" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728611AbgKHRkS (ORCPT ); Sun, 8 Nov 2020 12:40:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727570AbgKHRkS (ORCPT ); Sun, 8 Nov 2020 12:40:18 -0500 Received: from mail-qk1-x743.google.com (mail-qk1-x743.google.com [IPv6:2607:f8b0:4864:20::743]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20ACAC0613CF for ; Sun, 8 Nov 2020 09:40:18 -0800 (PST) Received: by mail-qk1-x743.google.com with SMTP id r7so5952142qkf.3 for ; Sun, 08 Nov 2020 09:40:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:date:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=BgT2CP26pzdejYHyc1QoimRmV9Baz6Lnq3lNUhEBR1U=; b=oTiZFw3p+N+PvikLX4+DAzQiM6BSG4w4ttHWPMqGHiNIRb9obfo91xrHOq2e7Vi3/F +tv1rlZ/uiojHfLfWfzoZKWtKH+OwqggVWT4DpkH7ggAxeIhxutHkdsRqIH8NGNJbda8 Yh1QdaYwrF0H/tLXYXac6AdB6pjJgdYHEr7G6sYna+yILlgVXj1H+/3u/62tQCM8zU07 0bqyf0aIEUt8KZBD3x8mJFz+XinlMFEx6s1lKABpn9RqDrFPKyyFBAziSnSioq/tqk7e w3o2zTwGNImgNATxDMg1yqjH/3ToH9wXTzSbg1PUfwD81PE9zlS1UQtHkBpm1AnZA3Eq pMqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:date:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=BgT2CP26pzdejYHyc1QoimRmV9Baz6Lnq3lNUhEBR1U=; b=JLLAkPxcuwB7efWNQE0vHtGEcMiA9VYyqWl5995KDtFNLh25xKvQBnyIvCn98HodNU IHFVYqMWL8DI7SL9ppjElbyV48o+dG3EQY3n+RpnoHZ3+kZCtxE8PFAOllIwHiZebH/y yuFCupTgc822qQo5zJK+vAD1tw4mXU6Zv0zwslMRZon9xUDBspulkcJcEhJpgfOfIDvE xNc1MXRcTrmd1IB/8XfLWB5PHwUfPBh+qgJxEtzPH17bgY2dz1d9RIdZALob5w69TrFZ hRy9ofhbI6TvDy9i7BHlccehebwaUJYuWTyUs204wMCszixRE90XZZinit5nZH6JD83m wK3Q== X-Gm-Message-State: AOAM533IstBjYhBz8sZdYbZIBr4koGqHByEXUbQDMeC6Ct5bMTElCDM9 MmIhr7FVbLFFwFATzXGMFsE= X-Google-Smtp-Source: ABdhPJxEn8FFMo8ZvgDetMdWJcp3APOG6Lq+Nvj71XjGeAzlU4WcoJzrs8OLhaHlnx4m/BefT/NQ7w== X-Received: by 2002:a37:458c:: with SMTP id s134mr10034685qka.405.1604857216288; Sun, 08 Nov 2020 09:40:16 -0800 (PST) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id y14sm2723654qkj.56.2020.11.08.09.40.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 08 Nov 2020 09:40:15 -0800 (PST) Sender: Arvind Sankar From: Arvind Sankar X-Google-Original-From: Arvind Sankar Date: Sun, 8 Nov 2020 12:40:14 -0500 To: Adrian Ratiu Cc: linux-arm-kernel@lists.infradead.org, Nathan Chancellor , Nick Desaulniers , Arnd Bergmann , clang-built-linux@googlegroups.com, Russell King , linux-kernel@vger.kernel.org, kernel@collabora.com Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Message-ID: <20201108174014.GA219672@rani.riverdale.lan> References: <20201106051436.2384842-1-adrian.ratiu@collabora.com> <20201106051436.2384842-3-adrian.ratiu@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20201106051436.2384842-3-adrian.ratiu@collabora.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 06, 2020 at 07:14:36AM +0200, Adrian Ratiu wrote: > Due to a Clang bug [1] neon autoloop vectorization does not happen or > happens badly with no gains and considering previous GCC experiences > which generated unoptimized code which was worse than the default asm > implementation, it is safer to default clang builds to the known good > generic implementation. > > The kernel currently supports a minimum Clang version of v10.0.1, see > commit 1f7a44f63e6c ("compiler-clang: add build check for clang 10.0.1"). > > When the bug gets eventually fixed, this commit could be reverted or, > if the minimum clang version bump takes a long time, a warning could > be added for users to upgrade their compilers like was done for GCC. > > [1] https://bugs.llvm.org/show_bug.cgi?id=40976 > > Signed-off-by: Adrian Ratiu > --- > arch/arm/include/asm/xor.h | 3 ++- > arch/arm/lib/Makefile | 3 +++ > arch/arm/lib/xor-neon.c | 4 ++++ > 3 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/include/asm/xor.h b/arch/arm/include/asm/xor.h > index aefddec79286..49937dafaa71 100644 > --- a/arch/arm/include/asm/xor.h > +++ b/arch/arm/include/asm/xor.h > @@ -141,7 +141,8 @@ static struct xor_block_template xor_block_arm4regs = { > NEON_TEMPLATES; \ > } while (0) > > -#ifdef CONFIG_KERNEL_MODE_NEON > +/* disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 */ > +#if defined(CONFIG_KERNEL_MODE_NEON) && !defined(CONFIG_CC_IS_CLANG) > > extern struct xor_block_template const xor_block_neon_inner; > > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile > index 6d2ba454f25b..53f9e7dd9714 100644 > --- a/arch/arm/lib/Makefile > +++ b/arch/arm/lib/Makefile > @@ -43,8 +43,11 @@ endif > $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S > $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S > > +# disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 > +ifndef CONFIG_CC_IS_CLANG > ifeq ($(CONFIG_KERNEL_MODE_NEON),y) > NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon > CFLAGS_xor-neon.o += $(NEON_FLAGS) > obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o > endif > +endif > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c > index e1e76186ec23..84c91c48dfa2 100644 > --- a/arch/arm/lib/xor-neon.c > +++ b/arch/arm/lib/xor-neon.c > @@ -18,6 +18,10 @@ MODULE_LICENSE("GPL"); > * Pull in the reference implementations while instructing GCC (through > * -ftree-vectorize) to attempt to exploit implicit parallelism and emit > * NEON instructions. > + > + * On Clang the loop vectorizer is enabled by default, but due to a bug > + * (https://bugs.llvm.org/show_bug.cgi?id=40976) vectorization is broke > + * so xor-neon is disabled in favor of the default reg implementations. > */ > #ifdef CONFIG_CC_IS_GCC > #pragma GCC optimize "tree-vectorize" > -- > 2.29.0 > It's actually a bad idea to use #pragma GCC optimize. This is basically the same as tagging all the functions with __attribute__((optimize)), which GCC does not recommend for production use, as it _replaces_ optimization options rather than appending to them, and has been observed to result in dropping important compiler flags. There've been a few discussions recently around other such cases: https://lore.kernel.org/lkml/20201028171506.15682-1-ardb@kernel.org/ https://lore.kernel.org/lkml/20201028081123.GT2628@hirez.programming.kicks-ass.net/ For this file, given that it is supposed to use -ftree-vectorize for the whole file anyway, is there any reason it's not just added to CFLAGS via the Makefile? This seems to be the only use of pragma optimize in the kernel. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB73CC388F9 for ; Sun, 8 Nov 2020 17:40:58 +0000 (UTC) Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 566FF20760 for ; Sun, 8 Nov 2020 17:40:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="OYFs1hu+"; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oTiZFw3p" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 566FF20760 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=alum.mit.edu Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=merlin.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:Cc:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:To:Date:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=bsjGdswxQ9K6XBee6poE8U97rgLiGl+no4qAnEs1Tj4=; b=OYFs1hu+n0dnkyTnPkH/fJzpQ PtCPqCLa/z8KcnjVZ1Wgx7PcTbAGpD3+uvN3c7UY2tEWbQ/WPj1HGM51qt5WfVgsiMrwBht79kwAX YxRYUm58RnfYTJHgrf2maA5DBT90rNt+tqACEgLjB5GvI0uFlMiNg91UTMERi/Ei3QxXvghtMWv1J wmH7Ee60M8nWbaTy2elf905mNJoNW3mdzwOo6Y9647ib0iQbdl2gXbR9AQEZI+SipDvKRRDrqh5uP FY6nP1c40/40PUBXpUidTDN7eI/GKBoZB6a7m75BxKOCNcwPagXL9couj9vwzWEourN5xP7aeUvzT KuA3SWlSg==; Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.92.3 #3 (Red Hat Linux)) id 1kboft-0005Fj-P1; Sun, 08 Nov 2020 17:40:21 +0000 Received: from mail-qk1-x743.google.com ([2607:f8b0:4864:20::743]) by merlin.infradead.org with esmtps (Exim 4.92.3 #3 (Red Hat Linux)) id 1kbofq-0005FO-3x for linux-arm-kernel@lists.infradead.org; Sun, 08 Nov 2020 17:40:19 +0000 Received: by mail-qk1-x743.google.com with SMTP id b18so5930893qkc.9 for ; Sun, 08 Nov 2020 09:40:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:date:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=BgT2CP26pzdejYHyc1QoimRmV9Baz6Lnq3lNUhEBR1U=; b=oTiZFw3p+N+PvikLX4+DAzQiM6BSG4w4ttHWPMqGHiNIRb9obfo91xrHOq2e7Vi3/F +tv1rlZ/uiojHfLfWfzoZKWtKH+OwqggVWT4DpkH7ggAxeIhxutHkdsRqIH8NGNJbda8 Yh1QdaYwrF0H/tLXYXac6AdB6pjJgdYHEr7G6sYna+yILlgVXj1H+/3u/62tQCM8zU07 0bqyf0aIEUt8KZBD3x8mJFz+XinlMFEx6s1lKABpn9RqDrFPKyyFBAziSnSioq/tqk7e w3o2zTwGNImgNATxDMg1yqjH/3ToH9wXTzSbg1PUfwD81PE9zlS1UQtHkBpm1AnZA3Eq pMqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:date:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=BgT2CP26pzdejYHyc1QoimRmV9Baz6Lnq3lNUhEBR1U=; b=f4Y24Eo9aQa4PAC7i8tnshDaGDAkVxCaa+buEjJB7j8N8WSaP/xhCq8qC3C3OtTXZc ux+O1EZJZHlASyOqzBr+GgG9c41IAamTNLgM0ByhoA8gwZcIbcxqkgodPSZKy0/Sm6sG Zd5JgN99eODkdmKnAjIHGh6+IJQF13+momfoPlIRpnS59tl9O3MAmmixnC3t3XfrCCJ6 S6ebhsdXE06FaCqUKl9/ntKjywuRuL3byhaKTlkefGgbrYAlYnC8Po6TH+xZNGO9c1kw O6nu5a0lThAKMyYC3GFbCtEGc/+6O59delF/uHjDEZhobdiMmYTy2ezEWyYBN2tVKw57 FOxQ== X-Gm-Message-State: AOAM533P4Lrei1+As3TlQcQDMR11SvTg4uO0GLIs6Ma/c0O9R0Z4GCvT cfGS9k4MtgKZ9ffOGmy+nOg= X-Google-Smtp-Source: ABdhPJxEn8FFMo8ZvgDetMdWJcp3APOG6Lq+Nvj71XjGeAzlU4WcoJzrs8OLhaHlnx4m/BefT/NQ7w== X-Received: by 2002:a37:458c:: with SMTP id s134mr10034685qka.405.1604857216288; Sun, 08 Nov 2020 09:40:16 -0800 (PST) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id y14sm2723654qkj.56.2020.11.08.09.40.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 08 Nov 2020 09:40:15 -0800 (PST) From: Arvind Sankar X-Google-Original-From: Arvind Sankar Date: Sun, 8 Nov 2020 12:40:14 -0500 To: Adrian Ratiu Subject: Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization Message-ID: <20201108174014.GA219672@rani.riverdale.lan> References: <20201106051436.2384842-1-adrian.ratiu@collabora.com> <20201106051436.2384842-3-adrian.ratiu@collabora.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20201106051436.2384842-3-adrian.ratiu@collabora.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20201108_124018_217788_68B7D1DB X-CRM114-Status: GOOD ( 30.49 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Arnd Bergmann , Nick Desaulniers , Russell King , linux-kernel@vger.kernel.org, clang-built-linux@googlegroups.com, Nathan Chancellor , kernel@collabora.com, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Nov 06, 2020 at 07:14:36AM +0200, Adrian Ratiu wrote: > Due to a Clang bug [1] neon autoloop vectorization does not happen or > happens badly with no gains and considering previous GCC experiences > which generated unoptimized code which was worse than the default asm > implementation, it is safer to default clang builds to the known good > generic implementation. > > The kernel currently supports a minimum Clang version of v10.0.1, see > commit 1f7a44f63e6c ("compiler-clang: add build check for clang 10.0.1"). > > When the bug gets eventually fixed, this commit could be reverted or, > if the minimum clang version bump takes a long time, a warning could > be added for users to upgrade their compilers like was done for GCC. > > [1] https://bugs.llvm.org/show_bug.cgi?id=40976 > > Signed-off-by: Adrian Ratiu > --- > arch/arm/include/asm/xor.h | 3 ++- > arch/arm/lib/Makefile | 3 +++ > arch/arm/lib/xor-neon.c | 4 ++++ > 3 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/include/asm/xor.h b/arch/arm/include/asm/xor.h > index aefddec79286..49937dafaa71 100644 > --- a/arch/arm/include/asm/xor.h > +++ b/arch/arm/include/asm/xor.h > @@ -141,7 +141,8 @@ static struct xor_block_template xor_block_arm4regs = { > NEON_TEMPLATES; \ > } while (0) > > -#ifdef CONFIG_KERNEL_MODE_NEON > +/* disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 */ > +#if defined(CONFIG_KERNEL_MODE_NEON) && !defined(CONFIG_CC_IS_CLANG) > > extern struct xor_block_template const xor_block_neon_inner; > > diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile > index 6d2ba454f25b..53f9e7dd9714 100644 > --- a/arch/arm/lib/Makefile > +++ b/arch/arm/lib/Makefile > @@ -43,8 +43,11 @@ endif > $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S > $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S > > +# disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 > +ifndef CONFIG_CC_IS_CLANG > ifeq ($(CONFIG_KERNEL_MODE_NEON),y) > NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon > CFLAGS_xor-neon.o += $(NEON_FLAGS) > obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o > endif > +endif > diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c > index e1e76186ec23..84c91c48dfa2 100644 > --- a/arch/arm/lib/xor-neon.c > +++ b/arch/arm/lib/xor-neon.c > @@ -18,6 +18,10 @@ MODULE_LICENSE("GPL"); > * Pull in the reference implementations while instructing GCC (through > * -ftree-vectorize) to attempt to exploit implicit parallelism and emit > * NEON instructions. > + > + * On Clang the loop vectorizer is enabled by default, but due to a bug > + * (https://bugs.llvm.org/show_bug.cgi?id=40976) vectorization is broke > + * so xor-neon is disabled in favor of the default reg implementations. > */ > #ifdef CONFIG_CC_IS_GCC > #pragma GCC optimize "tree-vectorize" > -- > 2.29.0 > It's actually a bad idea to use #pragma GCC optimize. This is basically the same as tagging all the functions with __attribute__((optimize)), which GCC does not recommend for production use, as it _replaces_ optimization options rather than appending to them, and has been observed to result in dropping important compiler flags. There've been a few discussions recently around other such cases: https://lore.kernel.org/lkml/20201028171506.15682-1-ardb@kernel.org/ https://lore.kernel.org/lkml/20201028081123.GT2628@hirez.programming.kicks-ass.net/ For this file, given that it is supposed to use -ftree-vectorize for the whole file anyway, is there any reason it's not just added to CFLAGS via the Makefile? This seems to be the only use of pragma optimize in the kernel. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel