From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751837AbeAPWae (ORCPT + 1 other); Tue, 16 Jan 2018 17:30:34 -0500 Received: from mail-ot0-f193.google.com ([74.125.82.193]:40145 "EHLO mail-ot0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751774AbeAPWac (ORCPT ); Tue, 16 Jan 2018 17:30:32 -0500 X-Google-Smtp-Source: ACJfBot/RkpUJtpOFjwmkBD+kj5eM8lqYpiUCJi1z/tndFfhGAvAe4KNzq7Fka6GeYSil7Nm8WlPGDxiLqIbCIamSIU= MIME-Version: 1.0 In-Reply-To: References: <20180116162905.1404585-1-arnd@arndb.de> From: Arnd Bergmann Date: Tue, 16 Jan 2018 23:30:30 +0100 X-Google-Sender-Auth: a2TE1qIoVHQ4VucV51qBibYcax8 Message-ID: Subject: Re: [PATCH] ARM: make memzero optimization smarter To: Nicolas Pitre Cc: Russell King , Ard Biesheuvel , Zhichang Yuan , Linux ARM , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Tue, Jan 16, 2018 at 6:10 PM, Nicolas Pitre wrote: > On Tue, 16 Jan 2018, Arnd Bergmann wrote: > >> While testing with a gcc-8.0.0 snapshot, I ran into a harmless >> build warning: >> >> In file included from include/linux/string.h:18:0, >> ... >> from drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:10: >> drivers/net/ethernet/hisilicon/hns/hns_ethtool.c: In function '__lb_other_process': >> arch/arm/include/asm/string.h:50:5: error: 'memset' specified size 4294967295 exceeds maximum object size 2147483647 [-Werror=stringop-overflow=] >> memset((__p),(v),(__n)); \ >> ^~~~~~~~~~~~~~~~~~~~~~~ >> drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:394:3: note: in expansion of macro 'memset' >> memset(&skb->data[frame_size / 2], 0xAA, frame_size / 2 - 1); >> ^~~~~~ >> >> I think the warning is unintentional here, and gcc should not actually >> warn, so I reported this in the gcc bugzilla as pr82103. From the >> discussion there, it seems unlikely that this gets addressed in >> the compiler. >> >> The problem here is that testing the 'frame_size' variable for non-zero >> in the first memset() macro invocation leads to a code path in which >> gcc thinks it may be zero, and that code path would lead to an overly >> large length for the following memset that is now "(u32)-1". We know >> this won't happen as the skb len is already guaranteed to be nonzero >> when we get here (it has just been allocated with a nonzero size). >> >> However, we can avoid this class of bogus warnings for the memset() macro >> by only doing the micro-optimization for zero-length arguments when the >> length is a compile-time constant. This should also reduce code size by >> a few bytes, and avoid an extra branch for the cases that a variable-length >> argument is always nonzero, which is probably the common case anyway. >> >> I have made sure that the __memzero implementation can safely handle >> a zero length argument. > > Why not simply drop the test on (__n) != 0 then? I fail to see what the > advantage is in that case. Good point. We might actually get even better results by dropping the __memzero path entirely, since gcc has can optimize trivial memset() operations and inline them. If I read arch/arm/lib/memzero.S correctly, it saves exactly two 'orr' instructions compared to the memset.S implementation, but calling memset() rather than __memzero() from C code ends up saving a function call at least some of the time. Building a defconfig kernel with gcc-7.2.1, I see 1919 calls to __memzero() and 636 calls to memset() in vmlinux. If I remove the macro entirely, I get 1775 calls to memset() instead, so 780 memzero instances got inlined, and kernel shrinks by 5488 bytes (0.03%), not counting the __memzero implementation that we could possibly also drop. FWIW, the zero-length check saves five references to __memzero() and one reference to memset(), or 16 bytes in kernel size, I have not checked what those are. Arnd