From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.lttng.org (lists.lttng.org [167.114.26.123]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DAB5BC636CC for ; Tue, 31 Jan 2023 16:31:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=lists.lttng.org; s=default; t=1675182713; bh=zmqGYEOfvTZRXdrr/BzhJHNOnU4HijYFPbmBgS3Rdqg=; h=Date:To:References:In-Reply-To:Subject:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=KJyDPonA+tEpJCwse/qrSl77vvLZaM4+QnaPlzZhKEwjncLBMt3XXH+L0NkOKZj9o sS0V5x3vbhcvRehZTlPrEOjvivT0d/fkS/ovTc+kkfo26jCeS7kO35QFS26lZ0osNx NJwDjpATvwRvruJcOk8Tnc63BJbIN7tRtiLCuBy2AfZDWmaDjxc08MJ5lNbYtUY6CH 6DCpAQG8s9/XLVyegcWmjykpkQ/0D0rpQljmemsxPnojcuoSbGPnI7rFQR8OKR+GP7 n/f3RgokSKRpaNy7WYmFL5SPDyciW3hsuv/jUbwY4EtNU4kVq3fTM9TCXrdYdB7hM0 IK3HBnDrBebPg== Received: from lists-lttng01.efficios.com (localhost [IPv6:::1]) by lists.lttng.org (Postfix) with ESMTP id 4P5rD92FYBz1WxW; Tue, 31 Jan 2023 11:31:53 -0500 (EST) Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by lists.lttng.org (Postfix) with ESMTPS id 4P5rD70VXRz1XMT for ; Tue, 31 Jan 2023 11:31:51 -0500 (EST) Received: from [172.16.0.188] (192-222-180-24.qc.cable.ebox.net [192.222.180.24]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4P5rCX3dLVzjFB; Tue, 31 Jan 2023 11:31:20 -0500 (EST) Message-ID: Date: Tue, 31 Jan 2023 11:32:00 -0500 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Content-Language: en-US To: "Beckius, Mikael" , "lttng-dev@lists.lttng.org" References: <46f36a1a-c748-773b-8f6d-d481c9c8ad1b@efficios.com> In-Reply-To: Subject: Re: [lttng-dev] lttng-consumerd crash on aarch64 due to x86 arch specific optimization X-BeenThere: lttng-dev@lists.lttng.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: LTTng development list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , From: Mathieu Desnoyers via lttng-dev Reply-To: Mathieu Desnoyers Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: lttng-dev-bounces@lists.lttng.org Sender: "lttng-dev" On 2023-01-31 11:18, Mathieu Desnoyers wrote: > On 2023-01-31 11:08, Mathieu Desnoyers wrote: >> On 2023-01-30 01:50, Beckius, Mikael via lttng-dev wrote: >>> Hello Matthieu! >>> >>> I have looked at this in place of Anders and as far as I can tell >>> this is not an arm64 issue but an arm issue. And even on arm >>> __ARM_FEATURE_UNALIGNED is 1 so it seems the problem only occurs if >>> size equals 8. >> >> So for ARM, perhaps we should do the following in >> include/lttng/ust-arch.h: >> >> #if defined(LTTNG_UST_ARCH_ARM) && defined(__ARM_FEATURE_UNALIGNED) >> #define LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 >> #endif >> >> And refer to >> https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#ARM-Options >> >> Based on that documentation, it is possible to build with >> -mno-unaligned-access, >> and for all pre-ARMv6, all ARMv6-M and for ARMv8-M Baseline >> architectures, >> unaligned accesses are not enabled. >> >> I would only push this kind of change into the master branch though, >> due to >> its impact and the fact that this is only a performance improvement. > > But setting LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 for arm32 > when __ARM_FEATURE_UNALIGNED is defined would still cause issues for > 8-byte lttng_inline_memcpy with my proposed patch right ? > > AFAIU 32-bit arm with __ARM_FEATURE_UNALIGNED has unaligned accesses for > 2 and 4 bytes accesses, but somehow traps for unaligned 8-bytes > accesses ? Re-reading your analysis, I may have mistakenly concluded that using the lttng ust ring buffer in "packed" mode would be faster than aligned mode on arm32 and aarch64, but that's not really what you have benchmarked there. So forget what I said about setting LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS to 1 for arm32 and aarch64. There is a distinction between having efficient unaligned access and supporting unaligned accesses at all. For aarch64, it appears to support unaligned accesses, but it may be slower than aligned accesses AFAIU. For arm32, it supports unaligned accesses for 2 and 4 bytes when __ARM_FEATURE_UNALIGNED is set, but not for 8 bytes (it traps). Then it's not clear whether a 2 or 4 bytes access is slower when unaligned compared to aligned. At the end of the day, it's a question of compactness of the generated trace data (added throughput overhead) vs cpu time required to perform an unaligned access vs aligned. Thoughts ? Thanks, Mathieu > > Thanks, > > Mathieu > >> >>> >>> In addition I did some performance testing of lttng_inline_memcpy by >>> extracting it and adding it to a simple test program. It appears that >>> the general performance increases on arm, arm64, arm on arm64 >>> hardware and x86-64. But it also appears that on arm if you end up in >>> memcpy the old code where you call memcpy directly is actually >>> slightly faster. >> >> Nothing unexpected here. Just make sure that your test program does >> not call lttng_inline_memcpy >> with constant size values which end up optimizing away branches. In >> the context where lttng_inline_memcpy >> is used, most of the time its arguments are not constants. >> >>> >>> Skipping the memcpy fallback on arm for unaligned copies of sizes 2 >>> and 4 further improves the performance >> >> This would be naturally done on your board if we conditionally >> set LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 for >> __ARM_FEATURE_UNALIGNED >> right ? >> >> and setting LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 yields the >> best performance on arm64. >> >> This could go into lttng-ust master branch as well, e.g.: >> >> #if defined(LTTNG_UST_ARCH_AARCH64) >> #define LTTNG_UST_ARCH_HAS_EFFICIENT_UNALIGNED_ACCESS 1 >> #endif >> >> Thanks! >> >> Mathieu >> >>> >>> Micke >>> _______________________________________________ >>> lttng-dev mailing list >>> lttng-dev@lists.lttng.org >>> https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev >> > -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev