From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6A17C47082 for ; Tue, 8 Jun 2021 15:38:56 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 773A561263 for ; Tue, 8 Jun 2021 15:38:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 773A561263 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=dPxvxwJoflGY45loHQDhZUYF7Lad9Xwwtrr73A9lAsw=; b=Qg23RIpPK/3raq KonZyUg00GItcMVPg+xZX1OOAciDsyTrwy28RGV2KhHImwYMBQntivCgaFcq+00e9cKfICwk0f6aZ o/Bg2S64UJw+m6xoNAeIDQRZBKugHuLtUyJDEn2/JGPm+d6ZulmPN2L6UAHAgTMVxCkQnMvB0MFaV GGV37jvUmT9vHlnt+j7BKbG0X5ZZBczLGVVgM188cD0A5bOBf7k1VlRRyoI0CT5aKUiPZcXFi3gLF mrDVjxSeEs1YyR7/R/NSo7pq4lDPatjt1hoRwvbX7NbpWp24ONrULvZM/poSovSc5MDNeLreY5pfT z1cKh3vdTl9MVxbS7gCw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lqcxY-008zkh-5y; Tue, 08 Jun 2021 14:44:04 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lqcxS-008ziv-Cp for linux-arm-kernel@lists.infradead.org; Tue, 08 Jun 2021 14:44:00 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id 0455061042; Tue, 8 Jun 2021 14:43:56 +0000 (UTC) Date: Tue, 8 Jun 2021 15:43:54 +0100 From: Catalin Marinas To: Peter Collingbourne Cc: Vincenzo Frascino , Will Deacon , Evgenii Stepanov , linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH] arm64: mte: allow async MTE to be upgraded to sync on a per-CPU basis Message-ID: <20210608144353.GF17957@arm.com> References: <20210602232445.3829248-1-pcc@google.com> <5ee9d9a1-5b13-ea21-67df-e713c76fc163@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210608_074358_500596_1222CD23 X-CRM114-Status: GOOD ( 43.73 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Jun 03, 2021 at 10:49:24AM -0700, Peter Collingbourne wrote: > On Thu, Jun 3, 2021 at 6:01 AM Vincenzo Frascino > wrote: > > On 6/3/21 12:24 AM, Peter Collingbourne wrote: > > > On some CPUs the performance of MTE in synchronous mode is the same > > > as that of asynchronous mode. This makes it worthwhile to enable > > > synchronous mode on those CPUs when asynchronous mode is requested, > > > in order to gain the error detection benefits of synchronous mode > > > without the performance downsides. Therefore, make it possible for CPUs > > > to opt into upgrading to synchronous mode via a new mte-prefer-sync > > > device tree attribute. > > > > > > > I had a look at your patch and I think that there are few points that are worth > > mentioning: > > 1) The approach you are using is per-CPU hence we might end up with a system > > that has some PE configured as sync and some configured as async. We currently > > support only a system wide setting. > > This is the intent. On e.g. a big/little system this means that we > would effectively have sampling of sync MTE faults at a higher rate > than a pure userspace implementation could achieve, at zero cost. > > > 2) async and sync have slightly different semantics (e.g. in sync mode the > > access does not take place and it requires emulation) this means that a mixed > > configuration affects the ABI. > > We considered the ABI question and think that is somewhat academic. > While it's true that we would prevent the first access from succeeding > (and, more visibly, use SEGV_MTESERR in the signal rather than > SEGV_MTEAERR) I'm not aware of a reasonable way that a userspace > program could depend on the access succeeding. It's more about whether some software relies on the async mode only for logging without any intention of handling the synchronous faults. In such scenario, the async signal is fairly simple, it logs and continues safely (well, as "safe" as before MTE). With the sync mode, however, the signal handler will have to ensure the access took place either before continuing, either by emulating, restarting the instruction with PSTATE.TCO set or by falling back to the async mode. IOW, I don't expect all programs making use of MTE to simply die on an MTE fault (though I guess they'll be in a minority but we still need to allow such scenarios). > While it's slightly > more plausible that there could be a dependency on the signal type, we > don't depend on that in Android, at least not in a way that would lead > to worse outcomes if we get MTESERR instead of MTEAERR (it would lead > to better outcomes, in the form of a more accurate/detailed crash > report, which is what motivates this change). I also checked glibc and > they don't appear to have any dependencies on the signal type, or > indeed have any detailed crash reporting at all as far as I can tell. > Furthermore, basically nobody has hardware at the moment so I don't > think we would be breaking any actual users by doing this. While there's no user-space out there yet, given that MTE was merged in 5.10 and that's an LTS kernel, we'll see software running on this kernel version at some point in the next few years. So any fix will need backporting but an ABI change for better performance on specific SoCs hardly counts as a fix. I'm ok with improving the ABI but not breaking the current one, hence the suggestion for a new PR_MTE_TCF_* field or maybe a new bit that allows the mode to be "upgraded", for some definition of this (for example, if TCF_NONE is as fast as TCF_ASYNC, should NONE be upgraded?) > > 3) In your patch you use DT to enforce sync mode on a CPU, probably it is better > > to have an MIDR scheme to mark these CPUs. > > Okay, so in your scheme we would say that e.g. all Cortex-A510 CPUs > should be subject to this treatment. Can we guarantee that all > Cortex-A510 CPUs would have the same performance for sync and async or > could the system designer tweak some aspect of the system such that > they could get different performance? The possibility of the latter is > what led me to specify the information via DT. While it's more of a CPU microarchitecture issue, there's indeed a good chance that the SoC has a non-trivial impact on the performance of the synchronous mode, so it may tip the balance one way or another. Another idea would be to introduce a PR_MTE_TCF_DEFAULT mode together with some /sys/*/cpu*/ entries to control the default MTE mode per CPU. We'd leave it entirely to user-space (privileged user), maybe it even wants to run some checks before tuning the default mode per CPU. I'm pretty sure the sync vs async decision is not a clear cut (e.g. sync may still be slower but within a tolerable margin for certain benchmarks). Leaving the decision to the hardware vendor, hard-coded in the DT, is probably not the best option (nor is the MIDR). Some future benchmark with a weird memory access pattern could expose slowness in the sync mode and vendors will scramble on how to change the DT already deployed (maybe they can do this OTA already). -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel