From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58FAAC433ED for ; Wed, 19 May 2021 18:14:53 +0000 (UTC) Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 096D4611BD for ; Wed, 19 May 2021 18:14:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 096D4611BD Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=arm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding :Content-Type:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ekpEGH5cUb8+Ge1vQMXTsVFrj6ahJPjJz4tRMTZiz+8=; b=Ser21pTOhwVOFFIBHNZ0a9P7l tyTHgLUL8Uy7t3N6C3/R2nual9UBHSbLFK/ksiwnFBDsm+xmYilBtnefYgZCL12zNK4B39XNEsBdE y+ZJIPxPgoE7TvJUlBEWccxDAvLL+4okvo1JK7UdLRVlVsMvYk1KrC6iwTx78KwN7it00PQrD5D9I V4sNJ0zXjeZxwh+wZWATRRHiTOB0D6tqxIvCDj6XC7Vb0Tjzgcp4ENZQ8Ue8499oTEebEN1ml3vjo CVESo8VIC28WPKVJV1QFXoMrDbh4qmVSbnq0p4TQQVyR7btOsjvHtnh4tqSoL9OPMihpDz3h5PsE1 wfRrEhYMw==; Received: from localhost ([::1] helo=desiato.infradead.org) by desiato.infradead.org with esmtp (Exim 4.94 #2 (Red Hat Linux)) id 1ljQgu-004k2F-Nv; Wed, 19 May 2021 18:13:08 +0000 Received: from bombadil.infradead.org ([2607:7c80:54:e::133]) by desiato.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1ljQgq-004k1Q-9K for linux-arm-kernel@desiato.infradead.org; Wed, 19 May 2021 18:13:04 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20210309; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=46gq6Y9W8CzFkwjoy9hWLKn2qwp7R+ob1Xtw/3ebtAo=; b=MEOHpWL8oyqZ5ZQyKimcKJ06yM KXWTXtgLUT8bWI4QgpCQ5UVcLIh7uzjOQjJPJ+8+pNxliP+IBPeqSX4hKEr06p6Q6Y7JrPhsehvqu /o2CqcmxrSwZVV2T6S2t/sVwGEfYEepLJNNC6vsCSd6M6iZAayVYu6ZR1eznyhjCWodvzaDdWZXXL S0AKAagLSF9a19nBevoQrUPFafHg5Fp1SaCzbjxV7ZTquAQfdZIRHwpneE8oh//gjw10F+N3ClgYi +NUSStUbwGYvYwEpKUXZgkg44vawy4x6edPoH5U8z0uC32q2pKI8EJ0TKZp+zOblX/wlPIKSOO36/ LynwiF6Q==; Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94 #2 (Red Hat Linux)) id 1ljQgn-00Fgfb-Jd for linux-arm-kernel@lists.infradead.org; Wed, 19 May 2021 18:13:03 +0000 Received: by mail.kernel.org (Postfix) with ESMTPSA id 91AD4611BF; Wed, 19 May 2021 18:12:59 +0000 (UTC) Date: Wed, 19 May 2021 19:12:57 +0100 From: Catalin Marinas To: Peter Collingbourne Cc: Evgenii Stepanov , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Will Deacon , Steven Price , kasan-dev , Linux ARM , Linux Kernel Mailing List Subject: Re: [PATCH v3] kasan: speed up mte_set_mem_tag_range Message-ID: <20210519181225.GF21619@arm.com> References: <20210517235546.3038875-1-eugenis@google.com> <20210518174439.GA28491@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210519_111301_691955_9CC08FF1 X-CRM114-Status: GOOD ( 25.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, May 18, 2021 at 11:11:52AM -0700, Peter Collingbourne wrote: > On Tue, May 18, 2021 at 10:44 AM Catalin Marinas > wrote: > > If we want to get the best performance out of this, we should look at > > the memset implementation and do something similar. In principle it's > > not that far from a memzero, though depending on the microarchitecture > > it may behave slightly differently. > > For Scudo I compared our storeTags implementation linked above against > __mtag_tag_zero_region from the arm-optimized-routines repository > (which I think is basically an improved version of that memset > implementation rewritten to use STG and DC GZVA), and our > implementation performed better on the hardware that we have access > to. That's the advantage of having hardware early ;). > > Anyway, before that I wonder if we wrote all this in C + inline asm > > (three while loops or maybe two and some goto), what's the performance > > difference? It has the advantage of being easier to maintain even if we > > used some C macros to generate gva/gzva variants. > > I'm not sure I agree that it will be easier to maintain. Due to the > number of "unusual" instructions required here it seems more readable > to have the code in pure assembly than to require readers to switch > contexts between C and asm. If we did move it to inline asm then I > think it should basically be a large blob of asm like the Scudo code > that I linked. I was definitely not thinking of a big asm block, that's even less readable than separate .S file. It's more like adding dedicated macros for single STG or DC GVA uses and using them in while loops. Anyway, let's see a better commented .S implementation first. Given that tagging is very sensitive to the performance of this function, we'd probably benefit from a (few percent I suspect) perf improvement with the hand-coded assembly. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel