From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 57214C433EF for ; Tue, 5 Apr 2022 01:57:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:Subject:Message-ID:Date:From: In-Reply-To:References:MIME-Version:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=QdaW2znFnRSX9/PhjK6jUSVqGJBVOcIuECskFttlbFA=; b=NdYQfd2QMv+CFy mefZUFPRw8r3PYkjFTliCBUT29j/LDxxIkRCM1MBvyMsoLV43J7FMVqKRkD0YfAk8QyO7PvjXzcPZ 1RmHkkVlmPfyYNHbh6ywu9taLPmOKxh7r6E5yNuXHoQfSX3fuv7UMhxOqkwg/WCoSiT3IWmKM6XR0 HdBdpKPU6n9QbVI7uBaCR2vBnv/f1S+TcDlUb3unB5B/ei72zoe8i/ZVw591j0bVZx7exxFONf0Kv 4YdfvWEakboLq3FUhoBOUpkGJUkMcRXStqI8LIZmWauXANSGDn3q/Cd7HzpNMTP+9qUknkN1ynWwL aU1eLH8e5ca/fOXOqXhA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nbYQY-00Gq2P-Kp; Tue, 05 Apr 2022 01:56:14 +0000 Received: from mail-lf1-x12f.google.com ([2a00:1450:4864:20::12f]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nbYQV-00Gq1m-28 for linux-arm-kernel@lists.infradead.org; Tue, 05 Apr 2022 01:56:12 +0000 Received: by mail-lf1-x12f.google.com with SMTP id h7so20632968lfl.2 for ; Mon, 04 Apr 2022 18:56:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=cRMKHMEeY9+RKsGGifttcYB8078b1P2NNUuN3rXNHsA=; b=LlZFviyzsfP62rQJSDGGCX3c9ZzlqUhIw0Grkn+0VsM5rBKFmnTsvaGr6mPFYw/tj6 ZBlT4zeTLpFbNjJFxipLzirk3SZnQBkKZztZrogl8TdhoIqBfNsmEHMQAwoLN4wDBSb1 RX6dA3jOFOe37TqV8nAumLk8bTAHjDZcvlCQ7IG1JaltWKxP8ngirmIA6lxxLG8A2Jz+ A/eJ43HBVW5k9XKtbxoZealcoZb77DOHQc72k2eDNR0zVMPrdx6whfcxtnwjka6gcAen A1ItInfmns7dDfPE74B3SyF17i4AKPzOoaFKdp52yNvyaChvEdHGjckkPvPTrw3IIDYA fxoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=cRMKHMEeY9+RKsGGifttcYB8078b1P2NNUuN3rXNHsA=; b=K3Kni33T8ZaKXLChSLjXCPRh1OwAHrFflFqVjR8rQiHCAb6oe6Mz7MX9DQTsTTS4zK kkuzoSxWEDV0f6vnv6X6GUbyCON/v7sMB0XuvEVri1SrC+SAtbNJJ8kaUZqc+SeY/4p1 ob5yOnOmahYkAx4PEuWh9B+w814xNUUaGB6HNtVtW7ylQVCngt/NLEpTbQFS1Vv2rSS0 TfqcqSmsBak07j8URgvdR4a3BVzuncW3Xd3Az+mVpXaJRcPmmeDixxoPhbTkzPK1PFmT uHfDEiNaOXk38oAd+i0WrUgTPwdClrmdtw+IpEzOIQRlfdw3wuxWSwGlBkawOikxF8Eo FprA== X-Gm-Message-State: AOAM531rwJ7BBk7QQTy5lKwKZ0WB8SWRgH1TH2NynfvkqCd26AwFuYXn 2za9ngYqPWOUSvxcaHAiwRD5aL2twRlux7pWakAc/A== X-Google-Smtp-Source: ABdhPJwP68brfn2rAlyZj8ifciA/yGBj3dOA1XzN+z0hIDf8y32OrKqN3zR2RIoCsOMVm7zYU2M/MhPE2Upm9zNrwX4= X-Received: by 2002:a05:6512:1153:b0:44a:3b47:4f88 with SMTP id m19-20020a056512115300b0044a3b474f88mr852181lfg.447.1649123765408; Mon, 04 Apr 2022 18:56:05 -0700 (PDT) MIME-Version: 1.0 References: <20220315230035.3792663-1-nhuck@google.com> <20220315230035.3792663-8-nhuck@google.com> In-Reply-To: From: Nathan Huckleberry Date: Mon, 4 Apr 2022 20:55:54 -0500 Message-ID: Subject: Re: [PATCH v3 7/8] crypto: arm64/polyval: Add PMULL accelerated implementation of POLYVAL To: Eric Biggers Cc: linux-crypto@vger.kernel.org, Herbert Xu , "David S. Miller" , linux-arm-kernel@lists.infradead.org, Paul Crowley , Sami Tolvanen , Ard Biesheuvel X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220404_185611_142268_48489949 X-CRM114-Status: GOOD ( 21.37 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Mar 23, 2022 at 8:37 PM Eric Biggers wrote: > > On Tue, Mar 15, 2022 at 11:00:34PM +0000, Nathan Huckleberry wrote: > > Add hardware accelerated version of POLYVAL for ARM64 CPUs with > > Crypto Extension support. > > Nit: It's "Crypto Extensions", not "Crypto Extension". > > > +config CRYPTO_POLYVAL_ARM64_CE > > + tristate "POLYVAL using ARMv8 Crypto Extensions (for HCTR2)" > > + depends on KERNEL_MODE_NEON > > + select CRYPTO_CRYPTD > > + select CRYPTO_HASH > > + select CRYPTO_POLYVAL > > CRYPTO_POLYVAL selects CRYPTO_HASH already, so there's no need to select it > here. > > > +/* > > + * Perform polynomial evaluation as specified by POLYVAL. This computes: > > + * h^n * accumulator + h^n * m_0 + ... + h^1 * m_{n-1} > > + * where n=nblocks, h is the hash key, and m_i are the message blocks. > > + * > > + * x0 - pointer to message blocks > > + * x1 - pointer to precomputed key powers h^8 ... h^1 > > + * x2 - number of blocks to hash > > + * x3 - pointer to accumulator > > + * > > + * void pmull_polyval_update(const u8 *in, const struct polyval_ctx *ctx, > > + * size_t nblocks, u8 *accumulator); > > + */ > > +SYM_FUNC_START(pmull_polyval_update) > > + adr TMP, .Lgstar > > + ld1 {GSTAR.2d}, [TMP] > > + ld1 {SUM.16b}, [x3] > > + ands PARTIAL_LEFT, BLOCKS_LEFT, #7 > > + beq .LskipPartial > > + partial_stride > > +.LskipPartial: > > + subs BLOCKS_LEFT, BLOCKS_LEFT, #NUM_PRECOMPUTE_POWERS > > + blt .LstrideLoopExit > > + ld1 {KEY8.16b, KEY7.16b, KEY6.16b, KEY5.16b}, [x1], #64 > > + ld1 {KEY4.16b, KEY3.16b, KEY2.16b, KEY1.16b}, [x1], #64 > > + full_stride 0 > > + subs BLOCKS_LEFT, BLOCKS_LEFT, #NUM_PRECOMPUTE_POWERS > > + blt .LstrideLoopExitReduce > > +.LstrideLoop: > > + full_stride 1 > > + subs BLOCKS_LEFT, BLOCKS_LEFT, #NUM_PRECOMPUTE_POWERS > > + bge .LstrideLoop > > +.LstrideLoopExitReduce: > > + montgomery_reduction > > + mov SUM.16b, PH.16b > > +.LstrideLoopExit: > > + st1 {SUM.16b}, [x3] > > + ret > > +SYM_FUNC_END(pmull_polyval_update) > > Is there a reason why partial_stride is done first in the arm64 implementation, > but last in the x86 implementation? It would be nice if the implementations > worked the same way. Probably last would be better? What is the advantage of > doing it first? It was so I could return early without loading keys into registers, since I only need them if there's a full stride. I was able to rewrite it in the same way that the x86 implementation works. > > Besides that, many of the comments I made on the x86 implementation apply to the > arm64 implementation too. > > - Eric _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel