From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jason A. Donenfeld" Subject: Re: [PATCH net-next v6 07/23] zinc: ChaCha20 ARM and ARM64 implementations Date: Thu, 27 Sep 2018 02:04:56 +0200 Message-ID: References: <20180925145622.29959-1-Jason@zx2c4.com> <20180925145622.29959-8-Jason@zx2c4.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Herbert Xu , Thomas Gleixner , LKML , Netdev , Linux Crypto Mailing List , David Miller , Greg Kroah-Hartman , Samuel Neves , Andrew Lutomirski , Jean-Philippe Aumasson , Russell King - ARM Linux , linux-arm-kernel@lists.infradead.org To: Ard Biesheuvel Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-Id: linux-crypto.vger.kernel.org On Wed, Sep 26, 2018 at 5:52 PM Ard Biesheuvel wrote: > > On Wed, 26 Sep 2018 at 17:50, Jason A. Donenfeld wrote: > > > > On Wed, Sep 26, 2018 at 5:45 PM Jason A. Donenfeld wrote: > > > So what you have in mind is something like calling simd_relax() every > > > 4096 bytes or so? > > > > That was actually pretty easy, putting together both of your suggestions: > > > > static inline bool chacha20_arch(struct chacha20_ctx *state, u8 *dst, > > u8 *src, size_t len, > > simd_context_t *simd_context) > > { > > while (len > PAGE_SIZE) { > > chacha20_arch(state, dst, src, PAGE_SIZE, simd_context); > > len -= PAGE_SIZE; > > src += PAGE_SIZE; > > dst += PAGE_SIZE; > > simd_relax(simd_context); > > } > > if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && chacha20_use_neon && > > len >= CHACHA20_BLOCK_SIZE * 3 && simd_use(simd_context)) > > chacha20_neon(dst, src, len, state->key, state->counter); > > else > > chacha20_arm(dst, src, len, state->key, state->counter); > > > > state->counter[0] += (len + 63) / 64; > > return true; > > } > > Nice one :-) > > This works for me (but perhaps add a comment as well) As elegant as my quick recursive solution was, gcc produced kind of bad code from it, as you might expect. So I've implemented this using a boring old loop that works the way it's supposed to. This is marked for v7. From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jason@zx2c4.com (Jason A. Donenfeld) Date: Thu, 27 Sep 2018 02:04:56 +0200 Subject: [PATCH net-next v6 07/23] zinc: ChaCha20 ARM and ARM64 implementations In-Reply-To: References: <20180925145622.29959-1-Jason@zx2c4.com> <20180925145622.29959-8-Jason@zx2c4.com> Message-ID: To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Wed, Sep 26, 2018 at 5:52 PM Ard Biesheuvel wrote: > > On Wed, 26 Sep 2018 at 17:50, Jason A. Donenfeld wrote: > > > > On Wed, Sep 26, 2018 at 5:45 PM Jason A. Donenfeld wrote: > > > So what you have in mind is something like calling simd_relax() every > > > 4096 bytes or so? > > > > That was actually pretty easy, putting together both of your suggestions: > > > > static inline bool chacha20_arch(struct chacha20_ctx *state, u8 *dst, > > u8 *src, size_t len, > > simd_context_t *simd_context) > > { > > while (len > PAGE_SIZE) { > > chacha20_arch(state, dst, src, PAGE_SIZE, simd_context); > > len -= PAGE_SIZE; > > src += PAGE_SIZE; > > dst += PAGE_SIZE; > > simd_relax(simd_context); > > } > > if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && chacha20_use_neon && > > len >= CHACHA20_BLOCK_SIZE * 3 && simd_use(simd_context)) > > chacha20_neon(dst, src, len, state->key, state->counter); > > else > > chacha20_arm(dst, src, len, state->key, state->counter); > > > > state->counter[0] += (len + 63) / 64; > > return true; > > } > > Nice one :-) > > This works for me (but perhaps add a comment as well) As elegant as my quick recursive solution was, gcc produced kind of bad code from it, as you might expect. So I've implemented this using a boring old loop that works the way it's supposed to. This is marked for v7.