From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9779DC4360C for ; Fri, 4 Oct 2019 14:23:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 681A82070B for ; Fri, 4 Oct 2019 14:23:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="oGZlViRm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727587AbfJDOXs (ORCPT ); Fri, 4 Oct 2019 10:23:48 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:37131 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725974AbfJDOXr (ORCPT ); Fri, 4 Oct 2019 10:23:47 -0400 Received: by mail-wm1-f67.google.com with SMTP id f22so6077457wmc.2 for ; Fri, 04 Oct 2019 07:23:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=3WGYmzEHvQw0B0cgz0N8JTWopZTEatCEZHu/1WAFZ6c=; b=oGZlViRmgdIfBGmC1QUMvQnLGE9gpOXsUbnzQx+JD7mONCVE/5u6797xn61VDlI0/o p2JATTspRYwQpt5gIFKKqnn5LqNpozXukH/um6DmpZWG6DfXVp/BCovlQc0mQf3oZktM sZ4umM2dOcvk8bae1/C+SOyhXj6VMGzZQPPQ6zH7x+UZXvO4ckGYqy+YhX+f2xRQ3Fem QLWwQuCfIxR3AWHzWNxwkDuf6yqBN1ztNe6KvMS1NyN8ImFJPivj6wEXyVEjWEGfdypy 5EsnPW37h6rK5B4l4PiAli6YJwPaT+akkDUOtC3ruAnLXa43UsJPr85HO7yj42mBLC5V 4NmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=3WGYmzEHvQw0B0cgz0N8JTWopZTEatCEZHu/1WAFZ6c=; b=X6zd7fIkEVf5X6X7nLPi/rph3+QFCsi4dex9bk3yz59YDM8d3IN3BjgPlnbcY4XPlP MCjQgghrd0T/usQW6P/h7xmIM15Xg1/5WVjYdFwoZqyDoOSv3rkjWnDwxLMr4dPzzSkS d9EcgtU3YJjlr8bXc56JLVxiFBprWSXo+OK1MPFwWCgQJAWpd9lNTaknt76MMCRhXliw 5w1jGXhc+NEowI/O9p5IA0CHUB2ZMupqWzCAUD04TC+DwRkuOSS+GphedhNgBeTR56wP 1jKO3Mu1PWCjb5oxSqLgPCqBdIBrtlcYtMugsWOzB3hn7HX+6MHWwXSX10yUBNJnCJ5q MJFA== X-Gm-Message-State: APjAAAUzRzYHv9IQsFVnOGbcld4zJTtQe9t6+F7b3o6EvBnCmyliBYoi lrnobZUJVbXaq4iRfB15IBZLtvOkgYaJsYtZXyp/4Q== X-Google-Smtp-Source: APXvYqziWplQoG134ULjmrlJWBnYLpq6X/QUhOV4Hj+Q8k/kxLMslUqfquaKYMDa7JqNjO8O7NvpOfGIVoGaPFAfIc4= X-Received: by 2002:a1c:e906:: with SMTP id q6mr10448718wmc.136.1570199024530; Fri, 04 Oct 2019 07:23:44 -0700 (PDT) MIME-Version: 1.0 References: <20191002141713.31189-1-ard.biesheuvel@linaro.org> <20191002141713.31189-5-ard.biesheuvel@linaro.org> In-Reply-To: From: Ard Biesheuvel Date: Fri, 4 Oct 2019 16:23:32 +0200 Message-ID: Subject: Re: [PATCH v2 04/20] crypto: arm/chacha - expose ARM ChaCha routine as library function To: "Jason A. Donenfeld" Cc: Linux Crypto Mailing List , Herbert Xu , David Miller , Greg KH , Linus Torvalds , Samuel Neves , Dan Carpenter , Arnd Bergmann , Eric Biggers , Andy Lutomirski , Will Deacon , Marc Zyngier , Catalin Marinas , Martin Willi , Peter Zijlstra , Josh Poimboeuf Content-Type: text/plain; charset="UTF-8" Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Fri, 4 Oct 2019 at 15:53, Jason A. Donenfeld wrote: > > On Wed, Oct 2, 2019 at 4:17 PM Ard Biesheuvel wrote: > > Expose the accelerated NEON ChaCha routine directly as a symbol > > export so that users of the ChaCha library can use it directly. > > Eric had some nice code for ChaCha for certain ARM cores that lived in > Zinc as chacha20-unrolled-arm.S. This code became active for certain > cores where NEON was bad and for cores with no NEON. The condition for > it was: > > switch (read_cpuid_part()) { > case ARM_CPU_PART_CORTEX_A7: > case ARM_CPU_PART_CORTEX_A5: > /* The Cortex-A7 and Cortex-A5 do not perform well with the NEON > * implementation but do incredibly with the scalar one and use > * less power. > */ > break; > default: > chacha20_use_neon = elf_hwcap & HWCAP_NEON; > } > > ... How is it relevant whether the boot CPU is A5 or A7? These are bL little cores that only implement NEON for feature parity with their bl big counterparts, but CPU intensive tasks are scheduled on big cores, where NEON performance is much better than scalar. If we need a policy for this in the kernel, I'd prefer it to be one at the arch/arm level where we disable kernel mode NEON entirely, either via a command line option, or via a policy based on the the types of all CPUs. > > for (;;) { > if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && chacha20_use_neon && > len >= CHACHA20_BLOCK_SIZE * 3 && simd_use(simd_context)) { > const size_t bytes = min_t(size_t, len, PAGE_SIZE); > > chacha20_neon(dst, src, bytes, ctx->key, ctx->counter); > ctx->counter[0] += (bytes + 63) / 64; > len -= bytes; > if (!len) > break; > dst += bytes; > src += bytes; > simd_relax(simd_context); > } else { > chacha20_arm(dst, src, len, ctx->key, ctx->counter); > ctx->counter[0] += (len + 63) / 64; > break; > } > } > > It's another instance in which the generic code was totally optimized > out of Zinc builds. > > Did these changes make it into the existing tree? I'd like to keep Eric's code, but if it is really that much faster, we might drop it in arch/arm/lib so it supersedes the builtin code that /dev/random uses as well.