linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René van Dorst" <opensource@vdorst.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	"open list:HARDWARE RANDOM NUMBER GENERATOR CORE" 
	<linux-crypto@vger.kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	David Miller <davem@davemloft.net>,
	Greg KH <gregkh@linuxfoundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Samuel Neves <sneves@dei.uc.pt>,
	Dan Carpenter <dan.carpenter@oracle.com>,
	Arnd Bergmann <arnd@arndb.de>, Eric Biggers <ebiggers@google.com>,
	Andy Lutomirski <luto@kernel.org>, Will Deacon <will@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Martin Willi <martin@strongswan.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH v2 05/20] crypto: mips/chacha - import accelerated 32r2 code from Zinc
Date: Sun, 06 Oct 2019 19:12:28 +0000	[thread overview]
Message-ID: <20191006191228.Horde.E8aAava9O1UOhVnxdaZzfqw@www.vdorst.com> (raw)
In-Reply-To: <CAKv+Gu-84O9wo3-w7bYxW41g3gjwGk5tBJX54TGN53MUPNpdvQ@mail.gmail.com>

Quoting Ard Biesheuvel <ard.biesheuvel@linaro.org>:

<snip>

Hi Ard,

> Thanks a lot for taking the time to double check this. I think it
> would be nice to be able to expose xchacha12 like we do on other
> architectures.
>
> Note that for xchacha, I also added a hchacha_block() routine based on
> your code (with the round count as the third argument) [0]. Please let
> me know if you see anything wrong with that.
>
>
> +.globl hchacha_block
> +.ent hchacha_block
> +hchacha_block:
> + .frame $sp, STACK_SIZE, $ra
> +
> + addiu $sp, -STACK_SIZE
> +
> + /* Save s0-s7 */
> + sw $s0, 0($sp)
> + sw $s1, 4($sp)
> + sw $s2, 8($sp)
> + sw $s3, 12($sp)
> + sw $s4, 16($sp)
> + sw $s5, 20($sp)
> + sw $s6, 24($sp)
> + sw $s7, 28($sp)

We only have to preserve the used s registers.
Currently X11 to X15 are using the registers s6 down to s2.

But by shuffling/redefine the needed registers, so that we use all the
non-preserve registers, I can reduce the used s registers to one.

Registers we don't use and don't have to preserve are a3, at and v0.
Also STATE(a0) can be reused because we only need that pointer while  
loading the
values from memory.

So:

#undef X12
#undef X13
#undef X14
#undef X15

#define X12    $a3
#define X13    $at
#define X14    $v0
#define X15    STATE

And save X11(s6) on the stack.

See the full code here [0].

For the rest the code looks good!

Greats,

René

[0]:  
https://github.com/vDorst/wireguard/commit/562a516ae3b282b32f57d3239369360bc926df60


> +
> + lw X0, 0(STATE)
> + lw X1, 4(STATE)
> + lw X2, 8(STATE)
> + lw X3, 12(STATE)
> + lw X4, 16(STATE)
> + lw X5, 20(STATE)
> + lw X6, 24(STATE)
> + lw X7, 28(STATE)
> + lw X8, 32(STATE)
> + lw X9, 36(STATE)
> + lw X10, 40(STATE)
> + lw X11, 44(STATE)
> + lw X12, 48(STATE)
> + lw X13, 52(STATE)
> + lw X14, 56(STATE)
> + lw X15, 60(STATE)
> +
> +.Loop_hchacha_xor_rounds:
> + addiu $a2, -2
> + AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 16);
> + AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 12);
> + AXR( 0, 1, 2, 3, 4, 5, 6, 7, 12,13,14,15, 8);
> + AXR( 8, 9,10,11, 12,13,14,15, 4, 5, 6, 7, 7);
> + AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 16);
> + AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 12);
> + AXR( 0, 1, 2, 3, 5, 6, 7, 4, 15,12,13,14, 8);
> + AXR(10,11, 8, 9, 15,12,13,14, 5, 6, 7, 4, 7);
> + bnez $a2, .Loop_hchacha_xor_rounds
> +
> + sw X0, 0(OUT)
> + sw X1, 4(OUT)
> + sw X2, 8(OUT)
> + sw X3, 12(OUT)
> + sw X12, 16(OUT)
> + sw X13, 20(OUT)
> + sw X14, 24(OUT)
> + sw X15, 28(OUT)
> +
> + /* Restore used registers */
> + lw $s0, 0($sp)
> + lw $s1, 4($sp)
> + lw $s2, 8($sp)
> + lw $s3, 12($sp)
> + lw $s4, 16($sp)
> + lw $s5, 20($sp)
> + lw $s6, 24($sp)
> + lw $s7, 28($sp)
> +
> + addiu $sp, STACK_SIZE
> + jr $ra
> +.end hchacha_block
> +.set at
>
>
> [0]  
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=wireguard-crypto-library-api-v3&id=cc74a037f8152d52bd17feaf8d9142b61761484f






  parent reply	other threads:[~2019-10-06 19:12 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 14:16 [PATCH v2 00/20] crypto: crypto API library interfaces for WireGuard Ard Biesheuvel
2019-10-02 14:16 ` [PATCH v2 01/20] crypto: chacha - move existing library code into lib/crypto Ard Biesheuvel
2019-10-02 14:30   ` Greg KH
2019-10-04 13:21   ` Jason A. Donenfeld
2019-10-02 14:16 ` [PATCH v2 02/20] crypto: x86/chacha - expose SIMD ChaCha routine as library function Ard Biesheuvel
2019-10-02 14:31   ` Greg KH
2019-10-04 13:36   ` Jason A. Donenfeld
2019-10-04 13:54     ` Ard Biesheuvel
2019-10-02 14:16 ` [PATCH v2 03/20] crypto: arm64/chacha - expose arm64 " Ard Biesheuvel
2019-10-02 14:31   ` Greg KH
2019-10-02 14:16 ` [PATCH v2 04/20] crypto: arm/chacha - expose ARM " Ard Biesheuvel
2019-10-04 13:52   ` Jason A. Donenfeld
2019-10-04 14:23     ` Ard Biesheuvel
2019-10-04 14:28       ` Jason A. Donenfeld
2019-10-04 14:29       ` Jason A. Donenfeld
2019-10-04 15:43         ` Eric Biggers
2019-10-04 15:24       ` Arnd Bergmann
2019-10-04 15:35         ` Ard Biesheuvel
2019-10-04 15:38           ` Jason A. Donenfeld
2019-10-02 14:16 ` [PATCH v2 05/20] crypto: mips/chacha - import accelerated 32r2 code from Zinc Ard Biesheuvel
2019-10-04 13:46   ` Jason A. Donenfeld
2019-10-04 14:38     ` Ard Biesheuvel
2019-10-04 14:38       ` Ard Biesheuvel
2019-10-04 14:59       ` Jason A. Donenfeld
2019-10-04 15:05         ` Ard Biesheuvel
2019-10-04 15:15         ` René van Dorst
2019-10-04 15:23           ` Ard Biesheuvel
2019-10-05  9:05             ` René van Dorst
2019-10-06 19:12             ` René van Dorst [this message]
2019-10-02 14:16 ` [PATCH v2 06/20] crypto: poly1305 - move into lib/crypto and refactor into library Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 07/20] crypto: x86/poly1305 - expose existing driver as poly1305 library Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 08/20] crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 09/20] crypto: arm/poly1305 " Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 10/20] crypto: mips/poly1305 - import accelerated 32r2 code from Zinc Ard Biesheuvel
2019-10-04 13:48   ` Jason A. Donenfeld
2019-10-02 14:17 ` [PATCH v2 11/20] int128: move __uint128_t compiler test to Kconfig Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 12/20] crypto: BLAKE2s - generic C library implementation and selftest Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 13/20] crypto: BLAKE2s - x86_64 library implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 14/20] crypto: Curve25519 - generic C library implementations and selftest Ard Biesheuvel
2019-10-04 13:57   ` Jason A. Donenfeld
2019-10-04 14:03     ` Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 15/20] crypto: lib/curve25519 - work around Clang stack spilling issue Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 16/20] crypto: Curve25519 - x86_64 library implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 17/20] crypto: arm - import Bernstein and Schwabe's Curve25519 ARM implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 18/20] crypto: arm/Curve25519 - wire up NEON implementation Ard Biesheuvel
2019-10-04 14:00   ` Jason A. Donenfeld
2019-10-04 14:11     ` Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 19/20] crypto: chacha20poly1305 - import construction and selftest from Zinc Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 20/20] crypto: lib/chacha20poly1305 - reimplement crypt_from_sg() routine Ard Biesheuvel
2019-10-04 14:03   ` Jason A. Donenfeld
2019-10-04 14:07     ` Ard Biesheuvel
2019-10-03  8:43 ` [PATCH v2 00/20] crypto: crypto API library interfaces for WireGuard Ard Biesheuvel
2019-10-04 13:42   ` Jason A. Donenfeld
2019-10-04 13:52     ` Ard Biesheuvel
2019-10-04 14:53       ` Andy Lutomirski
2019-10-04 14:55         ` Jason A. Donenfeld
2019-10-04 14:59           ` Ard Biesheuvel
2019-10-04 14:56         ` Ard Biesheuvel
2019-10-05  7:24           ` Ard Biesheuvel
2019-10-07  4:44             ` Andy Lutomirski
2019-10-07  5:23               ` Ard Biesheuvel
2019-10-07 15:01                 ` Andy Lutomirski
2019-10-07 15:12                   ` Ard Biesheuvel
2019-10-07 16:05                     ` Andy Lutomirski
2019-10-04 14:50     ` Andy Lutomirski
2019-10-04 13:16 ` Jason A. Donenfeld
2019-10-04 14:12 ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191006191228.Horde.E8aAava9O1UOhVnxdaZzfqw@www.vdorst.com \
    --to=opensource@vdorst.com \
    --cc=Jason@zx2c4.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=dan.carpenter@oracle.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=jpoimboe@redhat.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=martin@strongswan.org \
    --cc=maz@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sneves@dei.uc.pt \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).