All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Nathan Huckleberry <nhuck@google.com>
Cc: linux-crypto@vger.kernel.org, linux-fscrypt@vger.kernel.org,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org,
	Paul Crowley <paulcrowley@google.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Ard Biesheuvel <ardb@kernel.org>
Subject: Re: [PATCH v6 8/9] crypto: arm64/polyval: Add PMULL accelerated implementation of POLYVAL
Date: Wed, 4 May 2022 22:56:49 -0700	[thread overview]
Message-ID: <YnNnIV0P9bFgTkQt@sol.localdomain> (raw)
In-Reply-To: <20220504001823.2483834-9-nhuck@google.com>

On Wed, May 04, 2022 at 12:18:22AM +0000, Nathan Huckleberry wrote:
> + * X = [X_1 : X_0]
> + * Y = [Y_1 : Y_0]
> + *
> + * The multiplication produces four parts:
> + *   LOW: The polynomial given by performing carryless multiplication of X_0 and
> + *   Y_0
> + *   MID: The polynomial given by performing carryless multiplication of (X_0 +
> + *   X_1) and (Y_0 + Y_1)
> + *   HIGH: The polynomial given by performing carryless multiplication of X_1
> + *   and Y_1
> + *
> + * We compute:
> + *  LO += LOW
> + *  MI += MID
> + *  HI += HIGH

Three parts, not four.  But why not write this as the much more concise:

 * Given:
 *	X = [X_1 : X_0]
 *	Y = [Y_1 : Y_0]
 *
 * We compute:
 *	LO += X_0 * Y_0
 *	MI += (X_0 + X_1) * (Y_0 + Y_1)
 *	HI += X_1 * Y_1

> + * So our final computation is: T = T_1 : T_0 = g*(x) * P_0 V = V_1 : V_0 =      
> + * g*(x) * (P_1 + T_0) p(x) / x^{128} mod g(x) = P_3 + P_1 + T_0 + V_1 : P_2 +   
> + * P_0 + T_1 + V_0                                                               

As on the x86 version, this part is now unreadable.  It was fine in v5.

> + *   [HI_1 : HI_0 + HI_1 + MI_1 + LO_1 : LO_1 + HI_0 + MI_0 + LO_0 : LO_0]
[...]
> + *   [HI_1 : HI_1 + HI_0 + MI_1 + LO_1 : HI_0 + MI_0 + LO_1 + LO_0 : LO_0]
[...]
> +	// TMP_V = T_1 : T_0 = P_0 * g*(x)
> +	pmull	TMP_V.1q, PL.1d, GSTAR.1d
[...]
> +	// TMP_V = V_1 : V_0 = (P_1 + T_0) * g*(x)
> +	pmull2	TMP_V.1q, GSTAR.2d, TMP_V.2d
> +	eor	DEST.16b, PH.16b, TMP_V.16b
[...]
> +	pmull	TMP_V.1q, GSTAR.1d, PL.1d
[...]
> +	pmull2	TMP_V.1q, GSTAR.2d, TMP_V.2d
[...]
> +	eor	SUM.16b, TMP_V.16b, PH.16b

It looks like you didn't fully address my comments on v5 about putting operands
in a consistent order.  Not a big deal, but assembly code is always hard to
read, and anything to make it easier would be greatly appreciated.

> +/*
> + * Handle any extra blocks afer full_stride loop.
> + */

Typo above.

> diff --git a/arch/arm64/crypto/polyval-ce-glue.c b/arch/arm64/crypto/polyval-ce-glue.c
[...]
> +struct polyval_tfm_ctx {
> +	u8 key_powers[NUM_KEY_POWERS][POLYVAL_BLOCK_SIZE];
> +};

This is missing the comment about the order of the key powers that I had
suggested for readability.  It made it into the x86 version but not here.  This
file is very similar to arch/x86/crypto/polyval-clmulni_glue.c, so if you could
diff them and eliminate any unintended differences, that would be helpful.

Other than the above readability suggestions this patch looks good, nice job.

- Eric

WARNING: multiple messages have this Message-ID (diff)
From: Eric Biggers <ebiggers@kernel.org>
To: Nathan Huckleberry <nhuck@google.com>
Cc: linux-crypto@vger.kernel.org, linux-fscrypt@vger.kernel.org,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	linux-arm-kernel@lists.infradead.org,
	Paul Crowley <paulcrowley@google.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Ard Biesheuvel <ardb@kernel.org>
Subject: Re: [PATCH v6 8/9] crypto: arm64/polyval: Add PMULL accelerated implementation of POLYVAL
Date: Wed, 4 May 2022 22:56:49 -0700	[thread overview]
Message-ID: <YnNnIV0P9bFgTkQt@sol.localdomain> (raw)
In-Reply-To: <20220504001823.2483834-9-nhuck@google.com>

On Wed, May 04, 2022 at 12:18:22AM +0000, Nathan Huckleberry wrote:
> + * X = [X_1 : X_0]
> + * Y = [Y_1 : Y_0]
> + *
> + * The multiplication produces four parts:
> + *   LOW: The polynomial given by performing carryless multiplication of X_0 and
> + *   Y_0
> + *   MID: The polynomial given by performing carryless multiplication of (X_0 +
> + *   X_1) and (Y_0 + Y_1)
> + *   HIGH: The polynomial given by performing carryless multiplication of X_1
> + *   and Y_1
> + *
> + * We compute:
> + *  LO += LOW
> + *  MI += MID
> + *  HI += HIGH

Three parts, not four.  But why not write this as the much more concise:

 * Given:
 *	X = [X_1 : X_0]
 *	Y = [Y_1 : Y_0]
 *
 * We compute:
 *	LO += X_0 * Y_0
 *	MI += (X_0 + X_1) * (Y_0 + Y_1)
 *	HI += X_1 * Y_1

> + * So our final computation is: T = T_1 : T_0 = g*(x) * P_0 V = V_1 : V_0 =      
> + * g*(x) * (P_1 + T_0) p(x) / x^{128} mod g(x) = P_3 + P_1 + T_0 + V_1 : P_2 +   
> + * P_0 + T_1 + V_0                                                               

As on the x86 version, this part is now unreadable.  It was fine in v5.

> + *   [HI_1 : HI_0 + HI_1 + MI_1 + LO_1 : LO_1 + HI_0 + MI_0 + LO_0 : LO_0]
[...]
> + *   [HI_1 : HI_1 + HI_0 + MI_1 + LO_1 : HI_0 + MI_0 + LO_1 + LO_0 : LO_0]
[...]
> +	// TMP_V = T_1 : T_0 = P_0 * g*(x)
> +	pmull	TMP_V.1q, PL.1d, GSTAR.1d
[...]
> +	// TMP_V = V_1 : V_0 = (P_1 + T_0) * g*(x)
> +	pmull2	TMP_V.1q, GSTAR.2d, TMP_V.2d
> +	eor	DEST.16b, PH.16b, TMP_V.16b
[...]
> +	pmull	TMP_V.1q, GSTAR.1d, PL.1d
[...]
> +	pmull2	TMP_V.1q, GSTAR.2d, TMP_V.2d
[...]
> +	eor	SUM.16b, TMP_V.16b, PH.16b

It looks like you didn't fully address my comments on v5 about putting operands
in a consistent order.  Not a big deal, but assembly code is always hard to
read, and anything to make it easier would be greatly appreciated.

> +/*
> + * Handle any extra blocks afer full_stride loop.
> + */

Typo above.

> diff --git a/arch/arm64/crypto/polyval-ce-glue.c b/arch/arm64/crypto/polyval-ce-glue.c
[...]
> +struct polyval_tfm_ctx {
> +	u8 key_powers[NUM_KEY_POWERS][POLYVAL_BLOCK_SIZE];
> +};

This is missing the comment about the order of the key powers that I had
suggested for readability.  It made it into the x86 version but not here.  This
file is very similar to arch/x86/crypto/polyval-clmulni_glue.c, so if you could
diff them and eliminate any unintended differences, that would be helpful.

Other than the above readability suggestions this patch looks good, nice job.

- Eric

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-05-05  5:56 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-04  0:18 [PATCH v6 0/9] crypto: HCTR2 support Nathan Huckleberry
2022-05-04  0:18 ` Nathan Huckleberry
2022-05-04  0:18 ` [PATCH v6 1/9] crypto: xctr - Add XCTR support Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry
2022-05-04  0:18 ` [PATCH v6 2/9] crypto: polyval - Add POLYVAL support Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry
2022-05-04  0:18 ` [PATCH v6 3/9] crypto: hctr2 - Add HCTR2 support Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry
2022-05-04  0:18 ` [PATCH v6 4/9] crypto: x86/aesni-xctr: Add accelerated implementation of XCTR Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry
2022-05-05  4:45   ` Eric Biggers
2022-05-05  4:45     ` Eric Biggers
2022-05-04  0:18 ` [PATCH v6 5/9] crypto: arm64/aes-xctr: " Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry
2022-05-06  5:49   ` Eric Biggers
2022-05-06  5:49     ` Eric Biggers
2022-05-04  0:18 ` [PATCH v6 6/9] crypto: arm64/aes-xctr: Improve readability of XCTR and CTR modes Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry
2022-05-05  6:49   ` Ard Biesheuvel
2022-05-05  6:49     ` Ard Biesheuvel
2022-05-06  5:41   ` Eric Biggers
2022-05-06  5:41     ` Eric Biggers
2022-05-06 21:22     ` Nathan Huckleberry
2022-05-06 21:22       ` Nathan Huckleberry
2022-05-04  0:18 ` [PATCH v6 7/9] crypto: x86/polyval: Add PCLMULQDQ accelerated implementation of POLYVAL Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry
2022-05-05  5:08   ` Eric Biggers
2022-05-05  5:08     ` Eric Biggers
2022-05-04  0:18 ` [PATCH v6 8/9] crypto: arm64/polyval: Add PMULL " Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry
2022-05-05  5:56   ` Eric Biggers [this message]
2022-05-05  5:56     ` Eric Biggers
2022-05-04  0:18 ` [PATCH v6 9/9] fscrypt: Add HCTR2 support for filename encryption Nathan Huckleberry
2022-05-04  0:18   ` Nathan Huckleberry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YnNnIV0P9bFgTkQt@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=ardb@kernel.org \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-fscrypt@vger.kernel.org \
    --cc=nhuck@google.com \
    --cc=paulcrowley@google.com \
    --cc=samitolvanen@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.