All of lore.kernel.org
 help / color / mirror / Atom feed
From: Herbert Xu <herbert@gondor.apana.org.au>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-crypto@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, will@kernel.org,
	mark.rutland@arm.com, catalin.marinas@arm.com,
	Dave Martin <dave.martin@arm.com>,
	Eric Biggers <ebiggers@google.com>
Subject: Re: [PATCH v2 0/9] arm64: rework NEON yielding to avoid scheduling from asm code
Date: Wed, 10 Feb 2021 18:23:07 +1100	[thread overview]
Message-ID: <20210210072307.GA4617@gondor.apana.org.au> (raw)
In-Reply-To: <20210203113626.220151-1-ardb@kernel.org>

On Wed, Feb 03, 2021 at 12:36:17PM +0100, Ard Biesheuvel wrote:
> Given how kernel mode NEON code disables preemption (to ensure that the
> FP/SIMD register state is protected without having to context switch it),
> we need to take care not to let those algorithms operate on unbounded
> input data, or we may end up with excessive scheduling blackouts on
> CONFIG_PREEMPT kernels.
> 
> This is currently handled by the cond_yield_neon macros, which check the
> preempt count and the TIF_NEED_RESCHED flag from assembler code, and call
> into kernel_neon_end()+kernel_neon_begin(), triggering a reschedule.
> This works as expected, but is a bit messy, given how much of the state
> preserve/restore code in the algorithm needs to be duplicated, as well as
> causing the need to manage the stack frame explicitly. All of this is better
> handled by the compiler, especially now that we have enabled features such
> as the shadow call stack and BTI, and are working to improve call stack
> validation.
> 
> In some cases, yielding is not necessary at all: algoritms that implement
> skciphers and use the skcipher walk API will be invoked at page granularity,
> which is granular enough for our purpose.
> 
> In other cases, it is better to simply return early from the assembler
> routine if a reschedule is pending, and let the C code handle with this, by
> retrying the call until it completes. This removes any voluntary schedule()
> calls from the call stack, making the code much easier to reason about in
> the context of stack validation, rcu_tasks synchronization, etc.
> 
> Practical note: assuming there are no objections to these changes, it may
> be the most convenient to take patch #1 into the arm64 tree for v5.12,
> and postpone the rest for merging via the crypto tree. (Note that this
> series was created against the cryptodev tree, and so the arm64 maintainers
> are also welcome to take the whole set if it applies cleanly to the arm64
> tree)
> 
> Will: if you stick #1 on a separate branch, please base it on v5.11-rc1
> 
> Changes since v1:
> - use sub+cbz instead of cmp+b.eq to avoid clobbering the flags in cond_yield
>   (patch #1)
> 
> Cc: Dave Martin <dave.martin@arm.com>
> Cc: Eric Biggers <ebiggers@google.com>
> 
> Ard Biesheuvel (9):
>   arm64: assembler: add cond_yield macro
>   crypto: arm64/sha1-ce - simplify NEON yield
>   crypto: arm64/sha2-ce - simplify NEON yield
>   crypto: arm64/sha3-ce - simplify NEON yield
>   crypto: arm64/sha512-ce - simplify NEON yield
>   crypto: arm64/aes-neonbs - remove NEON yield calls
>   crypto: arm64/aes-ce-mac - simplify NEON yield
>   crypto: arm64/crc-t10dif - move NEON yield to C code
>   arm64: assembler: remove conditional NEON yield macros
> 
>  arch/arm64/crypto/aes-glue.c          | 21 +++--
>  arch/arm64/crypto/aes-modes.S         | 52 +++++--------
>  arch/arm64/crypto/aes-neonbs-core.S   |  8 +-
>  arch/arm64/crypto/crct10dif-ce-core.S | 43 +++--------
>  arch/arm64/crypto/crct10dif-ce-glue.c | 30 ++++++--
>  arch/arm64/crypto/sha1-ce-core.S      | 47 ++++--------
>  arch/arm64/crypto/sha1-ce-glue.c      | 22 +++---
>  arch/arm64/crypto/sha2-ce-core.S      | 38 ++++-----
>  arch/arm64/crypto/sha2-ce-glue.c      | 22 +++---
>  arch/arm64/crypto/sha3-ce-core.S      | 81 ++++++++------------
>  arch/arm64/crypto/sha3-ce-glue.c      | 14 ++--
>  arch/arm64/crypto/sha512-ce-core.S    | 29 ++-----
>  arch/arm64/crypto/sha512-ce-glue.c    | 53 +++++++------
>  arch/arm64/include/asm/assembler.h    | 78 +++----------------
>  14 files changed, 209 insertions(+), 329 deletions(-)

Patches 2-8 applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

WARNING: multiple messages have this Message-ID (diff)
From: Herbert Xu <herbert@gondor.apana.org.au>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: mark.rutland@arm.com, Eric Biggers <ebiggers@google.com>,
	catalin.marinas@arm.com, linux-crypto@vger.kernel.org,
	will@kernel.org, Dave Martin <dave.martin@arm.com>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v2 0/9] arm64: rework NEON yielding to avoid scheduling from asm code
Date: Wed, 10 Feb 2021 18:23:07 +1100	[thread overview]
Message-ID: <20210210072307.GA4617@gondor.apana.org.au> (raw)
In-Reply-To: <20210203113626.220151-1-ardb@kernel.org>

On Wed, Feb 03, 2021 at 12:36:17PM +0100, Ard Biesheuvel wrote:
> Given how kernel mode NEON code disables preemption (to ensure that the
> FP/SIMD register state is protected without having to context switch it),
> we need to take care not to let those algorithms operate on unbounded
> input data, or we may end up with excessive scheduling blackouts on
> CONFIG_PREEMPT kernels.
> 
> This is currently handled by the cond_yield_neon macros, which check the
> preempt count and the TIF_NEED_RESCHED flag from assembler code, and call
> into kernel_neon_end()+kernel_neon_begin(), triggering a reschedule.
> This works as expected, but is a bit messy, given how much of the state
> preserve/restore code in the algorithm needs to be duplicated, as well as
> causing the need to manage the stack frame explicitly. All of this is better
> handled by the compiler, especially now that we have enabled features such
> as the shadow call stack and BTI, and are working to improve call stack
> validation.
> 
> In some cases, yielding is not necessary at all: algoritms that implement
> skciphers and use the skcipher walk API will be invoked at page granularity,
> which is granular enough for our purpose.
> 
> In other cases, it is better to simply return early from the assembler
> routine if a reschedule is pending, and let the C code handle with this, by
> retrying the call until it completes. This removes any voluntary schedule()
> calls from the call stack, making the code much easier to reason about in
> the context of stack validation, rcu_tasks synchronization, etc.
> 
> Practical note: assuming there are no objections to these changes, it may
> be the most convenient to take patch #1 into the arm64 tree for v5.12,
> and postpone the rest for merging via the crypto tree. (Note that this
> series was created against the cryptodev tree, and so the arm64 maintainers
> are also welcome to take the whole set if it applies cleanly to the arm64
> tree)
> 
> Will: if you stick #1 on a separate branch, please base it on v5.11-rc1
> 
> Changes since v1:
> - use sub+cbz instead of cmp+b.eq to avoid clobbering the flags in cond_yield
>   (patch #1)
> 
> Cc: Dave Martin <dave.martin@arm.com>
> Cc: Eric Biggers <ebiggers@google.com>
> 
> Ard Biesheuvel (9):
>   arm64: assembler: add cond_yield macro
>   crypto: arm64/sha1-ce - simplify NEON yield
>   crypto: arm64/sha2-ce - simplify NEON yield
>   crypto: arm64/sha3-ce - simplify NEON yield
>   crypto: arm64/sha512-ce - simplify NEON yield
>   crypto: arm64/aes-neonbs - remove NEON yield calls
>   crypto: arm64/aes-ce-mac - simplify NEON yield
>   crypto: arm64/crc-t10dif - move NEON yield to C code
>   arm64: assembler: remove conditional NEON yield macros
> 
>  arch/arm64/crypto/aes-glue.c          | 21 +++--
>  arch/arm64/crypto/aes-modes.S         | 52 +++++--------
>  arch/arm64/crypto/aes-neonbs-core.S   |  8 +-
>  arch/arm64/crypto/crct10dif-ce-core.S | 43 +++--------
>  arch/arm64/crypto/crct10dif-ce-glue.c | 30 ++++++--
>  arch/arm64/crypto/sha1-ce-core.S      | 47 ++++--------
>  arch/arm64/crypto/sha1-ce-glue.c      | 22 +++---
>  arch/arm64/crypto/sha2-ce-core.S      | 38 ++++-----
>  arch/arm64/crypto/sha2-ce-glue.c      | 22 +++---
>  arch/arm64/crypto/sha3-ce-core.S      | 81 ++++++++------------
>  arch/arm64/crypto/sha3-ce-glue.c      | 14 ++--
>  arch/arm64/crypto/sha512-ce-core.S    | 29 ++-----
>  arch/arm64/crypto/sha512-ce-glue.c    | 53 +++++++------
>  arch/arm64/include/asm/assembler.h    | 78 +++----------------
>  14 files changed, 209 insertions(+), 329 deletions(-)

Patches 2-8 applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2021-02-10  7:24 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-03 11:36 [PATCH v2 0/9] arm64: rework NEON yielding to avoid scheduling from asm code Ard Biesheuvel
2021-02-03 11:36 ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 1/9] arm64: assembler: add cond_yield macro Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 2/9] crypto: arm64/sha1-ce - simplify NEON yield Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 3/9] crypto: arm64/sha2-ce " Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 4/9] crypto: arm64/sha3-ce " Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 5/9] crypto: arm64/sha512-ce " Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 6/9] crypto: arm64/aes-neonbs - remove NEON yield calls Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 7/9] crypto: arm64/aes-ce-mac - simplify NEON yield Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 8/9] crypto: arm64/crc-t10dif - move NEON yield to C code Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 11:36 ` [PATCH v2 9/9] arm64: assembler: remove conditional NEON yield macros Ard Biesheuvel
2021-02-03 11:36   ` Ard Biesheuvel
2021-02-03 21:31 ` (subset) Re: [PATCH v2 0/9] arm64: rework NEON yielding to avoid scheduling from asm code Will Deacon
2021-02-03 21:31   ` Will Deacon
2021-02-04  2:44   ` Herbert Xu
2021-02-04  2:44     ` Herbert Xu
2021-02-04  8:29     ` Ard Biesheuvel
2021-02-04  8:29       ` Ard Biesheuvel
2021-02-04 11:10       ` Herbert Xu
2021-02-04 11:10         ` Herbert Xu
2021-02-04 13:03         ` Will Deacon
2021-02-04 13:03           ` Will Deacon
2021-02-04 19:45           ` Herbert Xu
2021-02-04 19:45             ` Herbert Xu
2021-02-04 10:33   ` Will Deacon
2021-02-04 10:33     ` Will Deacon
2021-02-10  7:23 ` Herbert Xu [this message]
2021-02-10  7:23   ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210210072307.GA4617@gondor.apana.org.au \
    --to=herbert@gondor.apana.org.au \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dave.martin@arm.com \
    --cc=ebiggers@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.