linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Martin <Dave.Martin@arm.com>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-crypto@vger.kernel.org, Ingo Molnar <mingo@kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Peter Zijlstra <peterz@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	linux-kernel@vger.kernel.org, Eric Biggers <ebiggers@kernel.org>,
	Mark Brown <broonie@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC PATCH 4/5] arm64: fpsimd: run kernel mode NEON with softirqs disabled
Date: Tue, 19 Jan 2021 16:00:56 +0000	[thread overview]
Message-ID: <20210119160045.GA1684@arm.com> (raw)
In-Reply-To: <20201218170106.23280-5-ardb@kernel.org>

On Fri, Dec 18, 2020 at 06:01:05PM +0100, Ard Biesheuvel wrote:
> Kernel mode NEON can be used in task or softirq context, but only in
> a non-nesting manner, i.e., softirq context is only permitted if the
> interrupt was not taken at a point where the kernel was using the NEON
> in task context.
> 
> This means all users of kernel mode NEON have to be aware of this
> limitation, and either need to provide scalar fallbacks that may be much
> slower (up to 20x for AES instructions) and potentially less safe, or
> use an asynchronous interface that defers processing to a later time
> when the NEON is guaranteed to be available.
> 
> Given that grabbing and releasing the NEON is cheap, we can relax this
> restriction, by increasing the granularity of kernel mode NEON code, and
> always disabling softirq processing while the NEON is being used in task
> context.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Sorry for the slow reply on this...  it looks reasonable, but I have a
few comments below.

> ---
>  arch/arm64/include/asm/assembler.h | 19 +++++++++++++------
>  arch/arm64/kernel/asm-offsets.c    |  2 ++
>  arch/arm64/kernel/fpsimd.c         |  4 ++--
>  3 files changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index ddbe6bf00e33..74ce46ed55ac 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -15,6 +15,7 @@
>  #include <asm-generic/export.h>
>  
>  #include <asm/asm-offsets.h>
> +#include <asm/alternative.h>
>  #include <asm/cpufeature.h>
>  #include <asm/cputype.h>
>  #include <asm/debug-monitors.h>
> @@ -717,17 +718,23 @@ USER(\label, ic	ivau, \tmp2)			// invalidate I line PoU
>  	.endm
>  
>  	.macro		if_will_cond_yield_neon
> -#ifdef CONFIG_PREEMPTION
>  	get_current_task	x0
>  	ldr		x0, [x0, #TSK_TI_PREEMPT]
> -	sub		x0, x0, #PREEMPT_DISABLE_OFFSET
> -	cbz		x0, .Lyield_\@
> +#ifdef CONFIG_PREEMPTION
> +	cmp		x0, #PREEMPT_DISABLE_OFFSET
> +	beq		.Lyield_\@	// yield on need_resched in task context
> +#endif
> +	/* never yield while serving a softirq */
> +	tbnz		x0, #SOFTIRQ_SHIFT, .Lnoyield_\@

Can you explain the rationale here?

Using if_will_cond_yield_neon suggests the algo thinks it may run for
too long the stall preemption until completion, but we happily stall
preemption _and_ softirqs here.

Is it actually a bug to use the NEON conditional yield helpers in
softirq context?

Ideally, if processing in softirq context takes an unreasonable about of
time, the work would be handed off to an asynchronous worker, but that
does seem to conflict rather with the purpose of this series...

> +
> +	adr_l		x0, irq_stat + IRQ_CPUSTAT_SOFTIRQ_PENDING
> +	this_cpu_offset	x1
> +	ldr		w0, [x0, x1]
> +	cbnz		w0, .Lyield_\@	// yield on pending softirq in task context
> +.Lnoyield_\@:
>  	/* fall through to endif_yield_neon */
>  	.subsection	1
>  .Lyield_\@ :
> -#else
> -	.section	".discard.cond_yield_neon", "ax"
> -#endif
>  	.endm
>  
>  	.macro		do_cond_yield_neon
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index 7d32fc959b1a..34ef70877de4 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -93,6 +93,8 @@ int main(void)
>    DEFINE(DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
>    BLANK();
>    DEFINE(PREEMPT_DISABLE_OFFSET, PREEMPT_DISABLE_OFFSET);
> +  DEFINE(SOFTIRQ_SHIFT, SOFTIRQ_SHIFT);
> +  DEFINE(IRQ_CPUSTAT_SOFTIRQ_PENDING, offsetof(irq_cpustat_t, __softirq_pending));
>    BLANK();
>    DEFINE(CPU_BOOT_STACK,	offsetof(struct secondary_data, stack));
>    DEFINE(CPU_BOOT_TASK,		offsetof(struct secondary_data, task));
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 062b21f30f94..823e3a8a8871 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -180,7 +180,7 @@ static void __get_cpu_fpsimd_context(void)
>   */
>  static void get_cpu_fpsimd_context(void)
>  {
> -	preempt_disable();
> +	local_bh_disable();
>  	__get_cpu_fpsimd_context();
>  }
>  
> @@ -201,7 +201,7 @@ static void __put_cpu_fpsimd_context(void)
>  static void put_cpu_fpsimd_context(void)
>  {
>  	__put_cpu_fpsimd_context();
> -	preempt_enable();
> +	local_bh_enable();
>  }
>  
>  static bool have_cpu_fpsimd_context(void)

I was concerned about catching all the relevant preempt_disable()s, but
it had slipped my memory that Julien had factored these into one place.

I can't see off the top of my head any reason why this shouldn't work.


In threory, switching to local_bh_enable() here will add a check for
pending softirqs onto context handling fast paths.  I haven't dug into
how that works, so perhaps this is trivial on top of the preemption
check in preempt_enable().  Do you see any difference in hackbench or
similar benchmarks?

Cheers
---Dave

  reply	other threads:[~2021-01-19 16:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-18 17:01 [RFC PATCH 0/5] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
2020-12-18 17:01 ` [RFC PATCH 1/5] crypto: aead - disallow en/decrypt for non-task or non-softirq context Ard Biesheuvel
2020-12-18 17:01 ` [RFC PATCH 2/5] crypto: skcipher " Ard Biesheuvel
2020-12-18 17:01 ` [RFC PATCH 3/5] crypto: arm64/gcm-aes-ce - add NEON yield support Ard Biesheuvel
2020-12-18 17:01 ` [RFC PATCH 4/5] arm64: fpsimd: run kernel mode NEON with softirqs disabled Ard Biesheuvel
2021-01-19 16:00   ` Dave Martin [this message]
2021-01-19 16:29     ` Ard Biesheuvel
2021-01-20 15:44       ` Dave Martin
2021-02-15 18:30         ` Ard Biesheuvel
2020-12-18 17:01 ` [RFC PATCH 5/5] crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path Ard Biesheuvel
2020-12-19  2:04 ` [RFC PATCH 0/5] running kernel mode SIMD with softirqs disabled Herbert Xu
2021-01-14  8:22   ` Ard Biesheuvel
2021-02-16 10:09 ` Peter Zijlstra
2021-02-16 10:35   ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210119160045.GA1684@arm.com \
    --to=dave.martin@arm.com \
    --cc=ardb@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=ebiggers@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).