linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled
@ 2021-03-02  9:01 Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 1/9] arm64: assembler: remove conditional NEON yield macros Ard Biesheuvel
                   ` (9 more replies)
  0 siblings, 10 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

[ TL;DR for the non-ARM folks on CC: disabling softirq processing when using
  SIMD in kernel mode could reduce complexity and improve performance, but we
  need to decide whether we can do this, and how much softirq processing
  latency we can tolerate. If we can find a satisfactory solution for this,
  we might do the same for x86 and 32-bit ARM as well.
  
  However, based on preliminary off-list discussions with peterz and luto, it
  seems that for x86, there is a preference for using per-CPU buffers to
  preserve/restore the task context's kernel mode SIMD state when the task is
  interrupted to perform kernel mode SIMD in softirq context. On arm64, we
  actually had this arrangement before, and removed it because it made
  reasoning about preserving/restoring userland SVE state (32 SIMD registers
  of up to 2 kbit in size) rather complex. ]

The crypto API provides two ways to invoke symmetric encryption algorithms:
- synchronously, where the transformation is guaranteed to be done by the
  time the function returns;
- asynchronously, where the function may return with a -EINPROGRESS return code,
  and a completion will be signalled when the transformation is done.

The latter is mainly intended for h/w accelerators, where the throughput would
be severely limited by the latency otherwise. However, it is also being used
for software algorithms based on SIMD instructions, which cannot be issued from
any context (the rules are not the same on each architecture, but typically,
SIMD can be used in task context, or in softirq context if it was not taken
while the SIMD was already in use in kernel mode).

Many users of the crypto API exist in the kernel today that opt out of this
asynchronous interface (802.11, macsec, kerberos, sw kTLS), or use a library
interface which is fundamentally synchronous (wireguard). This means we end
up using a degraded mode for the contended case (a scalar fallback) as well
as the uncontended case (generic GCM/CCM/CTR chaining mode templates wrapped
around the SIMD cipher as opposed to accelerated implementations of the full
chaining modes in question). Note that scalar AES runs ~20x slower than the
SIMD instruction based version.

So let's address this for arm64, by reorganizing kernel mode SIMD support so
that the SIMD unit can always be assumed to be available. This means we need
to defer softirq processing when grabbing the NEON unit in task context, so
that any use of it in softirq context is guaranteed not to interrupt any code
that was already using the NEON.

This obviously impacts softirq processing latency, which is why the existing
conditional yield support is modified to take pending softirqs into account.

Change since RFC/v1:
- add patch to remove obsolete cond_yield_neon macros
- rebased onto new, simplified cond_yield macro
- include patches to remove the async path from all arm64 crypto skciphers
  and AEADs

Previous RFC version:
[0] https://lore.kernel.org/linux-arm-kernel/20201218170106.23280-1-ardb@kernel.org/

The first 3 patches will need to go through the arm64 tree, so once this
series is reviewed, some coordination is required between the arm64 and
crypto trees to get this merged without conflicts.

Cc: Dave Martin <dave.martin@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>

Ard Biesheuvel (9):
  arm64: assembler: remove conditional NEON yield macros
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  crypto: aead - disallow en/decrypt for non-task or non-softirq context
  crypto: skcipher - disallow en/decrypt for non-task or non-softirq
    context
  crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path
  crypto: arm64/aes-ccm - remove non-SIMD fallback path
  crypto: arm64/aes-ce - stop using SIMD helper for skciphers
  crypto: arm64/aes-neonbs - stop using SIMD helper for skciphers

 arch/arm64/crypto/Kconfig           |   3 -
 arch/arm64/crypto/aes-ce-ccm-glue.c | 151 +++-----------
 arch/arm64/crypto/aes-glue.c        | 102 ++--------
 arch/arm64/crypto/aes-modes.S       |   2 +-
 arch/arm64/crypto/aes-neonbs-glue.c | 122 +-----------
 arch/arm64/crypto/ghash-ce-glue.c   | 209 +++++---------------
 arch/arm64/crypto/sha1-ce-core.S    |   2 +-
 arch/arm64/crypto/sha2-ce-core.S    |   2 +-
 arch/arm64/crypto/sha3-ce-core.S    |   4 +-
 arch/arm64/crypto/sha512-ce-core.S  |   2 +-
 arch/arm64/include/asm/assembler.h  | 106 +++-------
 arch/arm64/kernel/asm-offsets.c     |   2 +
 arch/arm64/kernel/fpsimd.c          |   4 +-
 crypto/aead.c                       |  10 +
 crypto/skcipher.c                   |  10 +
 15 files changed, 162 insertions(+), 569 deletions(-)

-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 1/9] arm64: assembler: remove conditional NEON yield macros
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-03-30  9:52   ` Will Deacon
  2021-03-02  9:01 ` [PATCH v2 2/9] arm64: assembler: introduce wxN aliases for wN registers Ard Biesheuvel
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

The users of the conditional NEON yield macros have all been switched to
the simplified cond_yield macro, and so the NEON specific ones can be
removed.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/include/asm/assembler.h | 70 --------------------
 1 file changed, 70 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index ca31594d3d6c..e0fc1d424f9b 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -692,76 +692,6 @@ USER(\label, ic	ivau, \tmp2)			// invalidate I line PoU
 	isb
 .endm
 
-/*
- * Check whether to yield to another runnable task from kernel mode NEON code
- * (which runs with preemption disabled).
- *
- * if_will_cond_yield_neon
- *        // pre-yield patchup code
- * do_cond_yield_neon
- *        // post-yield patchup code
- * endif_yield_neon    <label>
- *
- * where <label> is optional, and marks the point where execution will resume
- * after a yield has been performed. If omitted, execution resumes right after
- * the endif_yield_neon invocation. Note that the entire sequence, including
- * the provided patchup code, will be omitted from the image if
- * CONFIG_PREEMPTION is not defined.
- *
- * As a convenience, in the case where no patchup code is required, the above
- * sequence may be abbreviated to
- *
- * cond_yield_neon <label>
- *
- * Note that the patchup code does not support assembler directives that change
- * the output section, any use of such directives is undefined.
- *
- * The yield itself consists of the following:
- * - Check whether the preempt count is exactly 1 and a reschedule is also
- *   needed. If so, calling of preempt_enable() in kernel_neon_end() will
- *   trigger a reschedule. If it is not the case, yielding is pointless.
- * - Disable and re-enable kernel mode NEON, and branch to the yield fixup
- *   code.
- *
- * This macro sequence may clobber all CPU state that is not guaranteed by the
- * AAPCS to be preserved across an ordinary function call.
- */
-
-	.macro		cond_yield_neon, lbl
-	if_will_cond_yield_neon
-	do_cond_yield_neon
-	endif_yield_neon	\lbl
-	.endm
-
-	.macro		if_will_cond_yield_neon
-#ifdef CONFIG_PREEMPTION
-	get_current_task	x0
-	ldr		x0, [x0, #TSK_TI_PREEMPT]
-	sub		x0, x0, #PREEMPT_DISABLE_OFFSET
-	cbz		x0, .Lyield_\@
-	/* fall through to endif_yield_neon */
-	.subsection	1
-.Lyield_\@ :
-#else
-	.section	".discard.cond_yield_neon", "ax"
-#endif
-	.endm
-
-	.macro		do_cond_yield_neon
-	bl		kernel_neon_end
-	bl		kernel_neon_begin
-	.endm
-
-	.macro		endif_yield_neon, lbl
-	.ifnb		\lbl
-	b		\lbl
-	.else
-	b		.Lyield_out_\@
-	.endif
-	.previous
-.Lyield_out_\@ :
-	.endm
-
 	/*
 	 * Check whether preempt-disabled code should yield as soon as it
 	 * is able. This is the case if re-enabling preemption a single
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 2/9] arm64: assembler: introduce wxN aliases for wN registers
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 1/9] arm64: assembler: remove conditional NEON yield macros Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-03-30  9:59   ` Will Deacon
  2021-03-02  9:01 ` [PATCH v2 3/9] arm64: fpsimd: run kernel mode NEON with softirqs disabled Ard Biesheuvel
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

The AArch64 asm syntax has this slightly tedious property that the names
used in mnemonics to refer to registers depend on whether the opcode in
question targets the entire 64-bits (xN), or only the least significant
8, 16 or 32 bits (wN). When writing parameterized code such as macros,
this can be annoying, as macro arguments don't lend themselves to
indexed lookups, and so generating a reference to wN in a macro that
receives xN as an argument is problematic.

For instance, an upcoming patch that modifies the implementation of the
cond_yield macro to be able to refer to 32-bit registers would need to
modify invocations such as

  cond_yield	3f, x8

to

  cond_yield	3f, 8

so that the second argument can be token pasted after x or w to emit the
correct register reference. Unfortunately, this interferes with the self
documenting nature of the first example, where the second argument is
obviously a register, whereas in the second example, one would need to
go and look at the code to find out what '8' means.

So let's fix this by defining wxN aliases for all xN registers, which
resolve to the 32-bit alias of each respective 64-bit register. This
allows the macro implementation to paste the xN reference after a w to
obtain the correct register name.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/include/asm/assembler.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index e0fc1d424f9b..7b076ccd1a54 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -23,6 +23,14 @@
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
+	/*
+	 * Provide a wxN alias for each wN register so what we can paste a xN
+	 * reference after a 'w' to obtain the 32-bit version.
+	 */
+	.irp	n,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
+	wx\n	.req	w\n
+	.endr
+
 	.macro save_and_disable_daif, flags
 	mrs	\flags, daif
 	msr	daifset, #0xf
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 3/9] arm64: fpsimd: run kernel mode NEON with softirqs disabled
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 1/9] arm64: assembler: remove conditional NEON yield macros Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 2/9] arm64: assembler: introduce wxN aliases for wN registers Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-03-30 10:36   ` Will Deacon
  2021-03-02  9:01 ` [PATCH v2 4/9] crypto: aead - disallow en/decrypt for non-task or non-softirq context Ard Biesheuvel
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

Kernel mode NEON can be used in task or softirq context, but only in
a non-nesting manner, i.e., softirq context is only permitted if the
interrupt was not taken at a point where the kernel was using the NEON
in task context.

This means all users of kernel mode NEON have to be aware of this
limitation, and either need to provide scalar fallbacks that may be much
slower (up to 20x for AES instructions) and potentially less safe, or
use an asynchronous interface that defers processing to a later time
when the NEON is guaranteed to be available.

Given that grabbing and releasing the NEON is cheap, we can relax this
restriction, by increasing the granularity of kernel mode NEON code, and
always disabling softirq processing while the NEON is being used in task
context.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/crypto/aes-modes.S      |  2 +-
 arch/arm64/crypto/sha1-ce-core.S   |  2 +-
 arch/arm64/crypto/sha2-ce-core.S   |  2 +-
 arch/arm64/crypto/sha3-ce-core.S   |  4 +--
 arch/arm64/crypto/sha512-ce-core.S |  2 +-
 arch/arm64/include/asm/assembler.h | 28 +++++++++++++++-----
 arch/arm64/kernel/asm-offsets.c    |  2 ++
 arch/arm64/kernel/fpsimd.c         |  4 +--
 8 files changed, 31 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/crypto/aes-modes.S b/arch/arm64/crypto/aes-modes.S
index bbdb54702aa7..ab6c14ef9f4e 100644
--- a/arch/arm64/crypto/aes-modes.S
+++ b/arch/arm64/crypto/aes-modes.S
@@ -700,7 +700,7 @@ AES_FUNC_START(aes_mac_update)
 	cbz		w5, .Lmacout
 	encrypt_block	v0, w2, x1, x7, w8
 	st1		{v0.16b}, [x4]			/* return dg */
-	cond_yield	.Lmacout, x7
+	cond_yield	.Lmacout, x7, x8
 	b		.Lmacloop4x
 .Lmac1x:
 	add		w3, w3, #4
diff --git a/arch/arm64/crypto/sha1-ce-core.S b/arch/arm64/crypto/sha1-ce-core.S
index 8c02bbc2684e..889ca0f8972b 100644
--- a/arch/arm64/crypto/sha1-ce-core.S
+++ b/arch/arm64/crypto/sha1-ce-core.S
@@ -121,7 +121,7 @@ CPU_LE(	rev32		v11.16b, v11.16b	)
 	add		dgav.4s, dgav.4s, dg0v.4s
 
 	cbz		w2, 2f
-	cond_yield	3f, x5
+	cond_yield	3f, x5, x6
 	b		0b
 
 	/*
diff --git a/arch/arm64/crypto/sha2-ce-core.S b/arch/arm64/crypto/sha2-ce-core.S
index 6cdea7d56059..491179922f49 100644
--- a/arch/arm64/crypto/sha2-ce-core.S
+++ b/arch/arm64/crypto/sha2-ce-core.S
@@ -129,7 +129,7 @@ CPU_LE(	rev32		v19.16b, v19.16b	)
 
 	/* handled all input blocks? */
 	cbz		w2, 2f
-	cond_yield	3f, x5
+	cond_yield	3f, x5, x6
 	b		0b
 
 	/*
diff --git a/arch/arm64/crypto/sha3-ce-core.S b/arch/arm64/crypto/sha3-ce-core.S
index 6f5208414fe3..9c77313f5a60 100644
--- a/arch/arm64/crypto/sha3-ce-core.S
+++ b/arch/arm64/crypto/sha3-ce-core.S
@@ -184,11 +184,11 @@ SYM_FUNC_START(sha3_ce_transform)
 	eor	 v0.16b,  v0.16b, v31.16b
 
 	cbnz	w8, 3b
-	cond_yield 3f, x8
+	cond_yield 4f, x8, x9
 	cbnz	w2, 0b
 
 	/* save state */
-3:	st1	{ v0.1d- v3.1d}, [x0], #32
+4:	st1	{ v0.1d- v3.1d}, [x0], #32
 	st1	{ v4.1d- v7.1d}, [x0], #32
 	st1	{ v8.1d-v11.1d}, [x0], #32
 	st1	{v12.1d-v15.1d}, [x0], #32
diff --git a/arch/arm64/crypto/sha512-ce-core.S b/arch/arm64/crypto/sha512-ce-core.S
index d6e7f6c95fa6..b6a3a36e15f5 100644
--- a/arch/arm64/crypto/sha512-ce-core.S
+++ b/arch/arm64/crypto/sha512-ce-core.S
@@ -195,7 +195,7 @@ CPU_LE(	rev64		v19.16b, v19.16b	)
 	add		v10.2d, v10.2d, v2.2d
 	add		v11.2d, v11.2d, v3.2d
 
-	cond_yield	3f, x4
+	cond_yield	3f, x4, x5
 	/* handled all input blocks? */
 	cbnz		w2, 0b
 
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 7b076ccd1a54..6ac38f7cf824 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -15,6 +15,7 @@
 #include <asm-generic/export.h>
 
 #include <asm/asm-offsets.h>
+#include <asm/alternative.h>
 #include <asm/cpufeature.h>
 #include <asm/cputype.h>
 #include <asm/debug-monitors.h>
@@ -701,19 +702,32 @@ USER(\label, ic	ivau, \tmp2)			// invalidate I line PoU
 .endm
 
 	/*
-	 * Check whether preempt-disabled code should yield as soon as it
-	 * is able. This is the case if re-enabling preemption a single
-	 * time results in a preempt count of zero, and the TIF_NEED_RESCHED
-	 * flag is set. (Note that the latter is stored negated in the
-	 * top word of the thread_info::preempt_count field)
+	 * Check whether preempt/bh-disabled asm code should yield as soon as
+	 * it is able. This is the case if we are currently running in task
+	 * context, and either a softirq is pending, or the TIF_NEED_RESCHED
+	 * flag is set and re-enabling preemption a single time would result in
+	 * a preempt count of zero. (Note that the TIF_NEED_RESCHED flag is
+	 * stored negated in the top word of the thread_info::preempt_count
+	 * field)
 	 */
-	.macro		cond_yield, lbl:req, tmp:req
-#ifdef CONFIG_PREEMPTION
+	.macro		cond_yield, lbl:req, tmp:req, tmp2:req
 	get_current_task \tmp
 	ldr		\tmp, [\tmp, #TSK_TI_PREEMPT]
+	/*
+	 * If we are serving a softirq, there is no point in yielding: the
+	 * softirq will not be preempted no matter what we do, so we should
+	 * run to completion as quickly as we can.
+	 */
+	tbnz		\tmp, #SOFTIRQ_SHIFT, .Lnoyield_\@
+#ifdef CONFIG_PREEMPTION
 	sub		\tmp, \tmp, #PREEMPT_DISABLE_OFFSET
 	cbz		\tmp, \lbl
 #endif
+	adr_l		\tmp, irq_stat + IRQ_CPUSTAT_SOFTIRQ_PENDING
+	this_cpu_offset	\tmp2
+	ldr		w\tmp, [\tmp, \tmp2]
+	cbnz		w\tmp, \lbl	// yield on pending softirq in task context
+.Lnoyield_\@:
 	.endm
 
 /*
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index a36e2fc330d4..cc7267a24bf7 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -95,6 +95,8 @@ int main(void)
   DEFINE(DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
   BLANK();
   DEFINE(PREEMPT_DISABLE_OFFSET, PREEMPT_DISABLE_OFFSET);
+  DEFINE(SOFTIRQ_SHIFT, SOFTIRQ_SHIFT);
+  DEFINE(IRQ_CPUSTAT_SOFTIRQ_PENDING, offsetof(irq_cpustat_t, __softirq_pending));
   BLANK();
   DEFINE(CPU_BOOT_STACK,	offsetof(struct secondary_data, stack));
   DEFINE(CPU_BOOT_TASK,		offsetof(struct secondary_data, task));
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 062b21f30f94..823e3a8a8871 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -180,7 +180,7 @@ static void __get_cpu_fpsimd_context(void)
  */
 static void get_cpu_fpsimd_context(void)
 {
-	preempt_disable();
+	local_bh_disable();
 	__get_cpu_fpsimd_context();
 }
 
@@ -201,7 +201,7 @@ static void __put_cpu_fpsimd_context(void)
 static void put_cpu_fpsimd_context(void)
 {
 	__put_cpu_fpsimd_context();
-	preempt_enable();
+	local_bh_enable();
 }
 
 static bool have_cpu_fpsimd_context(void)
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 4/9] crypto: aead - disallow en/decrypt for non-task or non-softirq context
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2021-03-02  9:01 ` [PATCH v2 3/9] arm64: fpsimd: run kernel mode NEON with softirqs disabled Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 5/9] crypto: skcipher " Ard Biesheuvel
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

In order to ensure that kernel mode SIMD routines will not need a scalar
fallback if they run with softirqs disabled, disallow any use of the
AEAD encrypt and decrypt routines from outside of task or softirq context.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/aead.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/crypto/aead.c b/crypto/aead.c
index 16991095270d..b5304b3d3314 100644
--- a/crypto/aead.c
+++ b/crypto/aead.c
@@ -87,6 +87,11 @@ int crypto_aead_encrypt(struct aead_request *req)
 	unsigned int cryptlen = req->cryptlen;
 	int ret;
 
+	if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
+	    WARN_ONCE(!in_task() && !in_serving_softirq(),
+		      "synchronous call from invalid context\n"))
+		return -EBUSY;
+
 	crypto_stats_get(alg);
 	if (crypto_aead_get_flags(aead) & CRYPTO_TFM_NEED_KEY)
 		ret = -ENOKEY;
@@ -104,6 +109,11 @@ int crypto_aead_decrypt(struct aead_request *req)
 	unsigned int cryptlen = req->cryptlen;
 	int ret;
 
+	if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
+	    WARN_ONCE(!in_task() && !in_serving_softirq(),
+		      "synchronous call from invalid context\n"))
+		return -EBUSY;
+
 	crypto_stats_get(alg);
 	if (crypto_aead_get_flags(aead) & CRYPTO_TFM_NEED_KEY)
 		ret = -ENOKEY;
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 5/9] crypto: skcipher - disallow en/decrypt for non-task or non-softirq context
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2021-03-02  9:01 ` [PATCH v2 4/9] crypto: aead - disallow en/decrypt for non-task or non-softirq context Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 6/9] crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path Ard Biesheuvel
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

In order to ensure that kernel mode SIMD routines will not need a scalar
fallback if they run with softirqs disabled, disallow any use of the
skcipher encrypt and decrypt routines from outside of task or softirq
context.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 crypto/skcipher.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/crypto/skcipher.c b/crypto/skcipher.c
index a15376245416..d1a2ba6eacbe 100644
--- a/crypto/skcipher.c
+++ b/crypto/skcipher.c
@@ -623,6 +623,11 @@ int crypto_skcipher_encrypt(struct skcipher_request *req)
 	unsigned int cryptlen = req->cryptlen;
 	int ret;
 
+	if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
+	    WARN_ONCE(!in_task() && !in_serving_softirq(),
+		      "synchronous call from invalid context\n"))
+		return -EBUSY;
+
 	crypto_stats_get(alg);
 	if (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
 		ret = -ENOKEY;
@@ -640,6 +645,11 @@ int crypto_skcipher_decrypt(struct skcipher_request *req)
 	unsigned int cryptlen = req->cryptlen;
 	int ret;
 
+	if (!(alg->cra_flags & CRYPTO_ALG_ASYNC) &&
+	    WARN_ONCE(!in_task() && !in_serving_softirq(),
+		      "synchronous call from invalid context\n"))
+		return -EBUSY;
+
 	crypto_stats_get(alg);
 	if (crypto_skcipher_get_flags(tfm) & CRYPTO_TFM_NEED_KEY)
 		ret = -ENOKEY;
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 6/9] crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2021-03-02  9:01 ` [PATCH v2 5/9] crypto: skcipher " Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 7/9] crypto: arm64/aes-ccm " Ard Biesheuvel
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

Now that kernel mode SIMD is guaranteed to be available when executing
in task or softirq context, we no longer need scalar fallbacks to use
when the NEON is unavailable. So get rid of them.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/crypto/ghash-ce-glue.c | 209 +++++---------------
 1 file changed, 51 insertions(+), 158 deletions(-)

diff --git a/arch/arm64/crypto/ghash-ce-glue.c b/arch/arm64/crypto/ghash-ce-glue.c
index 720cd3a58da3..15794fe21a0b 100644
--- a/arch/arm64/crypto/ghash-ce-glue.c
+++ b/arch/arm64/crypto/ghash-ce-glue.c
@@ -362,84 +362,36 @@ static int gcm_encrypt(struct aead_request *req)
 
 	err = skcipher_walk_aead_encrypt(&walk, req, false);
 
-	if (likely(crypto_simd_usable())) {
-		do {
-			const u8 *src = walk.src.virt.addr;
-			u8 *dst = walk.dst.virt.addr;
-			int nbytes = walk.nbytes;
-
-			tag = (u8 *)&lengths;
-
-			if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE)) {
-				src = dst = memcpy(buf + sizeof(buf) - nbytes,
-						   src, nbytes);
-			} else if (nbytes < walk.total) {
-				nbytes &= ~(AES_BLOCK_SIZE - 1);
-				tag = NULL;
-			}
-
-			kernel_neon_begin();
-			pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
-					  dg, iv, ctx->aes_key.key_enc, nrounds,
-					  tag);
-			kernel_neon_end();
-
-			if (unlikely(!nbytes))
-				break;
-
-			if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
-				memcpy(walk.dst.virt.addr,
-				       buf + sizeof(buf) - nbytes, nbytes);
-
-			err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
-		} while (walk.nbytes);
-	} else {
-		while (walk.nbytes >= AES_BLOCK_SIZE) {
-			int blocks = walk.nbytes / AES_BLOCK_SIZE;
-			const u8 *src = walk.src.virt.addr;
-			u8 *dst = walk.dst.virt.addr;
-			int remaining = blocks;
-
-			do {
-				aes_encrypt(&ctx->aes_key, buf, iv);
-				crypto_xor_cpy(dst, src, buf, AES_BLOCK_SIZE);
-				crypto_inc(iv, AES_BLOCK_SIZE);
-
-				dst += AES_BLOCK_SIZE;
-				src += AES_BLOCK_SIZE;
-			} while (--remaining > 0);
-
-			ghash_do_update(blocks, dg, walk.dst.virt.addr,
-					&ctx->ghash_key, NULL);
-
-			err = skcipher_walk_done(&walk,
-						 walk.nbytes % AES_BLOCK_SIZE);
-		}
-
-		/* handle the tail */
-		if (walk.nbytes) {
-			aes_encrypt(&ctx->aes_key, buf, iv);
+	do {
+		const u8 *src = walk.src.virt.addr;
+		u8 *dst = walk.dst.virt.addr;
+		int nbytes = walk.nbytes;
 
-			crypto_xor_cpy(walk.dst.virt.addr, walk.src.virt.addr,
-				       buf, walk.nbytes);
+		tag = (u8 *)&lengths;
 
-			memcpy(buf, walk.dst.virt.addr, walk.nbytes);
-			memset(buf + walk.nbytes, 0, sizeof(buf) - walk.nbytes);
+		if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE)) {
+			src = dst = memcpy(buf + sizeof(buf) - nbytes,
+					   src, nbytes);
+		} else if (nbytes < walk.total) {
+			nbytes &= ~(AES_BLOCK_SIZE - 1);
+			tag = NULL;
 		}
 
-		tag = (u8 *)&lengths;
-		ghash_do_update(1, dg, tag, &ctx->ghash_key,
-				walk.nbytes ? buf : NULL);
+		kernel_neon_begin();
+		pmull_gcm_encrypt(nbytes, dst, src, ctx->ghash_key.h,
+				  dg, iv, ctx->aes_key.key_enc, nrounds,
+				  tag);
+		kernel_neon_end();
 
-		if (walk.nbytes)
-			err = skcipher_walk_done(&walk, 0);
+		if (unlikely(!nbytes))
+			break;
 
-		put_unaligned_be64(dg[1], tag);
-		put_unaligned_be64(dg[0], tag + 8);
-		put_unaligned_be32(1, iv + GCM_IV_SIZE);
-		aes_encrypt(&ctx->aes_key, iv, iv);
-		crypto_xor(tag, iv, AES_BLOCK_SIZE);
-	}
+		if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
+			memcpy(walk.dst.virt.addr,
+			       buf + sizeof(buf) - nbytes, nbytes);
+
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+	} while (walk.nbytes);
 
 	if (err)
 		return err;
@@ -464,6 +416,7 @@ static int gcm_decrypt(struct aead_request *req)
 	u64 dg[2] = {};
 	be128 lengths;
 	u8 *tag;
+	int ret;
 	int err;
 
 	lengths.a = cpu_to_be64(req->assoclen * 8);
@@ -481,101 +434,41 @@ static int gcm_decrypt(struct aead_request *req)
 
 	err = skcipher_walk_aead_decrypt(&walk, req, false);
 
-	if (likely(crypto_simd_usable())) {
-		int ret;
-
-		do {
-			const u8 *src = walk.src.virt.addr;
-			u8 *dst = walk.dst.virt.addr;
-			int nbytes = walk.nbytes;
-
-			tag = (u8 *)&lengths;
-
-			if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE)) {
-				src = dst = memcpy(buf + sizeof(buf) - nbytes,
-						   src, nbytes);
-			} else if (nbytes < walk.total) {
-				nbytes &= ~(AES_BLOCK_SIZE - 1);
-				tag = NULL;
-			}
-
-			kernel_neon_begin();
-			ret = pmull_gcm_decrypt(nbytes, dst, src,
-						ctx->ghash_key.h,
-						dg, iv, ctx->aes_key.key_enc,
-						nrounds, tag, otag, authsize);
-			kernel_neon_end();
-
-			if (unlikely(!nbytes))
-				break;
-
-			if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
-				memcpy(walk.dst.virt.addr,
-				       buf + sizeof(buf) - nbytes, nbytes);
-
-			err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
-		} while (walk.nbytes);
-
-		if (err)
-			return err;
-		if (ret)
-			return -EBADMSG;
-	} else {
-		while (walk.nbytes >= AES_BLOCK_SIZE) {
-			int blocks = walk.nbytes / AES_BLOCK_SIZE;
-			const u8 *src = walk.src.virt.addr;
-			u8 *dst = walk.dst.virt.addr;
-
-			ghash_do_update(blocks, dg, walk.src.virt.addr,
-					&ctx->ghash_key, NULL);
-
-			do {
-				aes_encrypt(&ctx->aes_key, buf, iv);
-				crypto_xor_cpy(dst, src, buf, AES_BLOCK_SIZE);
-				crypto_inc(iv, AES_BLOCK_SIZE);
-
-				dst += AES_BLOCK_SIZE;
-				src += AES_BLOCK_SIZE;
-			} while (--blocks > 0);
+	do {
+		const u8 *src = walk.src.virt.addr;
+		u8 *dst = walk.dst.virt.addr;
+		int nbytes = walk.nbytes;
 
-			err = skcipher_walk_done(&walk,
-						 walk.nbytes % AES_BLOCK_SIZE);
-		}
+		tag = (u8 *)&lengths;
 
-		/* handle the tail */
-		if (walk.nbytes) {
-			memcpy(buf, walk.src.virt.addr, walk.nbytes);
-			memset(buf + walk.nbytes, 0, sizeof(buf) - walk.nbytes);
+		if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE)) {
+			src = dst = memcpy(buf + sizeof(buf) - nbytes,
+					   src, nbytes);
+		} else if (nbytes < walk.total) {
+			nbytes &= ~(AES_BLOCK_SIZE - 1);
+			tag = NULL;
 		}
 
-		tag = (u8 *)&lengths;
-		ghash_do_update(1, dg, tag, &ctx->ghash_key,
-				walk.nbytes ? buf : NULL);
-
-		if (walk.nbytes) {
-			aes_encrypt(&ctx->aes_key, buf, iv);
+		kernel_neon_begin();
+		ret = pmull_gcm_decrypt(nbytes, dst, src, ctx->ghash_key.h,
+					dg, iv, ctx->aes_key.key_enc,
+					nrounds, tag, otag, authsize);
+		kernel_neon_end();
 
-			crypto_xor_cpy(walk.dst.virt.addr, walk.src.virt.addr,
-				       buf, walk.nbytes);
+		if (unlikely(!nbytes))
+			break;
 
-			err = skcipher_walk_done(&walk, 0);
-		}
+		if (unlikely(nbytes > 0 && nbytes < AES_BLOCK_SIZE))
+			memcpy(walk.dst.virt.addr,
+			       buf + sizeof(buf) - nbytes, nbytes);
 
-		if (err)
-			return err;
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
+	} while (walk.nbytes);
 
-		put_unaligned_be64(dg[1], tag);
-		put_unaligned_be64(dg[0], tag + 8);
-		put_unaligned_be32(1, iv + GCM_IV_SIZE);
-		aes_encrypt(&ctx->aes_key, iv, iv);
-		crypto_xor(tag, iv, AES_BLOCK_SIZE);
+	if (err)
+		return err;
 
-		if (crypto_memneq(tag, otag, authsize)) {
-			memzero_explicit(tag, AES_BLOCK_SIZE);
-			return -EBADMSG;
-		}
-	}
-	return 0;
+	return ret ? -EBADMSG : 0;
 }
 
 static struct aead_alg gcm_aes_alg = {
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 7/9] crypto: arm64/aes-ccm - remove non-SIMD fallback path
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2021-03-02  9:01 ` [PATCH v2 6/9] crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 8/9] crypto: arm64/aes-ce - stop using SIMD helper for skciphers Ard Biesheuvel
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

AES/CCM on arm64 is implemented as a synchronous AEAD, and so it is
guaranteed by the API that it is only invoked in task or softirq
context. Since softirqs are now only handled when the SIMD is not
being used in the task context that was interrupted to service the
softirq, we no longer need a fallback path. Let's remove it.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/crypto/aes-ce-ccm-glue.c | 151 ++++----------------
 1 file changed, 30 insertions(+), 121 deletions(-)

diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c
index f6d19b0dc893..e6a7243825a2 100644
--- a/arch/arm64/crypto/aes-ce-ccm-glue.c
+++ b/arch/arm64/crypto/aes-ce-ccm-glue.c
@@ -99,36 +99,10 @@ static int ccm_init_mac(struct aead_request *req, u8 maciv[], u32 msglen)
 static void ccm_update_mac(struct crypto_aes_ctx *key, u8 mac[], u8 const in[],
 			   u32 abytes, u32 *macp)
 {
-	if (crypto_simd_usable()) {
-		kernel_neon_begin();
-		ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc,
-				     num_rounds(key));
-		kernel_neon_end();
-	} else {
-		if (*macp > 0 && *macp < AES_BLOCK_SIZE) {
-			int added = min(abytes, AES_BLOCK_SIZE - *macp);
-
-			crypto_xor(&mac[*macp], in, added);
-
-			*macp += added;
-			in += added;
-			abytes -= added;
-		}
-
-		while (abytes >= AES_BLOCK_SIZE) {
-			aes_encrypt(key, mac, mac);
-			crypto_xor(mac, in, AES_BLOCK_SIZE);
-
-			in += AES_BLOCK_SIZE;
-			abytes -= AES_BLOCK_SIZE;
-		}
-
-		if (abytes > 0) {
-			aes_encrypt(key, mac, mac);
-			crypto_xor(mac, in, abytes);
-			*macp = abytes;
-		}
-	}
+	kernel_neon_begin();
+	ce_aes_ccm_auth_data(mac, in, abytes, macp, key->key_enc,
+			     num_rounds(key));
+	kernel_neon_end();
 }
 
 static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[])
@@ -171,54 +145,6 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[])
 	} while (len);
 }
 
-static int ccm_crypt_fallback(struct skcipher_walk *walk, u8 mac[], u8 iv0[],
-			      struct crypto_aes_ctx *ctx, bool enc)
-{
-	u8 buf[AES_BLOCK_SIZE];
-	int err = 0;
-
-	while (walk->nbytes) {
-		int blocks = walk->nbytes / AES_BLOCK_SIZE;
-		u32 tail = walk->nbytes % AES_BLOCK_SIZE;
-		u8 *dst = walk->dst.virt.addr;
-		u8 *src = walk->src.virt.addr;
-		u32 nbytes = walk->nbytes;
-
-		if (nbytes == walk->total && tail > 0) {
-			blocks++;
-			tail = 0;
-		}
-
-		do {
-			u32 bsize = AES_BLOCK_SIZE;
-
-			if (nbytes < AES_BLOCK_SIZE)
-				bsize = nbytes;
-
-			crypto_inc(walk->iv, AES_BLOCK_SIZE);
-			aes_encrypt(ctx, buf, walk->iv);
-			aes_encrypt(ctx, mac, mac);
-			if (enc)
-				crypto_xor(mac, src, bsize);
-			crypto_xor_cpy(dst, src, buf, bsize);
-			if (!enc)
-				crypto_xor(mac, dst, bsize);
-			dst += bsize;
-			src += bsize;
-			nbytes -= bsize;
-		} while (--blocks);
-
-		err = skcipher_walk_done(walk, tail);
-	}
-
-	if (!err) {
-		aes_encrypt(ctx, buf, iv0);
-		aes_encrypt(ctx, mac, mac);
-		crypto_xor(mac, buf, AES_BLOCK_SIZE);
-	}
-	return err;
-}
-
 static int ccm_encrypt(struct aead_request *req)
 {
 	struct crypto_aead *aead = crypto_aead_reqtfm(req);
@@ -241,30 +167,22 @@ static int ccm_encrypt(struct aead_request *req)
 
 	err = skcipher_walk_aead_encrypt(&walk, req, false);
 
-	if (crypto_simd_usable()) {
-		while (walk.nbytes) {
-			u32 tail = walk.nbytes % AES_BLOCK_SIZE;
+	while (walk.nbytes) {
+		u32 tail = walk.nbytes % AES_BLOCK_SIZE;
 
-			if (walk.nbytes == walk.total)
-				tail = 0;
+		if (walk.nbytes == walk.total)
+			tail = 0;
 
-			kernel_neon_begin();
-			ce_aes_ccm_encrypt(walk.dst.virt.addr,
-					   walk.src.virt.addr,
-					   walk.nbytes - tail, ctx->key_enc,
-					   num_rounds(ctx), mac, walk.iv);
-			kernel_neon_end();
+		kernel_neon_begin();
+		ce_aes_ccm_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
+				   walk.nbytes - tail, ctx->key_enc,
+				   num_rounds(ctx), mac, walk.iv);
 
-			err = skcipher_walk_done(&walk, tail);
-		}
-		if (!err) {
-			kernel_neon_begin();
-			ce_aes_ccm_final(mac, buf, ctx->key_enc,
-					 num_rounds(ctx));
-			kernel_neon_end();
-		}
-	} else {
-		err = ccm_crypt_fallback(&walk, mac, buf, ctx, true);
+		if (walk.nbytes == walk.total)
+			ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx));
+		kernel_neon_end();
+
+		err = skcipher_walk_done(&walk, tail);
 	}
 	if (err)
 		return err;
@@ -299,32 +217,23 @@ static int ccm_decrypt(struct aead_request *req)
 
 	err = skcipher_walk_aead_decrypt(&walk, req, false);
 
-	if (crypto_simd_usable()) {
-		while (walk.nbytes) {
-			u32 tail = walk.nbytes % AES_BLOCK_SIZE;
+	while (walk.nbytes) {
+		u32 tail = walk.nbytes % AES_BLOCK_SIZE;
 
-			if (walk.nbytes == walk.total)
-				tail = 0;
+		if (walk.nbytes == walk.total)
+			tail = 0;
 
-			kernel_neon_begin();
-			ce_aes_ccm_decrypt(walk.dst.virt.addr,
-					   walk.src.virt.addr,
-					   walk.nbytes - tail, ctx->key_enc,
-					   num_rounds(ctx), mac, walk.iv);
-			kernel_neon_end();
+		kernel_neon_begin();
+		ce_aes_ccm_decrypt(walk.dst.virt.addr, walk.src.virt.addr,
+				   walk.nbytes - tail, ctx->key_enc,
+				   num_rounds(ctx), mac, walk.iv);
 
-			err = skcipher_walk_done(&walk, tail);
-		}
-		if (!err) {
-			kernel_neon_begin();
-			ce_aes_ccm_final(mac, buf, ctx->key_enc,
-					 num_rounds(ctx));
-			kernel_neon_end();
-		}
-	} else {
-		err = ccm_crypt_fallback(&walk, mac, buf, ctx, false);
-	}
+		if (walk.nbytes == walk.total)
+			ce_aes_ccm_final(mac, buf, ctx->key_enc, num_rounds(ctx));
+		kernel_neon_end();
 
+		err = skcipher_walk_done(&walk, tail);
+	}
 	if (err)
 		return err;
 
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 8/9] crypto: arm64/aes-ce - stop using SIMD helper for skciphers
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
                   ` (6 preceding siblings ...)
  2021-03-02  9:01 ` [PATCH v2 7/9] crypto: arm64/aes-ccm " Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-03-02  9:01 ` [PATCH v2 9/9] crypto: arm64/aes-neonbs " Ard Biesheuvel
  2021-04-12 13:11 ` (subset) [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Catalin Marinas
  9 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

Calls into the skcipher API can only occur from contexts where the SIMD
unit is available, so there is no need for the SIMD helper.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/crypto/Kconfig    |   2 -
 arch/arm64/crypto/aes-glue.c | 102 +++-----------------
 2 files changed, 13 insertions(+), 91 deletions(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index b8eb0453123d..1a4374ed70fa 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -89,7 +89,6 @@ config CRYPTO_AES_ARM64_CE_BLK
 	select CRYPTO_SKCIPHER
 	select CRYPTO_AES_ARM64_CE
 	select CRYPTO_AES_ARM64
-	select CRYPTO_SIMD
 
 config CRYPTO_AES_ARM64_NEON_BLK
 	tristate "AES in ECB/CBC/CTR/XTS modes using NEON instructions"
@@ -97,7 +96,6 @@ config CRYPTO_AES_ARM64_NEON_BLK
 	select CRYPTO_SKCIPHER
 	select CRYPTO_AES_ARM64
 	select CRYPTO_LIB_AES
-	select CRYPTO_SIMD
 
 config CRYPTO_CHACHA20_NEON
 	tristate "ChaCha20, XChaCha20, and XChaCha12 stream ciphers using NEON instructions"
diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c
index 17e735931a0c..30b7cc6a7079 100644
--- a/arch/arm64/crypto/aes-glue.c
+++ b/arch/arm64/crypto/aes-glue.c
@@ -444,7 +444,7 @@ static int __maybe_unused essiv_cbc_decrypt(struct skcipher_request *req)
 	return err ?: cbc_decrypt_walk(req, &walk);
 }
 
-static int ctr_encrypt(struct skcipher_request *req)
+static int __maybe_unused ctr_encrypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
 	struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
@@ -485,29 +485,6 @@ static int ctr_encrypt(struct skcipher_request *req)
 	return err;
 }
 
-static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
-{
-	const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm);
-	unsigned long flags;
-
-	/*
-	 * Temporarily disable interrupts to avoid races where
-	 * cachelines are evicted when the CPU is interrupted
-	 * to do something else.
-	 */
-	local_irq_save(flags);
-	aes_encrypt(ctx, dst, src);
-	local_irq_restore(flags);
-}
-
-static int __maybe_unused ctr_encrypt_sync(struct skcipher_request *req)
-{
-	if (!crypto_simd_usable())
-		return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
-
-	return ctr_encrypt(req);
-}
-
 static int __maybe_unused xts_encrypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -656,10 +633,9 @@ static int __maybe_unused xts_decrypt(struct skcipher_request *req)
 static struct skcipher_alg aes_algs[] = { {
 #if defined(USE_V8_CRYPTO_EXTENSIONS) || !IS_ENABLED(CONFIG_CRYPTO_AES_ARM64_BS)
 	.base = {
-		.cra_name		= "__ecb(aes)",
-		.cra_driver_name	= "__ecb-aes-" MODE,
+		.cra_name		= "ecb(aes)",
+		.cra_driver_name	= "ecb-aes-" MODE,
 		.cra_priority		= PRIO,
-		.cra_flags		= CRYPTO_ALG_INTERNAL,
 		.cra_blocksize		= AES_BLOCK_SIZE,
 		.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
 		.cra_module		= THIS_MODULE,
@@ -671,10 +647,9 @@ static struct skcipher_alg aes_algs[] = { {
 	.decrypt	= ecb_decrypt,
 }, {
 	.base = {
-		.cra_name		= "__cbc(aes)",
-		.cra_driver_name	= "__cbc-aes-" MODE,
+		.cra_name		= "cbc(aes)",
+		.cra_driver_name	= "cbc-aes-" MODE,
 		.cra_priority		= PRIO,
-		.cra_flags		= CRYPTO_ALG_INTERNAL,
 		.cra_blocksize		= AES_BLOCK_SIZE,
 		.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
 		.cra_module		= THIS_MODULE,
@@ -687,10 +662,9 @@ static struct skcipher_alg aes_algs[] = { {
 	.decrypt	= cbc_decrypt,
 }, {
 	.base = {
-		.cra_name		= "__ctr(aes)",
-		.cra_driver_name	= "__ctr-aes-" MODE,
+		.cra_name		= "ctr(aes)",
+		.cra_driver_name	= "ctr-aes-" MODE,
 		.cra_priority		= PRIO,
-		.cra_flags		= CRYPTO_ALG_INTERNAL,
 		.cra_blocksize		= 1,
 		.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
 		.cra_module		= THIS_MODULE,
@@ -704,26 +678,9 @@ static struct skcipher_alg aes_algs[] = { {
 	.decrypt	= ctr_encrypt,
 }, {
 	.base = {
-		.cra_name		= "ctr(aes)",
-		.cra_driver_name	= "ctr-aes-" MODE,
-		.cra_priority		= PRIO - 1,
-		.cra_blocksize		= 1,
-		.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
-		.cra_module		= THIS_MODULE,
-	},
-	.min_keysize	= AES_MIN_KEY_SIZE,
-	.max_keysize	= AES_MAX_KEY_SIZE,
-	.ivsize		= AES_BLOCK_SIZE,
-	.chunksize	= AES_BLOCK_SIZE,
-	.setkey		= skcipher_aes_setkey,
-	.encrypt	= ctr_encrypt_sync,
-	.decrypt	= ctr_encrypt_sync,
-}, {
-	.base = {
-		.cra_name		= "__xts(aes)",
-		.cra_driver_name	= "__xts-aes-" MODE,
+		.cra_name		= "xts(aes)",
+		.cra_driver_name	= "xts-aes-" MODE,
 		.cra_priority		= PRIO,
-		.cra_flags		= CRYPTO_ALG_INTERNAL,
 		.cra_blocksize		= AES_BLOCK_SIZE,
 		.cra_ctxsize		= sizeof(struct crypto_aes_xts_ctx),
 		.cra_module		= THIS_MODULE,
@@ -738,10 +695,9 @@ static struct skcipher_alg aes_algs[] = { {
 }, {
 #endif
 	.base = {
-		.cra_name		= "__cts(cbc(aes))",
-		.cra_driver_name	= "__cts-cbc-aes-" MODE,
+		.cra_name		= "cts(cbc(aes))",
+		.cra_driver_name	= "cts-cbc-aes-" MODE,
 		.cra_priority		= PRIO,
-		.cra_flags		= CRYPTO_ALG_INTERNAL,
 		.cra_blocksize		= AES_BLOCK_SIZE,
 		.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
 		.cra_module		= THIS_MODULE,
@@ -755,10 +711,9 @@ static struct skcipher_alg aes_algs[] = { {
 	.decrypt	= cts_cbc_decrypt,
 }, {
 	.base = {
-		.cra_name		= "__essiv(cbc(aes),sha256)",
-		.cra_driver_name	= "__essiv-cbc-aes-sha256-" MODE,
+		.cra_name		= "essiv(cbc(aes),sha256)",
+		.cra_driver_name	= "essiv-cbc-aes-sha256-" MODE,
 		.cra_priority		= PRIO + 1,
-		.cra_flags		= CRYPTO_ALG_INTERNAL,
 		.cra_blocksize		= AES_BLOCK_SIZE,
 		.cra_ctxsize		= sizeof(struct crypto_aes_essiv_cbc_ctx),
 		.cra_module		= THIS_MODULE,
@@ -997,28 +952,15 @@ static struct shash_alg mac_algs[] = { {
 	.descsize		= sizeof(struct mac_desc_ctx),
 } };
 
-static struct simd_skcipher_alg *aes_simd_algs[ARRAY_SIZE(aes_algs)];
-
 static void aes_exit(void)
 {
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(aes_simd_algs); i++)
-		if (aes_simd_algs[i])
-			simd_skcipher_free(aes_simd_algs[i]);
-
 	crypto_unregister_shashes(mac_algs, ARRAY_SIZE(mac_algs));
 	crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
 }
 
 static int __init aes_init(void)
 {
-	struct simd_skcipher_alg *simd;
-	const char *basename;
-	const char *algname;
-	const char *drvname;
 	int err;
-	int i;
 
 	err = crypto_register_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
 	if (err)
@@ -1028,26 +970,8 @@ static int __init aes_init(void)
 	if (err)
 		goto unregister_ciphers;
 
-	for (i = 0; i < ARRAY_SIZE(aes_algs); i++) {
-		if (!(aes_algs[i].base.cra_flags & CRYPTO_ALG_INTERNAL))
-			continue;
-
-		algname = aes_algs[i].base.cra_name + 2;
-		drvname = aes_algs[i].base.cra_driver_name + 2;
-		basename = aes_algs[i].base.cra_driver_name;
-		simd = simd_skcipher_create_compat(algname, drvname, basename);
-		err = PTR_ERR(simd);
-		if (IS_ERR(simd))
-			goto unregister_simds;
-
-		aes_simd_algs[i] = simd;
-	}
-
 	return 0;
 
-unregister_simds:
-	aes_exit();
-	return err;
 unregister_ciphers:
 	crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
 	return err;
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2 9/9] crypto: arm64/aes-neonbs - stop using SIMD helper for skciphers
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
                   ` (7 preceding siblings ...)
  2021-03-02  9:01 ` [PATCH v2 8/9] crypto: arm64/aes-ce - stop using SIMD helper for skciphers Ard Biesheuvel
@ 2021-03-02  9:01 ` Ard Biesheuvel
  2021-04-12 13:11 ` (subset) [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Catalin Marinas
  9 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2021-03-02  9:01 UTC (permalink / raw)
  To: linux-crypto
  Cc: linux-arm-kernel, Ard Biesheuvel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Will Deacon, Catalin Marinas,
	Thomas Gleixner, Peter Zijlstra, Sebastian Andrzej Siewior,
	Ingo Molnar, Andy Lutomirski

Calls into the skcipher API can only occur from contexts where the SIMD
unit is available, so there is no need for the SIMD helper.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/crypto/Kconfig           |   1 -
 arch/arm64/crypto/aes-neonbs-glue.c | 122 ++------------------
 2 files changed, 9 insertions(+), 114 deletions(-)

diff --git a/arch/arm64/crypto/Kconfig b/arch/arm64/crypto/Kconfig
index 1a4374ed70fa..dbbbc9a0a9b4 100644
--- a/arch/arm64/crypto/Kconfig
+++ b/arch/arm64/crypto/Kconfig
@@ -122,6 +122,5 @@ config CRYPTO_AES_ARM64_BS
 	select CRYPTO_AES_ARM64_NEON_BLK
 	select CRYPTO_AES_ARM64
 	select CRYPTO_LIB_AES
-	select CRYPTO_SIMD
 
 endif
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c
index fb507d569922..8df6ad8cb09d 100644
--- a/arch/arm64/crypto/aes-neonbs-glue.c
+++ b/arch/arm64/crypto/aes-neonbs-glue.c
@@ -63,11 +63,6 @@ struct aesbs_cbc_ctx {
 	u32			enc[AES_MAX_KEYLENGTH_U32];
 };
 
-struct aesbs_ctr_ctx {
-	struct aesbs_ctx	key;		/* must be first member */
-	struct crypto_aes_ctx	fallback;
-};
-
 struct aesbs_xts_ctx {
 	struct aesbs_ctx	key;
 	u32			twkey[AES_MAX_KEYLENGTH_U32];
@@ -207,25 +202,6 @@ static int cbc_decrypt(struct skcipher_request *req)
 	return err;
 }
 
-static int aesbs_ctr_setkey_sync(struct crypto_skcipher *tfm, const u8 *in_key,
-				 unsigned int key_len)
-{
-	struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
-	int err;
-
-	err = aes_expandkey(&ctx->fallback, in_key, key_len);
-	if (err)
-		return err;
-
-	ctx->key.rounds = 6 + key_len / 4;
-
-	kernel_neon_begin();
-	aesbs_convert_key(ctx->key.rk, ctx->fallback.key_enc, ctx->key.rounds);
-	kernel_neon_end();
-
-	return 0;
-}
-
 static int ctr_encrypt(struct skcipher_request *req)
 {
 	struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req);
@@ -292,29 +268,6 @@ static int aesbs_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key,
 	return aesbs_setkey(tfm, in_key, key_len);
 }
 
-static void ctr_encrypt_one(struct crypto_skcipher *tfm, const u8 *src, u8 *dst)
-{
-	struct aesbs_ctr_ctx *ctx = crypto_skcipher_ctx(tfm);
-	unsigned long flags;
-
-	/*
-	 * Temporarily disable interrupts to avoid races where
-	 * cachelines are evicted when the CPU is interrupted
-	 * to do something else.
-	 */
-	local_irq_save(flags);
-	aes_encrypt(&ctx->fallback, dst, src);
-	local_irq_restore(flags);
-}
-
-static int ctr_encrypt_sync(struct skcipher_request *req)
-{
-	if (!crypto_simd_usable())
-		return crypto_ctr_encrypt_walk(req, ctr_encrypt_one);
-
-	return ctr_encrypt(req);
-}
-
 static int __xts_crypt(struct skcipher_request *req, bool encrypt,
 		       void (*fn)(u8 out[], u8 const in[], u8 const rk[],
 				  int rounds, int blocks, u8 iv[]))
@@ -431,13 +384,12 @@ static int xts_decrypt(struct skcipher_request *req)
 }
 
 static struct skcipher_alg aes_algs[] = { {
-	.base.cra_name		= "__ecb(aes)",
-	.base.cra_driver_name	= "__ecb-aes-neonbs",
+	.base.cra_name		= "ecb(aes)",
+	.base.cra_driver_name	= "ecb-aes-neonbs",
 	.base.cra_priority	= 250,
 	.base.cra_blocksize	= AES_BLOCK_SIZE,
 	.base.cra_ctxsize	= sizeof(struct aesbs_ctx),
 	.base.cra_module	= THIS_MODULE,
-	.base.cra_flags		= CRYPTO_ALG_INTERNAL,
 
 	.min_keysize		= AES_MIN_KEY_SIZE,
 	.max_keysize		= AES_MAX_KEY_SIZE,
@@ -446,13 +398,12 @@ static struct skcipher_alg aes_algs[] = { {
 	.encrypt		= ecb_encrypt,
 	.decrypt		= ecb_decrypt,
 }, {
-	.base.cra_name		= "__cbc(aes)",
-	.base.cra_driver_name	= "__cbc-aes-neonbs",
+	.base.cra_name		= "cbc(aes)",
+	.base.cra_driver_name	= "cbc-aes-neonbs",
 	.base.cra_priority	= 250,
 	.base.cra_blocksize	= AES_BLOCK_SIZE,
 	.base.cra_ctxsize	= sizeof(struct aesbs_cbc_ctx),
 	.base.cra_module	= THIS_MODULE,
-	.base.cra_flags		= CRYPTO_ALG_INTERNAL,
 
 	.min_keysize		= AES_MIN_KEY_SIZE,
 	.max_keysize		= AES_MAX_KEY_SIZE,
@@ -462,13 +413,12 @@ static struct skcipher_alg aes_algs[] = { {
 	.encrypt		= cbc_encrypt,
 	.decrypt		= cbc_decrypt,
 }, {
-	.base.cra_name		= "__ctr(aes)",
-	.base.cra_driver_name	= "__ctr-aes-neonbs",
+	.base.cra_name		= "ctr(aes)",
+	.base.cra_driver_name	= "ctr-aes-neonbs",
 	.base.cra_priority	= 250,
 	.base.cra_blocksize	= 1,
 	.base.cra_ctxsize	= sizeof(struct aesbs_ctx),
 	.base.cra_module	= THIS_MODULE,
-	.base.cra_flags		= CRYPTO_ALG_INTERNAL,
 
 	.min_keysize		= AES_MIN_KEY_SIZE,
 	.max_keysize		= AES_MAX_KEY_SIZE,
@@ -479,29 +429,12 @@ static struct skcipher_alg aes_algs[] = { {
 	.encrypt		= ctr_encrypt,
 	.decrypt		= ctr_encrypt,
 }, {
-	.base.cra_name		= "ctr(aes)",
-	.base.cra_driver_name	= "ctr-aes-neonbs",
-	.base.cra_priority	= 250 - 1,
-	.base.cra_blocksize	= 1,
-	.base.cra_ctxsize	= sizeof(struct aesbs_ctr_ctx),
-	.base.cra_module	= THIS_MODULE,
-
-	.min_keysize		= AES_MIN_KEY_SIZE,
-	.max_keysize		= AES_MAX_KEY_SIZE,
-	.chunksize		= AES_BLOCK_SIZE,
-	.walksize		= 8 * AES_BLOCK_SIZE,
-	.ivsize			= AES_BLOCK_SIZE,
-	.setkey			= aesbs_ctr_setkey_sync,
-	.encrypt		= ctr_encrypt_sync,
-	.decrypt		= ctr_encrypt_sync,
-}, {
-	.base.cra_name		= "__xts(aes)",
-	.base.cra_driver_name	= "__xts-aes-neonbs",
+	.base.cra_name		= "xts(aes)",
+	.base.cra_driver_name	= "xts-aes-neonbs",
 	.base.cra_priority	= 250,
 	.base.cra_blocksize	= AES_BLOCK_SIZE,
 	.base.cra_ctxsize	= sizeof(struct aesbs_xts_ctx),
 	.base.cra_module	= THIS_MODULE,
-	.base.cra_flags		= CRYPTO_ALG_INTERNAL,
 
 	.min_keysize		= 2 * AES_MIN_KEY_SIZE,
 	.max_keysize		= 2 * AES_MAX_KEY_SIZE,
@@ -512,54 +445,17 @@ static struct skcipher_alg aes_algs[] = { {
 	.decrypt		= xts_decrypt,
 } };
 
-static struct simd_skcipher_alg *aes_simd_algs[ARRAY_SIZE(aes_algs)];
-
 static void aes_exit(void)
 {
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(aes_simd_algs); i++)
-		if (aes_simd_algs[i])
-			simd_skcipher_free(aes_simd_algs[i]);
-
 	crypto_unregister_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
 }
 
 static int __init aes_init(void)
 {
-	struct simd_skcipher_alg *simd;
-	const char *basename;
-	const char *algname;
-	const char *drvname;
-	int err;
-	int i;
-
 	if (!cpu_have_named_feature(ASIMD))
 		return -ENODEV;
 
-	err = crypto_register_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
-	if (err)
-		return err;
-
-	for (i = 0; i < ARRAY_SIZE(aes_algs); i++) {
-		if (!(aes_algs[i].base.cra_flags & CRYPTO_ALG_INTERNAL))
-			continue;
-
-		algname = aes_algs[i].base.cra_name + 2;
-		drvname = aes_algs[i].base.cra_driver_name + 2;
-		basename = aes_algs[i].base.cra_driver_name;
-		simd = simd_skcipher_create_compat(algname, drvname, basename);
-		err = PTR_ERR(simd);
-		if (IS_ERR(simd))
-			goto unregister_simds;
-
-		aes_simd_algs[i] = simd;
-	}
-	return 0;
-
-unregister_simds:
-	aes_exit();
-	return err;
+	return crypto_register_skciphers(aes_algs, ARRAY_SIZE(aes_algs));
 }
 
 module_init(aes_init);
-- 
2.30.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/9] arm64: assembler: remove conditional NEON yield macros
  2021-03-02  9:01 ` [PATCH v2 1/9] arm64: assembler: remove conditional NEON yield macros Ard Biesheuvel
@ 2021-03-30  9:52   ` Will Deacon
  2021-04-12  8:39     ` Ard Biesheuvel
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2021-03-30  9:52 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-crypto, linux-arm-kernel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Catalin Marinas, Thomas Gleixner,
	Peter Zijlstra, Sebastian Andrzej Siewior, Ingo Molnar,
	Andy Lutomirski

On Tue, Mar 02, 2021 at 10:01:10AM +0100, Ard Biesheuvel wrote:
> The users of the conditional NEON yield macros have all been switched to
> the simplified cond_yield macro, and so the NEON specific ones can be
> removed.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/include/asm/assembler.h | 70 --------------------
>  1 file changed, 70 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 2/9] arm64: assembler: introduce wxN aliases for wN registers
  2021-03-02  9:01 ` [PATCH v2 2/9] arm64: assembler: introduce wxN aliases for wN registers Ard Biesheuvel
@ 2021-03-30  9:59   ` Will Deacon
  0 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2021-03-30  9:59 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-crypto, linux-arm-kernel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Catalin Marinas, Thomas Gleixner,
	Peter Zijlstra, Sebastian Andrzej Siewior, Ingo Molnar,
	Andy Lutomirski

On Tue, Mar 02, 2021 at 10:01:11AM +0100, Ard Biesheuvel wrote:
> The AArch64 asm syntax has this slightly tedious property that the names
> used in mnemonics to refer to registers depend on whether the opcode in
> question targets the entire 64-bits (xN), or only the least significant
> 8, 16 or 32 bits (wN). When writing parameterized code such as macros,
> this can be annoying, as macro arguments don't lend themselves to
> indexed lookups, and so generating a reference to wN in a macro that
> receives xN as an argument is problematic.
> 
> For instance, an upcoming patch that modifies the implementation of the
> cond_yield macro to be able to refer to 32-bit registers would need to
> modify invocations such as
> 
>   cond_yield	3f, x8
> 
> to
> 
>   cond_yield	3f, 8
> 
> so that the second argument can be token pasted after x or w to emit the
> correct register reference. Unfortunately, this interferes with the self
> documenting nature of the first example, where the second argument is
> obviously a register, whereas in the second example, one would need to
> go and look at the code to find out what '8' means.
> 
> So let's fix this by defining wxN aliases for all xN registers, which
> resolve to the 32-bit alias of each respective 64-bit register. This
> allows the macro implementation to paste the xN reference after a w to
> obtain the correct register name.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/include/asm/assembler.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index e0fc1d424f9b..7b076ccd1a54 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -23,6 +23,14 @@
>  #include <asm/ptrace.h>
>  #include <asm/thread_info.h>
>  
> +	/*
> +	 * Provide a wxN alias for each wN register so what we can paste a xN
> +	 * reference after a 'w' to obtain the 32-bit version.
> +	 */
> +	.irp	n,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30
> +	wx\n	.req	w\n
> +	.endr

That's a pretty neat hack! I remember seeing code elsewhere which would
benefit from this, so might be worth a look at our other macros as I'm sure
I got annoyed by one the other day... ah yes, the SVE macros in fpsimdmacros.h

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 3/9] arm64: fpsimd: run kernel mode NEON with softirqs disabled
  2021-03-02  9:01 ` [PATCH v2 3/9] arm64: fpsimd: run kernel mode NEON with softirqs disabled Ard Biesheuvel
@ 2021-03-30 10:36   ` Will Deacon
  0 siblings, 0 replies; 15+ messages in thread
From: Will Deacon @ 2021-03-30 10:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-crypto, linux-arm-kernel, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Catalin Marinas, Thomas Gleixner,
	Peter Zijlstra, Sebastian Andrzej Siewior, Ingo Molnar,
	Andy Lutomirski

On Tue, Mar 02, 2021 at 10:01:12AM +0100, Ard Biesheuvel wrote:
> Kernel mode NEON can be used in task or softirq context, but only in
> a non-nesting manner, i.e., softirq context is only permitted if the
> interrupt was not taken at a point where the kernel was using the NEON
> in task context.
> 
> This means all users of kernel mode NEON have to be aware of this
> limitation, and either need to provide scalar fallbacks that may be much
> slower (up to 20x for AES instructions) and potentially less safe, or
> use an asynchronous interface that defers processing to a later time
> when the NEON is guaranteed to be available.
> 
> Given that grabbing and releasing the NEON is cheap, we can relax this
> restriction, by increasing the granularity of kernel mode NEON code, and
> always disabling softirq processing while the NEON is being used in task
> context.
> 
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm64/crypto/aes-modes.S      |  2 +-
>  arch/arm64/crypto/sha1-ce-core.S   |  2 +-
>  arch/arm64/crypto/sha2-ce-core.S   |  2 +-
>  arch/arm64/crypto/sha3-ce-core.S   |  4 +--
>  arch/arm64/crypto/sha512-ce-core.S |  2 +-
>  arch/arm64/include/asm/assembler.h | 28 +++++++++++++++-----
>  arch/arm64/kernel/asm-offsets.c    |  2 ++
>  arch/arm64/kernel/fpsimd.c         |  4 +--
>  8 files changed, 31 insertions(+), 15 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2 1/9] arm64: assembler: remove conditional NEON yield macros
  2021-03-30  9:52   ` Will Deacon
@ 2021-04-12  8:39     ` Ard Biesheuvel
  0 siblings, 0 replies; 15+ messages in thread
From: Ard Biesheuvel @ 2021-04-12  8:39 UTC (permalink / raw)
  To: Will Deacon, Catalin Marinas
  Cc: Linux Crypto Mailing List, Linux ARM, Dave Martin, Mark Brown,
	Herbert Xu, Eric Biggers, Thomas Gleixner, Peter Zijlstra,
	Sebastian Andrzej Siewior, Ingo Molnar, Andy Lutomirski

On Tue, 30 Mar 2021 at 11:52, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Mar 02, 2021 at 10:01:10AM +0100, Ard Biesheuvel wrote:
> > The users of the conditional NEON yield macros have all been switched to
> > the simplified cond_yield macro, and so the NEON specific ones can be
> > removed.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> > ---
> >  arch/arm64/include/asm/assembler.h | 70 --------------------
> >  1 file changed, 70 deletions(-)
>
> Acked-by: Will Deacon <will@kernel.org>
>

Thanks.

Catalin,

Please consider taking the first 3 patches of this series through the
arm64 tree. I will then resubmit the rest to be taken via the crypto
tree for the next cycle.

Thanks,
Ard.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: (subset) [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled
  2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
                   ` (8 preceding siblings ...)
  2021-03-02  9:01 ` [PATCH v2 9/9] crypto: arm64/aes-neonbs " Ard Biesheuvel
@ 2021-04-12 13:11 ` Catalin Marinas
  9 siblings, 0 replies; 15+ messages in thread
From: Catalin Marinas @ 2021-04-12 13:11 UTC (permalink / raw)
  To: linux-crypto, Ard Biesheuvel
  Cc: Will Deacon, Catalin Marinas, Sebastian Andrzej Siewior,
	Dave Martin, Peter Zijlstra, Eric Biggers, Ingo Molnar,
	Mark Brown, Herbert Xu, Thomas Gleixner, linux-arm-kernel,
	Andy Lutomirski

On Tue, 2 Mar 2021 10:01:09 +0100, Ard Biesheuvel wrote:
> [ TL;DR for the non-ARM folks on CC: disabling softirq processing when using
>   SIMD in kernel mode could reduce complexity and improve performance, but we
>   need to decide whether we can do this, and how much softirq processing
>   latency we can tolerate. If we can find a satisfactory solution for this,
>   we might do the same for x86 and 32-bit ARM as well.
> 
>   However, based on preliminary off-list discussions with peterz and luto, it
>   seems that for x86, there is a preference for using per-CPU buffers to
>   preserve/restore the task context's kernel mode SIMD state when the task is
>   interrupted to perform kernel mode SIMD in softirq context. On arm64, we
>   actually had this arrangement before, and removed it because it made
>   reasoning about preserving/restoring userland SVE state (32 SIMD registers
>   of up to 2 kbit in size) rather complex. ]
> 
> [...]

Applied to arm64 (for-next/neon-softirqs-disabled), thanks!

[1/9] arm64: assembler: remove conditional NEON yield macros
      https://git.kernel.org/arm64/c/27248fe1abb2
[2/9] arm64: assembler: introduce wxN aliases for wN registers
      https://git.kernel.org/arm64/c/4c4dcd3541f8
[3/9] arm64: fpsimd: run kernel mode NEON with softirqs disabled
      https://git.kernel.org/arm64/c/13150149aa6d

-- 
Catalin


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-04-12 13:12 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-02  9:01 [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Ard Biesheuvel
2021-03-02  9:01 ` [PATCH v2 1/9] arm64: assembler: remove conditional NEON yield macros Ard Biesheuvel
2021-03-30  9:52   ` Will Deacon
2021-04-12  8:39     ` Ard Biesheuvel
2021-03-02  9:01 ` [PATCH v2 2/9] arm64: assembler: introduce wxN aliases for wN registers Ard Biesheuvel
2021-03-30  9:59   ` Will Deacon
2021-03-02  9:01 ` [PATCH v2 3/9] arm64: fpsimd: run kernel mode NEON with softirqs disabled Ard Biesheuvel
2021-03-30 10:36   ` Will Deacon
2021-03-02  9:01 ` [PATCH v2 4/9] crypto: aead - disallow en/decrypt for non-task or non-softirq context Ard Biesheuvel
2021-03-02  9:01 ` [PATCH v2 5/9] crypto: skcipher " Ard Biesheuvel
2021-03-02  9:01 ` [PATCH v2 6/9] crypto: arm64/gcm-aes-ce - remove non-SIMD fallback path Ard Biesheuvel
2021-03-02  9:01 ` [PATCH v2 7/9] crypto: arm64/aes-ccm " Ard Biesheuvel
2021-03-02  9:01 ` [PATCH v2 8/9] crypto: arm64/aes-ce - stop using SIMD helper for skciphers Ard Biesheuvel
2021-03-02  9:01 ` [PATCH v2 9/9] crypto: arm64/aes-neonbs " Ard Biesheuvel
2021-04-12 13:11 ` (subset) [PATCH v2 0/9] running kernel mode SIMD with softirqs disabled Catalin Marinas

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox