All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Simplify kernel-mode NEON
@ 2017-08-03 16:23 Dave Martin
  2017-08-03 16:23 ` [PATCH 1/5] arm64: neon: replace generic definition of may_use_simd() Dave Martin
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Dave Martin @ 2017-08-03 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

This series aims to simplify kernel-mode NEON.

This series is motivated by SVE, which adds cost and complexity to
management of kernel-mode NEON.  Since the most problematic features of
KMN don't bring a substantial benefit even without SVE, it makes sense
to simplify.

This series depends on [2] to provide the needed crypto algo C
fallbacks, which Herbert Xu already accepted into the crypto tree.

Boot- and boot-bisect tested on Juno r0.


Changes since RFC v4:

 * Flip confusing sense-inverted use of the "efi_fpsimd_state_used"
   percpu variable.

 * Restore bisectability by swapping patches 4 and 5.


Previous blurb:

The key changes are:

 * Kernel-mode NEON is no longer nestable.

 * Kernel-mode NEON is no longer permitted in hardirq or nmi context.

 * Users of kernel-mode NEON must now call may_use_simd(), to determine
   whether NEON may be used.  Since this function is no longer trivial
   and can return false, the caller must typically provide a fallback
   implementation of the optimised algoritm for this situation, or must
   otherwise defer the algorithm execution to another context where it
   can execute.

 * EFI runtime servies save/restore NEON to/from a dedicated percpu
   buffer if called in hardirq or nmi context.


This series depends on Ard's v4.14 crypto rework [2].

AFAIK, that series makes all the required changes to the arm64 crypto
library code.


Changes since RFC v3:

 * Rebased onto Ard's new series [2].

 * Added EFI helpers so that EFI runtime service use still works from
   interrupt context.

Changes since RFC v2:

 * Re-enable softirqs between kernel_neon_begin() and kernel_mode_end()

   (This was the original intention, but I was overtaken by irrational
   paranoia when drafting the previous version of the patch.)

 * softirq disable/enable is removed from fpsimd_thread_switch() (which
   runs with hardirqs disabled in any case, thus local_bh_enable() is
   treated as an error by the core code).  The comment block is updated
   to clarify this situation.

   This also means that no cost is added to the context switch critical
   path after all.

 * Move the this_cpu_write(fpsimd_last_state, NULL) back outside the
   conditional in kernel_neon_begin() (i.e., where it was originally
   before this series).  RFC v2 accidentally moved this invalidation
   inside the conditional, which leads to state corruption if switching
   back to a user task previously running on the same CPU, after
   kernel_mode_neon() is used by a kernel thread.

 * Minor formatting and spurious #include tidyups.

Testing:

 * For now, I've hacked linux/kernel/time/timer.c:run_timer_softirq() to
   call kernel_mode_neon_begin() and erase the FPSIMD registers, and
   jacked CONFIG_HZ to 1000.  I also added a test syscall that userspace
   can hammer to trigger the nested kernel_neon_begin() scenario.

   This is not hugely realistic, but was sufficient to diagnose the
   corruption problem described above, when running on a model.


Original blurb:

The main motivation for these changes is that supporting kernel-mode
NEON alongside SVE is tricky with the current framework: the current
partial save/restore mechanisms would need additional porting to
save/restore the extended SVE vector bits, and this renders the cost
saving of partial save/restore rather doubtful -- even if not all vector
registers are saved, the save/restore cost will still grow with
increasing vector length.  We could get the mechanics of this to work in
principle, but it doesn't feel like the right thing to do.

If we withdraw kernel-mode NEON support for hardirq context, accept some
extra softirq latency and disable nesting, then we can simplify the code
by always saving the task context directly into task_struct and
deferring all restore to ret_to_user.  Previous discussions with Ard
Biesheuvel suggest that this is a feasible approach and reasonably
aligned with other architectures.

The main impact on client code is that callers must check that
kernel-mode NEON is usable in the current context and fall back to a
non-NEON when necessary.  Ard has already done some work on this. [1]

The interesting changes are all in patch 2: the first patch just adds a
header inclusion guard that I noted was missing.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git kernel-mode-neon

[2] [PATCH resend 00/18] crypto: ARM/arm64 roundup for v4.14
lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html


Ard Biesheuvel (1):
  arm64: neon: replace generic definition of may_use_simd()

Dave Martin (4):
  arm64: neon: Add missing header guard in <asm/neon.h>
  arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
  arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
  arm64: neon: Remove support for nested or hardirq kernel-mode NEON

 arch/arm64/include/asm/Kbuild         |   1 -
 arch/arm64/include/asm/efi.h          |   5 +-
 arch/arm64/include/asm/fpsimd.h       |  16 +---
 arch/arm64/include/asm/fpsimdmacros.h |  56 -----------
 arch/arm64/include/asm/neon.h         |   9 +-
 arch/arm64/include/asm/simd.h         |  54 +++++++++++
 arch/arm64/kernel/entry-fpsimd.S      |  24 -----
 arch/arm64/kernel/fpsimd.c            | 169 ++++++++++++++++++++++++++--------
 8 files changed, 197 insertions(+), 137 deletions(-)
 create mode 100644 arch/arm64/include/asm/simd.h

-- 
2.1.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/5] arm64: neon: replace generic definition of may_use_simd()
  2017-08-03 16:23 [PATCH 0/5] Simplify kernel-mode NEON Dave Martin
@ 2017-08-03 16:23 ` Dave Martin
  2017-08-03 16:23 ` [PATCH 2/5] arm64: neon: Add missing header guard in <asm/neon.h> Dave Martin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2017-08-03 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

In preparation of modifying the logic that decides whether kernel mode
NEON is allowable, which is required for SVE support, introduce an
implementation of may_use_simd() that reflects the current reality, i.e.,
that SIMD is allowed in any context.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/include/asm/Kbuild |  1 -
 arch/arm64/include/asm/simd.h | 23 +++++++++++++++++++++++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/simd.h

diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild
index f81c7b6..2326e39 100644
--- a/arch/arm64/include/asm/Kbuild
+++ b/arch/arm64/include/asm/Kbuild
@@ -20,7 +20,6 @@ generic-y += rwsem.h
 generic-y += segment.h
 generic-y += serial.h
 generic-y += set_memory.h
-generic-y += simd.h
 generic-y += sizes.h
 generic-y += switch_to.h
 generic-y += trace_clock.h
diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
new file mode 100644
index 0000000..96959b5
--- /dev/null
+++ b/arch/arm64/include/asm/simd.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2017 Linaro Ltd. <ard.biesheuvel@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#ifndef __ASM_SIMD_H
+#define __ASM_SIMD_H
+
+#include <linux/types.h>
+
+/*
+ * may_use_simd - whether it is allowable at this time to issue SIMD
+ *                instructions or access the SIMD register file
+ */
+static __must_check inline bool may_use_simd(void)
+{
+	return true;
+}
+
+#endif
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/5] arm64: neon: Add missing header guard in <asm/neon.h>
  2017-08-03 16:23 [PATCH 0/5] Simplify kernel-mode NEON Dave Martin
  2017-08-03 16:23 ` [PATCH 1/5] arm64: neon: replace generic definition of may_use_simd() Dave Martin
@ 2017-08-03 16:23 ` Dave Martin
  2017-08-03 16:23 ` [PATCH 3/5] arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate Dave Martin
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2017-08-03 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

asm/neon.h doesn't have a header inclusion guard, but it should
have one for consistency with other headers.

This patch adds a suitable guard.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/neon.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/include/asm/neon.h b/arch/arm64/include/asm/neon.h
index ad4cdc9..5368bd0 100644
--- a/arch/arm64/include/asm/neon.h
+++ b/arch/arm64/include/asm/neon.h
@@ -8,6 +8,9 @@
  * published by the Free Software Foundation.
  */
 
+#ifndef __ASM_NEON_H
+#define __ASM_NEON_H
+
 #include <linux/types.h>
 #include <asm/fpsimd.h>
 
@@ -17,3 +20,5 @@
 
 void kernel_neon_begin_partial(u32 num_regs);
 void kernel_neon_end(void);
+
+#endif /* ! __ASM_NEON_H */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/5] arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
  2017-08-03 16:23 [PATCH 0/5] Simplify kernel-mode NEON Dave Martin
  2017-08-03 16:23 ` [PATCH 1/5] arm64: neon: replace generic definition of may_use_simd() Dave Martin
  2017-08-03 16:23 ` [PATCH 2/5] arm64: neon: Add missing header guard in <asm/neon.h> Dave Martin
@ 2017-08-03 16:23 ` Dave Martin
  2017-08-03 16:23 ` [PATCH 4/5] arm64: neon: Allow EFI runtime services to use FPSIMD in irq context Dave Martin
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2017-08-03 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

__this_cpu_ ops are not used consistently with regard to this_cpu_
ops in a couple of places in fpsimd.c.

Since preemption is explicitly disabled in
fpsimd_restore_current_state() and fpsimd_update_current_state(),
this patch converts this_cpu_ ops in those functions to __this_cpu_
ops.  This doesn't save cost on arm64, but benefits from additional
assertions in the core code.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/fpsimd.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 06da8ea..d7e5f8a 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -194,7 +194,7 @@ void fpsimd_restore_current_state(void)
 		struct fpsimd_state *st = &current->thread.fpsimd_state;
 
 		fpsimd_load_state(st);
-		this_cpu_write(fpsimd_last_state, st);
+		__this_cpu_write(fpsimd_last_state, st);
 		st->cpu = smp_processor_id();
 	}
 	preempt_enable();
@@ -214,7 +214,7 @@ void fpsimd_update_current_state(struct fpsimd_state *state)
 	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		struct fpsimd_state *st = &current->thread.fpsimd_state;
 
-		this_cpu_write(fpsimd_last_state, st);
+		__this_cpu_write(fpsimd_last_state, st);
 		st->cpu = smp_processor_id();
 	}
 	preempt_enable();
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/5] arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
  2017-08-03 16:23 [PATCH 0/5] Simplify kernel-mode NEON Dave Martin
                   ` (2 preceding siblings ...)
  2017-08-03 16:23 ` [PATCH 3/5] arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate Dave Martin
@ 2017-08-03 16:23 ` Dave Martin
  2017-08-03 16:23 ` [PATCH 5/5] arm64: neon: Remove support for nested or hardirq kernel-mode NEON Dave Martin
  2017-08-04 12:08 ` [PATCH 0/5] Simplify " Catalin Marinas
  5 siblings, 0 replies; 22+ messages in thread
From: Dave Martin @ 2017-08-03 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

In order to be able to cope with kernel-mode NEON being unavailable
in hardirq/nmi context and non-nestable, we need special handling
for EFI runtime service calls that may be made during an interrupt
that interrupted a kernel_neon_begin()..._end() block.  This will
occur if the kernel tries to write diagnostic data to EFI
persistent storage during a panic triggered by an NMI for example.

EFI runtime services specify an ABI that clobbers the FPSIMD state,
rather than being able to use it optionally as an accelerator.
This means that EFI is really a special case and can be handled
specially.

To enable EFI calls from interrupts, this patch creates dedicated
__efi_fpsimd_{begin,end}() helpers solely for this purpose, which
save/restore to a separate percpu buffer if called in a context
where kernel_neon_begin() is not usable.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/efi.h    |  5 ++--
 arch/arm64/include/asm/fpsimd.h |  4 ++++
 arch/arm64/kernel/fpsimd.c      | 52 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 59 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index 8f3043a..8358222 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -3,6 +3,7 @@
 
 #include <asm/boot.h>
 #include <asm/cpufeature.h>
+#include <asm/fpsimd.h>
 #include <asm/io.h>
 #include <asm/mmu_context.h>
 #include <asm/neon.h>
@@ -20,8 +21,8 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
 
 #define arch_efi_call_virt_setup()					\
 ({									\
-	kernel_neon_begin();						\
 	efi_virtmap_load();						\
+	__efi_fpsimd_begin();						\
 })
 
 #define arch_efi_call_virt(p, f, args...)				\
@@ -33,8 +34,8 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
 
 #define arch_efi_call_virt_teardown()					\
 ({									\
+	__efi_fpsimd_end();						\
 	efi_virtmap_unload();						\
-	kernel_neon_end();						\
 })
 
 #define ARCH_EFI_IRQ_FLAGS_MASK (PSR_D_BIT | PSR_A_BIT | PSR_I_BIT | PSR_F_BIT)
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 50f559f..5155f21 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -81,6 +81,10 @@ extern void fpsimd_save_partial_state(struct fpsimd_partial_state *state,
 				      u32 num_regs);
 extern void fpsimd_load_partial_state(struct fpsimd_partial_state *state);
 
+/* For use by EFI runtime services calls only */
+extern void __efi_fpsimd_begin(void);
+extern void __efi_fpsimd_end(void);
+
 #endif
 
 #endif
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index d7e5f8a..bcde88e 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -21,12 +21,15 @@
 #include <linux/cpu_pm.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
+#include <linux/preempt.h>
 #include <linux/sched/signal.h>
 #include <linux/signal.h>
 #include <linux/hardirq.h>
 
 #include <asm/fpsimd.h>
 #include <asm/cputype.h>
+#include <asm/neon.h>
+#include <asm/simd.h>
 
 #define FPEXC_IOF	(1 << 0)
 #define FPEXC_DZF	(1 << 1)
@@ -276,6 +279,55 @@ void kernel_neon_end(void)
 }
 EXPORT_SYMBOL(kernel_neon_end);
 
+DEFINE_PER_CPU(struct fpsimd_state, efi_fpsimd_state);
+DEFINE_PER_CPU(bool, efi_fpsimd_state_used);
+
+/*
+ * EFI runtime services support functions
+ *
+ * The ABI for EFI runtime services allows EFI to use FPSIMD during the call.
+ * This means that for EFI (and only for EFI), we have to assume that FPSIMD
+ * is always used rather than being an optional accelerator.
+ *
+ * These functions provide the necessary support for ensuring FPSIMD
+ * save/restore in the contexts from which EFI is used.
+ *
+ * Do not use them for any other purpose -- if tempted to do so, you are
+ * either doing something wrong or you need to propose some refactoring.
+ */
+
+/*
+ * __efi_fpsimd_begin(): prepare FPSIMD for making an EFI runtime services call
+ */
+void __efi_fpsimd_begin(void)
+{
+	if (!system_supports_fpsimd())
+		return;
+
+	WARN_ON(preemptible());
+
+	if (may_use_simd())
+		kernel_neon_begin();
+	else {
+		fpsimd_save_state(this_cpu_ptr(&efi_fpsimd_state));
+		__this_cpu_write(efi_fpsimd_state_used, true);
+	}
+}
+
+/*
+ * __efi_fpsimd_end(): clean up FPSIMD after an EFI runtime services call
+ */
+void __efi_fpsimd_end(void)
+{
+	if (!system_supports_fpsimd())
+		return;
+
+	if (__this_cpu_xchg(efi_fpsimd_state_used, false))
+		fpsimd_load_state(this_cpu_ptr(&efi_fpsimd_state));
+	else
+		kernel_neon_end();
+}
+
 #endif /* CONFIG_KERNEL_MODE_NEON */
 
 #ifdef CONFIG_CPU_PM
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/5] arm64: neon: Remove support for nested or hardirq kernel-mode NEON
  2017-08-03 16:23 [PATCH 0/5] Simplify kernel-mode NEON Dave Martin
                   ` (3 preceding siblings ...)
  2017-08-03 16:23 ` [PATCH 4/5] arm64: neon: Allow EFI runtime services to use FPSIMD in irq context Dave Martin
@ 2017-08-03 16:23 ` Dave Martin
  2017-08-07 11:23   ` Catalin Marinas
  2017-08-04 12:08 ` [PATCH 0/5] Simplify " Catalin Marinas
  5 siblings, 1 reply; 22+ messages in thread
From: Dave Martin @ 2017-08-03 16:23 UTC (permalink / raw)
  To: linux-arm-kernel

Support for kernel-mode NEON to be nested and/or used in hardirq
context adds significant complexity, and the benefits may be
marginal.  In practice, kernel-mode NEON is not used in hardirq
context, and is rarely used in softirq context (by certain mac80211
drivers).

This patch implements an arm64 may_use_simd() function to allow
clients to check whether kernel-mode NEON is usable in the current
context, and simplifies kernel_neon_{begin,end}() to handle only
saving of the task FPSIMD state (if any).  Without nesting, there
is no other state to save.

The partial fpsimd save/restore functions become redundant as a
result of these changes, so they are removed too.

The save/restore model is changed to operate directly on
task_struct without additional percpu storage.  This simplifies the
code and saves a bit of memory, but means that softirqs must now be
disabled when manipulating the task fpsimd state from task context:
correspondingly, preempt_{en,dis}sable() calls are upgraded to
local_bh_{en,dis}able() as appropriate.  fpsimd_thread_switch()
already runs with hardirqs disabled and so is already protected
from softirqs.

These changes should make it easier to support kernel-mode NEON in
the presence of the Scalable Vector extension in the future.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/include/asm/fpsimd.h       |  14 -----
 arch/arm64/include/asm/fpsimdmacros.h |  56 -----------------
 arch/arm64/include/asm/neon.h         |   4 +-
 arch/arm64/include/asm/simd.h         |  33 +++++++++-
 arch/arm64/kernel/entry-fpsimd.S      |  24 -------
 arch/arm64/kernel/fpsimd.c            | 115 +++++++++++++++++++++++-----------
 6 files changed, 111 insertions(+), 135 deletions(-)

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 5155f21..410c481 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -41,16 +41,6 @@ struct fpsimd_state {
 	unsigned int cpu;
 };
 
-/*
- * Struct for stacking the bottom 'n' FP/SIMD registers.
- */
-struct fpsimd_partial_state {
-	u32		fpsr;
-	u32		fpcr;
-	u32		num_regs;
-	__uint128_t	vregs[32];
-};
-
 
 #if defined(__KERNEL__) && defined(CONFIG_COMPAT)
 /* Masks for extracting the FPSR and FPCR from the FPSCR */
@@ -77,10 +67,6 @@ extern void fpsimd_update_current_state(struct fpsimd_state *state);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 
-extern void fpsimd_save_partial_state(struct fpsimd_partial_state *state,
-				      u32 num_regs);
-extern void fpsimd_load_partial_state(struct fpsimd_partial_state *state);
-
 /* For use by EFI runtime services calls only */
 extern void __efi_fpsimd_begin(void);
 extern void __efi_fpsimd_end(void);
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index a2daf12..0f5fdd3 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -75,59 +75,3 @@
 	ldr	w\tmpnr, [\state, #16 * 2 + 4]
 	fpsimd_restore_fpcr x\tmpnr, \state
 .endm
-
-.macro fpsimd_save_partial state, numnr, tmpnr1, tmpnr2
-	mrs	x\tmpnr1, fpsr
-	str	w\numnr, [\state, #8]
-	mrs	x\tmpnr2, fpcr
-	stp	w\tmpnr1, w\tmpnr2, [\state]
-	adr	x\tmpnr1, 0f
-	add	\state, \state, x\numnr, lsl #4
-	sub	x\tmpnr1, x\tmpnr1, x\numnr, lsl #1
-	br	x\tmpnr1
-	stp	q30, q31, [\state, #-16 * 30 - 16]
-	stp	q28, q29, [\state, #-16 * 28 - 16]
-	stp	q26, q27, [\state, #-16 * 26 - 16]
-	stp	q24, q25, [\state, #-16 * 24 - 16]
-	stp	q22, q23, [\state, #-16 * 22 - 16]
-	stp	q20, q21, [\state, #-16 * 20 - 16]
-	stp	q18, q19, [\state, #-16 * 18 - 16]
-	stp	q16, q17, [\state, #-16 * 16 - 16]
-	stp	q14, q15, [\state, #-16 * 14 - 16]
-	stp	q12, q13, [\state, #-16 * 12 - 16]
-	stp	q10, q11, [\state, #-16 * 10 - 16]
-	stp	q8, q9, [\state, #-16 * 8 - 16]
-	stp	q6, q7, [\state, #-16 * 6 - 16]
-	stp	q4, q5, [\state, #-16 * 4 - 16]
-	stp	q2, q3, [\state, #-16 * 2 - 16]
-	stp	q0, q1, [\state, #-16 * 0 - 16]
-0:
-.endm
-
-.macro fpsimd_restore_partial state, tmpnr1, tmpnr2
-	ldp	w\tmpnr1, w\tmpnr2, [\state]
-	msr	fpsr, x\tmpnr1
-	fpsimd_restore_fpcr x\tmpnr2, x\tmpnr1
-	adr	x\tmpnr1, 0f
-	ldr	w\tmpnr2, [\state, #8]
-	add	\state, \state, x\tmpnr2, lsl #4
-	sub	x\tmpnr1, x\tmpnr1, x\tmpnr2, lsl #1
-	br	x\tmpnr1
-	ldp	q30, q31, [\state, #-16 * 30 - 16]
-	ldp	q28, q29, [\state, #-16 * 28 - 16]
-	ldp	q26, q27, [\state, #-16 * 26 - 16]
-	ldp	q24, q25, [\state, #-16 * 24 - 16]
-	ldp	q22, q23, [\state, #-16 * 22 - 16]
-	ldp	q20, q21, [\state, #-16 * 20 - 16]
-	ldp	q18, q19, [\state, #-16 * 18 - 16]
-	ldp	q16, q17, [\state, #-16 * 16 - 16]
-	ldp	q14, q15, [\state, #-16 * 14 - 16]
-	ldp	q12, q13, [\state, #-16 * 12 - 16]
-	ldp	q10, q11, [\state, #-16 * 10 - 16]
-	ldp	q8, q9, [\state, #-16 * 8 - 16]
-	ldp	q6, q7, [\state, #-16 * 6 - 16]
-	ldp	q4, q5, [\state, #-16 * 4 - 16]
-	ldp	q2, q3, [\state, #-16 * 2 - 16]
-	ldp	q0, q1, [\state, #-16 * 0 - 16]
-0:
-.endm
diff --git a/arch/arm64/include/asm/neon.h b/arch/arm64/include/asm/neon.h
index 5368bd0..fb9d137 100644
--- a/arch/arm64/include/asm/neon.h
+++ b/arch/arm64/include/asm/neon.h
@@ -16,9 +16,7 @@
 
 #define cpu_has_neon()		system_supports_fpsimd()
 
-#define kernel_neon_begin()	kernel_neon_begin_partial(32)
-
-void kernel_neon_begin_partial(u32 num_regs);
+void kernel_neon_begin(void);
 void kernel_neon_end(void);
 
 #endif /* ! __ASM_NEON_H */
diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
index 96959b5..5a1a927 100644
--- a/arch/arm64/include/asm/simd.h
+++ b/arch/arm64/include/asm/simd.h
@@ -9,15 +9,46 @@
 #ifndef __ASM_SIMD_H
 #define __ASM_SIMD_H
 
+#include <linux/compiler.h>
+#include <linux/percpu.h>
+#include <linux/preempt.h>
 #include <linux/types.h>
 
+#ifdef CONFIG_KERNEL_MODE_NEON
+
+DECLARE_PER_CPU(bool, kernel_neon_busy);
+
 /*
  * may_use_simd - whether it is allowable at this time to issue SIMD
  *                instructions or access the SIMD register file
+ *
+ * Callers must not assume that the result remains true beyond the next
+ * preempt_enable() or return from softirq context.
  */
 static __must_check inline bool may_use_simd(void)
 {
-	return true;
+	/*
+	 * The raw_cpu_read() is racy if called with preemption enabled.
+	 * This is not a bug: kernel_neon_busy is only set when
+	 * preemption is disabled, so we cannot migrate to another CPU
+	 * while it is set, nor can we migrate to a CPU where it is set.
+	 * So, if we find it clear on some CPU then we're guaranteed to
+	 * find it clear on any CPU we could migrate to.
+	 *
+	 * If we are in between kernel_neon_begin()...kernel_neon_end(),
+	 * the flag will be set, but preemption is also disabled, so we
+	 * can't migrate to another CPU and spuriously see it become
+	 * false.
+	 */
+	return !in_irq() && !in_nmi() && !raw_cpu_read(kernel_neon_busy);
 }
 
+#else /* ! CONFIG_KERNEL_MODE_NEON */
+
+static __must_check inline bool may_use_simd(void) {
+	return false;
+}
+
+#endif /* ! CONFIG_KERNEL_MODE_NEON */
+
 #endif
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
index c44a82f..6a27cd6 100644
--- a/arch/arm64/kernel/entry-fpsimd.S
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -41,27 +41,3 @@ ENTRY(fpsimd_load_state)
 	fpsimd_restore x0, 8
 	ret
 ENDPROC(fpsimd_load_state)
-
-#ifdef CONFIG_KERNEL_MODE_NEON
-
-/*
- * Save the bottom n FP registers.
- *
- * x0 - pointer to struct fpsimd_partial_state
- */
-ENTRY(fpsimd_save_partial_state)
-	fpsimd_save_partial x0, 1, 8, 9
-	ret
-ENDPROC(fpsimd_save_partial_state)
-
-/*
- * Load the bottom n FP registers.
- *
- * x0 - pointer to struct fpsimd_partial_state
- */
-ENTRY(fpsimd_load_partial_state)
-	fpsimd_restore_partial x0, 8, 9
-	ret
-ENDPROC(fpsimd_load_partial_state)
-
-#endif
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index bcde88e..138fcfa 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -17,18 +17,18 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/bottom_half.h>
 #include <linux/cpu.h>
 #include <linux/cpu_pm.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
+#include <linux/percpu.h>
 #include <linux/preempt.h>
 #include <linux/sched/signal.h>
 #include <linux/signal.h>
-#include <linux/hardirq.h>
 
 #include <asm/fpsimd.h>
 #include <asm/cputype.h>
-#include <asm/neon.h>
 #include <asm/simd.h>
 
 #define FPEXC_IOF	(1 << 0)
@@ -65,6 +65,13 @@
  * CPU currently contain the most recent userland FPSIMD state of the current
  * task.
  *
+ * In order to allow softirq handlers to use FPSIMD, kernel_neon_begin() may
+ * save the task's FPSIMD context back to task_struct from softirq context.
+ * To prevent this from racing with the manipulation of the task's FPSIMD state
+ * from task context and thereby corrupting the state, it is necessary to
+ * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE
+ * flag with local_bh_disable() unless softirqs are already masked.
+ *
  * For a certain task, the sequence may look something like this:
  * - the task gets scheduled in; if both the task's fpsimd_state.cpu field
  *   contains the id of the current CPU, and the CPU's fpsimd_last_state per-cpu
@@ -164,9 +171,14 @@ void fpsimd_flush_thread(void)
 {
 	if (!system_supports_fpsimd())
 		return;
+
+	local_bh_disable();
+
 	memset(&current->thread.fpsimd_state, 0, sizeof(struct fpsimd_state));
 	fpsimd_flush_task_state(current);
 	set_thread_flag(TIF_FOREIGN_FPSTATE);
+
+	local_bh_enable();
 }
 
 /*
@@ -177,10 +189,13 @@ void fpsimd_preserve_current_state(void)
 {
 	if (!system_supports_fpsimd())
 		return;
-	preempt_disable();
+
+	local_bh_disable();
+
 	if (!test_thread_flag(TIF_FOREIGN_FPSTATE))
 		fpsimd_save_state(&current->thread.fpsimd_state);
-	preempt_enable();
+
+	local_bh_enable();
 }
 
 /*
@@ -192,7 +207,9 @@ void fpsimd_restore_current_state(void)
 {
 	if (!system_supports_fpsimd())
 		return;
-	preempt_disable();
+
+	local_bh_disable();
+
 	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		struct fpsimd_state *st = &current->thread.fpsimd_state;
 
@@ -200,7 +217,8 @@ void fpsimd_restore_current_state(void)
 		__this_cpu_write(fpsimd_last_state, st);
 		st->cpu = smp_processor_id();
 	}
-	preempt_enable();
+
+	local_bh_enable();
 }
 
 /*
@@ -212,7 +230,9 @@ void fpsimd_update_current_state(struct fpsimd_state *state)
 {
 	if (!system_supports_fpsimd())
 		return;
-	preempt_disable();
+
+	local_bh_disable();
+
 	fpsimd_load_state(state);
 	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
 		struct fpsimd_state *st = &current->thread.fpsimd_state;
@@ -220,7 +240,8 @@ void fpsimd_update_current_state(struct fpsimd_state *state)
 		__this_cpu_write(fpsimd_last_state, st);
 		st->cpu = smp_processor_id();
 	}
-	preempt_enable();
+
+	local_bh_enable();
 }
 
 /*
@@ -233,49 +254,69 @@ void fpsimd_flush_task_state(struct task_struct *t)
 
 #ifdef CONFIG_KERNEL_MODE_NEON
 
-static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
-static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
+DEFINE_PER_CPU(bool, kernel_neon_busy);
 
 /*
  * Kernel-side NEON support functions
  */
-void kernel_neon_begin_partial(u32 num_regs)
+
+/*
+ * kernel_neon_begin(): obtain the CPU FPSIMD registers for use by the calling
+ * context
+ *
+ * Must not be called unless may_use_simd() returns true.
+ * Task context in the FPSIMD registers is saved back to memory as necessary.
+ *
+ * A matching call to kernel_neon_end() must be made before returning from the
+ * calling context.
+ *
+ * The caller may freely use the FPSIMD registers until kernel_neon_end() is
+ * called.
+ */
+void kernel_neon_begin(void)
 {
 	if (WARN_ON(!system_supports_fpsimd()))
 		return;
-	if (in_interrupt()) {
-		struct fpsimd_partial_state *s = this_cpu_ptr(
-			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
 
-		BUG_ON(num_regs > 32);
-		fpsimd_save_partial_state(s, roundup(num_regs, 2));
-	} else {
-		/*
-		 * Save the userland FPSIMD state if we have one and if we
-		 * haven't done so already. Clear fpsimd_last_state to indicate
-		 * that there is no longer userland FPSIMD state in the
-		 * registers.
-		 */
-		preempt_disable();
-		if (current->mm &&
-		    !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
-			fpsimd_save_state(&current->thread.fpsimd_state);
-		this_cpu_write(fpsimd_last_state, NULL);
-	}
+	BUG_ON(!may_use_simd());
+
+	local_bh_disable();
+
+	__this_cpu_write(kernel_neon_busy, true);
+
+	/* Save unsaved task fpsimd state, if any: */
+	if (current->mm && !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE))
+		fpsimd_save_state(&current->thread.fpsimd_state);
+
+	/* Invalidate any task state remaining in the fpsimd regs: */
+	__this_cpu_write(fpsimd_last_state, NULL);
+
+	preempt_disable();
+
+	local_bh_enable();
 }
-EXPORT_SYMBOL(kernel_neon_begin_partial);
+EXPORT_SYMBOL(kernel_neon_begin);
 
+/*
+ * kernel_neon_end(): give the CPU FPSIMD registers back to the current task
+ *
+ * Must be called from a context in which kernel_neon_begin() was previously
+ * called, with no call to kernel_neon_end() in the meantime.
+ *
+ * The caller must not use the FPSIMD registers after this function is called,
+ * unless kernel_neon_begin() is called again in the meantime.
+ */
 void kernel_neon_end(void)
 {
+	bool busy;
+
 	if (!system_supports_fpsimd())
 		return;
-	if (in_interrupt()) {
-		struct fpsimd_partial_state *s = this_cpu_ptr(
-			in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate);
-		fpsimd_load_partial_state(s);
-	} else {
-		preempt_enable();
-	}
+
+	busy = __this_cpu_xchg(kernel_neon_busy, false);
+	WARN_ON(!busy);	/* No matching kernel_neon_begin()? */
+
+	preempt_enable();
 }
 EXPORT_SYMBOL(kernel_neon_end);
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-03 16:23 [PATCH 0/5] Simplify kernel-mode NEON Dave Martin
                   ` (4 preceding siblings ...)
  2017-08-03 16:23 ` [PATCH 5/5] arm64: neon: Remove support for nested or hardirq kernel-mode NEON Dave Martin
@ 2017-08-04 12:08 ` Catalin Marinas
  2017-08-04 12:22   ` Catalin Marinas
  5 siblings, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2017-08-04 12:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
> Ard Biesheuvel (1):
>   arm64: neon: replace generic definition of may_use_simd()
> 
> Dave Martin (4):
>   arm64: neon: Add missing header guard in <asm/neon.h>
>   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
>   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
>   arm64: neon: Remove support for nested or hardirq kernel-mode NEON

Queued for 4.14. Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 12:08 ` [PATCH 0/5] Simplify " Catalin Marinas
@ 2017-08-04 12:22   ` Catalin Marinas
  2017-08-04 12:25     ` Ard Biesheuvel
  0 siblings, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2017-08-04 12:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
> > This series depends on Ard's v4.14 crypto rework [2].
[...]
> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
[...]
> > Ard Biesheuvel (1):
> >   arm64: neon: replace generic definition of may_use_simd()
> > 
> > Dave Martin (4):
> >   arm64: neon: Add missing header guard in <asm/neon.h>
> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
> 
> Queued for 4.14. Thanks.

... and reverted.

I now realised that it doesn't build without Ard's crypto series (I had
the wrong impression it is just a performance impact). I get errors
like:

arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
of function ?kernel_neon_begin_partial?

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 12:22   ` Catalin Marinas
@ 2017-08-04 12:25     ` Ard Biesheuvel
  2017-08-04 12:41       ` Dave P Martin
  2017-08-04 13:12       ` Catalin Marinas
  0 siblings, 2 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2017-08-04 12:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
>> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
>> > This series depends on Ard's v4.14 crypto rework [2].
> [...]
>> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
> [...]
>> > Ard Biesheuvel (1):
>> >   arm64: neon: replace generic definition of may_use_simd()
>> >
>> > Dave Martin (4):
>> >   arm64: neon: Add missing header guard in <asm/neon.h>
>> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
>> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
>> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
>>
>> Queued for 4.14. Thanks.
>
> ... and reverted.
>
> I now realised that it doesn't build without Ard's crypto series (I had
> the wrong impression it is just a performance impact). I get errors
> like:
>
> arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
> of function ?kernel_neon_begin_partial?
>

Ah yes. Apologies for failing to mention that. Apparently, there are
in fact build time cross-dependencies that I forgot about.

To avoid delaying this unnecessarily, we just alias
kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
argument. We can remove that again when everything has gone upstream.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 12:25     ` Ard Biesheuvel
@ 2017-08-04 12:41       ` Dave P Martin
  2017-08-04 13:12       ` Catalin Marinas
  1 sibling, 0 replies; 22+ messages in thread
From: Dave P Martin @ 2017-08-04 12:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 01:25:03PM +0100, Ard Biesheuvel wrote:
> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
> >> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
> >> > This series depends on Ard's v4.14 crypto rework [2].
> > [...]
> >> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
> > [...]
> >> > Ard Biesheuvel (1):
> >> >   arm64: neon: replace generic definition of may_use_simd()
> >> >
> >> > Dave Martin (4):
> >> >   arm64: neon: Add missing header guard in <asm/neon.h>
> >> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
> >> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
> >> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
> >>
> >> Queued for 4.14. Thanks.
> >
> > ... and reverted.
> >
> > I now realised that it doesn't build without Ard's crypto series (I had
> > the wrong impression it is just a performance impact). I get errors
> > like:
> >
> > arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
> > of function ?kernel_neon_begin_partial?
> >
>
> Ah yes. Apologies for failing to mention that. Apparently, there are
> in fact build time cross-dependencies that I forgot about.

This was what I meant by "depends on", but I could have been clearer.
Apologies for that.

> To avoid delaying this unnecessarily, we just alias
> kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
> argument. We can remove that again when everything has gone upstream.

We can do this to keep things buildable and mostly bootable.

Catalin, do you think that's a sufficient workaround, or do we need to
rethink this?

Do you need me to repost anything?

Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 12:25     ` Ard Biesheuvel
  2017-08-04 12:41       ` Dave P Martin
@ 2017-08-04 13:12       ` Catalin Marinas
  2017-08-04 13:21         ` Ard Biesheuvel
  2017-08-04 13:22         ` Dave Martin
  1 sibling, 2 replies; 22+ messages in thread
From: Catalin Marinas @ 2017-08-04 13:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 01:25:03PM +0100, Ard Biesheuvel wrote:
> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
> >> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
> >> > This series depends on Ard's v4.14 crypto rework [2].
> > [...]
> >> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
> > [...]
> >> > Ard Biesheuvel (1):
> >> >   arm64: neon: replace generic definition of may_use_simd()
> >> >
> >> > Dave Martin (4):
> >> >   arm64: neon: Add missing header guard in <asm/neon.h>
> >> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
> >> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
> >> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
> >>
> >> Queued for 4.14. Thanks.
> >
> > ... and reverted.
> >
> > I now realised that it doesn't build without Ard's crypto series (I had
> > the wrong impression it is just a performance impact). I get errors
> > like:
> >
> > arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
> > of function ?kernel_neon_begin_partial?
> 
> Ah yes. Apologies for failing to mention that. Apparently, there are
> in fact build time cross-dependencies that I forgot about.
> 
> To avoid delaying this unnecessarily, we just alias
> kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
> argument. We can remove that again when everything has gone upstream.

But would kernel_neon_begin() be called in an interrupt context from the
crypto code? I'd like the arm64 for-next/core branch (to be based on
-rc4 next week) to still be fully working.

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 13:12       ` Catalin Marinas
@ 2017-08-04 13:21         ` Ard Biesheuvel
  2017-08-04 13:26           ` Dave P Martin
  2017-08-04 14:00           ` Catalin Marinas
  2017-08-04 13:22         ` Dave Martin
  1 sibling, 2 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2017-08-04 13:21 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 August 2017 at 14:12, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Fri, Aug 04, 2017 at 01:25:03PM +0100, Ard Biesheuvel wrote:
>> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
>> >> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
>> >> > This series depends on Ard's v4.14 crypto rework [2].
>> > [...]
>> >> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
>> > [...]
>> >> > Ard Biesheuvel (1):
>> >> >   arm64: neon: replace generic definition of may_use_simd()
>> >> >
>> >> > Dave Martin (4):
>> >> >   arm64: neon: Add missing header guard in <asm/neon.h>
>> >> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
>> >> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
>> >> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
>> >>
>> >> Queued for 4.14. Thanks.
>> >
>> > ... and reverted.
>> >
>> > I now realised that it doesn't build without Ard's crypto series (I had
>> > the wrong impression it is just a performance impact). I get errors
>> > like:
>> >
>> > arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
>> > of function ?kernel_neon_begin_partial?
>>
>> Ah yes. Apologies for failing to mention that. Apparently, there are
>> in fact build time cross-dependencies that I forgot about.
>>
>> To avoid delaying this unnecessarily, we just alias
>> kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
>> argument. We can remove that again when everything has gone upstream.
>
> But would kernel_neon_begin() be called in an interrupt context from the
> crypto code? I'd like the arm64 for-next/core branch (to be based on
> -rc4 next week) to still be fully working.
>

Well, theoretically. In the new situation we allow KMN from process
and softirq context, but the latter only if no process context KMN is
in progress. This last rule could be violated by the old crypto code,
but the chances of actually hitting that are very low: if your wifi
chip lacks a h/w implementation of CCMP (which is uncommon these
days), and it gets invoked while KMN crypto is being used on the same
core, i.e., for IPsec or block encryption, you may get a splat. It
wouldn't corrupt your encrypted file system.

Perhaps we could stick it on a separate branch for this cycle (with
the suggested tweak) and get it pulled it into -next separately? That
way, you can forward it as a late pull, after Herbert's stuff gets in.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 13:12       ` Catalin Marinas
  2017-08-04 13:21         ` Ard Biesheuvel
@ 2017-08-04 13:22         ` Dave Martin
  1 sibling, 0 replies; 22+ messages in thread
From: Dave Martin @ 2017-08-04 13:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 02:12:12PM +0100, Catalin Marinas wrote:
> On Fri, Aug 04, 2017 at 01:25:03PM +0100, Ard Biesheuvel wrote:
> > On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
> > >> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
> > >> > This series depends on Ard's v4.14 crypto rework [2].
> > > [...]
> > >> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
> > > [...]
> > >> > Ard Biesheuvel (1):
> > >> >   arm64: neon: replace generic definition of may_use_simd()
> > >> >
> > >> > Dave Martin (4):
> > >> >   arm64: neon: Add missing header guard in <asm/neon.h>
> > >> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
> > >> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
> > >> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
> > >>
> > >> Queued for 4.14. Thanks.
> > >
> > > ... and reverted.
> > >
> > > I now realised that it doesn't build without Ard's crypto series (I had
> > > the wrong impression it is just a performance impact). I get errors
> > > like:
> > >
> > > arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
> > > of function ?kernel_neon_begin_partial?
> > 
> > Ah yes. Apologies for failing to mention that. Apparently, there are
> > in fact build time cross-dependencies that I forgot about.
> > 
> > To avoid delaying this unnecessarily, we just alias
> > kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
> > argument. We can remove that again when everything has gone upstream.
> 
> But would kernel_neon_begin() be called in an interrupt context from the
> crypto code? I'd like the arm64 for-next/core branch (to be based on

I think that this can happen (unless Ard knows otherwise).

However, I think the two affected cases would be

 * crypto in irq (which mostly doesn't happen due to generic
   may_use_simd() returning false)

 * EFI in irq (which doesn't exist yet)

> -rc4 next week) to still be fully working.

This is why I was wondering whether everything should go through the
same tree.


If we merged patch 1 into patch 5, there will be no window where
may_use_simd() unconditionally returns true, so code that checks
for this will defer crypto to a non-irq context.

Ard can comment if the gotchas here I've missed.

In particular, I don't know whether any crypto gets unconditionally
executed in irq context prior to Ard's patches.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 13:21         ` Ard Biesheuvel
@ 2017-08-04 13:26           ` Dave P Martin
  2017-08-04 14:05             ` Catalin Marinas
  2017-08-04 14:00           ` Catalin Marinas
  1 sibling, 1 reply; 22+ messages in thread
From: Dave P Martin @ 2017-08-04 13:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 02:21:52PM +0100, Ard Biesheuvel wrote:
> On 4 August 2017 at 14:12, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Fri, Aug 04, 2017 at 01:25:03PM +0100, Ard Biesheuvel wrote:
> >> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >> > On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
> >> >> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
> >> >> > This series depends on Ard's v4.14 crypto rework [2].
> >> > [...]
> >> >> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
> >> > [...]
> >> >> > Ard Biesheuvel (1):
> >> >> >   arm64: neon: replace generic definition of may_use_simd()
> >> >> >
> >> >> > Dave Martin (4):
> >> >> >   arm64: neon: Add missing header guard in <asm/neon.h>
> >> >> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
> >> >> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
> >> >> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
> >> >>
> >> >> Queued for 4.14. Thanks.
> >> >
> >> > ... and reverted.
> >> >
> >> > I now realised that it doesn't build without Ard's crypto series (I had
> >> > the wrong impression it is just a performance impact). I get errors
> >> > like:
> >> >
> >> > arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
> >> > of function ?kernel_neon_begin_partial?
> >>
> >> Ah yes. Apologies for failing to mention that. Apparently, there are
> >> in fact build time cross-dependencies that I forgot about.
> >>
> >> To avoid delaying this unnecessarily, we just alias
> >> kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
> >> argument. We can remove that again when everything has gone upstream.
> >
> > But would kernel_neon_begin() be called in an interrupt context from the
> > crypto code? I'd like the arm64 for-next/core branch (to be based on
> > -rc4 next week) to still be fully working.
> >
>
> Well, theoretically. In the new situation we allow KMN from process
> and softirq context, but the latter only if no process context KMN is
> in progress. This last rule could be violated by the old crypto code,
> but the chances of actually hitting that are very low: if your wifi
> chip lacks a h/w implementation of CCMP (which is uncommon these
> days), and it gets invoked while KMN crypto is being used on the same
> core, i.e., for IPsec or block encryption, you may get a splat. It
> wouldn't corrupt your encrypted file system.
>
> Perhaps we could stick it on a separate branch for this cycle (with
> the suggested tweak) and get it pulled it into -next separately? That
> way, you can forward it as a late pull, after Herbert's stuff gets in.

Or I could prepend the patches to the SVE series, and we wait for Linux
to pull the crypto branch before merging that.  (I still need to repost
and the SVE patches need to be reviewed, so that isn't imminent.)

Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 13:21         ` Ard Biesheuvel
  2017-08-04 13:26           ` Dave P Martin
@ 2017-08-04 14:00           ` Catalin Marinas
  2017-08-04 14:04             ` Ard Biesheuvel
  1 sibling, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2017-08-04 14:00 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 02:21:52PM +0100, Ard Biesheuvel wrote:
> On 4 August 2017 at 14:12, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Fri, Aug 04, 2017 at 01:25:03PM +0100, Ard Biesheuvel wrote:
> >> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
> >> > On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
> >> >> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
> >> >> > This series depends on Ard's v4.14 crypto rework [2].
> >> > [...]
> >> >> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
> >> > [...]
> >> >> > Ard Biesheuvel (1):
> >> >> >   arm64: neon: replace generic definition of may_use_simd()
> >> >> >
> >> >> > Dave Martin (4):
> >> >> >   arm64: neon: Add missing header guard in <asm/neon.h>
> >> >> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
> >> >> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
> >> >> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
> >> >>
> >> >> Queued for 4.14. Thanks.
> >> >
> >> > ... and reverted.
> >> >
> >> > I now realised that it doesn't build without Ard's crypto series (I had
> >> > the wrong impression it is just a performance impact). I get errors
> >> > like:
> >> >
> >> > arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
> >> > of function ?kernel_neon_begin_partial?
> >>
> >> Ah yes. Apologies for failing to mention that. Apparently, there are
> >> in fact build time cross-dependencies that I forgot about.
> >>
> >> To avoid delaying this unnecessarily, we just alias
> >> kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
> >> argument. We can remove that again when everything has gone upstream.
> >
> > But would kernel_neon_begin() be called in an interrupt context from the
> > crypto code? I'd like the arm64 for-next/core branch (to be based on
> > -rc4 next week) to still be fully working.
> 
> Well, theoretically. In the new situation we allow KMN from process
> and softirq context, but the latter only if no process context KMN is
> in progress. This last rule could be violated by the old crypto code,
> but the chances of actually hitting that are very low: if your wifi
> chip lacks a h/w implementation of CCMP (which is uncommon these
> days), and it gets invoked while KMN crypto is being used on the same
> core, i.e., for IPsec or block encryption, you may get a splat. It
> wouldn't corrupt your encrypted file system.

Just to make sure I understand correctly, even if may_use_simd() returns
false, the crypto code could still call kernel_neon_begin_partial() in
an interrupt context. I guess this is still fine with this series until
the last patch when we would get a BUG_ON (if we define
kernel_neon_begin_partial to kernel_neon_begin).

> Perhaps we could stick it on a separate branch for this cycle (with
> the suggested tweak) and get it pulled it into -next separately? That
> way, you can forward it as a late pull, after Herbert's stuff gets in.

I'll push it to a branch for now (with a fix to be able to build) and
look at the merge later (I don't think I'll ever hit the BUG_ON with my
hardware but you never know; linux-next should be fine since the other
crypto patches are merged). In hindsight, we should have put your crypto
patches on a common branch but Herbert already merged them on top of
others.

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 14:00           ` Catalin Marinas
@ 2017-08-04 14:04             ` Ard Biesheuvel
  0 siblings, 0 replies; 22+ messages in thread
From: Ard Biesheuvel @ 2017-08-04 14:04 UTC (permalink / raw)
  To: linux-arm-kernel

On 4 August 2017 at 15:00, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Fri, Aug 04, 2017 at 02:21:52PM +0100, Ard Biesheuvel wrote:
>> On 4 August 2017 at 14:12, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> > On Fri, Aug 04, 2017 at 01:25:03PM +0100, Ard Biesheuvel wrote:
>> >> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
>> >> > On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
>> >> >> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
>> >> >> > This series depends on Ard's v4.14 crypto rework [2].
>> >> > [...]
>> >> >> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
>> >> > [...]
>> >> >> > Ard Biesheuvel (1):
>> >> >> >   arm64: neon: replace generic definition of may_use_simd()
>> >> >> >
>> >> >> > Dave Martin (4):
>> >> >> >   arm64: neon: Add missing header guard in <asm/neon.h>
>> >> >> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
>> >> >> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
>> >> >> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
>> >> >>
>> >> >> Queued for 4.14. Thanks.
>> >> >
>> >> > ... and reverted.
>> >> >
>> >> > I now realised that it doesn't build without Ard's crypto series (I had
>> >> > the wrong impression it is just a performance impact). I get errors
>> >> > like:
>> >> >
>> >> > arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
>> >> > of function ?kernel_neon_begin_partial?
>> >>
>> >> Ah yes. Apologies for failing to mention that. Apparently, there are
>> >> in fact build time cross-dependencies that I forgot about.
>> >>
>> >> To avoid delaying this unnecessarily, we just alias
>> >> kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
>> >> argument. We can remove that again when everything has gone upstream.
>> >
>> > But would kernel_neon_begin() be called in an interrupt context from the
>> > crypto code? I'd like the arm64 for-next/core branch (to be based on
>> > -rc4 next week) to still be fully working.
>>
>> Well, theoretically. In the new situation we allow KMN from process
>> and softirq context, but the latter only if no process context KMN is
>> in progress. This last rule could be violated by the old crypto code,
>> but the chances of actually hitting that are very low: if your wifi
>> chip lacks a h/w implementation of CCMP (which is uncommon these
>> days), and it gets invoked while KMN crypto is being used on the same
>> core, i.e., for IPsec or block encryption, you may get a splat. It
>> wouldn't corrupt your encrypted file system.
>
> Just to make sure I understand correctly, even if may_use_simd() returns
> false, the crypto code could still call kernel_neon_begin_partial() in
> an interrupt context.

Yes, it almost completely ignores it, given that SIMD is allowed in any context.

> I guess this is still fine with this series until
> the last patch when we would get a BUG_ON (if we define
> kernel_neon_begin_partial to kernel_neon_begin).
>
>> Perhaps we could stick it on a separate branch for this cycle (with
>> the suggested tweak) and get it pulled it into -next separately? That
>> way, you can forward it as a late pull, after Herbert's stuff gets in.
>
> I'll push it to a branch for now (with a fix to be able to build) and
> look at the merge later (I don't think I'll ever hit the BUG_ON with my
> hardware but you never know; linux-next should be fine since the other
> crypto patches are merged). In hindsight, we should have put your crypto
> patches on a common branch but Herbert already merged them on top of
> others.
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 13:26           ` Dave P Martin
@ 2017-08-04 14:05             ` Catalin Marinas
  2017-08-04 14:25               ` Dave Martin
  0 siblings, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2017-08-04 14:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 02:26:06PM +0100, Dave P Martin wrote:
> On Fri, Aug 04, 2017 at 02:21:52PM +0100, Ard Biesheuvel wrote:
> > On 4 August 2017 at 14:12, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Fri, Aug 04, 2017 at 01:25:03PM +0100, Ard Biesheuvel wrote:
> > >> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > >> > On Fri, Aug 04, 2017 at 01:08:58PM +0100, Catalin Marinas wrote:
> > >> >> On Thu, Aug 03, 2017 at 05:23:18PM +0100, Dave P Martin wrote:
> > >> >> > This series depends on Ard's v4.14 crypto rework [2].
> > >> > [...]
> > >> >> > [2] http://lists.infradead.org/pipermail/linux-arm-kernel/2017-July/520664.html
> > >> > [...]
> > >> >> > Ard Biesheuvel (1):
> > >> >> >   arm64: neon: replace generic definition of may_use_simd()
> > >> >> >
> > >> >> > Dave Martin (4):
> > >> >> >   arm64: neon: Add missing header guard in <asm/neon.h>
> > >> >> >   arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate
> > >> >> >   arm64: neon: Allow EFI runtime services to use FPSIMD in irq context
> > >> >> >   arm64: neon: Remove support for nested or hardirq kernel-mode NEON
> > >> >>
> > >> >> Queued for 4.14. Thanks.
> > >> >
> > >> > ... and reverted.
> > >> >
> > >> > I now realised that it doesn't build without Ard's crypto series (I had
> > >> > the wrong impression it is just a performance impact). I get errors
> > >> > like:
> > >> >
> > >> > arch/arm64/crypto/sha1-ce-glue.c|41 col 2| error: implicit declaration
> > >> > of function ?kernel_neon_begin_partial?
> > >>
> > >> Ah yes. Apologies for failing to mention that. Apparently, there are
> > >> in fact build time cross-dependencies that I forgot about.
> > >>
> > >> To avoid delaying this unnecessarily, we just alias
> > >> kernel_neon_begin_partial() to kernel_neon_begin() and ignore the
> > >> argument. We can remove that again when everything has gone upstream.
> > >
> > > But would kernel_neon_begin() be called in an interrupt context from the
> > > crypto code? I'd like the arm64 for-next/core branch (to be based on
> > > -rc4 next week) to still be fully working.
> > >
> >
> > Well, theoretically. In the new situation we allow KMN from process
> > and softirq context, but the latter only if no process context KMN is
> > in progress. This last rule could be violated by the old crypto code,
> > but the chances of actually hitting that are very low: if your wifi
> > chip lacks a h/w implementation of CCMP (which is uncommon these
> > days), and it gets invoked while KMN crypto is being used on the same
> > core, i.e., for IPsec or block encryption, you may get a splat. It
> > wouldn't corrupt your encrypted file system.
> >
> > Perhaps we could stick it on a separate branch for this cycle (with
> > the suggested tweak) and get it pulled it into -next separately? That
> > way, you can forward it as a late pull, after Herbert's stuff gets in.
> 
> Or I could prepend the patches to the SVE series, and we wait for Linux
> to pull the crypto branch before merging that.  (I still need to repost
> and the SVE patches need to be reviewed, so that isn't imminent.)

I'll publish a branch early next week with the kernel mode neon patches
and you can base the SVE on top. We can test them together but I'll
wait for the crypto stuff to go in before sending this branch to Linus.

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 14:05             ` Catalin Marinas
@ 2017-08-04 14:25               ` Dave Martin
  2017-08-04 14:50                 ` Catalin Marinas
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Martin @ 2017-08-04 14:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 03:05:26PM +0100, Catalin Marinas wrote:
> On Fri, Aug 04, 2017 at 02:26:06PM +0100, Dave P Martin wrote:
> > On Fri, Aug 04, 2017 at 02:21:52PM +0100, Ard Biesheuvel wrote:

[...]

> > > >> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:

[...]

> > > >> > ... and reverted.

[...]

> > > Perhaps we could stick it on a separate branch for this cycle (with
> > > the suggested tweak) and get it pulled it into -next separately? That
> > > way, you can forward it as a late pull, after Herbert's stuff gets in.
> > 
> > Or I could prepend the patches to the SVE series, and we wait for Linux
> > to pull the crypto branch before merging that.  (I still need to repost
> > and the SVE patches need to be reviewed, so that isn't imminent.)
> 
> I'll publish a branch early next week with the kernel mode neon patches
> and you can base the SVE on top. We can test them together but I'll
> wait for the crypto stuff to go in before sending this branch to Linus.

OK, this sounds like a good approach.

Ping me when you have a branch, but anyway I'll drop those patches from
the SVE series.

Thanks
---Dave

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/5] Simplify kernel-mode NEON
  2017-08-04 14:25               ` Dave Martin
@ 2017-08-04 14:50                 ` Catalin Marinas
  0 siblings, 0 replies; 22+ messages in thread
From: Catalin Marinas @ 2017-08-04 14:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Aug 04, 2017 at 03:25:05PM +0100, Dave P Martin wrote:
> On Fri, Aug 04, 2017 at 03:05:26PM +0100, Catalin Marinas wrote:
> > On Fri, Aug 04, 2017 at 02:26:06PM +0100, Dave P Martin wrote:
> > > On Fri, Aug 04, 2017 at 02:21:52PM +0100, Ard Biesheuvel wrote:
> 
> [...]
> 
> > > > >> On 4 August 2017 at 13:22, Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> [...]
> 
> > > > >> > ... and reverted.
> 
> [...]
> 
> > > > Perhaps we could stick it on a separate branch for this cycle (with
> > > > the suggested tweak) and get it pulled it into -next separately? That
> > > > way, you can forward it as a late pull, after Herbert's stuff gets in.
> > > 
> > > Or I could prepend the patches to the SVE series, and we wait for Linux
> > > to pull the crypto branch before merging that.  (I still need to repost
> > > and the SVE patches need to be reviewed, so that isn't imminent.)
> > 
> > I'll publish a branch early next week with the kernel mode neon patches
> > and you can base the SVE on top. We can test them together but I'll
> > wait for the crypto stuff to go in before sending this branch to Linus.
> 
> OK, this sounds like a good approach.
> 
> Ping me when you have a branch, but anyway I'll drop those patches from
> the SVE series.

Here it is:

  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux for-next/kernel-mode-neon

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 5/5] arm64: neon: Remove support for nested or hardirq kernel-mode NEON
  2017-08-03 16:23 ` [PATCH 5/5] arm64: neon: Remove support for nested or hardirq kernel-mode NEON Dave Martin
@ 2017-08-07 11:23   ` Catalin Marinas
  2017-08-07 11:31     ` Ard Biesheuvel
  0 siblings, 1 reply; 22+ messages in thread
From: Catalin Marinas @ 2017-08-07 11:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Aug 03, 2017 at 05:23:23PM +0100, Dave P Martin wrote:
> diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
> index 96959b5..5a1a927 100644
> --- a/arch/arm64/include/asm/simd.h
> +++ b/arch/arm64/include/asm/simd.h
> @@ -9,15 +9,46 @@
>  #ifndef __ASM_SIMD_H
>  #define __ASM_SIMD_H
>  
> +#include <linux/compiler.h>
> +#include <linux/percpu.h>
> +#include <linux/preempt.h>
>  #include <linux/types.h>
>  
> +#ifdef CONFIG_KERNEL_MODE_NEON
> +
> +DECLARE_PER_CPU(bool, kernel_neon_busy);
[...]
> @@ -233,49 +254,69 @@ void fpsimd_flush_task_state(struct task_struct *t)
>  
>  #ifdef CONFIG_KERNEL_MODE_NEON
>  
> -static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
> -static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
> +DEFINE_PER_CPU(bool, kernel_neon_busy);

This variable needs to be exported to modules (allmodconfig fails to
build with these patches). Any preference for GPL vs non-GPL export?

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 138fcfaeadc1..5dde6f5961a1 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -255,6 +255,7 @@ void fpsimd_flush_task_state(struct task_struct *t)
 #ifdef CONFIG_KERNEL_MODE_NEON
 
 DEFINE_PER_CPU(bool, kernel_neon_busy);
+EXPORT_PER_CPU_SYMBOL_GPL(kernel_neon_busy);
 
 /*
  * Kernel-side NEON support functions

-- 
Catalin

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/5] arm64: neon: Remove support for nested or hardirq kernel-mode NEON
  2017-08-07 11:23   ` Catalin Marinas
@ 2017-08-07 11:31     ` Ard Biesheuvel
  2017-08-07 11:35       ` Catalin Marinas
  0 siblings, 1 reply; 22+ messages in thread
From: Ard Biesheuvel @ 2017-08-07 11:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 7 August 2017 at 12:23, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Thu, Aug 03, 2017 at 05:23:23PM +0100, Dave P Martin wrote:
>> diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
>> index 96959b5..5a1a927 100644
>> --- a/arch/arm64/include/asm/simd.h
>> +++ b/arch/arm64/include/asm/simd.h
>> @@ -9,15 +9,46 @@
>>  #ifndef __ASM_SIMD_H
>>  #define __ASM_SIMD_H
>>
>> +#include <linux/compiler.h>
>> +#include <linux/percpu.h>
>> +#include <linux/preempt.h>
>>  #include <linux/types.h>
>>
>> +#ifdef CONFIG_KERNEL_MODE_NEON
>> +
>> +DECLARE_PER_CPU(bool, kernel_neon_busy);
> [...]
>> @@ -233,49 +254,69 @@ void fpsimd_flush_task_state(struct task_struct *t)
>>
>>  #ifdef CONFIG_KERNEL_MODE_NEON
>>
>> -static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
>> -static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
>> +DEFINE_PER_CPU(bool, kernel_neon_busy);
>
> This variable needs to be exported to modules (allmodconfig fails to
> build with these patches). Any preference for GPL vs non-GPL export?
>

It should match whatever kernel_neon_begin|end use.


> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 138fcfaeadc1..5dde6f5961a1 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -255,6 +255,7 @@ void fpsimd_flush_task_state(struct task_struct *t)
>  #ifdef CONFIG_KERNEL_MODE_NEON
>
>  DEFINE_PER_CPU(bool, kernel_neon_busy);
> +EXPORT_PER_CPU_SYMBOL_GPL(kernel_neon_busy);
>
>  /*
>   * Kernel-side NEON support functions
>
> --
> Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 5/5] arm64: neon: Remove support for nested or hardirq kernel-mode NEON
  2017-08-07 11:31     ` Ard Biesheuvel
@ 2017-08-07 11:35       ` Catalin Marinas
  0 siblings, 0 replies; 22+ messages in thread
From: Catalin Marinas @ 2017-08-07 11:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Aug 07, 2017 at 12:31:47PM +0100, Ard Biesheuvel wrote:
> On 7 August 2017 at 12:23, Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Thu, Aug 03, 2017 at 05:23:23PM +0100, Dave P Martin wrote:
> >> diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h
> >> index 96959b5..5a1a927 100644
> >> --- a/arch/arm64/include/asm/simd.h
> >> +++ b/arch/arm64/include/asm/simd.h
> >> @@ -9,15 +9,46 @@
> >>  #ifndef __ASM_SIMD_H
> >>  #define __ASM_SIMD_H
> >>
> >> +#include <linux/compiler.h>
> >> +#include <linux/percpu.h>
> >> +#include <linux/preempt.h>
> >>  #include <linux/types.h>
> >>
> >> +#ifdef CONFIG_KERNEL_MODE_NEON
> >> +
> >> +DECLARE_PER_CPU(bool, kernel_neon_busy);
> > [...]
> >> @@ -233,49 +254,69 @@ void fpsimd_flush_task_state(struct task_struct *t)
> >>
> >>  #ifdef CONFIG_KERNEL_MODE_NEON
> >>
> >> -static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
> >> -static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
> >> +DEFINE_PER_CPU(bool, kernel_neon_busy);
> >
> > This variable needs to be exported to modules (allmodconfig fails to
> > build with these patches). Any preference for GPL vs non-GPL export?
> 
> It should match whatever kernel_neon_begin|end use.

Ah, good point. I'll drop the _GPL part.

-- 
Catalin

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-08-07 11:35 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-03 16:23 [PATCH 0/5] Simplify kernel-mode NEON Dave Martin
2017-08-03 16:23 ` [PATCH 1/5] arm64: neon: replace generic definition of may_use_simd() Dave Martin
2017-08-03 16:23 ` [PATCH 2/5] arm64: neon: Add missing header guard in <asm/neon.h> Dave Martin
2017-08-03 16:23 ` [PATCH 3/5] arm64: fpsimd: Consistently use __this_cpu_ ops where appropriate Dave Martin
2017-08-03 16:23 ` [PATCH 4/5] arm64: neon: Allow EFI runtime services to use FPSIMD in irq context Dave Martin
2017-08-03 16:23 ` [PATCH 5/5] arm64: neon: Remove support for nested or hardirq kernel-mode NEON Dave Martin
2017-08-07 11:23   ` Catalin Marinas
2017-08-07 11:31     ` Ard Biesheuvel
2017-08-07 11:35       ` Catalin Marinas
2017-08-04 12:08 ` [PATCH 0/5] Simplify " Catalin Marinas
2017-08-04 12:22   ` Catalin Marinas
2017-08-04 12:25     ` Ard Biesheuvel
2017-08-04 12:41       ` Dave P Martin
2017-08-04 13:12       ` Catalin Marinas
2017-08-04 13:21         ` Ard Biesheuvel
2017-08-04 13:26           ` Dave P Martin
2017-08-04 14:05             ` Catalin Marinas
2017-08-04 14:25               ` Dave Martin
2017-08-04 14:50                 ` Catalin Marinas
2017-08-04 14:00           ` Catalin Marinas
2017-08-04 14:04             ` Ard Biesheuvel
2017-08-04 13:22         ` Dave Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.