linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] arm64: Fix support for systems without FP/SIMD
@ 2019-10-10 17:15 Suzuki K Poulose
  2019-10-10 17:15 ` [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability Suzuki K Poulose
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-10 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, will, mark.rutland, catalin.marinas, dave.martin,
	Suzuki K Poulose

This series fixes the original support for systems without FP/SIMD.
We have three set of issues with the current code :

1) We detect the absence of FP/SIMD after the SMP cpus are brought
online (i.e, SYSTEM scope). This means, some of the early kernel threads
could run with their TIF_FOREIGN_FPSTATE flag set which might be
inherited by applications forked by them (e.g, modprobe from initramfs).

Also we allow a hotplugged CPU to boot successfully even if it doesn't
have the FP/SIMD when we have decided otherwise at boot and have now
advertised the ELF HWCAP for the userspace.
Fix this by turning this to a BOOT_RESTRICTED_CPU_LOCAL feature to
allow the detection of the feature the very moment a CPU turns up
without FP/SIMD and also prevent a conflict after SMP boot.

2) As mentioned above, some tasks could have the TIF flag set,
which will never be cleared after we detect the capability.
Thus they could get stuck indefinitely in do_notfiy_resume().
Fix this by clearing the TIF flag for such tasks but continuing
to avoid the save/restore of the FP state.

3) The compat ELF_HWCAP bits are statically initialised to indicate
that the FP/SIMD support is available. This must be updated dynamically
to provide the correct flags to the userspace.

Tested with a 32bit Debian Jessie fs on Fast model (with and without
FP support).

Suzuki K Poulose (3):
  arm64: cpufeature: Fix the type of no FP/SIMD capability
  arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks
  arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly

 arch/arm64/kernel/cpufeature.c | 39 ++++++++++++++++++++++++++++++----
 arch/arm64/kernel/fpsimd.c     | 26 ++++++++++++++---------
 2 files changed, 51 insertions(+), 14 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-10 17:15 [PATCH 0/3] arm64: Fix support for systems without FP/SIMD Suzuki K Poulose
@ 2019-10-10 17:15 ` Suzuki K Poulose
  2019-10-11 11:36   ` Dave Martin
  2019-10-10 17:15 ` [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks Suzuki K Poulose
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-10 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, will, mark.rutland, catalin.marinas, dave.martin,
	Suzuki K Poulose

The NO_FPSIMD capability is defined with scope SYSTEM, which implies
that the "absence" of FP/SIMD on at least one CPU is detected only
after all the SMP CPUs are brought up. However, we use the status
of this capability for every context switch. So, let us change
the scop to LOCAL_CPU to allow the detection of this capability
as and when the first CPU without FP is brought up.

Also, the current type allows hotplugged CPU to be brought up without
FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
up. Fix both of these issues by changing the capability to
BOOT_RESTRICTED_LOCAL_CPU_FEATURE.

Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/cpufeature.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9323bcc40a58..0f9eace6c64b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		/* FP/SIMD is not implemented */
 		.capability = ARM64_HAS_NO_FPSIMD,
-		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
 		.min_field_value = 0,
 		.matches = has_no_fpsimd,
 	},
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks
  2019-10-10 17:15 [PATCH 0/3] arm64: Fix support for systems without FP/SIMD Suzuki K Poulose
  2019-10-10 17:15 ` [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability Suzuki K Poulose
@ 2019-10-10 17:15 ` Suzuki K Poulose
  2019-10-11 11:26   ` Dave Martin
  2019-10-10 17:15 ` [PATCH 3/3] arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly Suzuki K Poulose
  2019-10-17  0:06 ` [PATCH 0/3] arm64: Fix support for systems without FP/SIMD Will Deacon
  3 siblings, 1 reply; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-10 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, will, mark.rutland, catalin.marinas, dave.martin,
	Suzuki K Poulose

We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then
we have kernel threads running already with TIF_FOREIGN_FPSTATE set which
could be inherited by early userspace applications (e.g, modprobe triggered
from initramfs). This could end up in the applications stuck in
do_nofity_resume() as we never clear the TIF flag, once we now know that
we don't support FP.

Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag
for tasks which may have them set, as we would have done in the normal
case, but avoiding touching the hardware state (since we don't support any).

Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/fpsimd.c | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 37d3912cfe06..dfcdd077aeca 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
  */
 void fpsimd_restore_current_state(void)
 {
-	if (!system_supports_fpsimd())
-		return;
-
 	get_cpu_fpsimd_context();
-
-	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
+	/*
+	 * For the tasks that were created before we detected the absence of
+	 * FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via fpsimd_thread_switch()
+	 * and/or could be inherited from the parent(init_task has this set). Even
+	 * though userspace has not run yet, this could be inherited by the
+	 * processes forked from one of those tasks (e.g, modprobe from initramfs).
+	 * If the system doesn't support FP/SIMD, we must clear the flag for the
+	 * tasks mentioned above, to indicate that the FPSTATE is clean (as we
+	 * can't have one) to avoid looping for ever to clear the flag.
+	 */
+	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) &&
+	    system_supports_fpsimd()) {
 		task_fpsimd_load();
 		fpsimd_bind_task_to_cpu();
 	}
@@ -1148,17 +1155,16 @@ void fpsimd_restore_current_state(void)
  */
 void fpsimd_update_current_state(struct user_fpsimd_state const *state)
 {
-	if (!system_supports_fpsimd())
-		return;
-
 	get_cpu_fpsimd_context();
 
 	current->thread.uw.fpsimd_state = *state;
 	if (system_supports_sve() && test_thread_flag(TIF_SVE))
 		fpsimd_to_sve(current);
 
-	task_fpsimd_load();
-	fpsimd_bind_task_to_cpu();
+	if (system_supports_fpsimd()) {
+		task_fpsimd_load();
+		fpsimd_bind_task_to_cpu();
+	}
 
 	clear_thread_flag(TIF_FOREIGN_FPSTATE);
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/3] arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly
  2019-10-10 17:15 [PATCH 0/3] arm64: Fix support for systems without FP/SIMD Suzuki K Poulose
  2019-10-10 17:15 ` [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability Suzuki K Poulose
  2019-10-10 17:15 ` [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks Suzuki K Poulose
@ 2019-10-10 17:15 ` Suzuki K Poulose
  2019-10-17  0:06 ` [PATCH 0/3] arm64: Fix support for systems without FP/SIMD Will Deacon
  3 siblings, 0 replies; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-10 17:15 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, will, mark.rutland, catalin.marinas, dave.martin,
	Suzuki K Poulose

We set the compat_elf_hwcap bits unconditionally on arm64 to
include the VFP and NEON support. However, the FP/SIMD unit
is optional on Arm v8 and thus could be missing. We already
handle this properly in the kernel, but still advertise to
the COMPAT applications that the VFP is available. Fix this
to make sure we only advertise when we really have them.

Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Changes since v1:
  - Switch to using cpuid_feature_extract_unsigned_field() rather than
    hard coding field extraction.
---
 arch/arm64/kernel/cpufeature.c | 37 +++++++++++++++++++++++++++++++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0f9eace6c64b..d260e3bdf07b 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -32,9 +32,7 @@ static unsigned long elf_hwcap __read_mostly;
 #define COMPAT_ELF_HWCAP_DEFAULT	\
 				(COMPAT_HWCAP_HALF|COMPAT_HWCAP_THUMB|\
 				 COMPAT_HWCAP_FAST_MULT|COMPAT_HWCAP_EDSP|\
-				 COMPAT_HWCAP_TLS|COMPAT_HWCAP_VFP|\
-				 COMPAT_HWCAP_VFPv3|COMPAT_HWCAP_VFPv4|\
-				 COMPAT_HWCAP_NEON|COMPAT_HWCAP_IDIV|\
+				 COMPAT_HWCAP_TLS|COMPAT_HWCAP_IDIV|\
 				 COMPAT_HWCAP_LPAE)
 unsigned int compat_elf_hwcap __read_mostly = COMPAT_ELF_HWCAP_DEFAULT;
 unsigned int compat_elf_hwcap2 __read_mostly;
@@ -1589,6 +1587,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.match_list = list,						\
 	}
 
+#define HWCAP_CAP_MATCH(match, cap_type, cap)					\
+	{									\
+		__HWCAP_CAP(#cap, cap_type, cap)				\
+		.matches = match,						\
+	}
+
 #ifdef CONFIG_ARM64_PTR_AUTH
 static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = {
 	{
@@ -1662,8 +1666,35 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	{},
 };
 
+#ifdef CONFIG_COMPAT
+static bool compat_has_neon(const struct arm64_cpu_capabilities *cap, int scope)
+{
+	/*
+	 * Check that all of MVFR1_EL1.{SIMDSP, SIMDInt, SIMDLS} are available,
+	 * in line with that of arm32 as in vfp_init(). We make sure that the
+	 * check is future proof, by making sure value is non-zero.
+	 */
+	u32 mvfr1;
+
+	WARN_ON(scope == SCOPE_LOCAL_CPU && preemptible());
+	if (scope == SCOPE_SYSTEM)
+		mvfr1 = read_sanitised_ftr_reg(SYS_MVFR1_EL1);
+	else
+		mvfr1 = read_sysreg_s(SYS_MVFR1_EL1);
+
+	return cpuid_feature_extract_unsigned_field(mvfr1, MVFR1_SIMDSP_SHIFT) &&
+		cpuid_feature_extract_unsigned_field(mvfr1, MVFR1_SIMDINT_SHIFT) &&
+		cpuid_feature_extract_unsigned_field(mvfr1, MVFR1_SIMDLS_SHIFT);
+}
+#endif
+
 static const struct arm64_cpu_capabilities compat_elf_hwcaps[] = {
 #ifdef CONFIG_COMPAT
+	HWCAP_CAP_MATCH(compat_has_neon, CAP_COMPAT_HWCAP, COMPAT_HWCAP_NEON),
+	HWCAP_CAP(SYS_MVFR1_EL1, MVFR1_SIMDFMAC_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFPv4),
+	/* Arm v8 mandates MVFR0.FPDP == {0, 2}. So, piggy back on this for the presence of VFP support */
+	HWCAP_CAP(SYS_MVFR0_EL1, MVFR0_FPDP_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFP),
+	HWCAP_CAP(SYS_MVFR0_EL1, MVFR0_FPDP_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP, COMPAT_HWCAP_VFPv3),
 	HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 2, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_PMULL),
 	HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_AES_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_AES),
 	HWCAP_CAP(SYS_ID_ISAR5_EL1, ID_ISAR5_SHA1_SHIFT, FTR_UNSIGNED, 1, CAP_COMPAT_HWCAP2, COMPAT_HWCAP2_SHA1),
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks
  2019-10-10 17:15 ` [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks Suzuki K Poulose
@ 2019-10-11 11:26   ` Dave Martin
  2019-10-17 12:42     ` Suzuki K Poulose
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Martin @ 2019-10-11 11:26 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, mark.rutland, catalin.marinas, linux-kernel, will

On Thu, Oct 10, 2019 at 06:15:16PM +0100, Suzuki K Poulose wrote:
> We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then
> we have kernel threads running already with TIF_FOREIGN_FPSTATE set which
> could be inherited by early userspace applications (e.g, modprobe triggered
> from initramfs). This could end up in the applications stuck in
> do_nofity_resume() as we never clear the TIF flag, once we now know that
> we don't support FP.
> 
> Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag
> for tasks which may have them set, as we would have done in the normal
> case, but avoiding touching the hardware state (since we don't support any).
> 
> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/kernel/fpsimd.c | 26 ++++++++++++++++----------
>  1 file changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 37d3912cfe06..dfcdd077aeca 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
>   */
>  void fpsimd_restore_current_state(void)
>  {
> -	if (!system_supports_fpsimd())
> -		return;
> -
>  	get_cpu_fpsimd_context();
> -
> -	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
> +	/*
> +	 * For the tasks that were created before we detected the absence of
> +	 * FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via fpsimd_thread_switch()
> +	 * and/or could be inherited from the parent(init_task has this set). Even
> +	 * though userspace has not run yet, this could be inherited by the
> +	 * processes forked from one of those tasks (e.g, modprobe from initramfs).
> +	 * If the system doesn't support FP/SIMD, we must clear the flag for the
> +	 * tasks mentioned above, to indicate that the FPSTATE is clean (as we
> +	 * can't have one) to avoid looping for ever to clear the flag.
> +	 */
> +	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) &&
> +	    system_supports_fpsimd()) {

I'm not too keen on this approach: elsewhere we just stub out all the
FPSIMD handling logic if !system_supports_fpsimd() -- I think we should
be using this test everywhere rather than relying on TIF_FOREIGN_FPSTATE.

Rather, I feel that TIF_FOREIGN_FPSTATE means "if this is a user task
and this task is current() and the system supports FPSIMD at all, this
task's FPSIMD state is not loaded in the cpu".

I think we should ensure that any check on TIF_FOREIGN_FPSTATE is
shadowed by a check on system_supports_fpsimd() somewhere.  This already
exists in many places -- we just need to fill in the missing ones.

fpsimd_save() is a backend function that should only be called if
system_supports_fpsimd(), so that should not need any check internally,
but we should make sure that calls to this function are appropriately
protected with in if (system_supports_fpsimd()).

For other maintenance functions intended for outside callers:

 * fpsimd_bind_task_to_cpu()
 * fpsimd_bind_state_to_cpu()
 * fpsimd_flush_task_state()
 * fpsimd_save_and_flush_cpu_state()

the situation is less clear.  Does is make sense to call these at all
if !system_supports_fpsimd()?  I'm not currently sure.  We could at
least drop some WARN_ON() into these to check, after revieweing their
callsites.

>  		task_fpsimd_load();
>  		fpsimd_bind_task_to_cpu();
>  	}
> @@ -1148,17 +1155,16 @@ void fpsimd_restore_current_state(void)
>   */
>  void fpsimd_update_current_state(struct user_fpsimd_state const *state)
>  {
> -	if (!system_supports_fpsimd())
> -		return;
> -
>  	get_cpu_fpsimd_context();
>  
>  	current->thread.uw.fpsimd_state = *state;
>  	if (system_supports_sve() && test_thread_flag(TIF_SVE))
>  		fpsimd_to_sve(current);

Why should we do this stuff on a system that doesn't support FP?

> -	task_fpsimd_load();
> -	fpsimd_bind_task_to_cpu();
> +	if (system_supports_fpsimd()) {
> +		task_fpsimd_load();
> +		fpsimd_bind_task_to_cpu();
> +	}
>  
>  	clear_thread_flag(TIF_FOREIGN_FPSTATE);

[...]

Not in scope for a stable fix, but:

It would be interesting to try to strip out TIF_FOREIGN_FPSTATE
entirely and do some benchmarks and irq latency measurements:

TIF_FOREIGN_FPSTATE is just a cached copy of the wrong_task || wrong_cpu
condition defined in fpsimd_thread_switch() --

That means we have to do maintenance on it all over the place to keep
it in sync with the condition it represents -- this has proven to be
a source of complexity and subtle bugs, as well as making the code
fragile to maintain.

The only point of all this is so that there is a thread flag for
do_notify_resume() to check.  Now that do_notify_resume() is C it would
be trivial to check the real condition -- there would be a cost
increase and interrupt latency increase here, but maybe not that much.

This wouldn't solve the whole problem, but it might remove a layer of
complexity.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-10 17:15 ` [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability Suzuki K Poulose
@ 2019-10-11 11:36   ` Dave Martin
  2019-10-11 12:13     ` Suzuki K Poulose
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Martin @ 2019-10-11 11:36 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, mark.rutland, catalin.marinas, linux-kernel, will

On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> that the "absence" of FP/SIMD on at least one CPU is detected only
> after all the SMP CPUs are brought up. However, we use the status
> of this capability for every context switch. So, let us change
> the scop to LOCAL_CPU to allow the detection of this capability
> as and when the first CPU without FP is brought up.
> 
> Also, the current type allows hotplugged CPU to be brought up without
> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> up. Fix both of these issues by changing the capability to
> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> 
> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/kernel/cpufeature.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index 9323bcc40a58..0f9eace6c64b 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  	{
>  		/* FP/SIMD is not implemented */
>  		.capability = ARM64_HAS_NO_FPSIMD,
> -		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
> +		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,

ARM64_HAS_NO_FPSIMD is really a disability, not a capability.

Although we have other things that smell like this (CPU errata for
example), I wonder whether inverting the meaning in the case would
make the situation easier to understand.

So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
IIUC.  We'd just need to invert the sense of the check in
system_supports_fpsimd().

>  		.min_field_value = 0,

(Does .min_field_value == 0 make sense, or is it even used?  I thought
only the default has_cpuid_feature() match logic uses that.)

>  		.matches = has_no_fpsimd,
>  	},

Cheers
---Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-11 11:36   ` Dave Martin
@ 2019-10-11 12:13     ` Suzuki K Poulose
  2019-10-11 14:21       ` Dave Martin
  0 siblings, 1 reply; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-11 12:13 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, mark.rutland, catalin.marinas, linux-kernel, will

Hi Dave

On 11/10/2019 12:36, Dave Martin wrote:
> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
>> that the "absence" of FP/SIMD on at least one CPU is detected only
>> after all the SMP CPUs are brought up. However, we use the status
>> of this capability for every context switch. So, let us change
>> the scop to LOCAL_CPU to allow the detection of this capability
>> as and when the first CPU without FP is brought up.
>>
>> Also, the current type allows hotplugged CPU to be brought up without
>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
>> up. Fix both of these issues by changing the capability to
>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
>>
>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   arch/arm64/kernel/cpufeature.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>> index 9323bcc40a58..0f9eace6c64b 100644
>> --- a/arch/arm64/kernel/cpufeature.c
>> +++ b/arch/arm64/kernel/cpufeature.c
>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>>   	{
>>   		/* FP/SIMD is not implemented */
>>   		.capability = ARM64_HAS_NO_FPSIMD,
>> -		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
>> +		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> 
> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> 
> Although we have other things that smell like this (CPU errata for
> example), I wonder whether inverting the meaning in the case would
> make the situation easier to understand.

Yes, it is indeed a disability, more on that below.

> 
> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> IIUC.  We'd just need to invert the sense of the check in
> system_supports_fpsimd().

This is particularly something we want to avoid with this patch. We want
to make sure that we have the up-to-date status of the disability right
when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
you must wait until all the CPUs are up, unlike the negated capability.

> 
>>   		.min_field_value = 0,
> 
> (Does .min_field_value == 0 make sense, or is it even used?  I thought
> only the default has_cpuid_feature() match logic uses that.)

True, it is not used for this particular case.

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-11 12:13     ` Suzuki K Poulose
@ 2019-10-11 14:21       ` Dave Martin
  2019-10-11 17:28         ` Suzuki K Poulose
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Martin @ 2019-10-11 14:21 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mark.rutland, catalin.marinas, will, linux-kernel, linux-arm-kernel

On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> 
> On 11/10/2019 12:36, Dave Martin wrote:
> >On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> >>The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> >>that the "absence" of FP/SIMD on at least one CPU is detected only
> >>after all the SMP CPUs are brought up. However, we use the status
> >>of this capability for every context switch. So, let us change
> >>the scop to LOCAL_CPU to allow the detection of this capability
> >>as and when the first CPU without FP is brought up.
> >>
> >>Also, the current type allows hotplugged CPU to be brought up without
> >>FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> >>up. Fix both of these issues by changing the capability to
> >>BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> >>
> >>Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> >>Cc: Will Deacon <will@kernel.org>
> >>Cc: Mark Rutland <mark.rutland@arm.com>
> >>Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>---
> >>  arch/arm64/kernel/cpufeature.c | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >>diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >>index 9323bcc40a58..0f9eace6c64b 100644
> >>--- a/arch/arm64/kernel/cpufeature.c
> >>+++ b/arch/arm64/kernel/cpufeature.c
> >>@@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> >>  	{
> >>  		/* FP/SIMD is not implemented */
> >>  		.capability = ARM64_HAS_NO_FPSIMD,
> >>-		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
> >>+		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> >
> >ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> >
> >Although we have other things that smell like this (CPU errata for
> >example), I wonder whether inverting the meaning in the case would
> >make the situation easier to understand.
> 
> Yes, it is indeed a disability, more on that below.
> 
> >
> >So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> >value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> >IIUC.  We'd just need to invert the sense of the check in
> >system_supports_fpsimd().
> 
> This is particularly something we want to avoid with this patch. We want
> to make sure that we have the up-to-date status of the disability right
> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> you must wait until all the CPUs are up, unlike the negated capability.

I don't see why waiting for the random defective early CPU to come up is
better than waiting for all the early CPUs to come up and then deciding.

Kernel-mode NEON aside, the status of this cap should not matter until
we enter userspace for the first time.

The only issue is if e.g., crypto drivers that can use kernel-mode NEON
probe for it before all early CPUs are up, and so cache the wrong
decision.  The current approach doesn't cope with that anyway AFAICT.

> >>  		.min_field_value = 0,
> >
> >(Does .min_field_value == 0 make sense, or is it even used?  I thought
> >only the default has_cpuid_feature() match logic uses that.)
> 
> True, it is not used for this particular case.

Ok, just wondering.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-11 14:21       ` Dave Martin
@ 2019-10-11 17:28         ` Suzuki K Poulose
  2019-10-14 14:52           ` Dave Martin
  0 siblings, 1 reply; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-11 17:28 UTC (permalink / raw)
  To: Dave Martin
  Cc: mark.rutland, catalin.marinas, will, linux-kernel, linux-arm-kernel



On 11/10/2019 15:21, Dave Martin wrote:
> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
>>
>> On 11/10/2019 12:36, Dave Martin wrote:
>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
>>>> after all the SMP CPUs are brought up. However, we use the status
>>>> of this capability for every context switch. So, let us change
>>>> the scop to LOCAL_CPU to allow the detection of this capability
>>>> as and when the first CPU without FP is brought up.
>>>>
>>>> Also, the current type allows hotplugged CPU to be brought up without
>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
>>>> up. Fix both of these issues by changing the capability to
>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
>>>>
>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
>>>> Cc: Will Deacon <will@kernel.org>
>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> ---
>>>>   arch/arm64/kernel/cpufeature.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>> index 9323bcc40a58..0f9eace6c64b 100644
>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>>>>   	{
>>>>   		/* FP/SIMD is not implemented */
>>>>   		.capability = ARM64_HAS_NO_FPSIMD,
>>>> -		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
>>>> +		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
>>>
>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
>>>
>>> Although we have other things that smell like this (CPU errata for
>>> example), I wonder whether inverting the meaning in the case would
>>> make the situation easier to understand.
>>
>> Yes, it is indeed a disability, more on that below.
>>
>>>
>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
>>> IIUC.  We'd just need to invert the sense of the check in
>>> system_supports_fpsimd().
>>
>> This is particularly something we want to avoid with this patch. We want
>> to make sure that we have the up-to-date status of the disability right
>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
>> you must wait until all the CPUs are up, unlike the negated capability.
> 
> I don't see why waiting for the random defective early CPU to come up is
> better than waiting for all the early CPUs to come up and then deciding.
> 
> Kernel-mode NEON aside, the status of this cap should not matter until
> we enter userspace for the first time.
> 
> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> probe for it before all early CPUs are up, and so cache the wrong
> decision.  The current approach doesn't cope with that anyway AFAICT.

This approach does in fact. With LOCAL_CPU scope, the moment a defective
CPU turns up, we mark the "capability" and thus the kernel cannot use
the neon then onwards, unlike the existing case where we have time till
we boot all the CPUs (even when the boot CPU may be defective).

Cheers
Suzuki

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-11 17:28         ` Suzuki K Poulose
@ 2019-10-14 14:52           ` Dave Martin
  2019-10-14 15:45             ` Suzuki K Poulose
  0 siblings, 1 reply; 23+ messages in thread
From: Dave Martin @ 2019-10-14 14:52 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mark.rutland, catalin.marinas, will, linux-kernel, linux-arm-kernel

On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> 
> 
> On 11/10/2019 15:21, Dave Martin wrote:
> >On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> >>
> >>On 11/10/2019 12:36, Dave Martin wrote:
> >>>On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> >>>>The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> >>>>that the "absence" of FP/SIMD on at least one CPU is detected only
> >>>>after all the SMP CPUs are brought up. However, we use the status
> >>>>of this capability for every context switch. So, let us change
> >>>>the scop to LOCAL_CPU to allow the detection of this capability
> >>>>as and when the first CPU without FP is brought up.
> >>>>
> >>>>Also, the current type allows hotplugged CPU to be brought up without
> >>>>FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> >>>>up. Fix both of these issues by changing the capability to
> >>>>BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> >>>>
> >>>>Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> >>>>Cc: Will Deacon <will@kernel.org>
> >>>>Cc: Mark Rutland <mark.rutland@arm.com>
> >>>>Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>>>Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>>---
> >>>>  arch/arm64/kernel/cpufeature.c | 2 +-
> >>>>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>>diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >>>>index 9323bcc40a58..0f9eace6c64b 100644
> >>>>--- a/arch/arm64/kernel/cpufeature.c
> >>>>+++ b/arch/arm64/kernel/cpufeature.c
> >>>>@@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> >>>>  	{
> >>>>  		/* FP/SIMD is not implemented */
> >>>>  		.capability = ARM64_HAS_NO_FPSIMD,
> >>>>-		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
> >>>>+		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> >>>
> >>>ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> >>>
> >>>Although we have other things that smell like this (CPU errata for
> >>>example), I wonder whether inverting the meaning in the case would
> >>>make the situation easier to understand.
> >>
> >>Yes, it is indeed a disability, more on that below.
> >>
> >>>
> >>>So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> >>>value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> >>>IIUC.  We'd just need to invert the sense of the check in
> >>>system_supports_fpsimd().
> >>
> >>This is particularly something we want to avoid with this patch. We want
> >>to make sure that we have the up-to-date status of the disability right
> >>when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> >>you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> >>you must wait until all the CPUs are up, unlike the negated capability.
> >
> >I don't see why waiting for the random defective early CPU to come up is
> >better than waiting for all the early CPUs to come up and then deciding.
> >
> >Kernel-mode NEON aside, the status of this cap should not matter until
> >we enter userspace for the first time.
> >
> >The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> >probe for it before all early CPUs are up, and so cache the wrong
> >decision.  The current approach doesn't cope with that anyway AFAICT.
> 
> This approach does in fact. With LOCAL_CPU scope, the moment a defective
> CPU turns up, we mark the "capability" and thus the kernel cannot use
> the neon then onwards, unlike the existing case where we have time till
> we boot all the CPUs (even when the boot CPU may be defective).

I guess that makes sense.

I'm now wondering what happens if anything tries to use kernel-mode NEON
before SVE is initialised -- which doesn't happen until cpufeatures
configures the system features.

I don't think your proposed change makes anything worse here, but it may
need looking into.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-14 14:52           ` Dave Martin
@ 2019-10-14 15:45             ` Suzuki K Poulose
  2019-10-14 15:50               ` Dave P Martin
  0 siblings, 1 reply; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-14 15:45 UTC (permalink / raw)
  To: Dave Martin
  Cc: mark.rutland, catalin.marinas, will, linux-kernel, linux-arm-kernel



On 14/10/2019 15:52, Dave Martin wrote:
> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
>>
>>
>> On 11/10/2019 15:21, Dave Martin wrote:
>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
>>>>
>>>> On 11/10/2019 12:36, Dave Martin wrote:
>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
>>>>>> after all the SMP CPUs are brought up. However, we use the status
>>>>>> of this capability for every context switch. So, let us change
>>>>>> the scop to LOCAL_CPU to allow the detection of this capability
>>>>>> as and when the first CPU without FP is brought up.
>>>>>>
>>>>>> Also, the current type allows hotplugged CPU to be brought up without
>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
>>>>>> up. Fix both of these issues by changing the capability to
>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
>>>>>>
>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
>>>>>> Cc: Will Deacon <will@kernel.org>
>>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>> ---
>>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-
>>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>>>> index 9323bcc40a58..0f9eace6c64b 100644
>>>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>>>>>>   	{
>>>>>>   		/* FP/SIMD is not implemented */
>>>>>>   		.capability = ARM64_HAS_NO_FPSIMD,
>>>>>> -		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
>>>>>> +		.type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
>>>>>
>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
>>>>>
>>>>> Although we have other things that smell like this (CPU errata for
>>>>> example), I wonder whether inverting the meaning in the case would
>>>>> make the situation easier to understand.
>>>>
>>>> Yes, it is indeed a disability, more on that below.
>>>>
>>>>>
>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
>>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
>>>>> IIUC.  We'd just need to invert the sense of the check in
>>>>> system_supports_fpsimd().
>>>>
>>>> This is particularly something we want to avoid with this patch. We want
>>>> to make sure that we have the up-to-date status of the disability right
>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
>>>> you must wait until all the CPUs are up, unlike the negated capability.
>>>
>>> I don't see why waiting for the random defective early CPU to come up is
>>> better than waiting for all the early CPUs to come up and then deciding.
>>>
>>> Kernel-mode NEON aside, the status of this cap should not matter until
>>> we enter userspace for the first time.
>>>
>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
>>> probe for it before all early CPUs are up, and so cache the wrong
>>> decision.  The current approach doesn't cope with that anyway AFAICT.
>>
>> This approach does in fact. With LOCAL_CPU scope, the moment a defective
>> CPU turns up, we mark the "capability" and thus the kernel cannot use
>> the neon then onwards, unlike the existing case where we have time till
>> we boot all the CPUs (even when the boot CPU may be defective).
> 
> I guess that makes sense.
> 
> I'm now wondering what happens if anything tries to use kernel-mode NEON
> before SVE is initialised -- which doesn't happen until cpufeatures
> configures the system features.
> 
> I don't think your proposed change makes anything worse here, but it may
> need looking into.

We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
is initialised ?

Suzuki

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-14 15:45             ` Suzuki K Poulose
@ 2019-10-14 15:50               ` Dave P Martin
  2019-10-14 16:57                 ` Ard Biesheuvel
  0 siblings, 1 reply; 23+ messages in thread
From: Dave P Martin @ 2019-10-14 15:50 UTC (permalink / raw)
  To: Suzuki Poulose
  Cc: Mark Rutland, Catalin Marinas, will, linux-kernel,
	linux-arm-kernel, Ard Biesheuvel

On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
>
>
> On 14/10/2019 15:52, Dave Martin wrote:
> > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> >>
> >>
> >> On 11/10/2019 15:21, Dave Martin wrote:
> >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> >>>>
> >>>> On 11/10/2019 12:36, Dave Martin wrote:
> >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
> >>>>>> after all the SMP CPUs are brought up. However, we use the status
> >>>>>> of this capability for every context switch. So, let us change
> >>>>>> the scop to LOCAL_CPU to allow the detection of this capability
> >>>>>> as and when the first CPU without FP is brought up.
> >>>>>>
> >>>>>> Also, the current type allows hotplugged CPU to be brought up without
> >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> >>>>>> up. Fix both of these issues by changing the capability to
> >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> >>>>>>
> >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> >>>>>> Cc: Will Deacon <will@kernel.org>
> >>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
> >>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>>>> ---
> >>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-
> >>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >>>>>> index 9323bcc40a58..0f9eace6c64b 100644
> >>>>>> --- a/arch/arm64/kernel/cpufeature.c
> >>>>>> +++ b/arch/arm64/kernel/cpufeature.c
> >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> >>>>>>        {
> >>>>>>                /* FP/SIMD is not implemented */
> >>>>>>                .capability = ARM64_HAS_NO_FPSIMD,
> >>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> >>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> >>>>>
> >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> >>>>>
> >>>>> Although we have other things that smell like this (CPU errata for
> >>>>> example), I wonder whether inverting the meaning in the case would
> >>>>> make the situation easier to understand.
> >>>>
> >>>> Yes, it is indeed a disability, more on that below.
> >>>>
> >>>>>
> >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> >>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> >>>>> IIUC.  We'd just need to invert the sense of the check in
> >>>>> system_supports_fpsimd().
> >>>>
> >>>> This is particularly something we want to avoid with this patch. We want
> >>>> to make sure that we have the up-to-date status of the disability right
> >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> >>>> you must wait until all the CPUs are up, unlike the negated capability.
> >>>
> >>> I don't see why waiting for the random defective early CPU to come up is
> >>> better than waiting for all the early CPUs to come up and then deciding.
> >>>
> >>> Kernel-mode NEON aside, the status of this cap should not matter until
> >>> we enter userspace for the first time.
> >>>
> >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> >>> probe for it before all early CPUs are up, and so cache the wrong
> >>> decision.  The current approach doesn't cope with that anyway AFAICT.
> >>
> >> This approach does in fact. With LOCAL_CPU scope, the moment a defective
> >> CPU turns up, we mark the "capability" and thus the kernel cannot use
> >> the neon then onwards, unlike the existing case where we have time till
> >> we boot all the CPUs (even when the boot CPU may be defective).
> >
> > I guess that makes sense.
> >
> > I'm now wondering what happens if anything tries to use kernel-mode NEON
> > before SVE is initialised -- which doesn't happen until cpufeatures
> > configures the system features.
> >
> > I don't think your proposed change makes anything worse here, but it may
> > need looking into.
>
> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
> is initialised ?

Could do, at least as an experiment.

Ard, do you have any thoughts on this?

Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-14 15:50               ` Dave P Martin
@ 2019-10-14 16:57                 ` Ard Biesheuvel
  2019-10-15  9:44                   ` Suzuki K Poulose
  2019-10-15 10:24                   ` Dave Martin
  0 siblings, 2 replies; 23+ messages in thread
From: Ard Biesheuvel @ 2019-10-14 16:57 UTC (permalink / raw)
  To: Dave P Martin
  Cc: Suzuki Poulose, Mark Rutland, Catalin Marinas, will,
	linux-kernel, linux-arm-kernel

On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:
>
> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
> >
> >
> > On 14/10/2019 15:52, Dave Martin wrote:
> > > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> > >>
> > >>
> > >> On 11/10/2019 15:21, Dave Martin wrote:
> > >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> > >>>>
> > >>>> On 11/10/2019 12:36, Dave Martin wrote:
> > >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> > >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> > >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
> > >>>>>> after all the SMP CPUs are brought up. However, we use the status
> > >>>>>> of this capability for every context switch. So, let us change
> > >>>>>> the scop to LOCAL_CPU to allow the detection of this capability
> > >>>>>> as and when the first CPU without FP is brought up.
> > >>>>>>
> > >>>>>> Also, the current type allows hotplugged CPU to be brought up without
> > >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> > >>>>>> up. Fix both of these issues by changing the capability to
> > >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> > >>>>>>
> > >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> > >>>>>> Cc: Will Deacon <will@kernel.org>
> > >>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
> > >>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> > >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > >>>>>> ---
> > >>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-
> > >>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
> > >>>>>>
> > >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > >>>>>> index 9323bcc40a58..0f9eace6c64b 100644
> > >>>>>> --- a/arch/arm64/kernel/cpufeature.c
> > >>>>>> +++ b/arch/arm64/kernel/cpufeature.c
> > >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> > >>>>>>        {
> > >>>>>>                /* FP/SIMD is not implemented */
> > >>>>>>                .capability = ARM64_HAS_NO_FPSIMD,
> > >>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> > >>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> > >>>>>
> > >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> > >>>>>
> > >>>>> Although we have other things that smell like this (CPU errata for
> > >>>>> example), I wonder whether inverting the meaning in the case would
> > >>>>> make the situation easier to understand.
> > >>>>
> > >>>> Yes, it is indeed a disability, more on that below.
> > >>>>
> > >>>>>
> > >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> > >>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> > >>>>> IIUC.  We'd just need to invert the sense of the check in
> > >>>>> system_supports_fpsimd().
> > >>>>
> > >>>> This is particularly something we want to avoid with this patch. We want
> > >>>> to make sure that we have the up-to-date status of the disability right
> > >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> > >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> > >>>> you must wait until all the CPUs are up, unlike the negated capability.
> > >>>
> > >>> I don't see why waiting for the random defective early CPU to come up is
> > >>> better than waiting for all the early CPUs to come up and then deciding.
> > >>>
> > >>> Kernel-mode NEON aside, the status of this cap should not matter until
> > >>> we enter userspace for the first time.
> > >>>
> > >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> > >>> probe for it before all early CPUs are up, and so cache the wrong
> > >>> decision.  The current approach doesn't cope with that anyway AFAICT.
> > >>
> > >> This approach does in fact. With LOCAL_CPU scope, the moment a defective
> > >> CPU turns up, we mark the "capability" and thus the kernel cannot use
> > >> the neon then onwards, unlike the existing case where we have time till
> > >> we boot all the CPUs (even when the boot CPU may be defective).
> > >
> > > I guess that makes sense.
> > >
> > > I'm now wondering what happens if anything tries to use kernel-mode NEON
> > > before SVE is initialised -- which doesn't happen until cpufeatures
> > > configures the system features.
> > >
> > > I don't think your proposed change makes anything worse here, but it may
> > > need looking into.
> >
> > We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
> > is initialised ?
>
> Could do, at least as an experiment.
>
> Ard, do you have any thoughts on this?
>

All in-kernel NEON code checks whether the NEON is usable, so I'd
expect that check to return 'false' if it is too early in the boot for
the NEON to be used at all.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-14 16:57                 ` Ard Biesheuvel
@ 2019-10-15  9:44                   ` Suzuki K Poulose
  2019-10-15  9:52                     ` Ard Biesheuvel
  2019-10-15 10:24                   ` Dave Martin
  1 sibling, 1 reply; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-15  9:44 UTC (permalink / raw)
  To: Ard Biesheuvel, Dave P Martin
  Cc: Mark Rutland, Catalin Marinas, will, linux-kernel, linux-arm-kernel



On 14/10/2019 17:57, Ard Biesheuvel wrote:
> On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:
>>
>> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
>>>
>>>
>>> On 14/10/2019 15:52, Dave Martin wrote:
>>>> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
>>>>>
>>>>>
>>>>> On 11/10/2019 15:21, Dave Martin wrote:
>>>>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
>>>>>>>
>>>>>>> On 11/10/2019 12:36, Dave Martin wrote:
>>>>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
>>>>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
>>>>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
>>>>>>>>> after all the SMP CPUs are brought up. However, we use the status
>>>>>>>>> of this capability for every context switch. So, let us change
>>>>>>>>> the scop to LOCAL_CPU to allow the detection of this capability
>>>>>>>>> as and when the first CPU without FP is brought up.
>>>>>>>>>
>>>>>>>>> Also, the current type allows hotplugged CPU to be brought up without
>>>>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
>>>>>>>>> up. Fix both of these issues by changing the capability to
>>>>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
>>>>>>>>>
>>>>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
>>>>>>>>> Cc: Will Deacon <will@kernel.org>
>>>>>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>>>>> ---
>>>>>>>>>    arch/arm64/kernel/cpufeature.c | 2 +-
>>>>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>>>>>>> index 9323bcc40a58..0f9eace6c64b 100644
>>>>>>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>>>>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>>>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>>>>>>>>>         {
>>>>>>>>>                 /* FP/SIMD is not implemented */
>>>>>>>>>                 .capability = ARM64_HAS_NO_FPSIMD,
>>>>>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,
>>>>>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
>>>>>>>>
>>>>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
>>>>>>>>
>>>>>>>> Although we have other things that smell like this (CPU errata for
>>>>>>>> example), I wonder whether inverting the meaning in the case would
>>>>>>>> make the situation easier to understand.
>>>>>>>
>>>>>>> Yes, it is indeed a disability, more on that below.
>>>>>>>
>>>>>>>>
>>>>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
>>>>>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
>>>>>>>> IIUC.  We'd just need to invert the sense of the check in
>>>>>>>> system_supports_fpsimd().
>>>>>>>
>>>>>>> This is particularly something we want to avoid with this patch. We want
>>>>>>> to make sure that we have the up-to-date status of the disability right
>>>>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
>>>>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
>>>>>>> you must wait until all the CPUs are up, unlike the negated capability.
>>>>>>
>>>>>> I don't see why waiting for the random defective early CPU to come up is
>>>>>> better than waiting for all the early CPUs to come up and then deciding.
>>>>>>
>>>>>> Kernel-mode NEON aside, the status of this cap should not matter until
>>>>>> we enter userspace for the first time.
>>>>>>
>>>>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
>>>>>> probe for it before all early CPUs are up, and so cache the wrong
>>>>>> decision.  The current approach doesn't cope with that anyway AFAICT.
>>>>>
>>>>> This approach does in fact. With LOCAL_CPU scope, the moment a defective
>>>>> CPU turns up, we mark the "capability" and thus the kernel cannot use
>>>>> the neon then onwards, unlike the existing case where we have time till
>>>>> we boot all the CPUs (even when the boot CPU may be defective).
>>>>
>>>> I guess that makes sense.
>>>>
>>>> I'm now wondering what happens if anything tries to use kernel-mode NEON
>>>> before SVE is initialised -- which doesn't happen until cpufeatures
>>>> configures the system features.
>>>>
>>>> I don't think your proposed change makes anything worse here, but it may
>>>> need looking into.
>>>
>>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
>>> is initialised ?
>>
>> Could do, at least as an experiment.
>>
>> Ard, do you have any thoughts on this?
>>
> 
> All in-kernel NEON code checks whether the NEON is usable, so I'd
> expect that check to return 'false' if it is too early in the boot for
> the NEON to be used at all.

Ok. That implies, we need a check to make sure SVE set up is complete,
which we don't at the moment, as we default to assume FP/SIMD is available.

"system_can_use_fpsimd()" instead of the "system_supports_fpsimd() where
the former should indicate:

  system_supports_fpsimd() && sve_setup_complete()

Where the sve_setup_complete() can itself be a static key, initialized
very early if we have !CONFIG_SVE. Otherwise, set from sve_setup().


Thoughts ?
Suzuki


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-15  9:44                   ` Suzuki K Poulose
@ 2019-10-15  9:52                     ` Ard Biesheuvel
  0 siblings, 0 replies; 23+ messages in thread
From: Ard Biesheuvel @ 2019-10-15  9:52 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Dave P Martin, Mark Rutland, Catalin Marinas, will, linux-kernel,
	linux-arm-kernel

On Tue, 15 Oct 2019 at 11:44, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
>
>
> On 14/10/2019 17:57, Ard Biesheuvel wrote:
> > On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:
> >>
> >> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
> >>>
> >>>
> >>> On 14/10/2019 15:52, Dave Martin wrote:
> >>>> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> >>>>>
> >>>>>
> >>>>> On 11/10/2019 15:21, Dave Martin wrote:
> >>>>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> >>>>>>>
> >>>>>>> On 11/10/2019 12:36, Dave Martin wrote:
> >>>>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> >>>>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> >>>>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
> >>>>>>>>> after all the SMP CPUs are brought up. However, we use the status
> >>>>>>>>> of this capability for every context switch. So, let us change
> >>>>>>>>> the scop to LOCAL_CPU to allow the detection of this capability
> >>>>>>>>> as and when the first CPU without FP is brought up.
> >>>>>>>>>
> >>>>>>>>> Also, the current type allows hotplugged CPU to be brought up without
> >>>>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> >>>>>>>>> up. Fix both of these issues by changing the capability to
> >>>>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> >>>>>>>>>
> >>>>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> >>>>>>>>> Cc: Will Deacon <will@kernel.org>
> >>>>>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
> >>>>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>>>>>>> ---
> >>>>>>>>>    arch/arm64/kernel/cpufeature.c | 2 +-
> >>>>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> >>>>>>>>> index 9323bcc40a58..0f9eace6c64b 100644
> >>>>>>>>> --- a/arch/arm64/kernel/cpufeature.c
> >>>>>>>>> +++ b/arch/arm64/kernel/cpufeature.c
> >>>>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> >>>>>>>>>         {
> >>>>>>>>>                 /* FP/SIMD is not implemented */
> >>>>>>>>>                 .capability = ARM64_HAS_NO_FPSIMD,
> >>>>>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> >>>>>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> >>>>>>>>
> >>>>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> >>>>>>>>
> >>>>>>>> Although we have other things that smell like this (CPU errata for
> >>>>>>>> example), I wonder whether inverting the meaning in the case would
> >>>>>>>> make the situation easier to understand.
> >>>>>>>
> >>>>>>> Yes, it is indeed a disability, more on that below.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> >>>>>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> >>>>>>>> IIUC.  We'd just need to invert the sense of the check in
> >>>>>>>> system_supports_fpsimd().
> >>>>>>>
> >>>>>>> This is particularly something we want to avoid with this patch. We want
> >>>>>>> to make sure that we have the up-to-date status of the disability right
> >>>>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> >>>>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> >>>>>>> you must wait until all the CPUs are up, unlike the negated capability.
> >>>>>>
> >>>>>> I don't see why waiting for the random defective early CPU to come up is
> >>>>>> better than waiting for all the early CPUs to come up and then deciding.
> >>>>>>
> >>>>>> Kernel-mode NEON aside, the status of this cap should not matter until
> >>>>>> we enter userspace for the first time.
> >>>>>>
> >>>>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> >>>>>> probe for it before all early CPUs are up, and so cache the wrong
> >>>>>> decision.  The current approach doesn't cope with that anyway AFAICT.
> >>>>>
> >>>>> This approach does in fact. With LOCAL_CPU scope, the moment a defective
> >>>>> CPU turns up, we mark the "capability" and thus the kernel cannot use
> >>>>> the neon then onwards, unlike the existing case where we have time till
> >>>>> we boot all the CPUs (even when the boot CPU may be defective).
> >>>>
> >>>> I guess that makes sense.
> >>>>
> >>>> I'm now wondering what happens if anything tries to use kernel-mode NEON
> >>>> before SVE is initialised -- which doesn't happen until cpufeatures
> >>>> configures the system features.
> >>>>
> >>>> I don't think your proposed change makes anything worse here, but it may
> >>>> need looking into.
> >>>
> >>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
> >>> is initialised ?
> >>
> >> Could do, at least as an experiment.
> >>
> >> Ard, do you have any thoughts on this?
> >>
> >
> > All in-kernel NEON code checks whether the NEON is usable, so I'd
> > expect that check to return 'false' if it is too early in the boot for
> > the NEON to be used at all.
>
> Ok. That implies, we need a check to make sure SVE set up is complete,
> which we don't at the moment, as we default to assume FP/SIMD is available.
>
> "system_can_use_fpsimd()" instead of the "system_supports_fpsimd() where
> the former should indicate:
>
>   system_supports_fpsimd() && sve_setup_complete()
>
> Where the sve_setup_complete() can itself be a static key, initialized
> very early if we have !CONFIG_SVE. Otherwise, set from sve_setup().
>
>
> Thoughts ?

Yes, that sounds reasonable. If we fold that into the implementation
of may_use_simd(), we shouldn't need any other changes to the clients
AFAICT

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-14 16:57                 ` Ard Biesheuvel
  2019-10-15  9:44                   ` Suzuki K Poulose
@ 2019-10-15 10:24                   ` Dave Martin
  2019-10-15 10:30                     ` Ard Biesheuvel
  1 sibling, 1 reply; 23+ messages in thread
From: Dave Martin @ 2019-10-15 10:24 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Mark Rutland, Suzuki Poulose, Catalin Marinas, linux-kernel,
	will, linux-arm-kernel

On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:
> On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:
> >
> > On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
> > >
> > >
> > > On 14/10/2019 15:52, Dave Martin wrote:
> > > > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> > > >>
> > > >>
> > > >> On 11/10/2019 15:21, Dave Martin wrote:
> > > >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> > > >>>>
> > > >>>> On 11/10/2019 12:36, Dave Martin wrote:
> > > >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> > > >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> > > >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
> > > >>>>>> after all the SMP CPUs are brought up. However, we use the status
> > > >>>>>> of this capability for every context switch. So, let us change
> > > >>>>>> the scop to LOCAL_CPU to allow the detection of this capability
> > > >>>>>> as and when the first CPU without FP is brought up.
> > > >>>>>>
> > > >>>>>> Also, the current type allows hotplugged CPU to be brought up without
> > > >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> > > >>>>>> up. Fix both of these issues by changing the capability to
> > > >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> > > >>>>>>
> > > >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> > > >>>>>> Cc: Will Deacon <will@kernel.org>
> > > >>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
> > > >>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > >>>>>> ---
> > > >>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-
> > > >>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >>>>>>
> > > >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > >>>>>> index 9323bcc40a58..0f9eace6c64b 100644
> > > >>>>>> --- a/arch/arm64/kernel/cpufeature.c
> > > >>>>>> +++ b/arch/arm64/kernel/cpufeature.c
> > > >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> > > >>>>>>        {
> > > >>>>>>                /* FP/SIMD is not implemented */
> > > >>>>>>                .capability = ARM64_HAS_NO_FPSIMD,
> > > >>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> > > >>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> > > >>>>>
> > > >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> > > >>>>>
> > > >>>>> Although we have other things that smell like this (CPU errata for
> > > >>>>> example), I wonder whether inverting the meaning in the case would
> > > >>>>> make the situation easier to understand.
> > > >>>>
> > > >>>> Yes, it is indeed a disability, more on that below.
> > > >>>>
> > > >>>>>
> > > >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> > > >>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> > > >>>>> IIUC.  We'd just need to invert the sense of the check in
> > > >>>>> system_supports_fpsimd().
> > > >>>>
> > > >>>> This is particularly something we want to avoid with this patch. We want
> > > >>>> to make sure that we have the up-to-date status of the disability right
> > > >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> > > >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> > > >>>> you must wait until all the CPUs are up, unlike the negated capability.
> > > >>>
> > > >>> I don't see why waiting for the random defective early CPU to come up is
> > > >>> better than waiting for all the early CPUs to come up and then deciding.
> > > >>>
> > > >>> Kernel-mode NEON aside, the status of this cap should not matter until
> > > >>> we enter userspace for the first time.
> > > >>>
> > > >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> > > >>> probe for it before all early CPUs are up, and so cache the wrong
> > > >>> decision.  The current approach doesn't cope with that anyway AFAICT.
> > > >>
> > > >> This approach does in fact. With LOCAL_CPU scope, the moment a defective
> > > >> CPU turns up, we mark the "capability" and thus the kernel cannot use
> > > >> the neon then onwards, unlike the existing case where we have time till
> > > >> we boot all the CPUs (even when the boot CPU may be defective).
> > > >
> > > > I guess that makes sense.
> > > >
> > > > I'm now wondering what happens if anything tries to use kernel-mode NEON
> > > > before SVE is initialised -- which doesn't happen until cpufeatures
> > > > configures the system features.
> > > >
> > > > I don't think your proposed change makes anything worse here, but it may
> > > > need looking into.
> > >
> > > We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
> > > is initialised ?
> >
> > Could do, at least as an experiment.
> >
> > Ard, do you have any thoughts on this?
> >
> 
> All in-kernel NEON code checks whether the NEON is usable, so I'd
> expect that check to return 'false' if it is too early in the boot for
> the NEON to be used at all.

My concern is that the check may be done once, at probe time, for crypto
drivers.  If probing happens before system_supports_fpsimd() has
stabilised, we may be stuck with the wrong probe decision.

So: are crypto drivers and kernel_mode_neon() users definitely only
probed _after_ all early CPUs are up?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-15 10:24                   ` Dave Martin
@ 2019-10-15 10:30                     ` Ard Biesheuvel
  2019-10-15 13:03                       ` Suzuki K Poulose
  2019-10-15 14:05                       ` Dave Martin
  0 siblings, 2 replies; 23+ messages in thread
From: Ard Biesheuvel @ 2019-10-15 10:30 UTC (permalink / raw)
  To: Dave Martin
  Cc: Mark Rutland, Suzuki Poulose, Catalin Marinas, linux-kernel,
	will, linux-arm-kernel

On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:
> > On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
> > > >
> > > >
> > > > On 14/10/2019 15:52, Dave Martin wrote:
> > > > > On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
> > > > >>
> > > > >>
> > > > >> On 11/10/2019 15:21, Dave Martin wrote:
> > > > >>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
> > > > >>>>
> > > > >>>> On 11/10/2019 12:36, Dave Martin wrote:
> > > > >>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
> > > > >>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
> > > > >>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
> > > > >>>>>> after all the SMP CPUs are brought up. However, we use the status
> > > > >>>>>> of this capability for every context switch. So, let us change
> > > > >>>>>> the scop to LOCAL_CPU to allow the detection of this capability
> > > > >>>>>> as and when the first CPU without FP is brought up.
> > > > >>>>>>
> > > > >>>>>> Also, the current type allows hotplugged CPU to be brought up without
> > > > >>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
> > > > >>>>>> up. Fix both of these issues by changing the capability to
> > > > >>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
> > > > >>>>>>
> > > > >>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> > > > >>>>>> Cc: Will Deacon <will@kernel.org>
> > > > >>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
> > > > >>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > > >>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > > > >>>>>> ---
> > > > >>>>>>   arch/arm64/kernel/cpufeature.c | 2 +-
> > > > >>>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >>>>>>
> > > > >>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > > > >>>>>> index 9323bcc40a58..0f9eace6c64b 100644
> > > > >>>>>> --- a/arch/arm64/kernel/cpufeature.c
> > > > >>>>>> +++ b/arch/arm64/kernel/cpufeature.c
> > > > >>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> > > > >>>>>>        {
> > > > >>>>>>                /* FP/SIMD is not implemented */
> > > > >>>>>>                .capability = ARM64_HAS_NO_FPSIMD,
> > > > >>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> > > > >>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
> > > > >>>>>
> > > > >>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
> > > > >>>>>
> > > > >>>>> Although we have other things that smell like this (CPU errata for
> > > > >>>>> example), I wonder whether inverting the meaning in the case would
> > > > >>>>> make the situation easier to understand.
> > > > >>>>
> > > > >>>> Yes, it is indeed a disability, more on that below.
> > > > >>>>
> > > > >>>>>
> > > > >>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
> > > > >>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
> > > > >>>>> IIUC.  We'd just need to invert the sense of the check in
> > > > >>>>> system_supports_fpsimd().
> > > > >>>>
> > > > >>>> This is particularly something we want to avoid with this patch. We want
> > > > >>>> to make sure that we have the up-to-date status of the disability right
> > > > >>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
> > > > >>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
> > > > >>>> you must wait until all the CPUs are up, unlike the negated capability.
> > > > >>>
> > > > >>> I don't see why waiting for the random defective early CPU to come up is
> > > > >>> better than waiting for all the early CPUs to come up and then deciding.
> > > > >>>
> > > > >>> Kernel-mode NEON aside, the status of this cap should not matter until
> > > > >>> we enter userspace for the first time.
> > > > >>>
> > > > >>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
> > > > >>> probe for it before all early CPUs are up, and so cache the wrong
> > > > >>> decision.  The current approach doesn't cope with that anyway AFAICT.
> > > > >>
> > > > >> This approach does in fact. With LOCAL_CPU scope, the moment a defective
> > > > >> CPU turns up, we mark the "capability" and thus the kernel cannot use
> > > > >> the neon then onwards, unlike the existing case where we have time till
> > > > >> we boot all the CPUs (even when the boot CPU may be defective).
> > > > >
> > > > > I guess that makes sense.
> > > > >
> > > > > I'm now wondering what happens if anything tries to use kernel-mode NEON
> > > > > before SVE is initialised -- which doesn't happen until cpufeatures
> > > > > configures the system features.
> > > > >
> > > > > I don't think your proposed change makes anything worse here, but it may
> > > > > need looking into.
> > > >
> > > > We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
> > > > is initialised ?
> > >
> > > Could do, at least as an experiment.
> > >
> > > Ard, do you have any thoughts on this?
> > >
> >
> > All in-kernel NEON code checks whether the NEON is usable, so I'd
> > expect that check to return 'false' if it is too early in the boot for
> > the NEON to be used at all.
>
> My concern is that the check may be done once, at probe time, for crypto
> drivers.  If probing happens before system_supports_fpsimd() has
> stabilised, we may be stuck with the wrong probe decision.
>
> So: are crypto drivers and kernel_mode_neon() users definitely only
> probed _after_ all early CPUs are up?
>

Isn't SMP already up when initcalls are processed?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-15 10:30                     ` Ard Biesheuvel
@ 2019-10-15 13:03                       ` Suzuki K Poulose
  2019-10-15 13:11                         ` Ard Biesheuvel
  2019-10-15 14:05                       ` Dave Martin
  1 sibling, 1 reply; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-15 13:03 UTC (permalink / raw)
  To: Ard Biesheuvel, Dave Martin
  Cc: Mark Rutland, Catalin Marinas, linux-kernel, will, linux-arm-kernel



On 15/10/2019 11:30, Ard Biesheuvel wrote:
> On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@arm.com> wrote:
>>
>> On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:
>>> On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:
>>>>
>>>> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
>>>>>
>>>>>
>>>>> On 14/10/2019 15:52, Dave Martin wrote:
>>>>>> On Fri, Oct 11, 2019 at 06:28:43PM +0100, Suzuki K Poulose wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 11/10/2019 15:21, Dave Martin wrote:
>>>>>>>> On Fri, Oct 11, 2019 at 01:13:18PM +0100, Suzuki K Poulose wrote: > Hi Dave
>>>>>>>>>
>>>>>>>>> On 11/10/2019 12:36, Dave Martin wrote:
>>>>>>>>>> On Thu, Oct 10, 2019 at 06:15:15PM +0100, Suzuki K Poulose wrote:
>>>>>>>>>>> The NO_FPSIMD capability is defined with scope SYSTEM, which implies
>>>>>>>>>>> that the "absence" of FP/SIMD on at least one CPU is detected only
>>>>>>>>>>> after all the SMP CPUs are brought up. However, we use the status
>>>>>>>>>>> of this capability for every context switch. So, let us change
>>>>>>>>>>> the scop to LOCAL_CPU to allow the detection of this capability
>>>>>>>>>>> as and when the first CPU without FP is brought up.
>>>>>>>>>>>
>>>>>>>>>>> Also, the current type allows hotplugged CPU to be brought up without
>>>>>>>>>>> FP/SIMD when all the current CPUs have FP/SIMD and we have the userspace
>>>>>>>>>>> up. Fix both of these issues by changing the capability to
>>>>>>>>>>> BOOT_RESTRICTED_LOCAL_CPU_FEATURE.
>>>>>>>>>>>
>>>>>>>>>>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
>>>>>>>>>>> Cc: Will Deacon <will@kernel.org>
>>>>>>>>>>> Cc: Mark Rutland <mark.rutland@arm.com>
>>>>>>>>>>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>>>>>>>>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>>>>>>>>> ---
>>>>>>>>>>>    arch/arm64/kernel/cpufeature.c | 2 +-
>>>>>>>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
>>>>>>>>>>> index 9323bcc40a58..0f9eace6c64b 100644
>>>>>>>>>>> --- a/arch/arm64/kernel/cpufeature.c
>>>>>>>>>>> +++ b/arch/arm64/kernel/cpufeature.c
>>>>>>>>>>> @@ -1361,7 +1361,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>>>>>>>>>>>         {
>>>>>>>>>>>                 /* FP/SIMD is not implemented */
>>>>>>>>>>>                 .capability = ARM64_HAS_NO_FPSIMD,
>>>>>>>>>>> -              .type = ARM64_CPUCAP_SYSTEM_FEATURE,
>>>>>>>>>>> +              .type = ARM64_CPUCAP_BOOT_RESTRICTED_CPU_LOCAL_FEATURE,
>>>>>>>>>>
>>>>>>>>>> ARM64_HAS_NO_FPSIMD is really a disability, not a capability.
>>>>>>>>>>
>>>>>>>>>> Although we have other things that smell like this (CPU errata for
>>>>>>>>>> example), I wonder whether inverting the meaning in the case would
>>>>>>>>>> make the situation easier to understand.
>>>>>>>>>
>>>>>>>>> Yes, it is indeed a disability, more on that below.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So, we'd have ARM64_HAS_FPSIMD, with a minimum (signed) feature field
>>>>>>>>>> value of 0.  Then this just looks like an ARM64_CPUCAP_SYSTEM_FEATURE
>>>>>>>>>> IIUC.  We'd just need to invert the sense of the check in
>>>>>>>>>> system_supports_fpsimd().
>>>>>>>>>
>>>>>>>>> This is particularly something we want to avoid with this patch. We want
>>>>>>>>> to make sure that we have the up-to-date status of the disability right
>>>>>>>>> when it happens. i.e, a CPU without FP/SIMD is brought up. With SYSTEM_FEATURE
>>>>>>>>> you have to wait until we bring all the CPUs up. Also, for HAS_FPSIMD,
>>>>>>>>> you must wait until all the CPUs are up, unlike the negated capability.
>>>>>>>>
>>>>>>>> I don't see why waiting for the random defective early CPU to come up is
>>>>>>>> better than waiting for all the early CPUs to come up and then deciding.
>>>>>>>>
>>>>>>>> Kernel-mode NEON aside, the status of this cap should not matter until
>>>>>>>> we enter userspace for the first time.
>>>>>>>>
>>>>>>>> The only issue is if e.g., crypto drivers that can use kernel-mode NEON
>>>>>>>> probe for it before all early CPUs are up, and so cache the wrong
>>>>>>>> decision.  The current approach doesn't cope with that anyway AFAICT.
>>>>>>>
>>>>>>> This approach does in fact. With LOCAL_CPU scope, the moment a defective
>>>>>>> CPU turns up, we mark the "capability" and thus the kernel cannot use
>>>>>>> the neon then onwards, unlike the existing case where we have time till
>>>>>>> we boot all the CPUs (even when the boot CPU may be defective).
>>>>>>
>>>>>> I guess that makes sense.
>>>>>>
>>>>>> I'm now wondering what happens if anything tries to use kernel-mode NEON
>>>>>> before SVE is initialised -- which doesn't happen until cpufeatures
>>>>>> configures the system features.
>>>>>>
>>>>>> I don't think your proposed change makes anything worse here, but it may
>>>>>> need looking into.
>>>>>
>>>>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
>>>>> is initialised ?
>>>>
>>>> Could do, at least as an experiment.
>>>>
>>>> Ard, do you have any thoughts on this?
>>>>
>>>
>>> All in-kernel NEON code checks whether the NEON is usable, so I'd
>>> expect that check to return 'false' if it is too early in the boot for
>>> the NEON to be used at all.
>>
>> My concern is that the check may be done once, at probe time, for crypto
>> drivers.  If probing happens before system_supports_fpsimd() has
>> stabilised, we may be stuck with the wrong probe decision.
>>
>> So: are crypto drivers and kernel_mode_neon() users definitely only
>> probed _after_ all early CPUs are up?
>>
> 
> Isn't SMP already up when initcalls are processed?

Not all of them. Booting with initcall_debug=1 shows the following :

--

  // trimmed //

[    0.000000] NR_IRQS:64 nr_irqs:64 0 
 
 

[    0.000000] GIC: Using split EOI/Deactivate mode 
 
 

[    0.000000] CPU0: found redistributor 0 region 0:0x000000002f100000 
 
 

[    0.000000] Architected cp15 timer(s) running at 100.00MHz (phys). 
 
 

[    0.000000] clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 
0x171024e7e0, max_idle_ns: 440795205315 ns
[    0.000029] sched_clock: 56 bits at 100MHz, resolution 10ns, wraps every 
4398046511100ns
[    0.000989] Console: colour dummy device 80x25 

[    0.001049] Calibrating delay loop (skipped), value calculated using timer 
frequency.. 200.00 BogoMIPS (lpj=400000)
[    0.001149] pid_max: default: 32768 minimum: 301 

[    0.001549] Security Framework initialized 

[    0.001802] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes) 

[    0.001849] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes) 

[    0.004949] Initializing cgroup subsys io
[    0.005042] Initializing cgroup subsys memory
[    0.005079] Initializing cgroup subsys devices
[    0.005149] Initializing cgroup subsys perf_event
[    0.005255] Initializing cgroup subsys hugetlb
[    0.005255] Initializing cgroup subsys pids

[    0.006002] calling  cpu_suspend_init+0x0/0x78 @ 1 

[    0.006062] initcall cpu_suspend_init+0x0/0x78 returned 0 after 0 usecs
[    0.006149] calling  arm64_enable_runtime_services+0x0/0x200 @ 1
[    0.006225] EFI services will not be available. 

[    0.006249] initcall arm64_enable_runtime_services+0x0/0x200 returned 0 after 
0 usecs
[    0.006389] calling  asids_init+0x0/0xf8 @ 1
[    0.006449] ASID allocator initialised with 65536 entries 

[    0.006535] initcall asids_init+0x0/0xf8 returned 0 after 0 usecs
[    0.006553] calling  xen_guest_init+0x0/0x1d8 @ 1 

[    0.006649] initcall xen_guest_init+0x0/0x1d8 returned 0 after 0 usecs
[    0.006749] calling  spawn_ksoftirqd+0x0/0x40 @ 1 

[    0.007749] initcall spawn_ksoftirqd+0x0/0x40 returned 0 after 3906 usecs
[    0.007864] calling  init_workqueues+0x0/0x3ec @ 1 

[    0.019869] initcall init_workqueues+0x0/0x3ec returned 0 after 11718 usecs
[    0.019988] calling  migration_init+0x0/0x84 @ 1 

[    0.020082] initcall migration_init+0x0/0x84 returned 0 after 0 usecs
[    0.020189] calling  check_cpu_stall_init+0x0/0x28 @ 1 

[    0.020316] initcall check_cpu_stall_init+0x0/0x28 returned 0 after 0 usecs
[    0.020449] calling  rcu_spawn_gp_kthread+0x0/0x12c @ 1 

[    0.020971] initcall rcu_spawn_gp_kthread+0x0/0x12c returned 0 after 0 usecs
[    0.021049] calling  cpu_stop_init+0x0/0xe0 @ 1 

[    0.023815] initcall cpu_stop_init+0x0/0xe0 returned 0 after 3906 usecs
[    0.023922] calling  jump_label_init_module+0x0/0x20 @ 1 

[    0.023949] initcall jump_label_init_module+0x0/0x20 returned 0 after 0 usecs
[    0.024084] calling  its_pci_msi_init+0x0/0xec @ 1 

[    0.024249] /interrupt-controller@2f000000/its@2f020000: unable to locate ITS 
domain
[    0.024349] initcall its_pci_msi_init+0x0/0xec returned 0 after 0 usecs
[    0.024455] calling  its_pmsi_init+0x0/0xec @ 1 

[    0.024576] /interrupt-controller@2f000000/its@2f020000: unable to locate ITS 
domain
[    0.024669] initcall its_pmsi_init+0x0/0xec returned 0 after 0 usecs
[    0.024849] calling  tegra_init_fuse+0x0/0x150 @ 1 

[    0.025095] initcall tegra_init_fuse+0x0/0x150 returned 0 after 0 usecs
[    0.025231] calling  tegra_pmc_early_init+0x0/0xfc @ 1 

[    0.025749] initcall tegra_pmc_early_init+0x0/0xfc returned 0 after 0 usecs
[    0.025886] calling  rand_initialize+0x0/0x40 @ 1 

[    0.026849] initcall rand_initialize+0x0/0x40 returned 0 after 0 usecs
[    0.026949] calling  dummy_timer_register+0x0/0x54 @ 1 

[    0.027033] initcall dummy_timer_register+0x0/0x54 returned 0 after 0 usecs


[    0.035949] Detected PIPT I-cache on CPU1 

[    0.036049] CPU1: found redistributor 1 region 0:0x000000002f120000
[    0.036082] CPU1: Booted secondary processor [410fd0f0]
[    0.048049] Detected PIPT I-cache on CPU2 

[    0.048149] CPU2: found redistributor 2 region 0:0x000000002f140000
[    0.048168] CPU2: Booted secondary processor [410fd0f0]
[    0.060249] Detected PIPT I-cache on CPU3 

[    0.060349] CPU3: found redistributor 3 region 0:0x000000002f160000
[    0.060402] CPU3: Booted secondary processor [410fd0f0]
[    0.060620] Brought up 4 CPUs
[    0.060949] SMP: Total of 4 processors activated.


Cheers
Suzuki

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-15 13:03                       ` Suzuki K Poulose
@ 2019-10-15 13:11                         ` Ard Biesheuvel
  0 siblings, 0 replies; 23+ messages in thread
From: Ard Biesheuvel @ 2019-10-15 13:11 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Dave Martin, Mark Rutland, Catalin Marinas, linux-kernel, will,
	linux-arm-kernel

On Tue, 15 Oct 2019 at 15:03, Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>
>
>
> On 15/10/2019 11:30, Ard Biesheuvel wrote:
> > On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@arm.com> wrote:
> >>
> >> On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:
> >>> On Mon, 14 Oct 2019 at 17:50, Dave P Martin <Dave.Martin@arm.com> wrote:
> >>>>
> >>>> On Mon, Oct 14, 2019 at 04:45:40PM +0100, Suzuki K Poulose wrote:
> >>>>>
> >>>>>
> >>>>> On 14/10/2019 15:52, Dave Martin wrote:
> >>>>>>
...
> >>>>>> I'm now wondering what happens if anything tries to use kernel-mode NEON
> >>>>>> before SVE is initialised -- which doesn't happen until cpufeatures
> >>>>>> configures the system features.
> >>>>>>
> >>>>>> I don't think your proposed change makes anything worse here, but it may
> >>>>>> need looking into.
> >>>>>
> >>>>> We could throw in a WARN_ON() in kernel_neon() to make sure that the SVE
> >>>>> is initialised ?
> >>>>
> >>>> Could do, at least as an experiment.
> >>>>
> >>>> Ard, do you have any thoughts on this?
> >>>>
> >>>
> >>> All in-kernel NEON code checks whether the NEON is usable, so I'd
> >>> expect that check to return 'false' if it is too early in the boot for
> >>> the NEON to be used at all.
> >>
> >> My concern is that the check may be done once, at probe time, for crypto
> >> drivers.  If probing happens before system_supports_fpsimd() has
> >> stabilised, we may be stuck with the wrong probe decision.
> >>
> >> So: are crypto drivers and kernel_mode_neon() users definitely only
> >> probed _after_ all early CPUs are up?
> >>
> >
> > Isn't SMP already up when initcalls are processed?
>
> Not all of them. Booting with initcall_debug=1 shows the following :
>
> --
>
>   // trimmed //
>
...
> [    0.027033] initcall dummy_timer_register+0x0/0x54 returned 0 after 0 usecs
>
>
> [    0.035949] Detected PIPT I-cache on CPU1
>
> [    0.036049] CPU1: found redistributor 1 region 0:0x000000002f120000
> [    0.036082] CPU1: Booted secondary processor [410fd0f0]
> [    0.048049] Detected PIPT I-cache on CPU2
>
> [    0.048149] CPU2: found redistributor 2 region 0:0x000000002f140000
> [    0.048168] CPU2: Booted secondary processor [410fd0f0]
> [    0.060249] Detected PIPT I-cache on CPU3
>
> [    0.060349] CPU3: found redistributor 3 region 0:0x000000002f160000
> [    0.060402] CPU3: Booted secondary processor [410fd0f0]
> [    0.060620] Brought up 4 CPUs
> [    0.060949] SMP: Total of 4 processors activated.
>
>

These are all early initcalls, which are actually documented as
running before SMP, and before 'pure' initcalls, which should only be
used to initialize global variables that cannot be initialized
statically. So I think we can safely disregard these as uses of kernel
mode NEON we should care about.

But I would still expect may_use_simd() to return the right value
here, independently of the logic that reasons about whether we have a
NEON in the first place.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability
  2019-10-15 10:30                     ` Ard Biesheuvel
  2019-10-15 13:03                       ` Suzuki K Poulose
@ 2019-10-15 14:05                       ` Dave Martin
  1 sibling, 0 replies; 23+ messages in thread
From: Dave Martin @ 2019-10-15 14:05 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Mark Rutland, Suzuki Poulose, Catalin Marinas, linux-kernel,
	will, linux-arm-kernel

On Tue, Oct 15, 2019 at 12:30:15PM +0200, Ard Biesheuvel wrote:
> On Tue, 15 Oct 2019 at 12:25, Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Mon, Oct 14, 2019 at 06:57:30PM +0200, Ard Biesheuvel wrote:

[...]

> > > All in-kernel NEON code checks whether the NEON is usable, so I'd
> > > expect that check to return 'false' if it is too early in the boot for
> > > the NEON to be used at all.
> >
> > My concern is that the check may be done once, at probe time, for crypto
> > drivers.  If probing happens before system_supports_fpsimd() has
> > stabilised, we may be stuck with the wrong probe decision.
> >
> > So: are crypto drivers and kernel_mode_neon() users definitely only
> > probed _after_ all early CPUs are up?
> >
> 
> Isn't SMP already up when initcalls are processed?

That was my original assumption when developing SVE.  I think I
convinced myself that it was valid, but it sounds worth reinvestigating.

Assuming the assumption _is_ valid, then dropping a suitable WARN() into
system_supports_fpsimd() or cpu_has_neon() or similar may be a good idea.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/3] arm64: Fix support for systems without FP/SIMD
  2019-10-10 17:15 [PATCH 0/3] arm64: Fix support for systems without FP/SIMD Suzuki K Poulose
                   ` (2 preceding siblings ...)
  2019-10-10 17:15 ` [PATCH 3/3] arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly Suzuki K Poulose
@ 2019-10-17  0:06 ` Will Deacon
  3 siblings, 0 replies; 23+ messages in thread
From: Will Deacon @ 2019-10-17  0:06 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, mark.rutland, catalin.marinas,
	dave.martin

On Thu, Oct 10, 2019 at 06:15:14PM +0100, Suzuki K Poulose wrote:
> This series fixes the original support for systems without FP/SIMD.
> We have three set of issues with the current code :
> 
> 1) We detect the absence of FP/SIMD after the SMP cpus are brought
> online (i.e, SYSTEM scope). This means, some of the early kernel threads
> could run with their TIF_FOREIGN_FPSTATE flag set which might be
> inherited by applications forked by them (e.g, modprobe from initramfs).
> 
> Also we allow a hotplugged CPU to boot successfully even if it doesn't
> have the FP/SIMD when we have decided otherwise at boot and have now
> advertised the ELF HWCAP for the userspace.
> Fix this by turning this to a BOOT_RESTRICTED_CPU_LOCAL feature to
> allow the detection of the feature the very moment a CPU turns up
> without FP/SIMD and also prevent a conflict after SMP boot.
> 
> 2) As mentioned above, some tasks could have the TIF flag set,
> which will never be cleared after we detect the capability.
> Thus they could get stuck indefinitely in do_notfiy_resume().
> Fix this by clearing the TIF flag for such tasks but continuing
> to avoid the save/restore of the FP state.
> 
> 3) The compat ELF_HWCAP bits are statically initialised to indicate
> that the FP/SIMD support is available. This must be updated dynamically
> to provide the correct flags to the userspace.
> 
> Tested with a 32bit Debian Jessie fs on Fast model (with and without
> FP support).
> 
> Suzuki K Poulose (3):
>   arm64: cpufeature: Fix the type of no FP/SIMD capability
>   arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks
>   arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly

This looks alright apart from Dave's comments on patch 2. Do you plan to
address those at all? Some of it looks like scope for future work, but there
are also some questions about the mechanics of the patch.

Will

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks
  2019-10-11 11:26   ` Dave Martin
@ 2019-10-17 12:42     ` Suzuki K Poulose
  2019-10-17 16:09       ` Dave Martin
  0 siblings, 1 reply; 23+ messages in thread
From: Suzuki K Poulose @ 2019-10-17 12:42 UTC (permalink / raw)
  To: Dave Martin
  Cc: linux-arm-kernel, mark.rutland, catalin.marinas, linux-kernel, will

Hi Dave

Thanks for the comments.

On 11/10/2019 12:26, Dave Martin wrote:
> On Thu, Oct 10, 2019 at 06:15:16PM +0100, Suzuki K Poulose wrote:
>> We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then
>> we have kernel threads running already with TIF_FOREIGN_FPSTATE set which
>> could be inherited by early userspace applications (e.g, modprobe triggered
>> from initramfs). This could end up in the applications stuck in
>> do_nofity_resume() as we never clear the TIF flag, once we now know that
>> we don't support FP.
>>
>> Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag
>> for tasks which may have them set, as we would have done in the normal
>> case, but avoiding touching the hardware state (since we don't support any).
>>
>> Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>   arch/arm64/kernel/fpsimd.c | 26 ++++++++++++++++----------
>>   1 file changed, 16 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
>> index 37d3912cfe06..dfcdd077aeca 100644
>> --- a/arch/arm64/kernel/fpsimd.c
>> +++ b/arch/arm64/kernel/fpsimd.c
>> @@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
>>    */
>>   void fpsimd_restore_current_state(void)
>>   {
>> -	if (!system_supports_fpsimd())
>> -		return;
>> -
>>   	get_cpu_fpsimd_context();
>> -
>> -	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
>> +	/*
>> +	 * For the tasks that were created before we detected the absence of
>> +	 * FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via fpsimd_thread_switch()
>> +	 * and/or could be inherited from the parent(init_task has this set). Even
>> +	 * though userspace has not run yet, this could be inherited by the
>> +	 * processes forked from one of those tasks (e.g, modprobe from initramfs).
>> +	 * If the system doesn't support FP/SIMD, we must clear the flag for the
>> +	 * tasks mentioned above, to indicate that the FPSTATE is clean (as we
>> +	 * can't have one) to avoid looping for ever to clear the flag.
>> +	 */
>> +	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) &&
>> +	    system_supports_fpsimd()) {
> 
> I'm not too keen on this approach: elsewhere we just stub out all the
> FPSIMD handling logic if !system_supports_fpsimd() -- I think we should
> be using this test everywhere rather than relying on TIF_FOREIGN_FPSTATE.

We used to do this. But the flag is not cleared anymore once we detect
the absence of FP/SIMD.

> Rather, I feel that TIF_FOREIGN_FPSTATE means "if this is a user task
> and this task is current() and the system supports FPSIMD at all, this
> task's FPSIMD state is not loaded in the cpu".

Yes, that is  correct. However, we ran some tasks, even before we detected
that the FPSIMD is missing. So, we need to clear the flag for those tasks
to make sure the flag state is consistent, as explained in the comment.

Another option is to clear this flag in fpsimd_thread_switch(), something like,
rather than sprinkling the "flag fixup" everywhere:

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index dfcdd077aeca..2d8091b6ebfb 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -982,9 +982,14 @@ void fpsimd_thread_switch(struct task_struct *next)
  {
         bool wrong_task, wrong_cpu;

-       if (!system_supports_fpsimd())
+       if (!system_supports_fpsimd()) {
+               /*
+                * Clear any TIF flags which may have been set, before we
+                * detected the absense of FPSIMD.
+                */
+               clear_task_thread_flag(next, TIF_FOREIGN_FPSTATE);
                 return;
-
+       }
         __get_cpu_fpsimd_context();


And also at :

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index a47462def04b..cd8e94d5dc8d 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -374,7 +374,10 @@ int copy_thread(unsigned long clone_flags, unsigned long 
stack_start,
          * Otherwise we could erroneously skip reloading the FPSIMD
          * registers for p.
          */
-       fpsimd_flush_task_state(p);
+       if (system_supports_fpsimd())
+               fpsimd_flush_task_state(p);
+       else
+               clear_tsk_thread_flag(p, TIF_FOREIGN_FPSTATE);

         if (likely(!(p->flags & PF_KTHREAD))) {
                 *childregs = *current_pt_regs();


That way we make sure the flag doesn't violate our assumption and we can
bail out calling into the stubs with the existing checks.

> 
> I think we should ensure that any check on TIF_FOREIGN_FPSTATE is
> shadowed by a check on system_supports_fpsimd() somewhere.  This already
> exists in many places -- we just need to fill in the missing ones.
> 
> fpsimd_save() is a backend function that should only be called if
> system_supports_fpsimd(), so that should not need any check internally,
> but we should make sure that calls to this function are appropriately
> protected with in if (system_supports_fpsimd()).

Agree.

> 
> For other maintenance functions intended for outside callers:
> 
>   * fpsimd_bind_task_to_cpu()
This was/is called from fpsimd_{update,restore}_current_state()
which are protected with system_supports_fpsimd() check already.

>   * fpsimd_bind_state_to_cpu()

This is only used by the KVM code and will only be used after we
have finalized the capabilities and thus we are covered by the
system_supports_fpsimd() check in __hyp_handle_fpsimd() which
sets the FP_ENABLED flag.

>   * fpsimd_flush_task_state()

This seemed rather innocent, but looking at the callers, I realise
that we need the check in fpr_{get/set} for NT_REGS and return errors
if we are asked to deal with FP regs.

>   * fpsimd_save_and_flush_cpu_state()

This must not be called and is only triggered from KVM (covered) and
the PM notifier (which is not registered if fp/simd is missing).

> 
> the situation is less clear.  Does is make sense to call these at all
> if !system_supports_fpsimd()?  I'm not currently sure.  We could at
> least drop some WARN_ON() into these to check, after revieweing their
> callsites.

Sure, I agree.

> 
>>   		task_fpsimd_load();
>>   		fpsimd_bind_task_to_cpu();
>>   	}
>> @@ -1148,17 +1155,16 @@ void fpsimd_restore_current_state(void)
>>    */
>>   void fpsimd_update_current_state(struct user_fpsimd_state const *state)
>>   {
>> -	if (!system_supports_fpsimd())
>> -		return;
>> -
>>   	get_cpu_fpsimd_context();
>>   
>>   	current->thread.uw.fpsimd_state = *state;
>>   	if (system_supports_sve() && test_thread_flag(TIF_SVE))
>>   		fpsimd_to_sve(current);
> 
> Why should we do this stuff on a system that doesn't support FP?

I was under the assumption that, !sve => !fpsimd. Otherwise, we have
bigger problems with the code where we transfer sve state to fpsimd state
without proper checks.

>> -	task_fpsimd_load();
>> -	fpsimd_bind_task_to_cpu();
>> +	if (system_supports_fpsimd()) {
>> +		task_fpsimd_load();
>> +		fpsimd_bind_task_to_cpu();
>> +	}
>>   
>>   	clear_thread_flag(TIF_FOREIGN_FPSTATE);


> 
> [...]
> 
> Not in scope for a stable fix, but:
> 
> It would be interesting to try to strip out TIF_FOREIGN_FPSTATE
> entirely and do some benchmarks and irq latency measurements:
> 
> TIF_FOREIGN_FPSTATE is just a cached copy of the wrong_task || wrong_cpu
> condition defined in fpsimd_thread_switch() --
> 
> That means we have to do maintenance on it all over the place to keep
> it in sync with the condition it represents -- this has proven to be
> a source of complexity and subtle bugs, as well as making the code
> fragile to maintain.
> 
> The only point of all this is so that there is a thread flag for
> do_notify_resume() to check.  Now that do_notify_resume() is C it would
> be trivial to check the real condition -- there would be a cost
> increase and interrupt latency increase here, but maybe not that much.
> 
> This wouldn't solve the whole problem, but it might remove a layer of
> complexity.

That is something I can take a look at.

Cheers
Suzuki

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks
  2019-10-17 12:42     ` Suzuki K Poulose
@ 2019-10-17 16:09       ` Dave Martin
  0 siblings, 0 replies; 23+ messages in thread
From: Dave Martin @ 2019-10-17 16:09 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mark.rutland, catalin.marinas, will, linux-kernel, linux-arm-kernel

On Thu, Oct 17, 2019 at 01:42:37PM +0100, Suzuki K Poulose wrote:
> Hi Dave
> 
> Thanks for the comments.
> 
> On 11/10/2019 12:26, Dave Martin wrote:
> >On Thu, Oct 10, 2019 at 06:15:16PM +0100, Suzuki K Poulose wrote:
> >>We detect the absence of FP/SIMD after we boot the SMP CPUs, and by then
> >>we have kernel threads running already with TIF_FOREIGN_FPSTATE set which
> >>could be inherited by early userspace applications (e.g, modprobe triggered
> >>from initramfs). This could end up in the applications stuck in
> >>do_nofity_resume() as we never clear the TIF flag, once we now know that
> >>we don't support FP.
> >>
> >>Fix this by making sure that we clear the TIF_FOREIGN_FPSTATE flag
> >>for tasks which may have them set, as we would have done in the normal
> >>case, but avoiding touching the hardware state (since we don't support any).
> >>
> >>Fixes: 82e0191a1aa11abf ("arm64: Support systems without FP/ASIMD")
> >>Cc: Will Deacon <will@kernel.org>
> >>Cc: Mark Rutland <mark.rutland@arm.com>
> >>Cc: Catalin Marinas <catalin.marinas@arm.com>
> >>Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>---
> >>  arch/arm64/kernel/fpsimd.c | 26 ++++++++++++++++----------
> >>  1 file changed, 16 insertions(+), 10 deletions(-)
> >>
> >>diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> >>index 37d3912cfe06..dfcdd077aeca 100644
> >>--- a/arch/arm64/kernel/fpsimd.c
> >>+++ b/arch/arm64/kernel/fpsimd.c
> >>@@ -1128,12 +1128,19 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_state,
> >>   */
> >>  void fpsimd_restore_current_state(void)
> >>  {
> >>-	if (!system_supports_fpsimd())
> >>-		return;
> >>-
> >>  	get_cpu_fpsimd_context();
> >>-
> >>-	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {
> >>+	/*
> >>+	 * For the tasks that were created before we detected the absence of
> >>+	 * FP/SIMD, the TIF_FOREIGN_FPSTATE could be set via fpsimd_thread_switch()
> >>+	 * and/or could be inherited from the parent(init_task has this set). Even
> >>+	 * though userspace has not run yet, this could be inherited by the
> >>+	 * processes forked from one of those tasks (e.g, modprobe from initramfs).
> >>+	 * If the system doesn't support FP/SIMD, we must clear the flag for the
> >>+	 * tasks mentioned above, to indicate that the FPSTATE is clean (as we
> >>+	 * can't have one) to avoid looping for ever to clear the flag.
> >>+	 */
> >>+	if (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE) &&
> >>+	    system_supports_fpsimd()) {
> >
> >I'm not too keen on this approach: elsewhere we just stub out all the
> >FPSIMD handling logic if !system_supports_fpsimd() -- I think we should
> >be using this test everywhere rather than relying on TIF_FOREIGN_FPSTATE.
> 
> We used to do this. But the flag is not cleared anymore once we detect
> the absence of FP/SIMD.
> 
> >Rather, I feel that TIF_FOREIGN_FPSTATE means "if this is a user task
> >and this task is current() and the system supports FPSIMD at all, this
> >task's FPSIMD state is not loaded in the cpu".
> 
> Yes, that is  correct. However, we ran some tasks, even before we detected
> that the FPSIMD is missing. So, we need to clear the flag for those tasks
> to make sure the flag state is consistent, as explained in the comment.

I think there's a misunderstanding here somewhere.

What I'm saying it that we shouldn't even look at TIF_FOREIGN_FPSTATE if
!system_supports_fpsimd() -- i.e., when checking whether there is
any FPSIMD context handling work to do, !system_supports_fpsimd()
should take precedence.

Firstly, this replaces the runtime TIF_FOREIGN_FPSTATE check with a
static key check in the !system_supprts_fpsimd() case, and second, the
"work to do" condition is never wrong, even temporarily.

The "work to do" condition is now

	system_supports_fpsimd() && test_thread_flag(TIF_FOREIGN_FPSTATE)

instead of

	test_thread_flag(TIF_FOREIGN_FPSTATE).

Code that _only writes_ the TIF_FORGIEN_FPSTATE flag can continue to do
so harmlessly if we do things this way.

In do_notify_resume() this doesn't quite work, but it's acceptable to
fall spuriously into fpsimd_restore_current_state() provided that we
check for !system_supports_fpsimd() in there.  Which we already do.
In this one case, we should clear TIF_FOREIGN_FPSTATE so this backwards
checking doesn't send do_notify_resume() into a spin waiting for the
flag to go clear.

Another option is to clear TIF_FOREIGN_FPSTATE from _TIF_WORK_MASK
when checking for pending work in do_notify_resume(), so that we
effectively ignore TIF_FOREIGN_FPSTATE here too.  This would be a static
key based check again, so should be fast.

I think this is a cleaner approach than trying to clean up
TIF_FOREIGN_FPSTATE for each task lazily in some random places --
otherwise we may need to do the cleanup anywhere that the flag is
accessed, and that happens all over the place.

Does that make sense?  More below.


> Another option is to clear this flag in fpsimd_thread_switch(), something like,
> rather than sprinkling the "flag fixup" everywhere:
> 
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index dfcdd077aeca..2d8091b6ebfb 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -982,9 +982,14 @@ void fpsimd_thread_switch(struct task_struct *next)
>  {
>         bool wrong_task, wrong_cpu;
> 
> -       if (!system_supports_fpsimd())
> +       if (!system_supports_fpsimd()) {
> +               /*
> +                * Clear any TIF flags which may have been set, before we
> +                * detected the absense of FPSIMD.
> +                */
> +               clear_task_thread_flag(next, TIF_FOREIGN_FPSTATE);
>                 return;
> -
> +       }
>         __get_cpu_fpsimd_context();
> 
> 
> And also at :
> 
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index a47462def04b..cd8e94d5dc8d 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -374,7 +374,10 @@ int copy_thread(unsigned long clone_flags, unsigned
> long stack_start,
>          * Otherwise we could erroneously skip reloading the FPSIMD
>          * registers for p.
>          */
> -       fpsimd_flush_task_state(p);
> +       if (system_supports_fpsimd())
> +               fpsimd_flush_task_state(p);
> +       else
> +               clear_tsk_thread_flag(p, TIF_FOREIGN_FPSTATE);
> 
>         if (likely(!(p->flags & PF_KTHREAD))) {
>                 *childregs = *current_pt_regs();
> 
> 
> That way we make sure the flag doesn't violate our assumption and we can
> bail out calling into the stubs with the existing checks.

But we may still go wrong where this flag is checked in signal handling /
ptrace / KVM, no?

> >I think we should ensure that any check on TIF_FOREIGN_FPSTATE is
> >shadowed by a check on system_supports_fpsimd() somewhere.  This already
> >exists in many places -- we just need to fill in the missing ones.
> >
> >fpsimd_save() is a backend function that should only be called if
> >system_supports_fpsimd(), so that should not need any check internally,
> >but we should make sure that calls to this function are appropriately
> >protected with in if (system_supports_fpsimd()).
> 
> Agree.
> 
> >
> >For other maintenance functions intended for outside callers:
> >
> >  * fpsimd_bind_task_to_cpu()
> This was/is called from fpsimd_{update,restore}_current_state()
> which are protected with system_supports_fpsimd() check already.
> 
> >  * fpsimd_bind_state_to_cpu()
> 
> This is only used by the KVM code and will only be used after we
> have finalized the capabilities and thus we are covered by the
> system_supports_fpsimd() check in __hyp_handle_fpsimd() which
> sets the FP_ENABLED flag.
> 
> >  * fpsimd_flush_task_state()
> 
> This seemed rather innocent, but looking at the callers, I realise
> that we need the check in fpr_{get/set} for NT_REGS and return errors
> if we are asked to deal with FP regs.
> 
> >  * fpsimd_save_and_flush_cpu_state()
> 
> This must not be called and is only triggered from KVM (covered) and
> the PM notifier (which is not registered if fp/simd is missing).
> 
> >
> >the situation is less clear.  Does is make sense to call these at all
> >if !system_supports_fpsimd()?  I'm not currently sure.  We could at
> >least drop some WARN_ON() into these to check, after revieweing their
> >callsites.
> 
> Sure, I agree.

My point is that we shouldn't knowingly introduce fragility, because this
code is hard enough to understand already -- I've spent literally months
of my life fighting it ;)

Hacking TIF_FOREIGN_FPSTATE at a few random sites feels inherently
more fragile than simply ignoring the flag when
!system_supports_fpsimd().

If there's a simple approach that can never go wrong, we should go for
that unless it introduces unacceptable overheads.

[...]

Cheers
---Dave

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-10-17 16:09 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-10 17:15 [PATCH 0/3] arm64: Fix support for systems without FP/SIMD Suzuki K Poulose
2019-10-10 17:15 ` [PATCH 1/3] arm64: cpufeature: Fix the type of no FP/SIMD capability Suzuki K Poulose
2019-10-11 11:36   ` Dave Martin
2019-10-11 12:13     ` Suzuki K Poulose
2019-10-11 14:21       ` Dave Martin
2019-10-11 17:28         ` Suzuki K Poulose
2019-10-14 14:52           ` Dave Martin
2019-10-14 15:45             ` Suzuki K Poulose
2019-10-14 15:50               ` Dave P Martin
2019-10-14 16:57                 ` Ard Biesheuvel
2019-10-15  9:44                   ` Suzuki K Poulose
2019-10-15  9:52                     ` Ard Biesheuvel
2019-10-15 10:24                   ` Dave Martin
2019-10-15 10:30                     ` Ard Biesheuvel
2019-10-15 13:03                       ` Suzuki K Poulose
2019-10-15 13:11                         ` Ard Biesheuvel
2019-10-15 14:05                       ` Dave Martin
2019-10-10 17:15 ` [PATCH 2/3] arm64: nofpsmid: Clear TIF_FOREIGN_FPSTATE flag for early tasks Suzuki K Poulose
2019-10-11 11:26   ` Dave Martin
2019-10-17 12:42     ` Suzuki K Poulose
2019-10-17 16:09       ` Dave Martin
2019-10-10 17:15 ` [PATCH 3/3] arm64: cpufeature: Set the FP/SIMD compat HWCAP bits properly Suzuki K Poulose
2019-10-17  0:06 ` [PATCH 0/3] arm64: Fix support for systems without FP/SIMD Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).