linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] VFP fixes
@ 2014-10-14 13:48 Stephen Boyd
  2014-10-14 13:48 ` [PATCH v2 1/3] ARM: vfp: Workaround bad MVFR1 register on some Kraits Stephen Boyd
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Stephen Boyd @ 2014-10-14 13:48 UTC (permalink / raw)
  To: Russell King
  Cc: linux-kernel, linux-arm-msm, linux-arm-kernel, Will Deacon, Rob Clark

These changes allow us to detect VFP correctly on Krait processors.
They also fix short vector emulation for Cortex-A15 and Krait.

Changes since v1:
 * Move to use CPUID and MVFR0 in patch 2
 * Patches 1 and 3 unchanged

Stepan Moskovchenko (1):
  arm: vfp: Bounce undefined instructions in vectored mode

Stephen Boyd (2):
  ARM: vfp: Workaround bad MVFR1 register on some Kraits
  ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus

 arch/arm/include/asm/vfp.h |  5 +++
 arch/arm/mm/proc-v7.S      |  5 ++-
 arch/arm/vfp/vfphw.S       |  6 +++
 arch/arm/vfp/vfpmodule.c   | 93 ++++++++++++++++++++++++++--------------------
 4 files changed, 66 insertions(+), 43 deletions(-)

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/3] ARM: vfp: Workaround bad MVFR1 register on some Kraits
  2014-10-14 13:48 [PATCH v2 0/3] VFP fixes Stephen Boyd
@ 2014-10-14 13:48 ` Stephen Boyd
  2014-10-14 13:48 ` [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus Stephen Boyd
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Stephen Boyd @ 2014-10-14 13:48 UTC (permalink / raw)
  To: Russell King
  Cc: linux-kernel, linux-arm-msm, linux-arm-kernel, Will Deacon, Rob Clark

Certain versions of the Krait processor don't report that they
support the fused multiply accumulate instruction via the MVFR1
register despite the fact that they actually do. Unfortunately we
use this register to identify support for VFPv4. Override the
hwcap on all Krait processors to indicate support for VFPv4 to
workaround this.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---
 arch/arm/mm/proc-v7.S | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
index b5d67db20897..2e78b4b2538f 100644
--- a/arch/arm/mm/proc-v7.S
+++ b/arch/arm/mm/proc-v7.S
@@ -593,9 +593,10 @@ __krait_proc_info:
 	/*
 	 * Some Krait processors don't indicate support for SDIV and UDIV
 	 * instructions in the ARM instruction set, even though they actually
-	 * do support them.
+	 * do support them. They also don't indicate support for fused multiply
+	 * instructions even though they actually do support them.
 	 */
-	__v7_proc __v7_setup, hwcaps = HWCAP_IDIV
+	__v7_proc __v7_setup, hwcaps = HWCAP_IDIV | HWCAP_VFPv4
 	.size	__krait_proc_info, . - __krait_proc_info
 
 	/*
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus
  2014-10-14 13:48 [PATCH v2 0/3] VFP fixes Stephen Boyd
  2014-10-14 13:48 ` [PATCH v2 1/3] ARM: vfp: Workaround bad MVFR1 register on some Kraits Stephen Boyd
@ 2014-10-14 13:48 ` Stephen Boyd
  2014-10-27 10:31   ` Will Deacon
  2014-10-14 13:48 ` [PATCH v2 3/3] arm: vfp: Bounce undefined instructions in vectored mode Stephen Boyd
  2014-10-16 13:14 ` [PATCH v2 0/3] VFP fixes Rob Clark
  3 siblings, 1 reply; 10+ messages in thread
From: Stephen Boyd @ 2014-10-14 13:48 UTC (permalink / raw)
  To: Russell King
  Cc: linux-kernel, linux-arm-msm, linux-arm-kernel, Will Deacon, Rob Clark

The subarchitecture field in the fpsid register is 7 bits wide on
ARM CPUs using the CPUID identification scheme, spanning bits 22
to 16. The topmost bit is used to designate that the
subarchitecture designer is not ARM when it is set to 1. On
non-CPUID scheme CPUs the subarchitecture field is only 4 bits
wide and the higher bits are used to indicate no double precision
support (bit 20) and the FTSMX/FLDMX format (bits 21-22).

The VFP support code only looks at bits 19-16 to determine the
VFP version. On Qualcomm's processors (Krait and Scorpion) we
should see that we have HWCAP_VFPv3 but we don't because bit 22
is set to 1 to indicate that the subarchitecture is not
implemented by ARM and the rest of the bits are left as 0 because
this is the first subarchitecture that Qualcomm has designed.
Unfortunately we can't just widen the FPSID subarchitecture
bitmask to consider all the bits on a CPUID scheme because there
may be CPUs without the CPUID scheme that have VFP without double
precision support and then the version would be a very wrong and
large number. Instead, update the version detection logic to
consider if the CPU is using the CPUID scheme.

If the CPU is using CPUID scheme, use the MVFR registers to
determine what version of VFP is supported. We already do this
for VFPv4, so do something similar for VFPv3 and look for single
or double precision support in MVFR0. Otherwise fall back to
using FPSID to detect VFP suppport on non-CPUID scheme CPUs. We
know that VFPv3 is only present in CPUs that have support for the
CPUID scheme so this should be equivalent.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---
 arch/arm/include/asm/vfp.h |  5 +++
 arch/arm/vfp/vfpmodule.c   | 93 ++++++++++++++++++++++++++--------------------
 2 files changed, 57 insertions(+), 41 deletions(-)

diff --git a/arch/arm/include/asm/vfp.h b/arch/arm/include/asm/vfp.h
index f4ab34fd4f72..ee5f3084243c 100644
--- a/arch/arm/include/asm/vfp.h
+++ b/arch/arm/include/asm/vfp.h
@@ -22,6 +22,7 @@
 #define FPSID_NODOUBLE		(1<<20)
 #define FPSID_ARCH_BIT		(16)
 #define FPSID_ARCH_MASK		(0xF  << FPSID_ARCH_BIT)
+#define FPSID_CPUID_ARCH_MASK	(0x7F  << FPSID_ARCH_BIT)
 #define FPSID_PART_BIT		(8)
 #define FPSID_PART_MASK		(0xFF << FPSID_PART_BIT)
 #define FPSID_VARIANT_BIT	(4)
@@ -75,6 +76,10 @@
 /* MVFR0 bits */
 #define MVFR0_A_SIMD_BIT	(0)
 #define MVFR0_A_SIMD_MASK	(0xf << MVFR0_A_SIMD_BIT)
+#define MVFR0_SP_BIT		(4)
+#define MVFR0_SP_MASK		(0xf << MVFR0_SP_BIT)
+#define MVFR0_DP_BIT		(8)
+#define MVFR0_DP_MASK		(0xf << MVFR0_DP_BIT)
 
 /* Bit patterns for decoding the packaged operation descriptors */
 #define VFPOPDESC_LENGTH_BIT	(9)
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 2f37e1d6cb45..f901242dee98 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -722,6 +722,7 @@ static int __init vfp_init(void)
 {
 	unsigned int vfpsid;
 	unsigned int cpu_arch = cpu_architecture();
+	u32 mvfr0;
 
 	if (cpu_arch >= CPU_ARCH_ARMv6)
 		on_each_cpu(vfp_enable, NULL, 1);
@@ -738,63 +739,73 @@ static int __init vfp_init(void)
 	vfp_vector = vfp_null_entry;
 
 	pr_info("VFP support v0.3: ");
-	if (VFP_arch)
+	if (VFP_arch) {
 		pr_cont("not present\n");
-	else if (vfpsid & FPSID_NODOUBLE) {
-		pr_cont("no double precision support\n");
-	} else {
-		hotcpu_notifier(vfp_hotplug, 0);
-
-		VFP_arch = (vfpsid & FPSID_ARCH_MASK) >> FPSID_ARCH_BIT;  /* Extract the architecture version */
-		pr_cont("implementor %02x architecture %d part %02x variant %x rev %x\n",
-			(vfpsid & FPSID_IMPLEMENTER_MASK) >> FPSID_IMPLEMENTER_BIT,
-			(vfpsid & FPSID_ARCH_MASK) >> FPSID_ARCH_BIT,
-			(vfpsid & FPSID_PART_MASK) >> FPSID_PART_BIT,
-			(vfpsid & FPSID_VARIANT_MASK) >> FPSID_VARIANT_BIT,
-			(vfpsid & FPSID_REV_MASK) >> FPSID_REV_BIT);
-
-		vfp_vector = vfp_support_entry;
-
-		thread_register_notifier(&vfp_notifier_block);
-		vfp_pm_init();
-
+		return 0;
+	/* Extract the arhitecture on CPUID scheme */
+	} else if ((read_cpuid_id() & 0x000f0000) == 0x000f0000) {
+		VFP_arch = vfpsid & FPSID_CPUID_ARCH_MASK;
+		VFP_arch >>= FPSID_ARCH_BIT;
 		/*
-		 * We detected VFP, and the support code is
-		 * in place; report VFP support to userspace.
+		 * Check for the presence of the Advanced SIMD
+		 * load/store instructions, integer and single
+		 * precision floating point operations. Only check
+		 * for NEON if the hardware has the MVFR registers.
 		 */
-		elf_hwcap |= HWCAP_VFP;
+#ifdef CONFIG_NEON
+		if ((fmrx(MVFR1) & 0x000fff00) == 0x00011100)
+			elf_hwcap |= HWCAP_NEON;
+#endif
 #ifdef CONFIG_VFPv3
-		if (VFP_arch >= 2) {
+		mvfr0 = fmrx(MVFR0);
+		if (((mvfr0 & MVFR0_DP_MASK) >> MVFR0_DP_BIT) == 0x2 ||
+		    ((mvfr0 & MVFR0_SP_MASK) >> MVFR0_SP_BIT) == 0x2) {
 			elf_hwcap |= HWCAP_VFPv3;
-
 			/*
 			 * Check for VFPv3 D16 and VFPv4 D16.  CPUs in
 			 * this configuration only have 16 x 64bit
 			 * registers.
 			 */
-			if (((fmrx(MVFR0) & MVFR0_A_SIMD_MASK)) == 1)
-				elf_hwcap |= HWCAP_VFPv3D16; /* also v4-D16 */
+			if ((mvfr0 & MVFR0_A_SIMD_MASK) == 1)
+				/* also v4-D16 */
+				elf_hwcap |= HWCAP_VFPv3D16;
 			else
 				elf_hwcap |= HWCAP_VFPD32;
 		}
+
+		if ((fmrx(MVFR1) & 0xf0000000) == 0x10000000)
+			elf_hwcap |= HWCAP_VFPv4;
 #endif
-		/*
-		 * Check for the presence of the Advanced SIMD
-		 * load/store instructions, integer and single
-		 * precision floating point operations. Only check
-		 * for NEON if the hardware has the MVFR registers.
-		 */
-		if ((read_cpuid_id() & 0x000f0000) == 0x000f0000) {
-#ifdef CONFIG_NEON
-			if ((fmrx(MVFR1) & 0x000fff00) == 0x00011100)
-				elf_hwcap |= HWCAP_NEON;
-#endif
-#ifdef CONFIG_VFPv3
-			if ((fmrx(MVFR1) & 0xf0000000) == 0x10000000)
-				elf_hwcap |= HWCAP_VFPv4;
-#endif
+	/* Extract the architecture version on pre-cpuid scheme */
+	} else {
+		if (vfpsid & FPSID_NODOUBLE) {
+			pr_cont("no double precision support\n");
+			return 0;
 		}
+
+		VFP_arch = (vfpsid & FPSID_ARCH_MASK) >> FPSID_ARCH_BIT;
 	}
+
+	hotcpu_notifier(vfp_hotplug, 0);
+
+	vfp_vector = vfp_support_entry;
+
+	thread_register_notifier(&vfp_notifier_block);
+	vfp_pm_init();
+
+	/*
+	 * We detected VFP, and the support code is
+	 * in place; report VFP support to userspace.
+	 */
+	elf_hwcap |= HWCAP_VFP;
+
+	pr_cont("implementor %02x architecture %d part %02x variant %x rev %x\n",
+		(vfpsid & FPSID_IMPLEMENTER_MASK) >> FPSID_IMPLEMENTER_BIT,
+		VFP_arch,
+		(vfpsid & FPSID_PART_MASK) >> FPSID_PART_BIT,
+		(vfpsid & FPSID_VARIANT_MASK) >> FPSID_VARIANT_BIT,
+		(vfpsid & FPSID_REV_MASK) >> FPSID_REV_BIT);
+
 	return 0;
 }
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/3] arm: vfp: Bounce undefined instructions in vectored mode
  2014-10-14 13:48 [PATCH v2 0/3] VFP fixes Stephen Boyd
  2014-10-14 13:48 ` [PATCH v2 1/3] ARM: vfp: Workaround bad MVFR1 register on some Kraits Stephen Boyd
  2014-10-14 13:48 ` [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus Stephen Boyd
@ 2014-10-14 13:48 ` Stephen Boyd
  2014-10-16 13:14 ` [PATCH v2 0/3] VFP fixes Rob Clark
  3 siblings, 0 replies; 10+ messages in thread
From: Stephen Boyd @ 2014-10-14 13:48 UTC (permalink / raw)
  To: Russell King
  Cc: Stepan Moskovchenko, linux-kernel, linux-arm-msm,
	linux-arm-kernel, Will Deacon, Rob Clark

From: Stepan Moskovchenko <stepanm@codeaurora.org>

Certain ARM CPU implementations (e.g. Cortex-A15) may not raise a
floating- point exception whenever deprecated short-vector VFP
instructions are executed. Instead these instructions are treated
as UNALLOCATED. Change the VFP exception handling code to emulate
short-vector instructions even if FPEXC exception bits are not
set.

Signed-off-by: Stepan Moskovchenko <stepanm@codeaurora.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
---
 arch/arm/vfp/vfphw.S | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index cda654cbf2c2..f74a8f7e5f84 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -197,6 +197,12 @@ look_for_VFP_exceptions:
 	tst	r5, #FPSCR_IXE
 	bne	process_exception
 
+	tst	r5, #FPSCR_LENGTH_MASK
+	beq	skip
+	orr	r1, r1, #FPEXC_DEX
+	b	process_exception
+skip:
+
 	@ Fall into hand on to next handler - appropriate coproc instr
 	@ not recognised by VFP
 
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 0/3] VFP fixes
  2014-10-14 13:48 [PATCH v2 0/3] VFP fixes Stephen Boyd
                   ` (2 preceding siblings ...)
  2014-10-14 13:48 ` [PATCH v2 3/3] arm: vfp: Bounce undefined instructions in vectored mode Stephen Boyd
@ 2014-10-16 13:14 ` Rob Clark
  3 siblings, 0 replies; 10+ messages in thread
From: Rob Clark @ 2014-10-16 13:14 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Russell King, Linux Kernel Mailing List, linux-arm-msm,
	linux-arm-kernel, Will Deacon

On Tue, Oct 14, 2014 at 9:48 AM, Stephen Boyd <sboyd@codeaurora.org> wrote:
> These changes allow us to detect VFP correctly on Krait processors.
> They also fix short vector emulation for Cortex-A15 and Krait.
>
> Changes since v1:
>  * Move to use CPUID and MVFR0 in patch 2
>  * Patches 1 and 3 unchanged
>
> Stepan Moskovchenko (1):
>   arm: vfp: Bounce undefined instructions in vectored mode
>
> Stephen Boyd (2):
>   ARM: vfp: Workaround bad MVFR1 register on some Kraits
>   ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus
>

for the series,

Tested-by: Rob Clark <robdclark@gmail.com>

>  arch/arm/include/asm/vfp.h |  5 +++
>  arch/arm/mm/proc-v7.S      |  5 ++-
>  arch/arm/vfp/vfphw.S       |  6 +++
>  arch/arm/vfp/vfpmodule.c   | 93 ++++++++++++++++++++++++++--------------------
>  4 files changed, 66 insertions(+), 43 deletions(-)
>
> --
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus
  2014-10-14 13:48 ` [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus Stephen Boyd
@ 2014-10-27 10:31   ` Will Deacon
  2014-10-27 11:49     ` Måns Rullgård
  2014-10-27 19:50     ` Stephen Boyd
  0 siblings, 2 replies; 10+ messages in thread
From: Will Deacon @ 2014-10-27 10:31 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Russell King, linux-kernel, linux-arm-msm, linux-arm-kernel, Rob Clark

Hi Stephen,

On Tue, Oct 14, 2014 at 02:48:58PM +0100, Stephen Boyd wrote:
> The subarchitecture field in the fpsid register is 7 bits wide on
> ARM CPUs using the CPUID identification scheme, spanning bits 22
> to 16. The topmost bit is used to designate that the
> subarchitecture designer is not ARM when it is set to 1. On
> non-CPUID scheme CPUs the subarchitecture field is only 4 bits
> wide and the higher bits are used to indicate no double precision
> support (bit 20) and the FTSMX/FLDMX format (bits 21-22).
> 
> The VFP support code only looks at bits 19-16 to determine the
> VFP version. On Qualcomm's processors (Krait and Scorpion) we
> should see that we have HWCAP_VFPv3 but we don't because bit 22
> is set to 1 to indicate that the subarchitecture is not
> implemented by ARM and the rest of the bits are left as 0 because
> this is the first subarchitecture that Qualcomm has designed.
> Unfortunately we can't just widen the FPSID subarchitecture
> bitmask to consider all the bits on a CPUID scheme because there
> may be CPUs without the CPUID scheme that have VFP without double
> precision support and then the version would be a very wrong and
> large number. Instead, update the version detection logic to
> consider if the CPU is using the CPUID scheme.
> 
> If the CPU is using CPUID scheme, use the MVFR registers to
> determine what version of VFP is supported. We already do this
> for VFPv4, so do something similar for VFPv3 and look for single
> or double precision support in MVFR0. Otherwise fall back to
> using FPSID to detect VFP suppport on non-CPUID scheme CPUs. We
> know that VFPv3 is only present in CPUs that have support for the
> CPUID scheme so this should be equivalent.

This looks correct to me, but it raises a bigger question about the
suitability of hwcaps for describing features of the instruction set.

With the extended CPUID scheme, there are a whole bunch of different
instruction set features that are reported and bundling arbitrary subsets of
them into hwcaps such as `VFPv4' doesn't feel like the right thing to do in
the long run. It also doesn't seem to match where the architecture is going.

Perhaps it would be better to consider exposing the ID registers to
userspace in some manner? This could be done either via an undef handler, or
using the vdso. We would add a (final) hwcap advertising this cpuid support.
For big/little systems, the kernel would need to expose a suitable subset of
the features (we already have the sanity checking code from Rutland).

I'd certainly like to explore that route for arm64, before we start adding a
bunch of fine-grained capabilities.

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus
  2014-10-27 10:31   ` Will Deacon
@ 2014-10-27 11:49     ` Måns Rullgård
  2014-10-27 19:50     ` Stephen Boyd
  1 sibling, 0 replies; 10+ messages in thread
From: Måns Rullgård @ 2014-10-27 11:49 UTC (permalink / raw)
  To: Will Deacon
  Cc: Stephen Boyd, Russell King, linux-kernel, linux-arm-msm,
	linux-arm-kernel, Rob Clark

Will Deacon <will.deacon@arm.com> writes:

> Perhaps it would be better to consider exposing the ID registers to
> userspace in some manner? This could be done either via an undef handler, or
> using the vdso. We would add a (final) hwcap advertising this cpuid support.
> For big/little systems, the kernel would need to expose a suitable subset of
> the features (we already have the sanity checking code from Rutland).

This was discussed a few years ago, and some people raised various
objections.  Off the top of my head:

- Some features (e.g. VFP/NEON) need kernel support, and if this is not
  enabled, the actual system capabilities will not match the raw
  register value.  Fudging the values exposed to userspace would be
  fragile.
  (This argument has some merit.)

- Only v7 and newer CPUs have the CPUID registers.  Ridiculously old
  CPUs don't even have a CP15.  Providing synthetic values might be
  tricky.  Software thus needs to support alternate feature detection
  methods for such hardware.
  (This is true enough.)

- It would only be available on new kernels, so software would still
  need a fallback to another method for the foreseeable future.
  (This is a rather lazy argument.)

- It would be specific to Linux, so software can't rely on it anyway.
  (This is an even lazier argument.)

-- 
Måns Rullgård
mans@mansr.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus
  2014-10-27 10:31   ` Will Deacon
  2014-10-27 11:49     ` Måns Rullgård
@ 2014-10-27 19:50     ` Stephen Boyd
  2014-10-28 12:11       ` Will Deacon
  1 sibling, 1 reply; 10+ messages in thread
From: Stephen Boyd @ 2014-10-27 19:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: Russell King, linux-kernel, linux-arm-msm, linux-arm-kernel, Rob Clark

On 10/27/2014 03:31 AM, Will Deacon wrote:
> Hi Stephen,
>
> On Tue, Oct 14, 2014 at 02:48:58PM +0100, Stephen Boyd wrote:
>> The subarchitecture field in the fpsid register is 7 bits wide on
>> ARM CPUs using the CPUID identification scheme, spanning bits 22
>> to 16. The topmost bit is used to designate that the
>> subarchitecture designer is not ARM when it is set to 1. On
>> non-CPUID scheme CPUs the subarchitecture field is only 4 bits
>> wide and the higher bits are used to indicate no double precision
>> support (bit 20) and the FTSMX/FLDMX format (bits 21-22).
>>
>> The VFP support code only looks at bits 19-16 to determine the
>> VFP version. On Qualcomm's processors (Krait and Scorpion) we
>> should see that we have HWCAP_VFPv3 but we don't because bit 22
>> is set to 1 to indicate that the subarchitecture is not
>> implemented by ARM and the rest of the bits are left as 0 because
>> this is the first subarchitecture that Qualcomm has designed.
>> Unfortunately we can't just widen the FPSID subarchitecture
>> bitmask to consider all the bits on a CPUID scheme because there
>> may be CPUs without the CPUID scheme that have VFP without double
>> precision support and then the version would be a very wrong and
>> large number. Instead, update the version detection logic to
>> consider if the CPU is using the CPUID scheme.
>>
>> If the CPU is using CPUID scheme, use the MVFR registers to
>> determine what version of VFP is supported. We already do this
>> for VFPv4, so do something similar for VFPv3 and look for single
>> or double precision support in MVFR0. Otherwise fall back to
>> using FPSID to detect VFP suppport on non-CPUID scheme CPUs. We
>> know that VFPv3 is only present in CPUs that have support for the
>> CPUID scheme so this should be equivalent.
> This looks correct to me, but it raises a bigger question about the
> suitability of hwcaps for describing features of the instruction set.

Great. Can I get your reviewed-by on this patch please?

>
> With the extended CPUID scheme, there are a whole bunch of different
> instruction set features that are reported and bundling arbitrary subsets of
> them into hwcaps such as `VFPv4' doesn't feel like the right thing to do in
> the long run. It also doesn't seem to match where the architecture is going.
>
> Perhaps it would be better to consider exposing the ID registers to
> userspace in some manner? This could be done either via an undef handler, or
> using the vdso. We would add a (final) hwcap advertising this cpuid support.
> For big/little systems, the kernel would need to expose a suitable subset of
> the features (we already have the sanity checking code from Rutland).
>
> I'd certainly like to explore that route for arm64, before we start adding a
> bunch of fine-grained capabilities.

I have an RFC for the undef handler written up, except for the
big/little thing. Let me post it. Is there anyone from the userspace
side that can be on Cc?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus
  2014-10-27 19:50     ` Stephen Boyd
@ 2014-10-28 12:11       ` Will Deacon
  2014-10-28 17:54         ` Stephen Boyd
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2014-10-28 12:11 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Russell King, linux-kernel, linux-arm-msm, linux-arm-kernel, Rob Clark

On Mon, Oct 27, 2014 at 07:50:42PM +0000, Stephen Boyd wrote:
> On 10/27/2014 03:31 AM, Will Deacon wrote:
> > On Tue, Oct 14, 2014 at 02:48:58PM +0100, Stephen Boyd wrote:
> >> If the CPU is using CPUID scheme, use the MVFR registers to
> >> determine what version of VFP is supported. We already do this
> >> for VFPv4, so do something similar for VFPv3 and look for single
> >> or double precision support in MVFR0. Otherwise fall back to
> >> using FPSID to detect VFP suppport on non-CPUID scheme CPUs. We
> >> know that VFPv3 is only present in CPUs that have support for the
> >> CPUID scheme so this should be equivalent.
> > This looks correct to me, but it raises a bigger question about the
> > suitability of hwcaps for describing features of the instruction set.
> 
> Great. Can I get your reviewed-by on this patch please?

Sure. There's a spelling mistake ("arhitecture") which you should fix,
but the code looks ok.

> > With the extended CPUID scheme, there are a whole bunch of different
> > instruction set features that are reported and bundling arbitrary subsets of
> > them into hwcaps such as `VFPv4' doesn't feel like the right thing to do in
> > the long run. It also doesn't seem to match where the architecture is going.
> >
> > Perhaps it would be better to consider exposing the ID registers to
> > userspace in some manner? This could be done either via an undef handler, or
> > using the vdso. We would add a (final) hwcap advertising this cpuid support.
> > For big/little systems, the kernel would need to expose a suitable subset of
> > the features (we already have the sanity checking code from Rutland).
> >
> > I'd certainly like to explore that route for arm64, before we start adding a
> > bunch of fine-grained capabilities.
> 
> I have an RFC for the undef handler written up, except for the
> big/little thing. Let me post it. Is there anyone from the userspace
> side that can be on Cc?

Off the top of my head:

  Mans Rullgard (already replied to this thread)
  Peter Maydell <peter.maydell@linaro.org> [QEMU]
  Jacob Bramley <jacob.bramley@arm.com> [JITs]
  Kyrylo Tkachov <kyrylo.tkachov@arm.com> [GCC]

(CC Rutland for the big/little bits too)

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus
  2014-10-28 12:11       ` Will Deacon
@ 2014-10-28 17:54         ` Stephen Boyd
  0 siblings, 0 replies; 10+ messages in thread
From: Stephen Boyd @ 2014-10-28 17:54 UTC (permalink / raw)
  To: Will Deacon
  Cc: Russell King, linux-kernel, linux-arm-msm, linux-arm-kernel, Rob Clark

On 10/28, Will Deacon wrote:
> 
> Sure. There's a spelling mistake ("arhitecture") which you should fix,
> but the code looks ok.

Ok I'll fix it up and send it off to the patch tracker if I don't
hear anything else.

> > 
> > I have an RFC for the undef handler written up, except for the
> > big/little thing. Let me post it. Is there anyone from the userspace
> > side that can be on Cc?
> 
> Off the top of my head:
> 
>   Mans Rullgard (already replied to this thread)
>   Peter Maydell <peter.maydell@linaro.org> [QEMU]
>   Jacob Bramley <jacob.bramley@arm.com> [JITs]
>   Kyrylo Tkachov <kyrylo.tkachov@arm.com> [GCC]
> 

Thanks. I'll add them to the patch for next round and add them on
Cc for the current patch on the list.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-10-28 17:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-14 13:48 [PATCH v2 0/3] VFP fixes Stephen Boyd
2014-10-14 13:48 ` [PATCH v2 1/3] ARM: vfp: Workaround bad MVFR1 register on some Kraits Stephen Boyd
2014-10-14 13:48 ` [PATCH v2 2/3] ARM: vfp: Fix VFPv3 hwcap detection on CPUID based cpus Stephen Boyd
2014-10-27 10:31   ` Will Deacon
2014-10-27 11:49     ` Måns Rullgård
2014-10-27 19:50     ` Stephen Boyd
2014-10-28 12:11       ` Will Deacon
2014-10-28 17:54         ` Stephen Boyd
2014-10-14 13:48 ` [PATCH v2 3/3] arm: vfp: Bounce undefined instructions in vectored mode Stephen Boyd
2014-10-16 13:14 ` [PATCH v2 0/3] VFP fixes Rob Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).