linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
@ 2023-07-31 13:58 Mark Brown
  2023-08-03 16:39 ` Catalin Marinas
  2023-08-11 11:44 ` Will Deacon
  0 siblings, 2 replies; 7+ messages in thread
From: Mark Brown @ 2023-07-31 13:58 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Dave Martin, Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, Mark Brown

For both SVE and SME we abuse the generic register field comparison
support in the cpufeature code as part of our detection of unsupported
variations in the vector lengths available to PEs, reporting the maximum
vector lengths via ZCR_EL1.LEN and SMCR_EL1.LEN.  Since these are
configuration registers rather than identification registers the
assumptions the cpufeature code makes about how unknown bitfields behave
are invalid, leading to warnings when SME features like FA64 are enabled
and we hotplug a CPU:

  CPU features: SANITY CHECK: Unexpected variation in SYS_SMCR_EL1. Boot CPU: 0x0000000000000f, CPU3: 0x0000008000000f
  CPU features: Unsupported CPU feature variation detected.

SVE has no controls other than the vector length so is not yet impacted
but the same issue will apply there if any are defined.

Since the only field we are interested in having the cpufeature code
handle is the length field and we use a custom read function to obtain
the value we can avoid these warnings by filtering out all other bits
when we return the register value, if we're doing that we don't need to
bother reading the register at all and can simply use the RDVL/RDSVL
value we were filling in instead.

Fixes: 2e0f2478ea37eb ("arm64/sve: Probe SVE capabilities and usable vector lengths")
FixeS: b42990d3bf77cc ("arm64/sme: Identify supported SME vector lengths at boot")
Signed-off-by: Mark Brown <broonie@kernel.org>
---
Changes in v2:
- Skip the munging into the register step and just return the value we
  would munge in.
- Link to v1: https://lore.kernel.org/r/20230727-arm64-sme-fa64-hotplug-v1-1-34ae93afc05b@kernel.org
---
 arch/arm64/kernel/fpsimd.c | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 520b681a07bb..aa4194c646b2 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1178,9 +1178,6 @@ void sve_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
  */
 u64 read_zcr_features(void)
 {
-	u64 zcr;
-	unsigned int vq_max;
-
 	/*
 	 * Set the maximum possible VL, and write zeroes to all other
 	 * bits to see if they stick.
@@ -1188,12 +1185,8 @@ u64 read_zcr_features(void)
 	sve_kernel_enable(NULL);
 	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
 
-	zcr = read_sysreg_s(SYS_ZCR_EL1);
-	zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */
-	vq_max = sve_vq_from_vl(sve_get_vl());
-	zcr |= vq_max - 1; /* set LEN field to maximum effective value */
-
-	return zcr;
+	/* Return LEN value that would be written to get the maximum VL */
+	return sve_vq_from_vl(sve_get_vl()) - 1;
 }
 
 void __init sve_setup(void)
@@ -1348,9 +1341,6 @@ void fa64_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
  */
 u64 read_smcr_features(void)
 {
-	u64 smcr;
-	unsigned int vq_max;
-
 	sme_kernel_enable(NULL);
 
 	/*
@@ -1359,12 +1349,8 @@ u64 read_smcr_features(void)
 	write_sysreg_s(read_sysreg_s(SYS_SMCR_EL1) | SMCR_ELx_LEN_MASK,
 		       SYS_SMCR_EL1);
 
-	smcr = read_sysreg_s(SYS_SMCR_EL1);
-	smcr &= ~(u64)SMCR_ELx_LEN_MASK; /* Only the LEN field */
-	vq_max = sve_vq_from_vl(sme_get_vl());
-	smcr |= vq_max - 1; /* set LEN field to maximum effective value */
-
-	return smcr;
+	/* Return LEN value that would be written to get the maximum VL */
+	return sve_vq_from_vl(sme_get_vl()) - 1;
 }
 
 void __init sme_setup(void)

---
base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
change-id: 20230727-arm64-sme-fa64-hotplug-1e6896746f97

Best regards,
-- 
Mark Brown <broonie@kernel.org>


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
  2023-07-31 13:58 [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers Mark Brown
@ 2023-08-03 16:39 ` Catalin Marinas
  2023-08-03 17:44   ` Mark Brown
  2023-08-11 11:44 ` Will Deacon
  1 sibling, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2023-08-03 16:39 UTC (permalink / raw)
  To: Mark Brown; +Cc: Will Deacon, Suzuki K Poulose, linux-arm-kernel, linux-kernel

On Mon, Jul 31, 2023 at 02:58:48PM +0100, Mark Brown wrote:
> For both SVE and SME we abuse the generic register field comparison
> support in the cpufeature code as part of our detection of unsupported
> variations in the vector lengths available to PEs, reporting the maximum
> vector lengths via ZCR_EL1.LEN and SMCR_EL1.LEN.  Since these are
> configuration registers rather than identification registers the
> assumptions the cpufeature code makes about how unknown bitfields behave
> are invalid, leading to warnings when SME features like FA64 are enabled
> and we hotplug a CPU:
> 
>   CPU features: SANITY CHECK: Unexpected variation in SYS_SMCR_EL1. Boot CPU: 0x0000000000000f, CPU3: 0x0000008000000f
>   CPU features: Unsupported CPU feature variation detected.
> 
> SVE has no controls other than the vector length so is not yet impacted
> but the same issue will apply there if any are defined.
> 
> Since the only field we are interested in having the cpufeature code
> handle is the length field and we use a custom read function to obtain
> the value we can avoid these warnings by filtering out all other bits
> when we return the register value, if we're doing that we don't need to
> bother reading the register at all and can simply use the RDVL/RDSVL
> value we were filling in instead.

Maybe that's the simplest fix, especially if you want it in stable, but
I wonder why we even bother with with treating ZCR_EL1 and SMCR_EL1 as
feature registers. We already have verify_sme_features() to check for
the mismatch. BTW, is vec_verify_vq_map() sufficient so that we can skip
the maximum vector length check?

-- 
Catalin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
  2023-08-03 16:39 ` Catalin Marinas
@ 2023-08-03 17:44   ` Mark Brown
  2023-08-04 16:20     ` Catalin Marinas
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Brown @ 2023-08-03 17:44 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Suzuki K Poulose, linux-arm-kernel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1968 bytes --]

On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:
> On Mon, Jul 31, 2023 at 02:58:48PM +0100, Mark Brown wrote:

> > Since the only field we are interested in having the cpufeature code
> > handle is the length field and we use a custom read function to obtain
> > the value we can avoid these warnings by filtering out all other bits
> > when we return the register value, if we're doing that we don't need to
> > bother reading the register at all and can simply use the RDVL/RDSVL
> > value we were filling in instead.

> Maybe that's the simplest fix, especially if you want it in stable, but

Yeah, it's definitely the sort of change we want as a fix - anything
more invasive would be inappropriate.

> I wonder why we even bother with with treating ZCR_EL1 and SMCR_EL1 as
> feature registers. We already have verify_sme_features() to check for
> the mismatch. BTW, is vec_verify_vq_map() sufficient so that we can skip
> the maximum vector length check?

Both enumeration mechanisms were added in the initial series supporting
SVE for reasons that are not entirely obvious to me.  The changelogs
explain what we're doing with the pseudo ID register stuff but do not
comment on why.  There is a cross check between the answers the two give
which appears to be geared towards detecting systems with asymmetric
maximum VLs for some reason but I'm not sure why that's done given that
we can't cope if *any* VL in the committed set is missing, not just the
maximum.  It's not KVM because that has even stricter requirements and
can't cope with any extra supported VLs.  I'm not sure if the inclusion
in the cpufeature stuff is supposed to expose the registers via
emulation of mrs, though a brief test appears to indicate that this
doesn't actually work if it was ever intended to.

The whole thing is very suspect but given that we don't currently have
any ability to emulate systems with asymmetric vector lengths I'm a bit
reluctant to poke at it.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
  2023-08-03 17:44   ` Mark Brown
@ 2023-08-04 16:20     ` Catalin Marinas
  2023-08-04 16:37       ` Mark Brown
  0 siblings, 1 reply; 7+ messages in thread
From: Catalin Marinas @ 2023-08-04 16:20 UTC (permalink / raw)
  To: Mark Brown; +Cc: Will Deacon, Suzuki K Poulose, linux-arm-kernel, linux-kernel

On Thu, Aug 03, 2023 at 06:44:24PM +0100, Mark Brown wrote:
> On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:
> > On Mon, Jul 31, 2023 at 02:58:48PM +0100, Mark Brown wrote:
> > > Since the only field we are interested in having the cpufeature code
> > > handle is the length field and we use a custom read function to obtain
> > > the value we can avoid these warnings by filtering out all other bits
> > > when we return the register value, if we're doing that we don't need to
> > > bother reading the register at all and can simply use the RDVL/RDSVL
> > > value we were filling in instead.
> 
> > Maybe that's the simplest fix, especially if you want it in stable, but
> 
> Yeah, it's definitely the sort of change we want as a fix - anything
> more invasive would be inappropriate.

I'd say it's still ok if we can just rip come code out safely (the fake
ID reg).

> > I wonder why we even bother with with treating ZCR_EL1 and SMCR_EL1 as
> > feature registers. We already have verify_sme_features() to check for
> > the mismatch. BTW, is vec_verify_vq_map() sufficient so that we can skip
> > the maximum vector length check?
> 
> Both enumeration mechanisms were added in the initial series supporting
> SVE for reasons that are not entirely obvious to me.  The changelogs
> explain what we're doing with the pseudo ID register stuff but do not
> comment on why.  There is a cross check between the answers the two give
> which appears to be geared towards detecting systems with asymmetric
> maximum VLs for some reason but I'm not sure why that's done given that
> we can't cope if *any* VL in the committed set is missing, not just the
> maximum.

We can cope with different VLs if the committed map is built during boot
(early secondary CPU bring-up). For any late/hotplugged CPUs, if they
don't fit the map, they'll be rejected. Not sure where the actual
maximum length matters in this process though (or later for user space).
I assume the user will only be allowed to set the common VLs across all
the early CPUs.

> The whole thing is very suspect but given that we don't currently have
> any ability to emulate systems with asymmetric vector lengths I'm a bit
> reluctant to poke at it.

The Arm fast models should allow such configuration, though I haven't
tried.

-- 
Catalin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
  2023-08-04 16:20     ` Catalin Marinas
@ 2023-08-04 16:37       ` Mark Brown
  2023-08-09 16:11         ` Catalin Marinas
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Brown @ 2023-08-04 16:37 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Will Deacon, Suzuki K Poulose, linux-arm-kernel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2743 bytes --]

On Fri, Aug 04, 2023 at 05:20:21PM +0100, Catalin Marinas wrote:
> On Thu, Aug 03, 2023 at 06:44:24PM +0100, Mark Brown wrote:
> > On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:

> > > Maybe that's the simplest fix, especially if you want it in stable, but

> > Yeah, it's definitely the sort of change we want as a fix - anything
> > more invasive would be inappropriate.

> I'd say it's still ok if we can just rip come code out safely (the fake
> ID reg).

It's the safely bit that concerns me here - it feels like a great way to
discover why the code was there, possibly including a use that was there
in the past but has subsequently been removed so bites a stable version.

> > Both enumeration mechanisms were added in the initial series supporting
> > SVE for reasons that are not entirely obvious to me.  The changelogs
> > explain what we're doing with the pseudo ID register stuff but do not
> > comment on why.  There is a cross check between the answers the two give
> > which appears to be geared towards detecting systems with asymmetric
> > maximum VLs for some reason but I'm not sure why that's done given that
> > we can't cope if *any* VL in the committed set is missing, not just the
> > maximum.

> We can cope with different VLs if the committed map is built during boot
> (early secondary CPU bring-up). For any late/hotplugged CPUs, if they
> don't fit the map, they'll be rejected. Not sure where the actual
> maximum length matters in this process though (or later for user space).
> I assume the user will only be allowed to set the common VLs across all
> the early CPUs.

Indeed, since we need to check each VL in the set we expose to userspace
individually that will include the maximum VL which should mean that
having a separate check for the maximum VL is redundant.  That's always
been the case though which makes me worried about changing it for a fix
rather than a cleanup.

For KVM we need the stricter requirement that no additional VLs are
supported in the subset that KVM exports to clients since guests can
directly enumerate VLs from the hardware and we don't want the answer
changing depending on what physical CPU we schedule a vCPU on.  That
should similarly not need a distinct check for the maximum VL.

> > The whole thing is very suspect but given that we don't currently have
> > any ability to emulate systems with asymmetric vector lengths I'm a bit
> > reluctant to poke at it.

> The Arm fast models should allow such configuration, though I haven't
> tried.

They don't, SVE and SME are provided as a plugin and all their
configuration is done at the plugin level so there's no per PE or per
cluster options like there are for features implemented in the model
itself.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
  2023-08-04 16:37       ` Mark Brown
@ 2023-08-09 16:11         ` Catalin Marinas
  0 siblings, 0 replies; 7+ messages in thread
From: Catalin Marinas @ 2023-08-09 16:11 UTC (permalink / raw)
  To: Mark Brown; +Cc: Will Deacon, Suzuki K Poulose, linux-arm-kernel, linux-kernel

On Fri, Aug 04, 2023 at 05:37:25PM +0100, Mark Brown wrote:
> On Fri, Aug 04, 2023 at 05:20:21PM +0100, Catalin Marinas wrote:
> > On Thu, Aug 03, 2023 at 06:44:24PM +0100, Mark Brown wrote:
> > > On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:
> > > > Maybe that's the simplest fix, especially if you want it in stable, but
> > > 
> > > Yeah, it's definitely the sort of change we want as a fix - anything
> > > more invasive would be inappropriate.
> > 
> > I'd say it's still ok if we can just rip come code out safely (the fake
> > ID reg).
> 
> It's the safely bit that concerns me here - it feels like a great way to
> discover why the code was there, possibly including a use that was there
> in the past but has subsequently been removed so bites a stable version.

As discussed earlier, let's queue this patch as is (for 6.6 I'd say)
with cc stable and post a new patch on top that removes the fake CPUID
register going forward, in case we missed anything.

For this patch:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
  2023-07-31 13:58 [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers Mark Brown
  2023-08-03 16:39 ` Catalin Marinas
@ 2023-08-11 11:44 ` Will Deacon
  1 sibling, 0 replies; 7+ messages in thread
From: Will Deacon @ 2023-08-11 11:44 UTC (permalink / raw)
  To: Dave Martin, Mark Brown, Suzuki K Poulose, Catalin Marinas
  Cc: kernel-team, Will Deacon, linux-arm-kernel, linux-kernel

On Mon, 31 Jul 2023 14:58:48 +0100, Mark Brown wrote:
> For both SVE and SME we abuse the generic register field comparison
> support in the cpufeature code as part of our detection of unsupported
> variations in the vector lengths available to PEs, reporting the maximum
> vector lengths via ZCR_EL1.LEN and SMCR_EL1.LEN.  Since these are
> configuration registers rather than identification registers the
> assumptions the cpufeature code makes about how unknown bitfields behave
> are invalid, leading to warnings when SME features like FA64 are enabled
> and we hotplug a CPU:
> 
> [...]

Applied to arm64 (for-next/cpufeature), thanks!

[1/1] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
      https://git.kernel.org/arm64/c/01948b09edc3

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-08-11 11:45 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-31 13:58 [PATCH v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers Mark Brown
2023-08-03 16:39 ` Catalin Marinas
2023-08-03 17:44   ` Mark Brown
2023-08-04 16:20     ` Catalin Marinas
2023-08-04 16:37       ` Mark Brown
2023-08-09 16:11         ` Catalin Marinas
2023-08-11 11:44 ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).