Re: [PATCH v2 2/6] arm64: HWCAP: add support for AT_HWCAP2

From: Andrew Murray <andrew.murray@arm.com>
To: Dave P Martin <Dave.Martin@arm.com>
Cc: Szabolcs Nagy <Szabolcs.Nagy@arm.com>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <Will.Deacon@arm.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v2 2/6] arm64: HWCAP: add support for AT_HWCAP2
Date: Wed, 27 Mar 2019 14:53:03 +0000	[thread overview]
Message-ID: <20190327145303.GD43527@e119886-lin.cambridge.arm.com> (raw)
In-Reply-To: <20190221184500.GO16031@e103592.cambridge.arm.com>

On Thu, Feb 21, 2019 at 06:45:03PM +0000, Dave P Martin wrote:
> On Thu, Feb 21, 2019 at 12:20:53PM +0000, Andrew Murray wrote:
> > As we will exhaust the first 32 bits of AT_HWCAP let's start
> > exposing AT_HWCAP2 to userspace to give us up to 64 caps.
> > 
> > Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we
> > prefer to expand into AT_HWCAP2 in order to provide a consistent
> > view to userspace between ILP32 and LP64. However internal to the
> > kernel we prefer to continue to use the full space of elf_hwcap.
> > 
> > To reduce complexity and allow for future expansion, we now
> > represent hwcaps in the kernel as ordinals and use a
> > KERNEL_HWCAP_ prefix. This allows us to support automatic feature
> > based module loading for all our hwcaps.
> > 
> > We introduce cpu_set_feature to set hwcaps which compliments the
> > existing cpu_have_feature helper. These helpers allow us to clean
> > up existing direct uses of elf_hwcap and reduce any future effort
> > required to move beyond 64 caps.
> > 
> > For convenience we also introduce cpu_{have,set}_feature_name which
> > makes use of the cpu_feature macro to allow providing a hwcap name
> > without a {KERNEL_}HWCAP_ prefix.
> > 
> > Signed-off-by: Andrew Murray <andrew.murray@arm.com>
> > ---
> >  arch/arm64/crypto/aes-ce-ccm-glue.c      |  2 +-
> >  arch/arm64/crypto/aes-neonbs-glue.c      |  2 +-
> >  arch/arm64/crypto/chacha-neon-glue.c     |  2 +-
> >  arch/arm64/crypto/crct10dif-ce-glue.c    |  2 +-
> >  arch/arm64/crypto/ghash-ce-glue.c        |  6 +--
> >  arch/arm64/crypto/nhpoly1305-neon-glue.c |  2 +-
> >  arch/arm64/crypto/sha256-glue.c          |  4 +-
> >  arch/arm64/include/asm/cpufeature.h      | 19 ++++++---
> >  arch/arm64/include/asm/hwcap.h           | 40 ++++++++++++++++++-
> >  arch/arm64/include/uapi/asm/hwcap.h      |  2 +-
> >  arch/arm64/kernel/cpufeature.c           | 66 ++++++++++++++++----------------
> >  arch/arm64/kernel/cpuinfo.c              |  2 +-
> >  arch/arm64/kernel/fpsimd.c               |  4 +-
> >  drivers/clocksource/arm_arch_timer.c     |  8 ++++
> >  14 files changed, 108 insertions(+), 53 deletions(-)
> 
> [...]
> 
> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> > index dfcfba7..dd21a32 100644
> > --- a/arch/arm64/include/asm/cpufeature.h
> > +++ b/arch/arm64/include/asm/cpufeature.h
> > @@ -17,12 +17,12 @@
> >  /*
> >   * In the arm64 world (as in the ARM world), elf_hwcap is used both internally
> >   * in the kernel and for user space to keep track of which optional features
> > - * are supported by the current system. So let's map feature 'x' to HWCAP_x.
> > - * Note that HWCAP_x constants are bit fields so we need to take the log.
> > + * are supported by the current system. So let's map feature 'x' to
> > + * KERNEL_HWCAP_x.
> 
> This doesn't read quite right now.
> 
> The purpose of this paragraph seems to be that the kernel and user
> views have the same encoding, so we can just map x to UAPI HWCAP_x
> definition.
> 
> This isn't true any more: we now consider the kernel (in elf_hwcap) and
> user encodings (in AT_HWCAP) distinct, but for backwards compatibility
> reasons HWCAP_x == BIT(KERNEL_HWCAP_x) for the first 32 hwcaps.
> 
> So it would be worth rewriting this whole comment to describe the new
> situation, rather than just tweaking it.
> 
> 
> I'm not sure whether it is more natural to put the comment in
> <asm/hwcap.h> or here: you could put it in one place with a brief
> pointer in the other.

How about the following description in asm/hwcap.h (subject to outcome of the
helpers discussion)? (non-uapi) hwcap.h seems the right place for this as it's
where you'd go to find out what the HWCAP's are called.

 /*
- * KERNEL_HWCAP flags - for elf_hwcap (in kernel)
+ * For userspace we represent hwcaps as a collection of HWCAP{,2}_x bitfields
+ * as described in uapi/asm/hwcap.h. For the kernel we represent hwcaps as
+ * natural numbers (in a single range of size MAX_CPU_FEATURES) defined here
+ * with prefix KERNEL_HWCAP_ mapped to their HWCAP{,2}_x counterpart.
+ *
+ * Hwcaps should be set and tested within the kernel via the
+ * cpu_{set,have}_named_feature(feature) where feature is the unique suffix
+ * of KERNEL_HWCAP_{feature}.
  */

And I'd be tempted to complete remove the comment in cpufeature.h - this relates
to the mapping of cpu_feature(x) to KERNEL_HWCAP_ which is linear and self
explanatory?

> 
> >   */
> >  
> > -#define MAX_CPU_FEATURES	(8 * sizeof(elf_hwcap))
> > -#define cpu_feature(x)		ilog2(HWCAP_ ## x)
> > +#define MAX_CPU_FEATURES	64
> > +#define cpu_feature(x)		(KERNEL_HWCAP_ ## x)
> >  
> >  #ifndef __ASSEMBLY__
> >  
> > @@ -396,10 +396,19 @@ extern struct static_key_false arm64_const_caps_ready;
> >  
> >  bool this_cpu_has_cap(unsigned int cap);
> >  
> > +static inline void cpu_set_feature(unsigned int num)
> > +{
> > +	WARN_ON(num >= MAX_CPU_FEATURES);
> > +	elf_hwcap |= BIT(num);
> > +}
> > +#define cpu_set_feature_name(name) cpu_set_feature(cpu_feature(name))
> > +
> >  static inline bool cpu_have_feature(unsigned int num)
> >  {
> > -	return elf_hwcap & (1UL << num);
> > +	WARN_ON(num >= MAX_CPU_FEATURES);
> > +	return elf_hwcap & BIT(num);
> >  }
> > +#define cpu_have_feature_name(name) cpu_have_feature(cpu_feature(name))
> 
> This is bikeshedding, but I'd say that CPUs don't have (feature) names;
> they have features (which have names).
> 
> So I might prefer to call these something like:
> 
> 	cpu_set_named_feature()
> 	cpu_have_named_feature()
> 
> (May not be worth it unless you respin the series for another reason
> though.)

I'll make this change, thanks for the suggestion.

> 
> >  /* System capability check for constant caps */
> >  static inline bool __cpus_have_const_cap(int num)
> > diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
> > index 400b80b..7549c72 100644
> > --- a/arch/arm64/include/asm/hwcap.h
> > +++ b/arch/arm64/include/asm/hwcap.h
> > @@ -39,12 +39,50 @@
> >  #define COMPAT_HWCAP2_SHA2	(1 << 3)
> >  #define COMPAT_HWCAP2_CRC32	(1 << 4)
> >  
> > +/*
> > + * KERNEL_HWCAP flags - for elf_hwcap (in kernel)
> > + */
> > +#define KERNEL_HWCAP_FP			ilog2(HWCAP_FP)
> > +#define KERNEL_HWCAP_ASIMD		ilog2(HWCAP_ASIMD)
> 
> [...]
> 
> > +#define KERNEL_HWCAP_SB			ilog2(HWCAP_SB)
> > +#define KERNEL_HWCAP_PACA		ilog2(HWCAP_PACA)
> > +#define KERNEL_HWCAP_PACG		ilog2(HWCAP_PACG)
> > +#define KERNEL_HWCAP_DCPODP		(ilog2(HWCAP2_DCPODP) + 32)
> 
> For ABI purposes, we should take the opportunity to document the status
> of the currently unused bits.
> 
> For interoperation with the glibc ifunc resolver ABI, we may want to
> reserve a bit among AT_HWCAP [63:32] or AT_HWCAP2 [31:0] that will
> never be used by the kernel and always passed to userspace as 0.
> 
> I'm envisaging code such as
> 
> 	foo resolver(unsigned long hwcaps, unsigned int num_at_hwcaps,
> 			unsigned long const *at_hwcaps)
> 	{
> 		if ((hwcaps & _GLIBC_EXTRA_HWCAPS) &&
> 				num_at_hwcaps >= 2 &&
> 				at_hwcaps[1] && HWCAP2_FOO)
> 			/* feature present */
> 	}
> 
> We would need that _GLIBC_EXTRA_HWCAPS to distinguish the second and
> third arguments from uninitialised junk that would be passed by older
> glibc versions.
> 
> Glibc might or might not choose to try and wegde AT_HWCAP2 in the top
> bits of the first argument instead of bits [63:32] of AT_HWCAP (which
> we expect to be zero for now, but could still be made reachable via the
> at_hwcaps pointer).
> 
> Coordination would be needed if glibc carries on using the
> <uapi/asm/hwcap.h> HWCAP{,2}_foo defines for here while doing tricks
> of this sort.
> 
> Szabolcs may have a view on whether this is needed / useful.
> 
> 
> If so, we should document any required guarantees now so that we don't
> accidentally violate them during future maintenance.  For the benefit
> of userspace folks, it may be a good idea to have some clear statement
> in Documentation/arm64/ also.
> 
> Because of the ABI implications here, if would also be good idea to copy
> the libc-alpha mailing list, and possibly also linux-api.
> 
> > +
> >  #ifndef __ASSEMBLY__
> >  /*
> >   * This yields a mask that user programs can use to figure out what
> >   * instruction set this cpu supports.
> >   */
> > -#define ELF_HWCAP		(elf_hwcap)
> > +#define ELF_HWCAP		lower_32_bits(elf_hwcap)
> > +#define ELF_HWCAP2		upper_32_bits(elf_hwcap)
> 
> Should we have #include <linux/kernel.h> here somewhere?

Yes, I'll have to add it after the #ifndef __ASSEMBLY__ otherwise it seems
to break anything that includes kernel.h from .S files.

> 
> [...]
> 
> > diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
> > index 9a7d4dc..4e8d3b4 100644
> > --- a/drivers/clocksource/arm_arch_timer.c
> > +++ b/drivers/clocksource/arm_arch_timer.c
> > @@ -778,7 +778,11 @@ static void arch_timer_evtstrm_enable(int divider)
> >  	cntkctl |= (divider << ARCH_TIMER_EVT_TRIGGER_SHIFT)
> >  			| ARCH_TIMER_VIRT_EVT_EN;
> >  	arch_timer_set_cntkctl(cntkctl);
> > +#ifdef CONFIG_ARM64
> > +	cpu_set_feature_name(EVTSTRM);
> > +#else
> >  	elf_hwcap |= HWCAP_EVTSTRM;
> > +#endif
> 
> This is a little nasty.
> 
> You could give arch/arm its own cpu_set_feature_name() (or whatever it's
> called), depending on how keen Russell is to pick it up.
> 
> Or for this particular case, stick with the old code (which we now get
> away with because elf_hwcap is still exported).  The case of an
> HWCAP_foo flag name that happens to have the same semantics on both arm
> and arm64 is a pretty esoteric one, so it's not the end of the world if
> the standard helpers don't deal with it.

Assuming we don't drop the encapsulate patch, I guess we could leave this
as it is and look at adding the ARM32 helper a later time?

> 
> To avoid future accidents you could replace the #ifdef with a comment at
> each site (which will also allow people to track down this patch when
> looking at that code).
> 
> >  #ifdef CONFIG_COMPAT
> >  	compat_elf_hwcap |= COMPAT_HWCAP_EVTSTRM;
> >  #endif
> > @@ -1000,7 +1004,11 @@ static int arch_timer_cpu_pm_notify(struct notifier_block *self,
> >  	} else if (action == CPU_PM_ENTER_FAILED || action == CPU_PM_EXIT) {
> >  		arch_timer_set_cntkctl(__this_cpu_read(saved_cntkctl));
> >  
> > +#ifdef CONFIG_ARM64
> > +		if (cpu_have_feature_name(EVTSTRM))
> > +#else
> >  		if (elf_hwcap & HWCAP_EVTSTRM)
> > +#endif
> 
> Ditto.

Thanks,

Andrew Murray

> 
> [...]
> 
> Cheers
> ---Dave

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel