linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/10] arm64: Expose CPU feature registers
@ 2015-07-24  9:43 Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 01/10] arm64: feature registers: Documentation Suzuki K. Poulose
                   ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

This is an early RFC prototype for an API to export ARMv8 CPU
feature registers to AArch64 userspace. The series also
consolidates the CPU info, HWCAPs and the sanity check
infrastructure.

The ARM architecture exposes the system/cpu capabilities via a set
of CPU feature Registers. Currently, we relay some of this information
to userspace via the following mechanisms:

1)  ELF HWCAPS auxiliary vector
 * There are limited number of bits available in the HWCAPS and
   may soon run out of bits.
 * the auxv is not available at all the time (e.g prior to libc is
   initialised at startup)
 * They cannot represent non-boolean information effectively.

2)  /proc/cpuinfo
 Provides CPU identification information along with the hwcaps.
 However, parsing the information is complex and prone to errors.
 Also this method cannot be used during the early application startup
 (e.g ld/libc load time).

This proposal emulates the 'MRS' instruction and exposes a limited set
of feature values (See Patch 1 for the detailed list and documentation)
that are safe across all the CPUs (e.g heterogeneous CPUs). The feature
bits that are not exposed are set to the 'safe value' which implies
'not supported'.

Apart from the selected feature registers, we expose MIDR_EL1 (Main
ID Register). The user should be aware that, reading MIDR_EL1 can be
tricky on a heterogeneous system (just like getcpu()). We export the
value of the current CPU where 'MRS' is executed.

This infrastructure  useful for the toolchains (e.g, gcc, dynamic linker,
JIT) to make better runtime decisions based on what is available.

This patch series is based on: 4.2.0-rc3 + the patch
"arm64: Generalise msr_s/mrs_s operations"
	http://lists.infradead.org/pipermail/linux-arm-kernel/2015-July/358462.html

Suzuki K. Poulose (10):
  arm64: feature registers: Documentation
  arm64: Make the CPU information more clear
  arm64: Delay ELF HWCAP initialisation until all CPUs are up
  arm64: Consolidate cpuinfo handling
  arm64: Keep track of CPU feature registers
  arm64: Add helper to decode register from instruction
  arm64: Expose feature registers by emulating MRS
  arm64: Emulate ID registers
  arm64: Read system wide CPUID value
  arm64: Use system-wide safe value of CPU feature register

 Documentation/arm64/cpu-feature-registers.txt |  185 +++++++
 arch/arm64/include/asm/cpu.h                  |  165 ++++++
 arch/arm64/include/asm/insn.h                 |    2 +
 arch/arm64/kernel/cpuinfo.c                   |  720 +++++++++++++++++++++++--
 arch/arm64/kernel/debug-monitors.c            |    6 +-
 arch/arm64/kernel/fpsimd.c                    |    5 +-
 arch/arm64/kernel/hw_breakpoint.c             |    5 +-
 arch/arm64/kernel/insn.c                      |   29 +
 arch/arm64/kernel/setup.c                     |  209 +------
 arch/arm64/kernel/smp.c                       |    3 +-
 arch/arm64/kvm/reset.c                        |    3 +-
 arch/arm64/kvm/sys_regs.c                     |    5 +-
 12 files changed, 1076 insertions(+), 261 deletions(-)
 create mode 100644 Documentation/arm64/cpu-feature-registers.txt

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-08-10 16:06   ` Catalin Marinas
  2015-07-24  9:43 ` [RFC PATCH 02/10] arm64: Make the CPU information more clear Suzuki K. Poulose
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

Documentation of the infrastructure

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 Documentation/arm64/cpu-feature-registers.txt |  185 +++++++++++++++++++++++++
 1 file changed, 185 insertions(+)
 create mode 100644 Documentation/arm64/cpu-feature-registers.txt

diff --git a/Documentation/arm64/cpu-feature-registers.txt b/Documentation/arm64/cpu-feature-registers.txt
new file mode 100644
index 0000000..08030be
--- /dev/null
+++ b/Documentation/arm64/cpu-feature-registers.txt
@@ -0,0 +1,185 @@
+		ARM64 CPU Feature Registers
+		===========================
+
+Author: Suzuki K. Poulose <suzuki.poulose@arm.com>
+
+
+This file describes the API for exporting the AArch64 CPU ID/feature registers
+to userspace.
+
+1. Motivation
+---------------
+
+The ARM architecture defines a set of feature registers, which describe
+the capabilities of the CPU/system. Access to these system registers is
+restricted from EL0 and there is no reliable way for an application to
+extract this information to make better decisions at runtime. There is
+limited information available to the application via ELF_HWCAPs, however
+there are some issues with their usage.
+
+ a) Any change to the HWCAPs requires an update to userspace (e.g libc)
+    to detect the new changes, which can take a long time to appear in
+    distributions. Exposing the registers allows applications to get the
+    information without requiring other userspace components to be updated.
+
+ b) Access to HWCAPs is sometimes restricted (e.g prior to libc, or when ld is
+    initialised at startup time).
+
+ c) HWCAPs cannot represent non-boolean information effectively. The
+    architecture defines a canonical format for representing features
+    in the ID registers; this is well defined and is capable of
+    representing all valid architecture variations. Exposing the ID
+    registers avoids having to come up with HWCAP representations
+    and parsing code.
+
+
+2. Requirements
+-----------------
+
+ a) Safety :
+    Applications should be able to use the information provided by the
+    infrastructure to run optimally safely across the system. This has
+    greater implications on a system with heterogeneous CPUs. The
+    infrastructure exports a value that is safe across all the available
+    CPU on the system.
+
+    e.g, If at least one CPU doesn't implement CRC32 instructions, while others
+    do, we should report that the CRC32 is not implemented. Otherwise an
+    application could crash when scheduled on the CPU which doesn't support
+    CRC32.
+
+ b) Security :
+    Applications should only be able to receive information that is relevant
+    to the normal operation in userspace. Hence, some of the fields
+    are masked out and the values of the fields are set to indicate the
+    feature is 'not supported' (See the 'visible' field in the
+    table in Section 4). Also, the kernel may manipulate the fields based on what
+    it supports. e.g, If FP is not supported by the kernel, the values
+    could indicate that the FP is not available (even when the CPU provides
+    it).
+
+ c) Implementation Defined Features
+    The infrastructure doesn't expose any register which is
+    IMPLEMENTATION DEFINED as per ARMv8-A Architecture and is set to 0.
+
+ d) CPU Identification :
+    MIDR_EL1 is exposed to help identify the processor. On a heterogeneous
+    system, this could be racy (just like getcpu()). The process could be
+    migrated to another CPU by the time we use the register value. Hence,
+    there is no guarantee that the value reflects the processor that it is
+    currently executing on.
+
+The list of supported registers and the attributes of individual
+feature bits are listed in section 4. Unless there is absolute necessity,
+we don't encourage the addition of new feature registers to the list.
+In any case, it should comply to the requirements listed above.
+
+3. Implementation
+--------------------
+
+The infrastructure is built on the emulation of the 'MRS' instruction.
+Accessing a restricted system register from an application generates an
+exception and ends up in SIGILL being delivered to the process.
+The infrastructure hooks into the exception handler and emulates the
+operation if the source belongs to the supported system register space.
+
+The infrastructure emulates only the following system register space:
+	Op0=3, Op1=0, CRn=0
+
+(See Table C5-6 'System instruction encodings for System register accesses'
+ in ARMv8 ARM, for the list of registers).
+
+
+The following rules are applied to the value returned by the infrastructure:
+
+ a) The value of an 'IMPLEMENTATION DEFINED' field is set to 0.
+ b) The value of a reserved field is set to the reserved value(as
+    defined by the architecture).
+ c) The value of a field marked as not 'visible', is set to indicate
+    the feature is missing (as defined by the architecture).
+ d) The value of a 'visible' field holds the system wide safe value
+    for the particular feature(except for MIDR_EL1, see section 4)
+
+There are only a few registers visible to the userspace. See Section 4,
+for the list of 'visible' registers.
+
+The registers which are either reserved RAZ or IMPLEMENTAION DEFINED are
+emulated as 0.
+
+All others are emulated as having 'invisible' features.
+
+4. List of exposed registers
+-----------------------------
+
+  1) ID_AA64ISAR0_EL1 - Instruction Set Attribute Register 0
+     x--------------------------------------------------x
+     | Name                         |  bits   | visible |
+     |--------------------------------------------------|
+     | RAZ                          | [63-20] |    n    |
+     |--------------------------------------------------|
+     | CRC32                        | [19-16] |    y    |
+     |--------------------------------------------------|
+     | SHA2                         | [15-12] |    y    |
+     |--------------------------------------------------|
+     | SHA1                         | [11-8]  |    y    |
+     |--------------------------------------------------|
+     | AES                          | [7-4]   |    y    |
+     |--------------------------------------------------|
+     | RAZ                          | [3-0]   |    n    |
+     x--------------------------------------------------x
+
+  2) ID_AA64ISAR1_EL1 - Instruction Set Attribute Register 1
+     x--------------------------------------------------x
+     | Name                         |  bits   | visible |
+     |--------------------------------------------------|
+     | RAZ                          | [63-0]  |    y    |
+     x--------------------------------------------------x
+
+  3) ID_AA64PFR0_EL1 - Processor Feature Register 0
+     x--------------------------------------------------x
+     | Name                         |  bits   | visible |
+     |--------------------------------------------------|
+     | RAZ                          | [63-28] |    n    |
+     |--------------------------------------------------|
+     | GIC                          | [27-24] |    n    |
+     |--------------------------------------------------|
+     | AdvSIMD                      | [23-20] |    y    |
+     |--------------------------------------------------|
+     | FP                           | [19-16] |    y    |
+     |--------------------------------------------------|
+     | EL3                          | [15-12] |    n    |
+     |--------------------------------------------------|
+     | EL2                          | [11-8]  |    n    |
+     |--------------------------------------------------|
+     | EL1                          | [7-4]   |    n    |
+     |--------------------------------------------------|
+     | EL0                          | [3-0]   |    n    |
+     x--------------------------------------------------x
+
+  4) ID_AA64PFR1_EL1 - Processor Feature Register 1
+     x--------------------------------------------------x
+     | Name                         |  bits   | visible |
+     |--------------------------------------------------|
+     | RAZ                          | [63-0]  |    y    |
+     x--------------------------------------------------x
+
+  5) MIDR_EL1 - Main ID Register
+     x--------------------------------------------------x
+     | Name                         |  bits   | visible |
+     |--------------------------------------------------|
+     | RAZ                          | [63-32] |    n    |
+     |--------------------------------------------------|
+     | Implementer                  | [31-24] |    y    |
+     |--------------------------------------------------|
+     | Variant                      | [23-20] |    y    |
+     |--------------------------------------------------|
+     | Architecture                 | [19-16] |    y    |
+     |--------------------------------------------------|
+     | PartNum                      | [15-4]  |    y    |
+     |--------------------------------------------------|
+     | Revision                     | [3-0]   |    y    |
+     x--------------------------------------------------x
+
+   NOTE: The 'visible' fields of MIDR_EL1 will contain the value
+   as available on the CPU where it is fetched and is not a system
+   wide safe value.
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 02/10] arm64: Make the CPU information more clear
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 01/10] arm64: feature registers: Documentation Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 03/10] arm64: Delay ELF HWCAP initialisation until all CPUs are up Suzuki K. Poulose
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

At early boot, we print the CPU version/revision. On a heterogeneous
system, we could have different types of CPUs. Print the CPU info for
all active cpus.

Also, remove the redundant 'revision' information which doesn't
make any sense without the 'variant' field.

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/setup.c |    3 +--
 arch/arm64/kernel/smp.c   |    3 ++-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index f3067d4..a30cf1d 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -229,8 +229,7 @@ static void __init setup_processor(void)
 	u32 cwg;
 	int cls;
 
-	printk("CPU: AArch64 Processor [%08x] revision %d\n",
-	       read_cpuid_id(), read_cpuid_id() & 15);
+	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
 
 	sprintf(init_utsname()->machine, ELF_PLATFORM);
 	elf_hwcap = 0;
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 50fb469..a121c67 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -144,7 +144,8 @@ asmlinkage void secondary_start_kernel(void)
 	cpumask_set_cpu(cpu, mm_cpumask(mm));
 
 	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
-	printk("CPU%u: Booted secondary processor\n", cpu);
+	pr_info("CPU%u: Booted secondary processor [%08x]\n",
+					 cpu, read_cpuid_id());
 
 	/*
 	 * TTBR0 is only used for the identity mapping at this stage. Make it
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 03/10] arm64: Delay ELF HWCAP initialisation until all CPUs are up
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 01/10] arm64: feature registers: Documentation Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 02/10] arm64: Make the CPU information more clear Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 04/10] arm64: Consolidate cpuinfo handling Suzuki K. Poulose
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

Delay the ELF HWCAP initialisation untill all the CPUs are up.
This is in preparation for detecting the common features across
the CPUS and creating a consistent ELF HWCAP for the system.

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/setup.c |   19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index a30cf1d..2a36d27 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -103,6 +103,8 @@ static struct resource mem_res[] = {
 	}
 };
 
+static void setup_processor_features(void);
+
 #define kernel_code mem_res[0]
 #define kernel_data mem_res[1]
 
@@ -212,6 +214,7 @@ static void __init hyp_mode_check(void)
 
 void __init do_post_cpus_up_work(void)
 {
+	setup_processor_features();
 	hyp_mode_check();
 	apply_alternatives_all();
 }
@@ -223,19 +226,12 @@ void __init up_late_init(void)
 }
 #endif /* CONFIG_UP_LATE_INIT */
 
-static void __init setup_processor(void)
+static void __init setup_processor_features(void)
 {
 	u64 features, block;
 	u32 cwg;
 	int cls;
 
-	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
-
-	sprintf(init_utsname()->machine, ELF_PLATFORM);
-	elf_hwcap = 0;
-
-	cpuinfo_store_boot_cpu();
-
 	/*
 	 * Check for sane CTR_EL0.CWG value.
 	 */
@@ -312,6 +308,13 @@ static void __init setup_processor(void)
 #endif
 }
 
+static void __init setup_processor(void)
+{
+	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
+	sprintf(init_utsname()->machine, ELF_PLATFORM);
+	cpuinfo_store_boot_cpu();
+}
+
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
 {
 	void *dt_virt = fixmap_remap_fdt(dt_phys);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 04/10] arm64: Consolidate cpuinfo handling
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
                   ` (2 preceding siblings ...)
  2015-07-24  9:43 ` [RFC PATCH 03/10] arm64: Delay ELF HWCAP initialisation until all CPUs are up Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 05/10] arm64: Keep track of CPU feature registers Suzuki K. Poulose
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

This patch re-arranges the code a little bit to consolidate
the /proc/cpuinfo handling code to arch/arm64/kernel/cpuinfo.c.

No functional changes.

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cpu.h |    1 +
 arch/arm64/kernel/cpuinfo.c  |  208 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/setup.c    |  207 -----------------------------------------
 3 files changed, 209 insertions(+), 207 deletions(-)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index 8e797b2..a34de72 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -62,5 +62,6 @@ DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);
 
 void cpuinfo_store_cpu(void);
 void __init cpuinfo_store_boot_cpu(void);
+void __init setup_processor_features(void);
 
 #endif /* __ASM_CPU_H */
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 75d5a86..a13468b 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -26,6 +26,9 @@
 #include <linux/kernel.h>
 #include <linux/preempt.h>
 #include <linux/printk.h>
+#include <linux/personality.h>
+#include <linux/seq_file.h>
+#include <linux/sched.h>
 #include <linux/smp.h>
 
 /*
@@ -254,3 +257,208 @@ void __init cpuinfo_store_boot_cpu(void)
 
 	boot_cpu_data = *info;
 }
+
+void __init setup_processor_features(void)
+{
+	u64 features, block;
+	u32 cwg;
+	int cls;
+
+	/*
+	 * Check for sane CTR_EL0.CWG value.
+	 */
+	cwg = cache_type_cwg();
+	cls = cache_line_size();
+	if (!cwg)
+		pr_warn("No Cache Writeback Granule information, assuming cache line size %d\n",
+			cls);
+	if (L1_CACHE_BYTES < cls)
+		pr_warn("L1_CACHE_BYTES smaller than the Cache Writeback Granule (%d < %d)\n",
+			L1_CACHE_BYTES, cls);
+
+	/*
+	 * ID_AA64ISAR0_EL1 contains 4-bit wide signed feature blocks.
+	 * The blocks we test below represent incremental functionality
+	 * for non-negative values. Negative values are reserved.
+	 */
+	features = read_cpuid(ID_AA64ISAR0_EL1);
+	block = (features >> 4) & 0xf;
+	if (!(block & 0x8)) {
+		switch (block) {
+		default:
+		case 2:
+			elf_hwcap |= HWCAP_PMULL;
+		case 1:
+			elf_hwcap |= HWCAP_AES;
+		case 0:
+			break;
+		}
+	}
+
+	block = (features >> 8) & 0xf;
+	if (block && !(block & 0x8))
+		elf_hwcap |= HWCAP_SHA1;
+
+	block = (features >> 12) & 0xf;
+	if (block && !(block & 0x8))
+		elf_hwcap |= HWCAP_SHA2;
+
+	block = (features >> 16) & 0xf;
+	if (block && !(block & 0x8))
+		elf_hwcap |= HWCAP_CRC32;
+
+#ifdef CONFIG_COMPAT
+	/*
+	 * ID_ISAR5_EL1 carries similar information as above, but pertaining to
+	 * the Aarch32 32-bit execution state.
+	 */
+	features = read_cpuid(ID_ISAR5_EL1);
+	block = (features >> 4) & 0xf;
+	if (!(block & 0x8)) {
+		switch (block) {
+		default:
+		case 2:
+			compat_elf_hwcap2 |= COMPAT_HWCAP2_PMULL;
+		case 1:
+			compat_elf_hwcap2 |= COMPAT_HWCAP2_AES;
+		case 0:
+			break;
+		}
+	}
+
+	block = (features >> 8) & 0xf;
+	if (block && !(block & 0x8))
+		compat_elf_hwcap2 |= COMPAT_HWCAP2_SHA1;
+
+	block = (features >> 12) & 0xf;
+	if (block && !(block & 0x8))
+		compat_elf_hwcap2 |= COMPAT_HWCAP2_SHA2;
+
+	block = (features >> 16) & 0xf;
+	if (block && !(block & 0x8))
+		compat_elf_hwcap2 |= COMPAT_HWCAP2_CRC32;
+#endif
+}
+
+static const char *hwcap_str[] = {
+	"fp",
+	"asimd",
+	"evtstrm",
+	"aes",
+	"pmull",
+	"sha1",
+	"sha2",
+	"crc32",
+	NULL
+};
+
+#ifdef CONFIG_COMPAT
+static const char *compat_hwcap_str[] = {
+	"swp",
+	"half",
+	"thumb",
+	"26bit",
+	"fastmult",
+	"fpa",
+	"vfp",
+	"edsp",
+	"java",
+	"iwmmxt",
+	"crunch",
+	"thumbee",
+	"neon",
+	"vfpv3",
+	"vfpv3d16",
+	"tls",
+	"vfpv4",
+	"idiva",
+	"idivt",
+	"vfpd32",
+	"lpae",
+	"evtstrm"
+};
+
+static const char *compat_hwcap2_str[] = {
+	"aes",
+	"pmull",
+	"sha1",
+	"sha2",
+	"crc32",
+	NULL
+};
+#endif /* CONFIG_COMPAT */
+
+static int c_show(struct seq_file *m, void *v)
+{
+	int i, j;
+
+	for_each_online_cpu(i) {
+		struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i);
+		u32 midr = cpuinfo->reg_midr;
+
+		/*
+		 * glibc reads /proc/cpuinfo to determine the number of
+		 * online processors, looking for lines beginning with
+		 * "processor".  Give glibc what it expects.
+		 */
+#ifdef CONFIG_SMP
+		seq_printf(m, "processor\t: %d\n", i);
+#endif
+
+		/*
+		 * Dump out the common processor features in a single line.
+		 * Userspace should read the hwcaps with getauxval(AT_HWCAP)
+		 * rather than attempting to parse this, but there's a body of
+		 * software which does already (at least for 32-bit).
+		 */
+		seq_puts(m, "Features\t:");
+		if (personality(current->personality) == PER_LINUX32) {
+#ifdef CONFIG_COMPAT
+			for (j = 0; compat_hwcap_str[j]; j++)
+				if (compat_elf_hwcap & (1 << j))
+					seq_printf(m, " %s", compat_hwcap_str[j]);
+
+			for (j = 0; compat_hwcap2_str[j]; j++)
+				if (compat_elf_hwcap2 & (1 << j))
+					seq_printf(m, " %s", compat_hwcap2_str[j]);
+#endif /* CONFIG_COMPAT */
+		} else {
+			for (j = 0; hwcap_str[j]; j++)
+				if (elf_hwcap & (1 << j))
+					seq_printf(m, " %s", hwcap_str[j]);
+		}
+		seq_puts(m, "\n");
+
+		seq_printf(m, "CPU implementer\t: 0x%02x\n",
+			   MIDR_IMPLEMENTOR(midr));
+		seq_printf(m, "CPU architecture: 8\n");
+		seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr));
+		seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr));
+		seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr));
+	}
+
+	return 0;
+}
+
+static void *c_start(struct seq_file *m, loff_t *pos)
+{
+	return *pos < 1 ? (void *)1 : NULL;
+}
+
+static void *c_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	++*pos;
+	return NULL;
+}
+
+static void c_stop(struct seq_file *m, void *v)
+{
+}
+
+const struct seq_operations cpuinfo_op = {
+	.start	= c_start,
+	.next	= c_next,
+	.stop	= c_stop,
+	.show	= c_show
+};
+
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 2a36d27..239e478 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -28,7 +28,6 @@
 #include <linux/console.h>
 #include <linux/cache.h>
 #include <linux/bootmem.h>
-#include <linux/seq_file.h>
 #include <linux/screen_info.h>
 #include <linux/init.h>
 #include <linux/kexec.h>
@@ -45,7 +44,6 @@
 #include <linux/of_fdt.h>
 #include <linux/of_platform.h>
 #include <linux/efi.h>
-#include <linux/personality.h>
 
 #include <asm/acpi.h>
 #include <asm/fixmap.h>
@@ -103,8 +101,6 @@ static struct resource mem_res[] = {
 	}
 };
 
-static void setup_processor_features(void);
-
 #define kernel_code mem_res[0]
 #define kernel_data mem_res[1]
 
@@ -226,88 +222,6 @@ void __init up_late_init(void)
 }
 #endif /* CONFIG_UP_LATE_INIT */
 
-static void __init setup_processor_features(void)
-{
-	u64 features, block;
-	u32 cwg;
-	int cls;
-
-	/*
-	 * Check for sane CTR_EL0.CWG value.
-	 */
-	cwg = cache_type_cwg();
-	cls = cache_line_size();
-	if (!cwg)
-		pr_warn("No Cache Writeback Granule information, assuming cache line size %d\n",
-			cls);
-	if (L1_CACHE_BYTES < cls)
-		pr_warn("L1_CACHE_BYTES smaller than the Cache Writeback Granule (%d < %d)\n",
-			L1_CACHE_BYTES, cls);
-
-	/*
-	 * ID_AA64ISAR0_EL1 contains 4-bit wide signed feature blocks.
-	 * The blocks we test below represent incremental functionality
-	 * for non-negative values. Negative values are reserved.
-	 */
-	features = read_cpuid(ID_AA64ISAR0_EL1);
-	block = (features >> 4) & 0xf;
-	if (!(block & 0x8)) {
-		switch (block) {
-		default:
-		case 2:
-			elf_hwcap |= HWCAP_PMULL;
-		case 1:
-			elf_hwcap |= HWCAP_AES;
-		case 0:
-			break;
-		}
-	}
-
-	block = (features >> 8) & 0xf;
-	if (block && !(block & 0x8))
-		elf_hwcap |= HWCAP_SHA1;
-
-	block = (features >> 12) & 0xf;
-	if (block && !(block & 0x8))
-		elf_hwcap |= HWCAP_SHA2;
-
-	block = (features >> 16) & 0xf;
-	if (block && !(block & 0x8))
-		elf_hwcap |= HWCAP_CRC32;
-
-#ifdef CONFIG_COMPAT
-	/*
-	 * ID_ISAR5_EL1 carries similar information as above, but pertaining to
-	 * the Aarch32 32-bit execution state.
-	 */
-	features = read_cpuid(ID_ISAR5_EL1);
-	block = (features >> 4) & 0xf;
-	if (!(block & 0x8)) {
-		switch (block) {
-		default:
-		case 2:
-			compat_elf_hwcap2 |= COMPAT_HWCAP2_PMULL;
-		case 1:
-			compat_elf_hwcap2 |= COMPAT_HWCAP2_AES;
-		case 0:
-			break;
-		}
-	}
-
-	block = (features >> 8) & 0xf;
-	if (block && !(block & 0x8))
-		compat_elf_hwcap2 |= COMPAT_HWCAP2_SHA1;
-
-	block = (features >> 12) & 0xf;
-	if (block && !(block & 0x8))
-		compat_elf_hwcap2 |= COMPAT_HWCAP2_SHA2;
-
-	block = (features >> 16) & 0xf;
-	if (block && !(block & 0x8))
-		compat_elf_hwcap2 |= COMPAT_HWCAP2_CRC32;
-#endif
-}
-
 static void __init setup_processor(void)
 {
 	pr_info("Boot CPU: AArch64 Processor [%08x]\n", read_cpuid_id());
@@ -449,124 +363,3 @@ static int __init topology_init(void)
 }
 subsys_initcall(topology_init);
 
-static const char *hwcap_str[] = {
-	"fp",
-	"asimd",
-	"evtstrm",
-	"aes",
-	"pmull",
-	"sha1",
-	"sha2",
-	"crc32",
-	NULL
-};
-
-#ifdef CONFIG_COMPAT
-static const char *compat_hwcap_str[] = {
-	"swp",
-	"half",
-	"thumb",
-	"26bit",
-	"fastmult",
-	"fpa",
-	"vfp",
-	"edsp",
-	"java",
-	"iwmmxt",
-	"crunch",
-	"thumbee",
-	"neon",
-	"vfpv3",
-	"vfpv3d16",
-	"tls",
-	"vfpv4",
-	"idiva",
-	"idivt",
-	"vfpd32",
-	"lpae",
-	"evtstrm"
-};
-
-static const char *compat_hwcap2_str[] = {
-	"aes",
-	"pmull",
-	"sha1",
-	"sha2",
-	"crc32",
-	NULL
-};
-#endif /* CONFIG_COMPAT */
-
-static int c_show(struct seq_file *m, void *v)
-{
-	int i, j;
-
-	for_each_online_cpu(i) {
-		struct cpuinfo_arm64 *cpuinfo = &per_cpu(cpu_data, i);
-		u32 midr = cpuinfo->reg_midr;
-
-		/*
-		 * glibc reads /proc/cpuinfo to determine the number of
-		 * online processors, looking for lines beginning with
-		 * "processor".  Give glibc what it expects.
-		 */
-#ifdef CONFIG_SMP
-		seq_printf(m, "processor\t: %d\n", i);
-#endif
-
-		/*
-		 * Dump out the common processor features in a single line.
-		 * Userspace should read the hwcaps with getauxval(AT_HWCAP)
-		 * rather than attempting to parse this, but there's a body of
-		 * software which does already (at least for 32-bit).
-		 */
-		seq_puts(m, "Features\t:");
-		if (personality(current->personality) == PER_LINUX32) {
-#ifdef CONFIG_COMPAT
-			for (j = 0; compat_hwcap_str[j]; j++)
-				if (compat_elf_hwcap & (1 << j))
-					seq_printf(m, " %s", compat_hwcap_str[j]);
-
-			for (j = 0; compat_hwcap2_str[j]; j++)
-				if (compat_elf_hwcap2 & (1 << j))
-					seq_printf(m, " %s", compat_hwcap2_str[j]);
-#endif /* CONFIG_COMPAT */
-		} else {
-			for (j = 0; hwcap_str[j]; j++)
-				if (elf_hwcap & (1 << j))
-					seq_printf(m, " %s", hwcap_str[j]);
-		}
-		seq_puts(m, "\n");
-
-		seq_printf(m, "CPU implementer\t: 0x%02x\n",
-			   MIDR_IMPLEMENTOR(midr));
-		seq_printf(m, "CPU architecture: 8\n");
-		seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr));
-		seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr));
-		seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr));
-	}
-
-	return 0;
-}
-
-static void *c_start(struct seq_file *m, loff_t *pos)
-{
-	return *pos < 1 ? (void *)1 : NULL;
-}
-
-static void *c_next(struct seq_file *m, void *v, loff_t *pos)
-{
-	++*pos;
-	return NULL;
-}
-
-static void c_stop(struct seq_file *m, void *v)
-{
-}
-
-const struct seq_operations cpuinfo_op = {
-	.start	= c_start,
-	.next	= c_next,
-	.stop	= c_stop,
-	.show	= c_show
-};
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 05/10] arm64: Keep track of CPU feature registers
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
                   ` (3 preceding siblings ...)
  2015-07-24  9:43 ` [RFC PATCH 04/10] arm64: Consolidate cpuinfo handling Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-08-05 14:58   ` Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 06/10] arm64: Add helper to decode register from instruction Suzuki K. Poulose
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

This patch adds an infrastructure to keep track of the CPU feature
registers on the system. This patch also consolidates the cpuinfo
SANITY checks which ensures that we don't have conflicting feature
supports across the CPUs.

Each register has a set of feature bits defined by the architecture.
We define the following attributes:

 1) strict - If strict matching is required for the field across the
    all the CPUs for SANITY checks.
 2) visible - If the field is exposed to the userspace (See documentation
    for more details).

The default 'safe' value for the feature is also defined, which will be
used:
 1) To set the value for a 'discrete' feature with conflicting values.
 2) To set the value for an 'invisible' feature for the userspace.

The infrastructure keeps track of the following values for a feature
register:
 - user visible value
 - system wide safe value

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cpu.h |  149 ++++++++++++++++
 arch/arm64/kernel/cpuinfo.c  |  399 ++++++++++++++++++++++++++++++++++++++----
 2 files changed, 511 insertions(+), 37 deletions(-)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index a34de72..c7b0b89 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -16,10 +16,154 @@
 #ifndef __ASM_CPU_H
 #define __ASM_CPU_H
 
+#include <asm/sysreg.h>
 #include <linux/cpu.h>
 #include <linux/init.h>
 #include <linux/percpu.h>
 
+
+#define SYS_REG(op0, op1, crn, crm, op2) \
+			(sys_reg(op0, op1, crn, crm, op2) >> 5)
+
+#define SYS_ID_PFR0_EL1			SYS_REG(3, 0, 0, 1, 0)
+#define SYS_ID_PFR1_EL1			SYS_REG(3, 0, 0, 1, 1)
+#define SYS_ID_DFR0_EL1			SYS_REG(3, 0, 0, 1, 2)
+#define SYS_ID_MMFR0_EL1		SYS_REG(3, 0, 0, 1, 4)
+#define SYS_ID_MMFR1_EL1		SYS_REG(3, 0, 0, 1, 5)
+#define SYS_ID_MMFR2_EL1		SYS_REG(3, 0, 0, 1, 6)
+#define SYS_ID_MMFR3_EL1		SYS_REG(3, 0, 0, 1, 7)
+
+#define SYS_ID_ISAR0_EL1		SYS_REG(3, 0, 0, 2, 0)
+#define SYS_ID_ISAR1_EL1		SYS_REG(3, 0, 0, 2, 1)
+#define SYS_ID_ISAR2_EL1		SYS_REG(3, 0, 0, 2, 2)
+#define SYS_ID_ISAR3_EL1		SYS_REG(3, 0, 0, 2, 3)
+#define SYS_ID_ISAR4_EL1		SYS_REG(3, 0, 0, 2, 4)
+#define SYS_ID_ISAR5_EL1		SYS_REG(3, 0, 0, 2, 5)
+#define SYS_ID_MMFR4_EL1		SYS_REG(3, 0, 0, 2, 6)
+
+#define SYS_MVFR0_EL1			SYS_REG(3, 0, 0, 3, 0)
+#define SYS_MVFR1_EL1			SYS_REG(3, 0, 0, 3, 1)
+#define SYS_MVFR2_EL1			SYS_REG(3, 0, 0, 3, 2)
+
+#define SYS_ID_AA64PFR0_EL1		SYS_REG(3, 0, 0, 4, 0)
+#define SYS_ID_AA64PFR1_EL1		SYS_REG(3, 0, 0, 4, 1)
+
+#define SYS_ID_AA64DFR0_EL1		SYS_REG(3, 0, 0, 5, 0)
+#define SYS_ID_AA64DFR1_EL1		SYS_REG(3, 0, 0, 5, 1)
+
+#define SYS_ID_AA64ISAR0_EL1		SYS_REG(3, 0, 0, 6, 0)
+#define SYS_ID_AA64ISAR1_EL1		SYS_REG(3, 0, 0, 6, 1)
+
+#define SYS_ID_AA64MMFR0_EL1		SYS_REG(3, 0, 0, 7, 0)
+#define SYS_ID_AA64MMFR1_EL1		SYS_REG(3, 0, 0, 7, 1)
+
+#define SYS_CNTFRQ_EL0			SYS_REG(3, 3, 14, 0, 0)
+#define SYS_CTR_EL0			SYS_REG(3, 3, 0, 0, 1)
+#define SYS_DCZID_EL0			SYS_REG(3, 3, 0, 0, 7)
+
+enum sys_id {
+	sys_cntfrq = SYS_CNTFRQ_EL0,
+	sys_ctr = SYS_CTR_EL0,
+	sys_dczid = SYS_DCZID_EL0,
+
+	sys_id_aa64dfr0 = SYS_ID_AA64DFR0_EL1,
+	sys_id_aa64dfr1 = SYS_ID_AA64DFR1_EL1,
+
+	sys_id_aa64isar0 = SYS_ID_AA64ISAR0_EL1,
+	sys_id_aa64isar1 = SYS_ID_AA64ISAR1_EL1,
+
+	sys_id_aa64mmfr0 = SYS_ID_AA64MMFR0_EL1,
+	sys_id_aa64mmfr1 = SYS_ID_AA64MMFR1_EL1,
+
+	sys_id_aa64pfr0 = SYS_ID_AA64PFR0_EL1,
+	sys_id_aa64pfr1 = SYS_ID_AA64PFR1_EL1,
+
+	sys_id_dfr0 = SYS_ID_DFR0_EL1,
+
+	sys_id_isar0 = SYS_ID_ISAR0_EL1,
+	sys_id_isar1 = SYS_ID_ISAR1_EL1,
+	sys_id_isar2 = SYS_ID_ISAR2_EL1,
+	sys_id_isar3 = SYS_ID_ISAR3_EL1,
+	sys_id_isar4 = SYS_ID_ISAR4_EL1,
+	sys_id_isar5 = SYS_ID_ISAR5_EL1,
+
+	sys_id_mmfr0 = SYS_ID_MMFR0_EL1,
+	sys_id_mmfr1 = SYS_ID_MMFR1_EL1,
+	sys_id_mmfr2 = SYS_ID_MMFR2_EL1,
+	sys_id_mmfr3 = SYS_ID_MMFR3_EL1,
+	sys_id_mmfr4 = SYS_ID_MMFR4_EL1,
+
+	sys_id_pfr0 = SYS_ID_PFR0_EL1,
+	sys_id_pfr1 = SYS_ID_PFR1_EL1,
+
+	sys_mvfr0 = SYS_MVFR0_EL1,
+	sys_mvfr1 = SYS_MVFR1_EL1,
+	sys_mvfr2 = SYS_MVFR2_EL1,
+};
+
+enum ftr_type {
+	FTR_DISCRETE,
+	FTR_SCALAR_MIN,
+	FTR_SCALAR_MAX,
+};
+
+struct arm64_ftr_bits {
+	bool		visible;	/* visible to userspace ? */
+	bool		strict;		/* CPU Sanity check
+					 *  strict matching required ? */
+	enum ftr_type	type;
+	u8		shift;
+	u64		mask;
+	u64		safe_val;	/* user visible safe value */
+};
+
+/*
+ * @arm64_ftr_reg - Feature register
+ * @user_mask 		User visibile bits of val. Rest of them are
+ *			marked as 'not supported' as held in @user_val.
+ * @strict_mask 	Bits which should match across all CPUs for sanity.
+ * @sys_val		Safe value across the CPUs (system view)
+ * @user_val		'Invisible' fields of the sysreg filled with
+ *			 respective 'unsupported' value.
+ */
+struct arm64_ftr_reg {
+	enum sys_id		sys_id;
+	const char*		name;
+	u64			user_mask;
+	u64			strict_mask;
+	u64			sys_val;
+	u64			user_val;
+	struct arm64_ftr_bits*	ftr_bits;
+};
+
+#define FTR_STRICT	true
+#define FTR_NONSTRICT	false
+#define FTR_VISIBLE	true
+#define FTR_HIDDEN	false
+
+#define ARM64_FTR_BITS(ftr_visible, ftr_strict, ftr_type, ftr_shift, ftr_mask, ftr_safe_val)	\
+	{							\
+		.visible = ftr_visible,				\
+		.strict = ftr_strict,				\
+		.type = ftr_type,				\
+		.shift = ftr_shift,				\
+		.mask = ftr_mask,				\
+		.safe_val = ftr_safe_val,			\
+	}
+
+
+#define ARM64_FTR_END					\
+	{						\
+		.mask = 0,				\
+	}
+
+#define ARM64_FTR_REG(id, ftr_table)			\
+	{						\
+		.sys_id = sys_ ## id,			\
+		.name = #id,				\
+		.ftr_bits = &((ftr_table)[0]),		\
+	}
+
 /*
  * Records attributes of an individual CPU.
  */
@@ -64,4 +208,9 @@ void cpuinfo_store_cpu(void);
 void __init cpuinfo_store_boot_cpu(void);
 void __init setup_processor_features(void);
 
+static inline u64 arm64_ftr_value(struct arm64_ftr_bits *ftrp, u64 reg)
+{
+	return ((reg >> ftrp->shift) & ftrp->mask);
+}
+
 #endif /* __ASM_CPU_H */
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index a13468b..ae2a37f 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -31,6 +31,207 @@
 #include <linux/sched.h>
 #include <linux/smp.h>
 
+struct arm64_ftr_bits ftr_id_aa64isar0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 32, 0xffffffffUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 20, 0xfffUL, 0),
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_DISCRETE, 16, 0xfUL, 0),	// crc32
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_DISCRETE, 12, 0xfUL, 0),	// sha2
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_DISCRETE, 8, 0xfUL, 0),	// sha1
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_SCALAR_MIN, 4, 0xfUL, 0),	// aes
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, 0),	// RAZ
+	ARM64_FTR_END,
+};
+
+#define id_aa64pfr0_simd_not_implemented	0xf
+#define id_aa64pfr0_fp_not_implemented		0xf
+#define id_aa64pfr0_ELx_64bit_only 		0x1
+struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 32, 0xffffffffUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 28, 0xfUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 24, 0xfUL, 0),			// GIC
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_DISCRETE, 20, 0xfUL, id_aa64pfr0_simd_not_implemented), 	// simd
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_DISCRETE, 16, 0xfUL, id_aa64pfr0_fp_not_implemented),	// fp
+	/* Linux doesn't care about the EL3 */
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_DISCRETE, 12, 0xfUL, 0),			// EL3
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xfUL, 0),			// EL2
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, id_aa64pfr0_ELx_64bit_only), // EL1
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, id_aa64pfr0_ELx_64bit_only), // EL0
+	ARM64_FTR_END,
+};
+
+#define id_aa64mmfr0_TGran4k_not_implemented		0xf
+#define id_aa64mmfr0_TGran64k_not_implemented		0xf
+#define id_aa64mmfr0_TGran16k_not_implemented		0x0
+struct arm64_ftr_bits ftr_id_aa64mmfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 32, 0xffffffffUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 28, 0xfUL, id_aa64mmfr0_TGran4k_not_implemented),	// TGran4
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 24, 0xfUL, id_aa64mmfr0_TGran64k_not_implemented),	// TGran64
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 20, 0xfUL, id_aa64mmfr0_TGran16k_not_implemented),	// TGran16
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 16, 0xfUL, 0),		// BigEndEL0
+	/* Linux shouldn't care about secure memory */
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_DISCRETE, 12, 0xfUL, 0),		// SNSMem
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xfUL, 0),		// BigEndEL
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, 0),		// ASID
+	/*
+	 * Differing PARange is fine as long as all peripherals and memory are mapped
+	 * within the minimum PARange of all CPUs
+	 */
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_SCALAR_MIN, 0, 0xfUL, 0),		// PARange
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_ctr[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 31, 0x1UL, 1),	// RAO
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 28, 0x7UL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_SCALAR_MAX, 24, 0xfUL, 0),	// CWG
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_SCALAR_MIN, 20, 0xfUL, 0),	// ERG
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_SCALAR_MIN, 16, 0xfUL, 1),	// DminLine
+	/*
+	 * Linux can handle differing I-cache policies. Userspace JITs will
+	 * make use of *minLine
+	 */
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_DISCRETE, 14, 0x3UL, 0),	// L1Ip
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0x3ffUL, 0),	// RAZ
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_SCALAR_MIN, 0, 0xfUL, 0),	// IminLine
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_id_mmfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 28, 0xfUL, 0),	// InnerShr
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 24, 0xfUL, 0),	// FCSE
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_SCALAR_MIN, 20, 0xfUL, 0),	// AuxReg
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 16, 0xfUL, 0),	// TCM
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 12, 0xfUL, 0),	// ShareLvl
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xfUL, 0),	// OuterShr
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, 0),	// PMSA
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, 0),	// VMSA
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_id_aa64dfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 32, 0xffffffffUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_SCALAR_MIN, 28, 0xfUL, 0),	// CTX_CMPs
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_SCALAR_MIN, 20, 0xfUL, 0),	// WRPs
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_SCALAR_MIN, 12, 0xfUL, 0),	// BRPs
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xfUL, 0),	// PMUVer
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, 0),	// TraceVer
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, 0x6),	// DebugVer
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_mvfr2[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xffffffUL, 0),	// RAZ
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, 0),	// FPMisc
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, 0),	// SIMDMisc
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_dczid[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 5, 0x7ffffffUL, 0),// RAZ
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0x1UL, 1),	// DZP
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_SCALAR_MIN, 0, 0xf, 0),	// BS
+	ARM64_FTR_END,
+};
+
+
+struct arm64_ftr_bits ftr_id_isar5[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 20, 0xfffUL, 0),	// RAZ
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 16, 0xfUL, 0),	// CRC32
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 12, 0xfUL, 0),	// SHA2
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xfUL, 0),	// SHA1
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, 0),	// AES
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, 0),	// SEVL
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_id_mmfr4[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xffffffUL, 0),	// RAZ
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, 0),	// ac2
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, 0),	// RAZ
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_id_pfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 16, 0xffffUL, 0),	// RAZ
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 12, 0xfUL, 0),	// State3
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xfUL, 0),	// State2
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, 0),	// State1
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, 0),	// State0
+	ARM64_FTR_END,
+};
+
+/*
+ * Common ftr bits for a 32bit register with all hidden, strict
+ * attributes, with 4bit feature fields and a default safe value of
+ * 0. Covers the following 32bit registers:
+ * id_isar[0-4], id_mmfr[1-3], id_pfr1, mvfr[0-1]
+ */
+struct arm64_ftr_bits ftr_generic_discrete_32bit[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 28, 0xfUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 24, 0xfUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 20, 0xfUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 16, 0xfUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 12, 0xfUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 8, 0xfUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 4, 0xfUL, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xfUL, 0),
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_generic[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, ~0x0ULL, 0),
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_generic32[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_DISCRETE, 0, 0xffffffffUL, 0),
+	ARM64_FTR_END,
+};
+
+struct arm64_ftr_bits ftr_aa64raz[] = {
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_DISCRETE, 0, ~0x0ULL, 0),
+	ARM64_FTR_END,
+};
+
+static struct arm64_ftr_reg arm64_regs[] = {
+
+	ARM64_FTR_REG(id_aa64isar0, ftr_id_aa64isar0),
+	ARM64_FTR_REG(id_aa64pfr0, ftr_id_aa64pfr0),
+	ARM64_FTR_REG(id_aa64pfr1, ftr_aa64raz),
+	ARM64_FTR_REG(id_aa64isar1, ftr_aa64raz),
+
+	ARM64_FTR_REG(id_aa64mmfr0, ftr_id_aa64mmfr0),
+	ARM64_FTR_REG(id_aa64dfr0, ftr_id_aa64dfr0),
+	ARM64_FTR_REG(id_aa64dfr1, ftr_generic),
+	ARM64_FTR_REG(id_aa64mmfr1, ftr_generic),
+	ARM64_FTR_REG(ctr, ftr_ctr),
+
+	ARM64_FTR_REG(dczid, ftr_dczid),
+	ARM64_FTR_REG(cntfrq, ftr_generic32),
+
+	ARM64_FTR_REG(id_dfr0, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_isar0, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_isar1, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_isar2, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_isar3, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_isar4, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_isar5, ftr_id_isar5),
+
+	ARM64_FTR_REG(id_mmfr0, ftr_id_mmfr0),
+	ARM64_FTR_REG(id_mmfr1, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_mmfr2, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_mmfr3, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(id_mmfr4, ftr_id_mmfr4),
+
+	ARM64_FTR_REG(id_pfr0, ftr_id_pfr0),
+	ARM64_FTR_REG(id_pfr1, ftr_generic_discrete_32bit),
+
+	ARM64_FTR_REG(mvfr0, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(mvfr1, ftr_generic_discrete_32bit),
+	ARM64_FTR_REG(mvfr2, ftr_mvfr2),
+
+};
+
 /*
  * In case the boot CPU is hotpluggable, we record its initial state and
  * current state separately. Certain system registers may contain different
@@ -92,22 +293,146 @@ static void update_cpu_features(struct cpuinfo_arm64 *info)
 	update_mixed_endian_el0_support(info);
 }
 
-static int check_reg_mask(char *name, u64 mask, u64 boot, u64 cur, int cpu)
+static inline int arm64_boot_cpuinfo_initialised(void)
+{
+	return (boot_cpu_data.reg_midr != 0);
+}
+
+static struct arm64_ftr_reg* get_arm64_sys_reg(enum sys_id sys_id)
+{
+	int i;
+
+	for(i = 0; i < ARRAY_SIZE(arm64_regs); i ++)
+		if (arm64_regs[i].sys_id == sys_id)
+			return &arm64_regs[i];
+	return NULL;
+}
+
+static u64 arm64_ftr_set_value(struct arm64_ftr_bits *ftrp, u64 reg, u64 ftr_val)
+{
+	u64 mask = ftrp->mask << ftrp->shift;
+
+	reg &= ~mask;
+	reg |= (ftr_val << ftrp->shift) & mask;
+	return reg;
+}
+
+/*
+ * Initialise the CPU feature register from Boot CPU values.
+ * Also initiliases the user_mask & strict_mask for the register.
+ */
+static void init_cpu_ftr_reg(struct arm64_ftr_reg *reg,  u64 new)
+{
+	struct arm64_ftr_bits *ftrp = reg->ftr_bits;
+	u64 user_mask = 0, strict_mask = ~0x0ULL;
+
+	for(; ftrp->mask; ftrp++) {
+		u64 ftr_new = arm64_ftr_value(ftrp, new);
+
+		reg->sys_val = arm64_ftr_set_value(ftrp, reg->sys_val, ftr_new);
+		if (ftrp->visible)
+			user_mask |= (ftrp->mask << ftrp->shift);
+		else
+			reg->user_val = arm64_ftr_set_value(ftrp, reg->user_val,
+								ftrp->safe_val);
+		if (!ftrp->strict)
+			strict_mask &= ~(ftrp->mask << ftrp->shift);
+	}
+	reg->user_mask = user_mask;
+	reg->strict_mask = strict_mask;
+}
+
+#define INIT_FTR_REG(info, id) \
+		init_cpu_ftr_reg(get_arm64_sys_reg(sys_ ##id), (u64)info->reg_ ##id)
+
+static void init_cpu_ftrs(struct cpuinfo_arm64 *info)
 {
-	if ((boot & mask) == (cur & mask))
+	INIT_FTR_REG(info, ctr);
+	INIT_FTR_REG(info, dczid);
+	INIT_FTR_REG(info, cntfrq);
+	INIT_FTR_REG(info, id_aa64dfr0);
+	INIT_FTR_REG(info, id_aa64dfr1);
+	INIT_FTR_REG(info, id_aa64isar0);
+	INIT_FTR_REG(info, id_aa64isar1);
+	INIT_FTR_REG(info, id_aa64mmfr0);
+	INIT_FTR_REG(info, id_aa64mmfr1);
+	INIT_FTR_REG(info, id_aa64pfr0);
+	INIT_FTR_REG(info, id_aa64pfr1);
+	INIT_FTR_REG(info, id_dfr0);
+	INIT_FTR_REG(info, id_isar0);
+	INIT_FTR_REG(info, id_isar1);
+	INIT_FTR_REG(info, id_isar2);
+	INIT_FTR_REG(info, id_isar3);
+	INIT_FTR_REG(info, id_isar4);
+	INIT_FTR_REG(info, id_isar5);
+	INIT_FTR_REG(info, id_mmfr0);
+	INIT_FTR_REG(info, id_mmfr1);
+	INIT_FTR_REG(info, id_mmfr2);
+	INIT_FTR_REG(info, id_mmfr3);
+	INIT_FTR_REG(info, id_pfr0);
+	INIT_FTR_REG(info, id_pfr1);
+	INIT_FTR_REG(info, mvfr0);
+	INIT_FTR_REG(info, mvfr1);
+	INIT_FTR_REG(info, mvfr2);
+}
+
+static u64 arm64_ftr_safe_value(struct arm64_ftr_bits *ftrp, u64 new, u64 cur)
+{
+	switch(ftrp->type) {
+	case FTR_DISCRETE:
+		return ftrp->safe_val;
+	case FTR_SCALAR_MIN:
+		return new < cur ? new : cur;
+	case FTR_SCALAR_MAX:
+		return new > cur ? new : cur;
+	}
+
+	BUG();
+	return 0;
+}
+
+static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new, int cpu)
+{
+	struct arm64_ftr_bits *ftrp = reg->ftr_bits;
+
+	for(; ftrp->mask; ftrp++) {
+
+		u64 ftr_cur = arm64_ftr_value(ftrp, reg->sys_val);
+		u64 ftr_new = arm64_ftr_value(ftrp, new);
+
+		if (ftr_cur == ftr_new)
+			continue;
+		/* Find a safe value */
+		ftr_new = arm64_ftr_safe_value(ftrp, ftr_new, ftr_cur);
+		reg->sys_val = arm64_ftr_set_value(ftrp, reg->sys_val, ftr_new);
+	}
+
+}
+
+static int check_reg_mask(struct arm64_ftr_reg *reg, u64 boot, u64 cur, int cpu)
+{
+
+	if ((boot & reg->strict_mask) == (cur & reg->strict_mask))
 		return 0;
 
 	pr_warn("SANITY CHECK: Unexpected variation in %s. Boot CPU: %#016lx, CPU%d: %#016lx\n",
-		name, (unsigned long)boot, cpu, (unsigned long)cur);
+		reg->name, (unsigned long)boot, cpu, (unsigned long)cur);
 
 	return 1;
 }
 
-#define CHECK_MASK(field, mask, boot, cur, cpu) \
-	check_reg_mask(#field, mask, (boot)->reg_ ## field, (cur)->reg_ ## field, cpu)
-
-#define CHECK(field, boot, cur, cpu) \
-	CHECK_MASK(field, ~0ULL, boot, cur, cpu)
+#define CHECK_CPUINFO(field)			 					\
+	({										\
+		int __rc = 0;								\
+		struct arm64_ftr_reg *__regp = get_arm64_sys_reg(sys_ ## field);	\
+		if (__regp) {								\
+			__rc = check_reg_mask(__regp,					\
+						(boot)->reg_ ## field,			\
+						(cur)->reg_ ## field, cpu);		\
+			update_cpu_ftr_reg(__regp, cur->reg_ ## field, cpu);		\
+		}									\
+		__rc;									\
+	})
 
 /*
  * Verify that CPUs don't have unexpected differences that will cause problems.
@@ -123,17 +448,17 @@ static void cpuinfo_sanity_check(struct cpuinfo_arm64 *cur)
 	 * caches should look identical. Userspace JITs will make use of
 	 * *minLine.
 	 */
-	diff |= CHECK_MASK(ctr, 0xffff3fff, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(ctr);
 
 	/*
 	 * Userspace may perform DC ZVA instructions. Mismatched block sizes
 	 * could result in too much or too little memory being zeroed if a
 	 * process is preempted and migrated between CPUs.
 	 */
-	diff |= CHECK(dczid, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(dczid);
 
 	/* If different, timekeeping will be broken (especially with KVM) */
-	diff |= CHECK(cntfrq, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(cntfrq);
 
 	/*
 	 * The kernel uses self-hosted debug features and expects CPUs to
@@ -141,15 +466,15 @@ static void cpuinfo_sanity_check(struct cpuinfo_arm64 *cur)
 	 * and BRPs to be identical.
 	 * ID_AA64DFR1 is currently RES0.
 	 */
-	diff |= CHECK(id_aa64dfr0, boot, cur, cpu);
-	diff |= CHECK(id_aa64dfr1, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(id_aa64dfr0);
+	diff |= CHECK_CPUINFO(id_aa64dfr1);
 
 	/*
 	 * Even in big.LITTLE, processors should be identical instruction-set
 	 * wise.
 	 */
-	diff |= CHECK(id_aa64isar0, boot, cur, cpu);
-	diff |= CHECK(id_aa64isar1, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(id_aa64isar0);
+	diff |= CHECK_CPUINFO(id_aa64isar1);
 
 	/*
 	 * Differing PARange support is fine as long as all peripherals and
@@ -157,42 +482,42 @@ static void cpuinfo_sanity_check(struct cpuinfo_arm64 *cur)
 	 * Linux should not care about secure memory.
 	 * ID_AA64MMFR1 is currently RES0.
 	 */
-	diff |= CHECK_MASK(id_aa64mmfr0, 0xffffffffffff0ff0, boot, cur, cpu);
-	diff |= CHECK(id_aa64mmfr1, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(id_aa64mmfr0);
+	diff |= CHECK_CPUINFO(id_aa64mmfr1);
 
 	/*
 	 * EL3 is not our concern.
 	 * ID_AA64PFR1 is currently RES0.
 	 */
-	diff |= CHECK_MASK(id_aa64pfr0, 0xffffffffffff0fff, boot, cur, cpu);
-	diff |= CHECK(id_aa64pfr1, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(id_aa64pfr0);
+	diff |= CHECK_CPUINFO(id_aa64pfr1);
 
 	/*
 	 * If we have AArch32, we care about 32-bit features for compat. These
 	 * registers should be RES0 otherwise.
 	 */
-	diff |= CHECK(id_dfr0, boot, cur, cpu);
-	diff |= CHECK(id_isar0, boot, cur, cpu);
-	diff |= CHECK(id_isar1, boot, cur, cpu);
-	diff |= CHECK(id_isar2, boot, cur, cpu);
-	diff |= CHECK(id_isar3, boot, cur, cpu);
-	diff |= CHECK(id_isar4, boot, cur, cpu);
-	diff |= CHECK(id_isar5, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(id_dfr0);
+	diff |= CHECK_CPUINFO(id_isar0);
+	diff |= CHECK_CPUINFO(id_isar1);
+	diff |= CHECK_CPUINFO(id_isar2);
+	diff |= CHECK_CPUINFO(id_isar3);
+	diff |= CHECK_CPUINFO(id_isar4);
+	diff |= CHECK_CPUINFO(id_isar5);
 	/*
 	 * Regardless of the value of the AuxReg field, the AIFSR, ADFSR, and
 	 * ACTLR formats could differ across CPUs and therefore would have to
 	 * be trapped for virtualization anyway.
 	 */
-	diff |= CHECK_MASK(id_mmfr0, 0xff0fffff, boot, cur, cpu);
-	diff |= CHECK(id_mmfr1, boot, cur, cpu);
-	diff |= CHECK(id_mmfr2, boot, cur, cpu);
-	diff |= CHECK(id_mmfr3, boot, cur, cpu);
-	diff |= CHECK(id_pfr0, boot, cur, cpu);
-	diff |= CHECK(id_pfr1, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(id_mmfr0);
+	diff |= CHECK_CPUINFO(id_mmfr1);
+	diff |= CHECK_CPUINFO(id_mmfr2);
+	diff |= CHECK_CPUINFO(id_mmfr3);
+	diff |= CHECK_CPUINFO(id_pfr0);
+	diff |= CHECK_CPUINFO(id_pfr1);
 
-	diff |= CHECK(mvfr0, boot, cur, cpu);
-	diff |= CHECK(mvfr1, boot, cur, cpu);
-	diff |= CHECK(mvfr2, boot, cur, cpu);
+	diff |= CHECK_CPUINFO(mvfr0);
+	diff |= CHECK_CPUINFO(mvfr1);
+	diff |= CHECK_CPUINFO(mvfr2);
 
 	/*
 	 * Mismatched CPU features are a recipe for disaster. Don't even
@@ -239,7 +564,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
 	cpuinfo_detect_icache_policy(info);
 
 	check_local_cpu_errata();
-	check_local_cpu_features();
+	cpuinfo_sanity_check(info);
 	update_cpu_features(info);
 }
 
@@ -247,13 +572,13 @@ void cpuinfo_store_cpu(void)
 {
 	struct cpuinfo_arm64 *info = this_cpu_ptr(&cpu_data);
 	__cpuinfo_store_cpu(info);
-	cpuinfo_sanity_check(info);
 }
 
 void __init cpuinfo_store_boot_cpu(void)
 {
 	struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0);
 	__cpuinfo_store_cpu(info);
+	init_cpu_ftrs(info);
 
 	boot_cpu_data = *info;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 06/10] arm64: Add helper to decode register from instruction
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
                   ` (4 preceding siblings ...)
  2015-07-24  9:43 ` [RFC PATCH 05/10] arm64: Keep track of CPU feature registers Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 07/10] arm64: Expose feature registers by emulating MRS Suzuki K. Poulose
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

Add a helper to extract the register field from a given
instruction.

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/insn.h |    2 ++
 arch/arm64/kernel/insn.c      |   29 +++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 30e50eb..6dea3bc 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -289,6 +289,8 @@ enum aarch64_insn_encoding_class aarch64_get_insn_class(u32 insn);
 u64 aarch64_insn_decode_immediate(enum aarch64_insn_imm_type type, u32 insn);
 u32 aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
 				  u32 insn, u64 imm);
+u32 aarch64_insn_decode_register(enum aarch64_insn_register_type type,
+					 u32 insn);
 u32 aarch64_insn_gen_branch_imm(unsigned long pc, unsigned long addr,
 				enum aarch64_insn_branch_type type);
 u32 aarch64_insn_gen_comp_branch_imm(unsigned long pc, unsigned long addr,
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index dd9671c..d44aa04 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -389,6 +389,35 @@ u32 __kprobes aarch64_insn_encode_immediate(enum aarch64_insn_imm_type type,
 	return insn;
 }
 
+u32 aarch64_insn_decode_register(enum aarch64_insn_register_type type,
+					u32 insn)
+{
+	int shift;
+
+	switch (type) {
+	case AARCH64_INSN_REGTYPE_RT:
+	case AARCH64_INSN_REGTYPE_RD:
+		shift = 0;
+		break;
+	case AARCH64_INSN_REGTYPE_RN:
+		shift = 5;
+		break;
+	case AARCH64_INSN_REGTYPE_RT2:
+	case AARCH64_INSN_REGTYPE_RA:
+		shift = 10;
+		break;
+	case AARCH64_INSN_REGTYPE_RM:
+		shift = 16;
+		break;
+	default:
+		pr_err("%s: unknown register type encoding %d\n", __func__,
+		       type);
+		return 0;
+	}
+
+	return (insn >> shift) & GENMASK(4, 0);
+}
+
 static u32 aarch64_insn_encode_register(enum aarch64_insn_register_type type,
 					u32 insn,
 					enum aarch64_insn_register reg)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 07/10] arm64: Expose feature registers by emulating MRS
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
                   ` (5 preceding siblings ...)
  2015-07-24  9:43 ` [RFC PATCH 06/10] arm64: Add helper to decode register from instruction Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 08/10] arm64: Emulate ID registers Suzuki K. Poulose
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

This patch adds the hook for emulating MRS instruction to
export the 'user visible' value of supported system registers.
We emulate only the following id space for system registers:
	Op0=0, Op1=0, CRn=0.

The rest will fall back to SIGILL.

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cpu.h |    6 ++++
 arch/arm64/kernel/cpuinfo.c  |   82 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 88 insertions(+)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index c7b0b89..2df3d81 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -61,6 +61,12 @@
 #define SYS_CTR_EL0			SYS_REG(3, 3, 0, 0, 1)
 #define SYS_DCZID_EL0			SYS_REG(3, 3, 0, 0, 7)
 
+#define SYSREG_Op0(id)		(((id) >> 14) & 0x3)
+#define SYSREG_Op1(id)		(((id) >> 11) & 0x7)
+#define SYSREG_CRn(id)		(((id) >> 7) & 0xf)
+#define SYSREG_CRm(id)		(((id) >> 3) & 0xf)
+#define SYSREG_Op2(id)		(((id) >> 0) & 0x7)
+
 enum sys_id {
 	sys_cntfrq = SYS_CNTFRQ_EL0,
 	sys_ctr = SYS_CTR_EL0,
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index ae2a37f..36e5058 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -19,6 +19,8 @@
 #include <asm/cpu.h>
 #include <asm/cputype.h>
 #include <asm/cpufeature.h>
+#include <asm/insn.h>
+#include <asm/traps.h>
 
 #include <linux/bitops.h>
 #include <linux/bug.h>
@@ -787,3 +789,83 @@ const struct seq_operations cpuinfo_op = {
 	.show	= c_show
 };
 
+/*
+ * We emulate only the following system register space.
+ * 	Op0 = 0x3, CRn = 0x0, Op1 = 0x0
+ * Further, at the moment,  with CRm = 0, Op2 should be one of :
+ *	0(MIDR_EL1)
+ *	5(MPIDR_EL1),
+ *  	6(REVIDR_EL1)
+ * See Table C5-6 System instruction encodings for System register accesses,
+ * ARMv8 ARM(ARM DDI 0487A.f) for more details.
+ */
+static int is_emulated(u32 id)
+{
+	if (SYSREG_Op0(id) != 0x3 ||
+	    SYSREG_CRn(id) != 0x0 ||
+	    SYSREG_Op1(id) != 0x0)
+		return 0;
+	if (SYSREG_CRm(id) == 0) {
+		switch(SYSREG_Op2(id)) {
+		default:
+			return 0;
+		case 0:
+		case 5:
+		case 6:
+			return 1;
+		}
+	}
+	return 1;
+}
+
+static int emulate_sys_reg(u32 id, u64 *valp)
+{
+	struct arm64_ftr_reg *regp;
+
+	if (!is_emulated(id))
+		return -EINVAL;
+
+	regp = get_arm64_sys_reg(id);
+	if (regp)
+		*valp = regp->user_val | (regp->sys_val & regp->user_mask);
+	else {
+		/*
+		 * Registers we don't track are either IMPLEMENTAION DEFINED
+		 * (e.g, ID_AFR0_EL1) or reserved RAZ.
+		 */
+		*valp = 0;
+	}
+	return 0;
+}
+
+static int emulate_mrs(struct pt_regs *regs, u32 insn)
+{
+	int rc = 0;
+	u32 sys_reg, dst;
+	u64 val = 0;
+
+	sys_reg = (u32)aarch64_insn_decode_immediate(AARCH64_INSN_IMM_16, insn);
+	rc = emulate_sys_reg(sys_reg, &val);
+	if (rc)
+		return rc;
+	dst = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT ,insn);
+	regs->user_regs.regs[dst] = val;
+	regs->pc += 4;
+	return 0;
+}
+
+static struct undef_hook mrs_hook = {
+	.instr_mask = 0xfff00000,
+	.instr_val  = 0xd5300000,
+	.pstate_mask = COMPAT_PSR_MODE_MASK,
+	.pstate_val = PSR_MODE_EL0t,
+	.fn = emulate_mrs,
+};
+
+int __init arm64_cpuinfo_init(void)
+{
+	register_undef_hook(&mrs_hook);
+	return 0;
+}
+
+late_initcall(arm64_cpuinfo_init);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 08/10] arm64: Emulate ID registers
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
                   ` (6 preceding siblings ...)
  2015-07-24  9:43 ` [RFC PATCH 07/10] arm64: Expose feature registers by emulating MRS Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 09/10] arm64: Read system wide CPUID value Suzuki K. Poulose
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

This patch adds the emulation for the id registers(i,e Op0=0,
Op1=0, CRn=0, CRm=0).

Expose MIDR_EL1 for the current cpu where the 'mrs' instruction
is executed. The users should be aware that, on a heterogeneous
system, there is no guarantee that the 'value' read belongs to
the current CPU where it is executing, as we could get migrated
to another CPU in between.

MPIDR and REVIDR are not visible and hence contain safe default
value as per the rules.

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cpu.h |    7 +++++++
 arch/arm64/kernel/cpuinfo.c  |   20 ++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index 2df3d81..cb25cb9 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -25,6 +25,10 @@
 #define SYS_REG(op0, op1, crn, crm, op2) \
 			(sys_reg(op0, op1, crn, crm, op2) >> 5)
 
+#define SYS_MIDR_EL1			SYS_REG(3, 0, 0, 0, 0)
+#define SYS_MPIDR_EL1			SYS_REG(3, 0, 0, 0, 5)
+#define SYS_REVIDR_EL1			SYS_REG(3, 0, 0, 0, 6)
+
 #define SYS_ID_PFR0_EL1			SYS_REG(3, 0, 0, 1, 0)
 #define SYS_ID_PFR1_EL1			SYS_REG(3, 0, 0, 1, 1)
 #define SYS_ID_DFR0_EL1			SYS_REG(3, 0, 0, 1, 2)
@@ -67,6 +71,9 @@
 #define SYSREG_CRm(id)		(((id) >> 3) & 0xf)
 #define SYSREG_Op2(id)		(((id) >> 0) & 0x7)
 
+/* Safe value for MPIDR_EL1: Bit31:RES1, Bit30:U:0, Bit24:MT:1 */
+#define SYS_MPIDR_SAFE_VAL	((1UL<<31)|(1UL<<24))
+
 enum sys_id {
 	sys_cntfrq = SYS_CNTFRQ_EL0,
 	sys_ctr = SYS_CTR_EL0,
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 36e5058..9145eef 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -818,6 +818,23 @@ static int is_emulated(u32 id)
 	return 1;
 }
 
+static int emulate_id_reg(u32 id, u64 *valp)
+{
+	switch(id) {
+	case SYS_MIDR_EL1:
+		*valp = read_cpuid_id();
+		return 0;
+	case SYS_MPIDR_EL1:
+		*valp = SYS_MPIDR_SAFE_VAL;
+		return 0;
+	case SYS_REVIDR_EL1:
+		*valp = 0;
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
+
 static int emulate_sys_reg(u32 id, u64 *valp)
 {
 	struct arm64_ftr_reg *regp;
@@ -825,6 +842,9 @@ static int emulate_sys_reg(u32 id, u64 *valp)
 	if (!is_emulated(id))
 		return -EINVAL;
 
+	if (SYSREG_CRm(id) == 0)
+		return emulate_id_reg(id, valp);
+
 	regp = get_arm64_sys_reg(id);
 	if (regp)
 		*valp = regp->user_val | (regp->sys_val & regp->user_mask);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 09/10] arm64: Read system wide CPUID value
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
                   ` (7 preceding siblings ...)
  2015-07-24  9:43 ` [RFC PATCH 08/10] arm64: Emulate ID registers Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-07-24  9:43 ` [RFC PATCH 10/10] arm64: Use system-wide safe value of CPU feature register Suzuki K. Poulose
  2015-07-24  9:43 ` sample: arm64 cpu feature: Test program Suzuki K. Poulose
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

Add an API for reading a safe CPUID value across the system

Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cpu.h |    2 ++
 arch/arm64/kernel/cpuinfo.c  |    9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index cb25cb9..17a871d 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -177,6 +177,7 @@ struct arm64_ftr_reg {
 		.ftr_bits = &((ftr_table)[0]),		\
 	}
 
+#define read_system_cpuid(id)	read_system_reg(SYS_##id)
 /*
  * Records attributes of an individual CPU.
  */
@@ -226,4 +227,5 @@ static inline u64 arm64_ftr_value(struct arm64_ftr_bits *ftrp, u64 reg)
 	return ((reg >> ftrp->shift) & ftrp->mask);
 }
 
+u64 read_system_reg(enum sys_id id);
 #endif /* __ASM_CPU_H */
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 9145eef..7d140f7 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -310,6 +310,15 @@ static struct arm64_ftr_reg* get_arm64_sys_reg(enum sys_id sys_id)
 	return NULL;
 }
 
+u64 read_system_reg(enum sys_id id)
+{
+	struct arm64_ftr_reg *regp = get_arm64_sys_reg(id);
+
+	/* We shouldn't get a request for an unsupported register */
+	BUG_ON(!regp);
+	return regp->sys_val;
+}
+
 static u64 arm64_ftr_set_value(struct arm64_ftr_bits *ftrp, u64 reg, u64 ftr_val)
 {
 	u64 mask = ftrp->mask << ftrp->shift;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC PATCH 10/10] arm64: Use system-wide safe value of CPU feature register
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
                   ` (8 preceding siblings ...)
  2015-07-24  9:43 ` [RFC PATCH 09/10] arm64: Read system wide CPUID value Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  2015-07-24  9:43 ` sample: arm64 cpu feature: Test program Suzuki K. Poulose
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose, Marc Zyngier

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

Now that we track the system-wide safe value of a CPU feature,
make use of that whenever possible, including ELF_HWCAPS.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/cpuinfo.c        |   18 +++---------------
 arch/arm64/kernel/debug-monitors.c |    6 ++++--
 arch/arm64/kernel/fpsimd.c         |    5 +++--
 arch/arm64/kernel/hw_breakpoint.c  |    5 +++--
 arch/arm64/kvm/reset.c             |    3 ++-
 arch/arm64/kvm/sys_regs.c          |    5 +++--
 6 files changed, 18 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 7d140f7..678e7f6 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -241,7 +241,6 @@ static struct arm64_ftr_reg arm64_regs[] = {
  */
 DEFINE_PER_CPU(struct cpuinfo_arm64, cpu_data);
 static struct cpuinfo_arm64 boot_cpu_data;
-static bool mixed_endian_el0 = true;
 
 static char *icache_policy_str[] = {
 	[ICACHE_POLICY_RESERVED] = "RESERVED/UNKNOWN",
@@ -282,17 +281,7 @@ bool cpu_supports_mixed_endian_el0(void)
 
 bool system_supports_mixed_endian_el0(void)
 {
-	return mixed_endian_el0;
-}
-
-static void update_mixed_endian_el0_support(struct cpuinfo_arm64 *info)
-{
-	mixed_endian_el0 &= id_aa64mmfr0_mixed_endian_el0(info->reg_id_aa64mmfr0);
-}
-
-static void update_cpu_features(struct cpuinfo_arm64 *info)
-{
-	update_mixed_endian_el0_support(info);
+	return id_aa64mmfr0_mixed_endian_el0(read_system_cpuid(ID_AA64MMFR0_EL1));
 }
 
 static inline int arm64_boot_cpuinfo_initialised(void)
@@ -576,7 +565,6 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
 
 	check_local_cpu_errata();
 	cpuinfo_sanity_check(info);
-	update_cpu_features(info);
 }
 
 void cpuinfo_store_cpu(void)
@@ -617,7 +605,7 @@ void __init setup_processor_features(void)
 	 * The blocks we test below represent incremental functionality
 	 * for non-negative values. Negative values are reserved.
 	 */
-	features = read_cpuid(ID_AA64ISAR0_EL1);
+	features = read_system_cpuid(ID_AA64ISAR0_EL1);
 	block = (features >> 4) & 0xf;
 	if (!(block & 0x8)) {
 		switch (block) {
@@ -648,7 +636,7 @@ void __init setup_processor_features(void)
 	 * ID_ISAR5_EL1 carries similar information as above, but pertaining to
 	 * the Aarch32 32-bit execution state.
 	 */
-	features = read_cpuid(ID_ISAR5_EL1);
+	features = read_system_cpuid(ID_ISAR5_EL1);
 	block = (features >> 4) & 0xf;
 	if (!(block & 0x8)) {
 		switch (block) {
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
index b056369..c21de2b 100644
--- a/arch/arm64/kernel/debug-monitors.c
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -26,14 +26,16 @@
 #include <linux/stat.h>
 #include <linux/uaccess.h>
 
-#include <asm/debug-monitors.h>
+
 #include <asm/cputype.h>
+#include <asm/cpu.h>
+#include <asm/debug-monitors.h>
 #include <asm/system_misc.h>
 
 /* Determine debug architecture. */
 u8 debug_monitors_arch(void)
 {
-	return read_cpuid(ID_AA64DFR0_EL1) & 0xf;
+	return read_system_cpuid(ID_AA64DFR0_EL1) & 0xf;
 }
 
 /*
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 44d6f75..d0961c5 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -25,8 +25,9 @@
 #include <linux/signal.h>
 #include <linux/hardirq.h>
 
-#include <asm/fpsimd.h>
+#include <asm/cpu.h>
 #include <asm/cputype.h>
+#include <asm/fpsimd.h>
 
 #define FPEXC_IOF	(1 << 0)
 #define FPEXC_DZF	(1 << 1)
@@ -331,7 +332,7 @@ static inline void fpsimd_hotplug_init(void) { }
  */
 static int __init fpsimd_init(void)
 {
-	u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
+	u64 pfr = read_system_cpuid(ID_AA64PFR0_EL1);
 
 	if (pfr & (0xf << 16)) {
 		pr_notice("Floating-point is not implemented\n");
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
index 7a1a5da..2f05e3d 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -28,6 +28,7 @@
 #include <linux/ptrace.h>
 #include <linux/smp.h>
 
+#include <asm/cpu.h>
 #include <asm/current.h>
 #include <asm/debug-monitors.h>
 #include <asm/hw_breakpoint.h>
@@ -51,13 +52,13 @@ static int core_num_wrps;
 /* Determine number of BRP registers available. */
 static int get_num_brps(void)
 {
-	return ((read_cpuid(ID_AA64DFR0_EL1) >> 12) & 0xf) + 1;
+	return ((read_system_cpuid(ID_AA64DFR0_EL1) >> 12) & 0xf) + 1;
 }
 
 /* Determine number of WRP registers available. */
 static int get_num_wrps(void)
 {
-	return ((read_cpuid(ID_AA64DFR0_EL1) >> 20) & 0xf) + 1;
+	return ((read_system_cpuid(ID_AA64DFR0_EL1) >> 20) & 0xf) + 1;
 }
 
 int hw_breakpoint_slots(int type)
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 0b43265..92abdc0 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -25,6 +25,7 @@
 
 #include <kvm/arm_arch_timer.h>
 
+#include <asm/cpu.h>
 #include <asm/cputype.h>
 #include <asm/ptrace.h>
 #include <asm/kvm_arm.h>
@@ -52,7 +53,7 @@ static bool cpu_has_32bit_el1(void)
 {
 	u64 pfr0;
 
-	pfr0 = read_cpuid(ID_AA64PFR0_EL1);
+	pfr0 = read_system_cpuid(ID_AA64PFR0_EL1);
 	return !!(pfr0 & 0x20);
 }
 
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c370b40..80ad47d 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -25,6 +25,7 @@
 #include <linux/uaccess.h>
 
 #include <asm/cacheflush.h>
+#include <asm/cpu.h>
 #include <asm/cputype.h>
 #include <asm/debug-monitors.h>
 #include <asm/esr.h>
@@ -490,8 +491,8 @@ static bool trap_dbgidr(struct kvm_vcpu *vcpu,
 	if (p->is_write) {
 		return ignore_write(vcpu, p);
 	} else {
-		u64 dfr = read_cpuid(ID_AA64DFR0_EL1);
-		u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
+		u64 dfr = read_system_cpuid(ID_AA64DFR0_EL1);
+		u64 pfr = read_system_cpuid(ID_AA64PFR0_EL1);
 		u32 el3 = !!((pfr >> 12) & 0xf);
 
 		*vcpu_reg(vcpu, p->Rt) = ((((dfr >> 20) & 0xf) << 28) |
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* sample: arm64 cpu feature: Test program
  2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
                   ` (9 preceding siblings ...)
  2015-07-24  9:43 ` [RFC PATCH 10/10] arm64: Use system-wide safe value of CPU feature register Suzuki K. Poulose
@ 2015-07-24  9:43 ` Suzuki K. Poulose
  10 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-07-24  9:43 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: catalin.marinas, will.deacon, mark.rutland, edward.nevill, aph,
	linux-kernel, Suzuki K. Poulose

From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>

Attached is a sample program that fetches all the emulated
cpu feature registers and traps on accessing an invalid
id register.


---

/*
 * show_sysregs_all: Sample program to retrieve the
 * CPU feature registers.
 *
 * Author: Suzuki K. Poulose <suzuki.poulose@arm.com>
 *
 * Copyright (C) 2015 ARM Ltd.
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License version 2 as
 * published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 */



#include <stdio.h>

///
/* Copied from asm/sysreg.h */
#define sys_reg(op0, op1, crn, crm, op2) \
        ((((op0)&3)<<19)|((op1)<<16)|((crn)<<12)|((crm)<<8)|((op2)<<5))

asm(
"       .irp    num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n"
"       .equ    __reg_num_x\\num, \\num\n"
"       .endr\n"
"       .equ    __reg_num_xzr, 31\n"
"\n"
"       .macro  mrs_s, rt, sreg\n"
"       .inst   0xd5200000|(\\sreg)|(__reg_num_\\rt)\n"
"       .endm\n"
"\n"
"       .macro  msr_s, sreg, rt\n"
"       .inst   0xd5000000|(\\sreg)|(__reg_num_\\rt)\n"
"       .endm\n"
);
///

#define get_cpu_ftr(id) ({					\
		unsigned long long __val;			\
		asm("mrs %0, "#id : "=r" (__val));		\
		printf("%-20s: 0x%016lx\n", #id, __val);	\
	})
#define __get_id_reg(id) ({						\
		unsigned long long __val;				\
		asm("mrs_s %0, "#id : "=r" (__val));			\
		printf("%-20s: 0x%016lx\n", get_id_str(id), __val);	\
	})
#define get_id_reg(x) __get_id_reg(x)

#define get_id_range(CRm) \
	get_id_reg(sys_reg(3, 0, 0, CRm, 0)); \
	get_id_reg(sys_reg(3, 0, 0, CRm, 1)); \
	get_id_reg(sys_reg(3, 0, 0, CRm, 2)); \
	get_id_reg(sys_reg(3, 0, 0, CRm, 3)); \
	get_id_reg(sys_reg(3, 0, 0, CRm, 4)); \
	get_id_reg(sys_reg(3, 0, 0, CRm, 5)); \
	get_id_reg(sys_reg(3, 0, 0, CRm, 6)); \
	get_id_reg(sys_reg(3, 0, 0, CRm, 7));

#define Op0(id)		(((id) >> 19) & 0x3)
#define Op1(id)		(((id) >> 16) & 0x7)
#define CRn(id)		(((id) >> 12) & 0xf)
#define CRm(id)		(((id) >> 8) & 0xf)
#define Op2(id)		(((id) >> 5) & 0x7)

char *get_id_str(int id)
{
	static char buf[64];

	sprintf(buf, "S%d_%d_%d_%d_%d",
			Op0(id), Op1(id), CRn(id), CRm(id), Op2(id));
	return buf;
}

main()
{
	get_id_range(1);
	get_id_range(2);
	get_id_range(3);
	get_id_range(4);
	get_id_range(5);
	get_id_range(6);
	get_id_range(7);
	get_cpu_ftr(MIDR_EL1);
	get_cpu_ftr(REVIDR_EL1);
	get_cpu_ftr(MPIDR_EL1);

	/* Trap */
	get_id_reg(sys_reg(3, 0, 0, 0, 2));
	return 0;
}


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 05/10] arm64: Keep track of CPU feature registers
  2015-07-24  9:43 ` [RFC PATCH 05/10] arm64: Keep track of CPU feature registers Suzuki K. Poulose
@ 2015-08-05 14:58   ` Suzuki K. Poulose
  0 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-08-05 14:58 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Catalin Marinas, Will Deacon, Mark Rutland, edward.nevill, aph,
	linux-kernel

On 24/07/15 10:43, Suzuki K. Poulose wrote:
> From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
>
> This patch adds an infrastructure to keep track of the CPU feature
> registers on the system. This patch also consolidates the cpuinfo
> SANITY checks which ensures that we don't have conflicting feature
> supports across the CPUs.
>
> Each register has a set of feature bits defined by the architecture.
> We define the following attributes:
>
>   1) strict - If strict matching is required for the field across the
>      all the CPUs for SANITY checks.
>   2) visible - If the field is exposed to the userspace (See documentation
>      for more details).
>
> The default 'safe' value for the feature is also defined, which will be
> used:
>   1) To set the value for a 'discrete' feature with conflicting values.
>   2) To set the value for an 'invisible' feature for the userspace.
>
> The infrastructure keeps track of the following values for a feature
> register:
>   - user visible value
>   - system wide safe value
>
> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
> ---
>   arch/arm64/include/asm/cpu.h |  149 ++++++++++++++++
>   arch/arm64/kernel/cpuinfo.c  |  399 ++++++++++++++++++++++++++++++++++++++----
>   2 files changed, 511 insertions(+), 37 deletions(-)
>
> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index a34de72..c7b0b89 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
...
> diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
> index a13468b..ae2a37f 100644
> --- a/arch/arm64/kernel/cpuinfo.c
> +++ b/arch/arm64/kernel/cpuinfo.c
> @@ -31,6 +31,207 @@
>   #include <linux/sched.h>
>   #include <linux/smp.h>
>
...
> -#define CHECK_MASK(field, mask, boot, cur, cpu) \
> -       check_reg_mask(#field, mask, (boot)->reg_ ## field, (cur)->reg_ ## field, cpu)
> -
> -#define CHECK(field, boot, cur, cpu) \
> -       CHECK_MASK(field, ~0ULL, boot, cur, cpu)
> +#define CHECK_CPUINFO(field)                                                           \
> +       ({                                                                              \
> +               int __rc = 0;                                                           \
> +               struct arm64_ftr_reg *__regp = get_arm64_sys_reg(sys_ ## field);        \
> +               if (__regp) {                                                           \
> +                       __rc = check_reg_mask(__regp,                                   \
> +                                               (boot)->reg_ ## field,                  \
> +                                               (cur)->reg_ ## field, cpu);             \
> +                       update_cpu_ftr_reg(__regp, cur->reg_ ## field, cpu);            \
> +               }                                                                       \
> +               __rc;                                                                   \
> +       })
>
>   /*
>    * Verify that CPUs don't have unexpected differences that will cause problems.
> @@ -123,17 +448,17 @@ static void cpuinfo_sanity_check(struct cpuinfo_arm64 *cur)
>           * caches should look identical. Userspace JITs will make use of
>           * *minLine.
>           */
> -       diff |= CHECK_MASK(ctr, 0xffff3fff, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(ctr);
>
>          /*
>           * Userspace may perform DC ZVA instructions. Mismatched block sizes
>           * could result in too much or too little memory being zeroed if a
>           * process is preempted and migrated between CPUs.
>           */
> -       diff |= CHECK(dczid, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(dczid);
>
>          /* If different, timekeeping will be broken (especially with KVM) */
> -       diff |= CHECK(cntfrq, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(cntfrq);
>
>          /*
>           * The kernel uses self-hosted debug features and expects CPUs to
> @@ -141,15 +466,15 @@ static void cpuinfo_sanity_check(struct cpuinfo_arm64 *cur)
>           * and BRPs to be identical.
>           * ID_AA64DFR1 is currently RES0.
>           */
> -       diff |= CHECK(id_aa64dfr0, boot, cur, cpu);
> -       diff |= CHECK(id_aa64dfr1, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(id_aa64dfr0);
> +       diff |= CHECK_CPUINFO(id_aa64dfr1);
>
>          /*
>           * Even in big.LITTLE, processors should be identical instruction-set
>           * wise.
>           */
> -       diff |= CHECK(id_aa64isar0, boot, cur, cpu);
> -       diff |= CHECK(id_aa64isar1, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(id_aa64isar0);
> +       diff |= CHECK_CPUINFO(id_aa64isar1);
>
>          /*
>           * Differing PARange support is fine as long as all peripherals and
> @@ -157,42 +482,42 @@ static void cpuinfo_sanity_check(struct cpuinfo_arm64 *cur)
>           * Linux should not care about secure memory.
>           * ID_AA64MMFR1 is currently RES0.
>           */
> -       diff |= CHECK_MASK(id_aa64mmfr0, 0xffffffffffff0ff0, boot, cur, cpu);
> -       diff |= CHECK(id_aa64mmfr1, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(id_aa64mmfr0);
> +       diff |= CHECK_CPUINFO(id_aa64mmfr1);
>
>          /*
>           * EL3 is not our concern.
>           * ID_AA64PFR1 is currently RES0.
>           */
> -       diff |= CHECK_MASK(id_aa64pfr0, 0xffffffffffff0fff, boot, cur, cpu);
> -       diff |= CHECK(id_aa64pfr1, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(id_aa64pfr0);
> +       diff |= CHECK_CPUINFO(id_aa64pfr1);
>
>          /*
>           * If we have AArch32, we care about 32-bit features for compat. These
>           * registers should be RES0 otherwise.
>           */
> -       diff |= CHECK(id_dfr0, boot, cur, cpu);
> -       diff |= CHECK(id_isar0, boot, cur, cpu);
> -       diff |= CHECK(id_isar1, boot, cur, cpu);
> -       diff |= CHECK(id_isar2, boot, cur, cpu);
> -       diff |= CHECK(id_isar3, boot, cur, cpu);
> -       diff |= CHECK(id_isar4, boot, cur, cpu);
> -       diff |= CHECK(id_isar5, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(id_dfr0);
> +       diff |= CHECK_CPUINFO(id_isar0);
> +       diff |= CHECK_CPUINFO(id_isar1);
> +       diff |= CHECK_CPUINFO(id_isar2);
> +       diff |= CHECK_CPUINFO(id_isar3);
> +       diff |= CHECK_CPUINFO(id_isar4);
> +       diff |= CHECK_CPUINFO(id_isar5);
>          /*
>           * Regardless of the value of the AuxReg field, the AIFSR, ADFSR, and
>           * ACTLR formats could differ across CPUs and therefore would have to
>           * be trapped for virtualization anyway.
>           */
> -       diff |= CHECK_MASK(id_mmfr0, 0xff0fffff, boot, cur, cpu);
> -       diff |= CHECK(id_mmfr1, boot, cur, cpu);
> -       diff |= CHECK(id_mmfr2, boot, cur, cpu);
> -       diff |= CHECK(id_mmfr3, boot, cur, cpu);
> -       diff |= CHECK(id_pfr0, boot, cur, cpu);
> -       diff |= CHECK(id_pfr1, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(id_mmfr0);
> +       diff |= CHECK_CPUINFO(id_mmfr1);
> +       diff |= CHECK_CPUINFO(id_mmfr2);
> +       diff |= CHECK_CPUINFO(id_mmfr3);
> +       diff |= CHECK_CPUINFO(id_pfr0);
> +       diff |= CHECK_CPUINFO(id_pfr1);
>
> -       diff |= CHECK(mvfr0, boot, cur, cpu);
> -       diff |= CHECK(mvfr1, boot, cur, cpu);
> -       diff |= CHECK(mvfr2, boot, cur, cpu);
> +       diff |= CHECK_CPUINFO(mvfr0);
> +       diff |= CHECK_CPUINFO(mvfr1);
> +       diff |= CHECK_CPUINFO(mvfr2);
>
>          /*
>           * Mismatched CPU features are a recipe for disaster. Don't even
> @@ -239,7 +564,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
>          cpuinfo_detect_icache_policy(info);
>
>          check_local_cpu_errata();
> -       check_local_cpu_features();
> +       cpuinfo_sanity_check(info);
These two changes shouldn't be there, I have fixed it locally.
>          update_cpu_features(info);
>   }
>
> @@ -247,13 +572,13 @@ void cpuinfo_store_cpu(void)
>   {
>          struct cpuinfo_arm64 *info = this_cpu_ptr(&cpu_data);
>          __cpuinfo_store_cpu(info);
> -       cpuinfo_sanity_check(info);
As above, this line should be retained here.
>   }
>
>   void __init cpuinfo_store_boot_cpu(void)
>   {
>          struct cpuinfo_arm64 *info = &per_cpu(cpu_data, 0);
>          __cpuinfo_store_cpu(info);
> +       init_cpu_ftrs(info);

Thanks
Suzuki


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-07-24  9:43 ` [RFC PATCH 01/10] arm64: feature registers: Documentation Suzuki K. Poulose
@ 2015-08-10 16:06   ` Catalin Marinas
  2015-08-10 17:36     ` Suzuki K. Poulose
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2015-08-10 16:06 UTC (permalink / raw)
  To: Suzuki K. Poulose
  Cc: linux-arm-kernel, mark.rutland, aph, will.deacon, linux-kernel,
	edward.nevill

Hi Suzuki,

On Fri, Jul 24, 2015 at 10:43:47AM +0100, Suzuki K. Poulose wrote:
> From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
> 
> Documentation of the infrastructure
> 
> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>

The implementation looks fine but I think the main discussion will be
around the goal of this feature and the ABI that it introduces. So I'll
just write my thoughts on this patch (I could as well have replied to
the cover letter).

Another question: who's going to use this feature? I know people asked
in private but I'd like to have some public statements.

> --- /dev/null
> +++ b/Documentation/arm64/cpu-feature-registers.txt
> @@ -0,0 +1,185 @@
> +		ARM64 CPU Feature Registers
> +		===========================
> +
> +Author: Suzuki K. Poulose <suzuki.poulose@arm.com>
> +
> +
> +This file describes the API for exporting the AArch64 CPU ID/feature registers
> +to userspace.
> +
> +1. Motivation
> +---------------
> +
> +The ARM architecture defines a set of feature registers, which describe
> +the capabilities of the CPU/system. Access to these system registers is
> +restricted from EL0 and there is no reliable way for an application to
> +extract this information to make better decisions at runtime. There is
> +limited information available to the application via ELF_HWCAPs, however
> +there are some issues with their usage.
> +
> + a) Any change to the HWCAPs requires an update to userspace (e.g libc)
> +    to detect the new changes, which can take a long time to appear in
> +    distributions. Exposing the registers allows applications to get the
> +    information without requiring other userspace components to be updated.

How does it help if you have a new CPUID field or even a new value in an
existing field? Doesn't userspace need to be changed anyway to make use
of the new feature? I don't think that's a valid argument.

> + b) Access to HWCAPs is sometimes restricted (e.g prior to libc, or when ld is
> +    initialised at startup time).

That's useful indeed.

> + c) HWCAPs cannot represent non-boolean information effectively. The
> +    architecture defines a canonical format for representing features
> +    in the ID registers; this is well defined and is capable of
> +    representing all valid architecture variations. Exposing the ID
> +    registers avoids having to come up with HWCAP representations
> +    and parsing code.

So far we've managed to cope with the boolean state of HWCAP, at least
for information relevant to user space. One thing it doesn't cover is
MIDR_EL1.

But the question here is whether we continue to add HWCAP bits even when
we exposed the CPUID registers to user. IMO, we should continue to add
the HWCAP bits matching new CPUID features for a few reasons:

1. It's the current interface that we have and the bits can be checked
   in standard C code without having to issue arm64-specific instructions

2. We still need features listed in /proc/cpuinfo, at least for humans
   reading this file or scripts that can't issue mrs instructions

And to debunk some of the counter arguments:

a) Running out of HWCAP bits - I really doubt this, we can always
   introduce 64 more via a new elf_hwcapX

b) Non-boolean information - The CPUID scheme (not MIDR) is pretty much
   boolean, each increment of the field adding a new feature or
   extending an existing one. We could do the same with HWCAP bits (e.g.
   HWCAP_FEATUREv4)

> +2. Requirements
> +-----------------
> +
> + a) Safety :
> +    Applications should be able to use the information provided by the
> +    infrastructure to run optimally safely across the system. This has
> +    greater implications on a system with heterogeneous CPUs. The
> +    infrastructure exports a value that is safe across all the available
> +    CPU on the system.
> +
> +    e.g, If at least one CPU doesn't implement CRC32 instructions, while others
> +    do, we should report that the CRC32 is not implemented. Otherwise an
> +    application could crash when scheduled on the CPU which doesn't support
> +    CRC32.

Agreed.

> + b) Security :
> +    Applications should only be able to receive information that is relevant
> +    to the normal operation in userspace. Hence, some of the fields
> +    are masked out and the values of the fields are set to indicate the
> +    feature is 'not supported' (See the 'visible' field in the
> +    table in Section 4). Also, the kernel may manipulate the fields based on what
> +    it supports. e.g, If FP is not supported by the kernel, the values
> +    could indicate that the FP is not available (even when the CPU provides
> +    it).

That's fine as well.

What we don't cover is what to do with emulated features. Luckily, we
don't have any for AArch64 currently. If we ever need to do this, do we
fake the CPUID to pretend we have the feature or we don't expose it at
all. I would vote for the latter but that's probably too vague to make
any decision now.

> + c) Implementation Defined Features
> +    The infrastructure doesn't expose any register which is
> +    IMPLEMENTATION DEFINED as per ARMv8-A Architecture and is set to 0.

It may be worth adding somewhere the (unwritten; yet) rules of the CPUID
fields: original 4-bit signed field is RAZ. When a feature is added or
extended, the field is incremented. If an existing feature is removed
for which the CPUID field is 0, the field becomes negative (0xf).

> + d) CPU Identification :
> +    MIDR_EL1 is exposed to help identify the processor. On a heterogeneous
> +    system, this could be racy (just like getcpu()). The process could be
> +    migrated to another CPU by the time we use the register value. Hence,

s/we use/it uses/

> +    there is no guarantee that the value reflects the processor that it is
> +    currently executing on.

You could extend this a bit, something like "unless the CPU affinity is
set".

Anyway, for this reason, we decided not to expose REVIDR since it can
only be read in conjunction with MIDR.

> +The list of supported registers and the attributes of individual
> +feature bits are listed in section 4. Unless there is absolute necessity,
> +we don't encourage the addition of new feature registers to the list.
> +In any case, it should comply to the requirements listed above.
> +
> +3. Implementation
> +--------------------
> +
> +The infrastructure is built on the emulation of the 'MRS' instruction.
> +Accessing a restricted system register from an application generates an
> +exception and ends up in SIGILL being delivered to the process.
> +The infrastructure hooks into the exception handler and emulates the
> +operation if the source belongs to the supported system register space.
> +
> +The infrastructure emulates only the following system register space:
> +	Op0=3, Op1=0, CRn=0
> +
> +(See Table C5-6 'System instruction encodings for System register accesses'
> + in ARMv8 ARM, for the list of registers).
> +
> +
> +The following rules are applied to the value returned by the infrastructure:
> +
> + a) The value of an 'IMPLEMENTATION DEFINED' field is set to 0.
> + b) The value of a reserved field is set to the reserved value(as
> +    defined by the architecture).

Do we expose any IMPLEMENTATION DEFINED or reserved field to user?

> + c) The value of a field marked as not 'visible', is set to indicate
> +    the feature is missing (as defined by the architecture).
> + d) The value of a 'visible' field holds the system wide safe value
> +    for the particular feature(except for MIDR_EL1, see section 4)

I'm slightly confused by the visible/not-visible definition. GIC for
example may be present but we don't want to expose it to user, hence you
marked it as "not visible" in the table. But the feature is definitely
not missing, it may be present and we just decided not to expose it to
EL0 since it is not relevant.

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-10 16:06   ` Catalin Marinas
@ 2015-08-10 17:36     ` Suzuki K. Poulose
  2015-08-10 17:48       ` Ard Biesheuvel
                         ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-08-10 17:36 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, Mark Rutland, aph, Will Deacon, linux-kernel,
	edward.nevill

On 10/08/15 17:06, Catalin Marinas wrote:
> Hi Suzuki,
>
> On Fri, Jul 24, 2015 at 10:43:47AM +0100, Suzuki K. Poulose wrote:
>> From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
>>
>> Documentation of the infrastructure
>>
>> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
>
> The implementation looks fine but I think the main discussion will be
> around the goal of this feature and the ABI that it introduces. So I'll
> just write my thoughts on this patch (I could as well have replied to
> the cover letter).
>
> Another question: who's going to use this feature? I know people asked
> in private but I'd like to have some public statements.

Right, I am hoping that folks from glibc / JIT / GCC will respond to
this thread.

>
...

>> + a) Any change to the HWCAPs requires an update to userspace (e.g libc)
>> +    to detect the new changes, which can take a long time to appear in
>> +    distributions. Exposing the registers allows applications to get the
>> +    information without requiring other userspace components to be updated.
>
> How does it help if you have a new CPUID field or even a new value in an
> existing field? Doesn't userspace need to be changed anyway to make use
> of the new feature? I don't think that's a valid argument.
>

Yes, the userspace would need an update to work with the new CPUID field. I understand.
It is just that, "in the enterprise world" updates to the system libraries provided by
the distribution might take a bit longer to provide the changes than a software vendor.
I agree thats not a common case.


>> + b) Access to HWCAPs is sometimes restricted (e.g prior to libc, or when ld is
>> +    initialised at startup time).
>
> That's useful indeed.

OK

>
>> + c) HWCAPs cannot represent non-boolean information effectively. The
>> +    architecture defines a canonical format for representing features
>> +    in the ID registers; this is well defined and is capable of
>> +    representing all valid architecture variations. Exposing the ID
>> +    registers avoids having to come up with HWCAP representations
>> +    and parsing code.
>
> So far we've managed to cope with the boolean state of HWCAP, at least
> for information relevant to user space. One thing it doesn't cover is
> MIDR_EL1.
>
> But the question here is whether we continue to add HWCAP bits even when
> we exposed the CPUID registers to user. IMO, we should continue to add
> the HWCAP bits matching new CPUID features for a few reasons:

I don't have a strong opinion against it.

>
> 1. It's the current interface that we have and the bits can be checked
>     in standard C code without having to issue arm64-specific instructions
>

I agree. May be we could provide library interface for this in the future ?

> 2. We still need features listed in /proc/cpuinfo, at least for humans
>     reading this file or scripts that can't issue mrs instructions
>

Agreed, we still need to provide the features in /proc/cpuinfo. We could do
this without HWCAP if we decide not to update the list.

> And to debunk some of the counter arguments:
>
> a) Running out of HWCAP bits - I really doubt this, we can always
>     introduce 64 more via a new elf_hwcapX

OK :)

>
> b) Non-boolean information - The CPUID scheme (not MIDR) is pretty much
>     boolean, each increment of the field adding a new feature or
>     extending an existing one. We could do the same with HWCAP bits (e.g.
>     HWCAP_FEATUREv4)

OK

>> + b) Security :
>> +    Applications should only be able to receive information that is relevant
>> +    to the normal operation in userspace. Hence, some of the fields
>> +    are masked out and the values of the fields are set to indicate the
>> +    feature is 'not supported' (See the 'visible' field in the
>> +    table in Section 4). Also, the kernel may manipulate the fields based on what
>> +    it supports. e.g, If FP is not supported by the kernel, the values
>> +    could indicate that the FP is not available (even when the CPU provides
>> +    it).
>
> That's fine as well.
>
> What we don't cover is what to do with emulated features. Luckily, we
> don't have any for AArch64 currently. If we ever need to do this, do we
> fake the CPUID to pretend we have the feature or we don't expose it at
> all. I would vote for the latter but that's probably too vague to make
> any decision now.

Right. We could pretend to have a feature which the userspace can safely
operate, given kernel/hardware can handle it.

>
>> + c) Implementation Defined Features
>> +    The infrastructure doesn't expose any register which is
>> +    IMPLEMENTATION DEFINED as per ARMv8-A Architecture and is set to 0.
>
> It may be worth adding somewhere the (unwritten; yet) rules of the CPUID
> fields: original 4-bit signed field is RAZ. When a feature is added or
> extended, the field is incremented. If an existing feature is removed
> for which the CPUID field is 0, the field becomes negative (0xf).

May be I can add it as an 'Notes' section at the end ?

>
>> + d) CPU Identification :
>> +    MIDR_EL1 is exposed to help identify the processor. On a heterogeneous
>> +    system, this could be racy (just like getcpu()). The process could be
>> +    migrated to another CPU by the time we use the register value. Hence,
>
> s/we use/it uses/

Will fix it.
  

>> +    there is no guarantee that the value reflects the processor that it is
>> +    currently executing on.
>
> You could extend this a bit, something like "unless the CPU affinity is
> set".

Sure, makes sense.

>
> Anyway, for this reason, we decided not to expose REVIDR since it can
> only be read in conjunction with MIDR.

Right.

...
>> +3. Implementation
>> +--------------------
>> +
...
>> +The infrastructure emulates only the following system register space:
>> +	Op0=3, Op1=0, CRn=0
>> +
>> +(See Table C5-6 'System instruction encodings for System register accesses'
>> + in ARMv8 ARM, for the list of registers).
>> +
>> +
>> +The following rules are applied to the value returned by the infrastructure:
>> +
>> + a) The value of an 'IMPLEMENTATION DEFINED' field is set to 0.
>> + b) The value of a reserved field is set to the reserved value(as
>> +    defined by the architecture).
>
> Do we expose any IMPLEMENTATION DEFINED or reserved field to user?

We don't. All such fields are marked invisible. The above rules define
how we fill those 'special' (invisible) fields. We 'emulate' all
access to the space (as defined above) with Op0=3, Op1=0 & CRn=0.
Out of this space, there are only a very few 'visible' fields(listed in
section 4). These rules, define how the values are emulated.

>
>> + c) The value of a field marked as not 'visible', is set to indicate
>> +    the feature is missing (as defined by the architecture).
>> + d) The value of a 'visible' field holds the system wide safe value
>> +    for the particular feature(except for MIDR_EL1, see section 4)
>
> I'm slightly confused by the visible/not-visible definition. GIC for
> example may be present but we don't want to expose it to user, hence you
> marked it as "not visible" in the table. But the feature is definitely
> not missing, it may be present and we just decided not to expose it to
> EL0 since it is not relevant.

Thats right. In this case, the userspace will see that 'GIC' is not present
even though it is available. Btw, the system wide value(exposed to the system
wide users) could be different from what the user gets. e.g, if all the CPUs
have GIC system register access, the system view will have 'GIC' available.

Taking another example to explain rule (d), if all CPUs but one supports CRC32
instructions, both the system view and the user view will have CRC32 disabled.


Thanks
Suzuki





^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-10 17:36     ` Suzuki K. Poulose
@ 2015-08-10 17:48       ` Ard Biesheuvel
  2015-08-11 14:23         ` Catalin Marinas
  2015-08-10 18:19       ` Andrew Haley
  2015-08-11 14:46       ` Catalin Marinas
  2 siblings, 1 reply; 24+ messages in thread
From: Ard Biesheuvel @ 2015-08-10 17:48 UTC (permalink / raw)
  To: Suzuki K. Poulose
  Cc: Catalin Marinas, Mark Rutland, aph, Will Deacon, linux-kernel,
	edward.nevill, linux-arm-kernel

On 10 August 2015 at 19:36, Suzuki K. Poulose <Suzuki.Poulose@arm.com> wrote:
> On 10/08/15 17:06, Catalin Marinas wrote:
>>
>> Hi Suzuki,
>>
>> On Fri, Jul 24, 2015 at 10:43:47AM +0100, Suzuki K. Poulose wrote:
>>>
>>> From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
>>>
>>> Documentation of the infrastructure
>>>
>>> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
>>
>>
>> The implementation looks fine but I think the main discussion will be
>> around the goal of this feature and the ABI that it introduces. So I'll
>> just write my thoughts on this patch (I could as well have replied to
>> the cover letter).
>>
>> Another question: who's going to use this feature? I know people asked
>> in private but I'd like to have some public statements.
>
>
> Right, I am hoping that folks from glibc / JIT / GCC will respond to
> this thread.
>
>>
> ...
>
>>> + a) Any change to the HWCAPs requires an update to userspace (e.g libc)
>>> +    to detect the new changes, which can take a long time to appear in
>>> +    distributions. Exposing the registers allows applications to get the
>>> +    information without requiring other userspace components to be
>>> updated.
>>
>>
>> How does it help if you have a new CPUID field or even a new value in an
>> existing field? Doesn't userspace need to be changed anyway to make use
>> of the new feature? I don't think that's a valid argument.
>>
>
> Yes, the userspace would need an update to work with the new CPUID field. I
> understand.
> It is just that, "in the enterprise world" updates to the system libraries
> provided by
> the distribution might take a bit longer to provide the changes than a
> software vendor.
> I agree thats not a common case.
>
>
>>> + b) Access to HWCAPs is sometimes restricted (e.g prior to libc, or when
>>> ld is
>>> +    initialised at startup time).
>>
>>
>> That's useful indeed.
>
>
> OK
>
>>
>>> + c) HWCAPs cannot represent non-boolean information effectively. The
>>> +    architecture defines a canonical format for representing features
>>> +    in the ID registers; this is well defined and is capable of
>>> +    representing all valid architecture variations. Exposing the ID
>>> +    registers avoids having to come up with HWCAP representations
>>> +    and parsing code.
>>
>>
>> So far we've managed to cope with the boolean state of HWCAP, at least
>> for information relevant to user space. One thing it doesn't cover is
>> MIDR_EL1.
>>
>> But the question here is whether we continue to add HWCAP bits even when
>> we exposed the CPUID registers to user. IMO, we should continue to add
>> the HWCAP bits matching new CPUID features for a few reasons:
>
>
> I don't have a strong opinion against it.
>
>>
>> 1. It's the current interface that we have and the bits can be checked
>>     in standard C code without having to issue arm64-specific instructions
>>
>
> I agree. May be we could provide library interface for this in the future ?
>
>> 2. We still need features listed in /proc/cpuinfo, at least for humans
>>     reading this file or scripts that can't issue mrs instructions
>>
>
> Agreed, we still need to provide the features in /proc/cpuinfo. We could do
> this without HWCAP if we decide not to update the list.
>
>> And to debunk some of the counter arguments:
>>
>> a) Running out of HWCAP bits - I really doubt this, we can always
>>     introduce 64 more via a new elf_hwcapX
>

Note that ELF_HWCAP is also wired into ifunc resolution of GNU
indirect functions, which looks like a useful feature although it
isn't used that widely yet.

The ifunc prototype for aarch64 has only one 'long' parameter, and I
don't know if it is possible to extend that without having a bit in
HWCAPn to indicate that HWCAPn+1 is valid. Also, the ifunc resolvers
are restricted in the sense that they cannot use shared libraries or
code that uses constructors (AFAIR) so it may require a special static
library to call this CPU feature interface from such a resolver if
features are not covered by HWCAP bits.

So treating HWCAP bits as an endless supply may not be the wisest
approach here. Also, I think some alignment with the libc folks is
indeed in order.

-- 
Ard.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-10 17:36     ` Suzuki K. Poulose
  2015-08-10 17:48       ` Ard Biesheuvel
@ 2015-08-10 18:19       ` Andrew Haley
  2015-08-11  8:41         ` Suzuki K. Poulose
  2015-08-11 14:46       ` Catalin Marinas
  2 siblings, 1 reply; 24+ messages in thread
From: Andrew Haley @ 2015-08-10 18:19 UTC (permalink / raw)
  To: Suzuki K. Poulose, Catalin Marinas
  Cc: linux-arm-kernel, Mark Rutland, Will Deacon, linux-kernel, edward.nevill

On 08/10/2015 06:36 PM, Suzuki K. Poulose wrote:
> On 10/08/15 17:06, Catalin Marinas wrote:
>> Hi Suzuki,
>>
>> On Fri, Jul 24, 2015 at 10:43:47AM +0100, Suzuki K. Poulose wrote:
>>> From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
>>>
>>> Documentation of the infrastructure
>>>
>>> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
>>
>> The implementation looks fine but I think the main discussion will be
>> around the goal of this feature and the ABI that it introduces. So I'll
>> just write my thoughts on this patch (I could as well have replied to
>> the cover letter).
>>
>> Another question: who's going to use this feature? I know people asked
>> in private but I'd like to have some public statements.
> 
> Right, I am hoping that folks from glibc / JIT / GCC will respond to
> this thread.

We certainly need it for OpenJDK.  We need to know the manufacturer,
part number, revision id, etc.  We already have workarounds in
OpenJDK for various bugs, and we also can generate better code if we
know the exact part.

I note that the REVIDR is not in this patch.  That seems odd, because
it can be used to identify minor revisions.

Andrew.



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-10 18:19       ` Andrew Haley
@ 2015-08-11  8:41         ` Suzuki K. Poulose
  2015-08-11  8:58           ` Andrew Haley
  0 siblings, 1 reply; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-08-11  8:41 UTC (permalink / raw)
  To: Andrew Haley, Catalin Marinas
  Cc: linux-arm-kernel, Mark Rutland, Will Deacon, linux-kernel, edward.nevill

On 10/08/15 19:19, Andrew Haley wrote:
> On 08/10/2015 06:36 PM, Suzuki K. Poulose wrote:
>> On 10/08/15 17:06, Catalin Marinas wrote:
>>> Hi Suzuki,
>>>
>>> On Fri, Jul 24, 2015 at 10:43:47AM +0100, Suzuki K. Poulose wrote:
>>>> From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
>>>>
>>>> Documentation of the infrastructure
>>>>
>>>> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
>>>
>>> The implementation looks fine but I think the main discussion will be
>>> around the goal of this feature and the ABI that it introduces. So I'll
>>> just write my thoughts on this patch (I could as well have replied to
>>> the cover letter).
>>>
>>> Another question: who's going to use this feature? I know people asked
>>> in private but I'd like to have some public statements.
>>
>> Right, I am hoping that folks from glibc / JIT / GCC will respond to
>> this thread.
>
> We certainly need it for OpenJDK.  We need to know the manufacturer,
> part number, revision id, etc.  We already have workarounds in
> OpenJDK for various bugs, and we also can generate better code if we
> know the exact part.

OK.

>
> I note that the REVIDR is not in this patch.  That seems odd, because
> it can be used to identify minor revisions.

The REVIDR has to be used in conjunction with the MIDR to make real sense.
We cannot guarantee that the REVIDR that we read (would) belong to the CPU
where MIDR would have been read (unless the process is pinned) and hence the
user may not be able to make any use of the information. Steve has a patch [1]
to expose the MIDR,REVIDR info via sysfs.

[1] https://lkml.org/lkml/2015/7/24/420

Thanks
Suzuki

>
> Andrew.
>
>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-11  8:41         ` Suzuki K. Poulose
@ 2015-08-11  8:58           ` Andrew Haley
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Haley @ 2015-08-11  8:58 UTC (permalink / raw)
  To: Suzuki K. Poulose, Catalin Marinas
  Cc: linux-arm-kernel, Mark Rutland, Will Deacon, linux-kernel, edward.nevill

On 11/08/15 09:41, Suzuki K. Poulose wrote:
> The REVIDR has to be used in conjunction with the MIDR to make real sense.

Sure, of course.

> We cannot guarantee that the REVIDR that we read (would) belong to the CPU
> where MIDR would have been read (unless the process is pinned) and hence the
> user may not be able to make any use of the information.

Well, yes, nothing is perfect.  The situation we're in now, for example,
is that if we see Cortex A57 we have to assume Cortex A53 and include
workarounds.  People might mix and match processors.  So it goes.

> Steve has a patch [1]
> to expose the MIDR,REVIDR info via sysfs.
> 
> [1] https://lkml.org/lkml/2015/7/24/420

OK.  As long as I can get at it I'm happy.

Andrew.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-10 17:48       ` Ard Biesheuvel
@ 2015-08-11 14:23         ` Catalin Marinas
  2015-08-11 15:37           ` Suzuki K. Poulose
  0 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2015-08-11 14:23 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Suzuki K. Poulose, Mark Rutland, aph, Will Deacon, linux-kernel,
	edward.nevill, linux-arm-kernel

On Mon, Aug 10, 2015 at 07:48:48PM +0200, Ard Biesheuvel wrote:
> > On 10/08/15 17:06, Catalin Marinas wrote:
> >> And to debunk some of the counter arguments:
> >>
> >> a) Running out of HWCAP bits - I really doubt this, we can always
> >>     introduce 64 more via a new elf_hwcapX
> 
> Note that ELF_HWCAP is also wired into ifunc resolution of GNU
> indirect functions, which looks like a useful feature although it
> isn't used that widely yet.

I forgot to mention, we also need an HWCAP_CPUID with these patches when
we expose the MRS interface. The ifunc resolver could use MRS when
available. But I would still keep adding HWCAP bits for new features,
even if we risk running out of the 64-bit we have now.

> The ifunc prototype for aarch64 has only one 'long' parameter, and I
> don't know if it is possible to extend that without having a bit in
> HWCAPn to indicate that HWCAPn+1 is valid. Also, the ifunc resolvers
> are restricted in the sense that they cannot use shared libraries or
> code that uses constructors (AFAIR) so it may require a special static
> library to call this CPU feature interface from such a resolver if
> features are not covered by HWCAP bits.

Or we could get some compiler intrinsics that generate the instruction
inline, just to avoid explicit asm.

> So treating HWCAP bits as an endless supply may not be the wisest
> approach here.

Probably not for ifunc, otherwise I don't think it hurts.

> Also, I think some alignment with the libc folks is indeed in order.

I agree (not sure how they feel about cross-posting though).

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-10 17:36     ` Suzuki K. Poulose
  2015-08-10 17:48       ` Ard Biesheuvel
  2015-08-10 18:19       ` Andrew Haley
@ 2015-08-11 14:46       ` Catalin Marinas
  2015-08-11 15:18         ` Suzuki K. Poulose
  2 siblings, 1 reply; 24+ messages in thread
From: Catalin Marinas @ 2015-08-11 14:46 UTC (permalink / raw)
  To: Suzuki K. Poulose
  Cc: Mark Rutland, aph, Will Deacon, linux-kernel, edward.nevill,
	linux-arm-kernel

On Mon, Aug 10, 2015 at 06:36:46PM +0100, Suzuki K. Poulose wrote:
> On 10/08/15 17:06, Catalin Marinas wrote:
> >On Fri, Jul 24, 2015 at 10:43:47AM +0100, Suzuki K. Poulose wrote:
> >>From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
> >>
> >>Documentation of the infrastructure
> >>
> >>Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>
> >
> >The implementation looks fine but I think the main discussion will be
> >around the goal of this feature and the ABI that it introduces. So I'll
> >just write my thoughts on this patch (I could as well have replied to
> >the cover letter).
> >
> >Another question: who's going to use this feature? I know people asked
> >in private but I'd like to have some public statements.
> 
> Right, I am hoping that folks from glibc / JIT / GCC will respond to
> this thread.

Some of them didn't even want to be cc'ed ;)

> >>+ a) Any change to the HWCAPs requires an update to userspace (e.g libc)
> >>+    to detect the new changes, which can take a long time to appear in
> >>+    distributions. Exposing the registers allows applications to get the
> >>+    information without requiring other userspace components to be updated.
> >
> >How does it help if you have a new CPUID field or even a new value in an
> >existing field? Doesn't userspace need to be changed anyway to make use
> >of the new feature? I don't think that's a valid argument.
> 
> Yes, the userspace would need an update to work with the new CPUID field. I understand.
> It is just that, "in the enterprise world" updates to the system libraries provided by
> the distribution might take a bit longer to provide the changes than a software vendor.
> I agree thats not a common case.

What I meant is that for a new CPU feature, the user space needs
updating anyway to make use of (add support for) it, whether it checks
its presence via HWCAP or MRS. Let's say we get new crypto instructions,
existing user space won't even check for it because it doesn't know
there is a new CPUID field (or HWCAP bit).

> >2. We still need features listed in /proc/cpuinfo, at least for humans
> >    reading this file or scripts that can't issue mrs instructions
> 
> Agreed, we still need to provide the features in /proc/cpuinfo. We could do
> this without HWCAP if we decide not to update the list.

I agree, /proc/cpuinfo is doable without HWCAP. But since some software
ends up parsing /proc/cpuinfo anyway, I don't see why we should hide
HWCAP.

> >>+ c) Implementation Defined Features
> >>+    The infrastructure doesn't expose any register which is
> >>+    IMPLEMENTATION DEFINED as per ARMv8-A Architecture and is set to 0.
> >
> >It may be worth adding somewhere the (unwritten; yet) rules of the CPUID
> >fields: original 4-bit signed field is RAZ. When a feature is added or
> >extended, the field is incremented. If an existing feature is removed
> >for which the CPUID field is 0, the field becomes negative (0xf).
> 
> May be I can add it as an 'Notes' section at the end ?

Fine.

> >>+3. Implementation
> >>+--------------------
> >>+
> ...
> >>+The infrastructure emulates only the following system register space:
> >>+	Op0=3, Op1=0, CRn=0
> >>+
> >>+(See Table C5-6 'System instruction encodings for System register accesses'
> >>+ in ARMv8 ARM, for the list of registers).
> >>+
> >>+
> >>+The following rules are applied to the value returned by the infrastructure:
> >>+
> >>+ a) The value of an 'IMPLEMENTATION DEFINED' field is set to 0.
> >>+ b) The value of a reserved field is set to the reserved value(as
> >>+    defined by the architecture).
> >
> >Do we expose any IMPLEMENTATION DEFINED or reserved field to user?
> 
> We don't. All such fields are marked invisible. The above rules define
> how we fill those 'special' (invisible) fields. We 'emulate' all
> access to the space (as defined above) with Op0=3, Op1=0 & CRn=0.
> Out of this space, there are only a very few 'visible' fields(listed in
> section 4). These rules, define how the values are emulated.

Point b) above is a bit confusing - reserved field is set to the
reserved value. If the reserved value is non-zero, do we expose such
value to user or we return zero as for other invisible fields?

> >>+ c) The value of a field marked as not 'visible', is set to indicate
> >>+    the feature is missing (as defined by the architecture).
> >>+ d) The value of a 'visible' field holds the system wide safe value
> >>+    for the particular feature(except for MIDR_EL1, see section 4)
> >
> >I'm slightly confused by the visible/not-visible definition. GIC for
> >example may be present but we don't want to expose it to user, hence you
> >marked it as "not visible" in the table. But the feature is definitely
> >not missing, it may be present and we just decided not to expose it to
> >EL0 since it is not relevant.
> 
> Thats right. In this case, the userspace will see that 'GIC' is not present
> even though it is available. Btw, the system wide value(exposed to the system
> wide users) could be different from what the user gets. e.g, if all the CPUs
> have GIC system register access, the system view will have 'GIC' available.
> 
> Taking another example to explain rule (d), if all CPUs but one supports CRC32
> instructions, both the system view and the user view will have CRC32 disabled.

OK. I missed the difference between "system wide view" and "user view".
I guess the former is not exposed to user.

As I mentioned in my reply to Ard, we need a HWCAP entry to inform the
user of the MRS emulation. My question is whether to use a single
HWCAP_CPUID or multiple for each ID register (e.g.
HWCAP_ID_AA64ISAR0_EL1 or a shorter HWCAP_ID_ISAR0). The advantage of
the latter is that we can expose new CPUID registers if any of them
appear (or there is a useful feature in a register we don't expose).

-- 
Catalin

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-11 14:46       ` Catalin Marinas
@ 2015-08-11 15:18         ` Suzuki K. Poulose
  0 siblings, 0 replies; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-08-11 15:18 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Mark Rutland, aph, Will Deacon, linux-kernel, edward.nevill,
	linux-arm-kernel

On 11/08/15 15:46, Catalin Marinas wrote:
> On Mon, Aug 10, 2015 at 06:36:46PM +0100, Suzuki K. Poulose wrote:
>> On 10/08/15 17:06, Catalin Marinas wrote:
>>> On Fri, Jul 24, 2015 at 10:43:47AM +0100, Suzuki K. Poulose wrote:
>>>> From: "Suzuki K. Poulose" <suzuki.poulose@arm.com>
>>>>
>>>> Documentation of the infrastructure
>>>>
>>>> Signed-off-by: Suzuki K. Poulose <suzuki.poulose@arm.com>

...

>>> + a) Any change to the HWCAPs requires an update to userspace (e.g libc)
>>>> +    to detect the new changes, which can take a long time to appear in
>>>> +    distributions. Exposing the registers allows applications to get the
>>>> +    information without requiring other userspace components to be updated.
>>>
>>> How does it help if you have a new CPUID field or even a new value in an
>>> existing field? Doesn't userspace need to be changed anyway to make use
>>> of the new feature? I don't think that's a valid argument.
>>
>> Yes, the userspace would need an update to work with the new CPUID field. I understand.
>> It is just that, "in the enterprise world" updates to the system libraries provided by
>> the distribution might take a bit longer to provide the changes than a software vendor.
>> I agree thats not a common case.
>
> What I meant is that for a new CPU feature, the user space needs
> updating anyway to make use of (add support for) it, whether it checks
> its presence via HWCAP or MRS. Let's say we get new crypto instructions,
> existing user space won't even check for it because it doesn't know
> there is a new CPUID field (or HWCAP bit).
>

I understand that, thats why I mentioned a 'software vendor' could roll out an
update independent of the 'system libraries' provided by the distribution, where
standard distributions might have their own schedule. But, I agree that the userspace
needs update and my 'story' is not common :)

>>> 2. We still need features listed in /proc/cpuinfo, at least for humans
>>>     reading this file or scripts that can't issue mrs instructions
>>
>> Agreed, we still need to provide the features in /proc/cpuinfo. We could do
>> this without HWCAP if we decide not to update the list.
>
> I agree, /proc/cpuinfo is doable without HWCAP. But since some software
> ends up parsing /proc/cpuinfo anyway, I don't see why we should hide
> HWCAP.
>

Right.

>>>> +3. Implementation
>>>> +--------------------
>>>> +
>> ...
>>>> +The infrastructure emulates only the following system register space:
>>>> +	Op0=3, Op1=0, CRn=0
>>>> +
>>>> +(See Table C5-6 'System instruction encodings for System register accesses'
>>>> + in ARMv8 ARM, for the list of registers).
>>>> +
>>>> +
>>>> +The following rules are applied to the value returned by the infrastructure:
>>>> +
>>>> + a) The value of an 'IMPLEMENTATION DEFINED' field is set to 0.
>>>> + b) The value of a reserved field is set to the reserved value(as
>>>> +    defined by the architecture).
>>>
>>> Do we expose any IMPLEMENTATION DEFINED or reserved field to user?
>>
>> We don't. All such fields are marked invisible. The above rules define
>> how we fill those 'special' (invisible) fields. We 'emulate' all
>> access to the space (as defined above) with Op0=3, Op1=0 & CRn=0.
>> Out of this space, there are only a very few 'visible' fields(listed in
>> section 4). These rules, define how the values are emulated.
>
> Point b) above is a bit confusing - reserved field is set to the
> reserved value. If the reserved value is non-zero, do we expose such
> value to user or we return zero as for other invisible fields?

At the moment, we are exposing the reserved values to not confuse the user, which
- I thought - is safer than exposing 0. Value 0 could have a different meaning in
the future(if it is not RES0 already).

>>>> + c) The value of a field marked as not 'visible', is set to indicate
>>>> +    the feature is missing (as defined by the architecture).
>>>> + d) The value of a 'visible' field holds the system wide safe value
>>>> +    for the particular feature(except for MIDR_EL1, see section 4)
>>>
>>> I'm slightly confused by the visible/not-visible definition. GIC for
>>> example may be present but we don't want to expose it to user, hence you
>>> marked it as "not visible" in the table. But the feature is definitely
>>> not missing, it may be present and we just decided not to expose it to
>>> EL0 since it is not relevant.
>>
>> Thats right. In this case, the userspace will see that 'GIC' is not present
>> even though it is available. Btw, the system wide value(exposed to the system
>> wide users) could be different from what the user gets. e.g, if all the CPUs
>> have GIC system register access, the system view will have 'GIC' available.
>>
>> Taking another example to explain rule (d), if all CPUs but one supports CRC32
>> instructions, both the system view and the user view will have CRC32 disabled.
>
> OK. I missed the difference between "system wide view" and "user view".
> I guess the former is not exposed to user.

Right, the system view and the user view are different. User view will only get the
visible parts from the 'System view'.

>
> As I mentioned in my reply to Ard, we need a HWCAP entry to inform the
> user of the MRS emulation. My question is whether to use a single
> HWCAP_CPUID or multiple for each ID register (e.g.
> HWCAP_ID_AA64ISAR0_EL1 or a shorter HWCAP_ID_ISAR0). The advantage of
> the latter is that we can expose new CPUID registers if any of them
> appear (or there is a useful feature in a register we don't expose).

The current implementation, kind of solves the issue. i.e, we expose
a safe value(which implies 'feature not available') for all the registers
in the id space (including the reserved ids) we plan to emulate. So if we
decide to expose something, userspace need not make any change. It could
continue to read the register safely and decide if the feature is available or not
and take appropriate decisions.


Thanks
Suzuki


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-11 14:23         ` Catalin Marinas
@ 2015-08-11 15:37           ` Suzuki K. Poulose
  2015-09-10 15:55             ` Dave Martin
  0 siblings, 1 reply; 24+ messages in thread
From: Suzuki K. Poulose @ 2015-08-11 15:37 UTC (permalink / raw)
  To: Catalin Marinas, Ard Biesheuvel
  Cc: Mark Rutland, aph, Will Deacon, linux-kernel, edward.nevill,
	linux-arm-kernel

On 11/08/15 15:23, Catalin Marinas wrote:
> On Mon, Aug 10, 2015 at 07:48:48PM +0200, Ard Biesheuvel wrote:
>>> On 10/08/15 17:06, Catalin Marinas wrote:
>>>> And to debunk some of the counter arguments:
>>>>
>>>> a) Running out of HWCAP bits - I really doubt this, we can always
>>>>      introduce 64 more via a new elf_hwcapX
>>
>> Note that ELF_HWCAP is also wired into ifunc resolution of GNU
>> indirect functions, which looks like a useful feature although it
>> isn't used that widely yet.
>
> I forgot to mention, we also need an HWCAP_CPUID with these patches when
> we expose the MRS interface. The ifunc resolver could use MRS when
> available. But I would still keep adding HWCAP bits for new features,
> even if we risk running out of the 64-bit we have now.
>

Sure, I will add the HWCAP_CPUID in the next version of the series.

Thanks
Suzuki


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 01/10] arm64: feature registers: Documentation
  2015-08-11 15:37           ` Suzuki K. Poulose
@ 2015-09-10 15:55             ` Dave Martin
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Martin @ 2015-09-10 15:55 UTC (permalink / raw)
  To: Suzuki K. Poulose
  Cc: Catalin Marinas, Ard Biesheuvel, Mark Rutland, aph, Will Deacon,
	linux-kernel, edward.nevill, linux-arm-kernel

On Tue, Aug 11, 2015 at 04:37:46PM +0100, Suzuki K. Poulose wrote:
> On 11/08/15 15:23, Catalin Marinas wrote:
> >On Mon, Aug 10, 2015 at 07:48:48PM +0200, Ard Biesheuvel wrote:
> >>>On 10/08/15 17:06, Catalin Marinas wrote:
> >>>>And to debunk some of the counter arguments:
> >>>>
> >>>>a) Running out of HWCAP bits - I really doubt this, we can always
> >>>>     introduce 64 more via a new elf_hwcapX
> >>
> >>Note that ELF_HWCAP is also wired into ifunc resolution of GNU
> >>indirect functions, which looks like a useful feature although it
> >>isn't used that widely yet.
> >
> >I forgot to mention, we also need an HWCAP_CPUID with these patches when
> >we expose the MRS interface. The ifunc resolver could use MRS when
> >available. But I would still keep adding HWCAP bits for new features,
> >even if we risk running out of the 64-bit we have now.
> >
> 
> Sure, I will add the HWCAP_CPUID in the next version of the series.

+1

Playing with this, I realise that I get a splat if my userspace code
tries to do an MRS for an ID register when this series is absent -- we
need an hwcap that we can check first.

Cheers
---Dave


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-09-10 15:55 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-24  9:43 [RFC PATCH 00/10] arm64: Expose CPU feature registers Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 01/10] arm64: feature registers: Documentation Suzuki K. Poulose
2015-08-10 16:06   ` Catalin Marinas
2015-08-10 17:36     ` Suzuki K. Poulose
2015-08-10 17:48       ` Ard Biesheuvel
2015-08-11 14:23         ` Catalin Marinas
2015-08-11 15:37           ` Suzuki K. Poulose
2015-09-10 15:55             ` Dave Martin
2015-08-10 18:19       ` Andrew Haley
2015-08-11  8:41         ` Suzuki K. Poulose
2015-08-11  8:58           ` Andrew Haley
2015-08-11 14:46       ` Catalin Marinas
2015-08-11 15:18         ` Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 02/10] arm64: Make the CPU information more clear Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 03/10] arm64: Delay ELF HWCAP initialisation until all CPUs are up Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 04/10] arm64: Consolidate cpuinfo handling Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 05/10] arm64: Keep track of CPU feature registers Suzuki K. Poulose
2015-08-05 14:58   ` Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 06/10] arm64: Add helper to decode register from instruction Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 07/10] arm64: Expose feature registers by emulating MRS Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 08/10] arm64: Emulate ID registers Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 09/10] arm64: Read system wide CPUID value Suzuki K. Poulose
2015-07-24  9:43 ` [RFC PATCH 10/10] arm64: Use system-wide safe value of CPU feature register Suzuki K. Poulose
2015-07-24  9:43 ` sample: arm64 cpu feature: Test program Suzuki K. Poulose

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).