linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/5] arm64: SPE ACPI enablement
@ 2019-05-03 23:24 Jeremy Linton
  2019-05-03 23:24 ` Jeremy Linton
                   ` (6 more replies)
  0 siblings, 7 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, Jeremy Linton, linux-acpi, catalin.marinas, john.garry,
	lenb

This patch series enables the Arm Statistical Profiling
Extension (SPE) on ACPI platforms.

This is possible because ACPI 6.3 uses a previously
reserved field in the MADT to store the SPE interrupt
number, similarly to how the normal PMU is described.
If a consistent valid interrupt exists across all the
cores in the system, a platform device is registered.
That then triggers the SPE module, which runs as normal.

We also add the ability to parse the PPTT for IDENTICAL
cores. We then use this to sanity check the single SPE
device we create. This creates a bit of a problem with
respect to the specification though. The specification
says that its legal for multiple tree's to exist in the
PPTT. We handle this fine, but what happens in the
case of multiple tree's is that the lack of a common
node with IDENTICAL set forces us to assume that there
are multiple non-IDENTICAL cores in the machine.

v2->v3: Previously a function pointer was being used
	  to handle the more complex node checking
	  required by the IDENTICAL flag. This version
	  simply checks for the IDENTICAL flag and calls
	  flag_identical() to preform the revision
	  and next node checks. (I think after reading
	  Raphael's comments for the Nth time, this is
	  actually what he was suggesting, which I
	  initially miss interpreted).
	Modify subject of first patch so that its clear
	  a that its a capitalization change rather,
	  than a logical C 'case' change.

v1->v2: Wrap the code which creates the SPE device in
	    a new CONFIG_ARM_SPE_ACPI ifdef.
	Move arm,spe-v1 device name into common header file
	Some comment/case sensitivity/function name changes.

Jeremy Linton (5):
  ACPI/PPTT: Trivial, change the capitalization of CPU
  ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  arm_pmu: acpi: spe: Add initial MADT/SPE probing
  perf: arm_spe: Enable ACPI/Platform automatic module loading

 arch/arm64/include/asm/acpi.h |   3 +
 drivers/acpi/pptt.c           | 102 +++++++++++++++++++++++++---------
 drivers/perf/Kconfig          |   5 ++
 drivers/perf/arm_pmu_acpi.c   |  76 +++++++++++++++++++++++++
 drivers/perf/arm_spe_pmu.c    |  12 +++-
 include/linux/acpi.h          |   5 ++
 include/linux/perf/arm_pmu.h  |   2 +
 7 files changed, 176 insertions(+), 29 deletions(-)

-- 
2.21.0

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-03 23:24 [PATCH v3 0/5] arm64: SPE ACPI enablement Jeremy Linton
@ 2019-05-03 23:24 ` Jeremy Linton
  2019-05-03 23:24 ` [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU Jeremy Linton
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry, Jeremy Linton

This patch series enables the Arm Statistical Profiling
Extension (SPE) on ACPI platforms.

This is possible because ACPI 6.3 uses a previously
reserved field in the MADT to store the SPE interrupt
number, similarly to how the normal PMU is described.
If a consistent valid interrupt exists across all the
cores in the system, a platform device is registered.
That then triggers the SPE module, which runs as normal.

We also add the ability to parse the PPTT for IDENTICAL
cores. We then use this to sanity check the single SPE
device we create. This creates a bit of a problem with
respect to the specification though. The specification
says that its legal for multiple tree's to exist in the
PPTT. We handle this fine, but what happens in the
case of multiple tree's is that the lack of a common
node with IDENTICAL set forces us to assume that there
are multiple non-IDENTICAL cores in the machine.

v2->v3: Previously a function pointer was being used
	  to handle the more complex node checking
	  required by the IDENTICAL flag. This version
	  simply checks for the IDENTICAL flag and calls
	  flag_identical() to preform the revision
	  and next node checks. (I think after reading
	  Raphael's comments for the Nth time, this is
	  actually what he was suggesting, which I
	  initially miss interpreted).
	Modify subject of first patch so that its clear
	  a that its a capitalization change rather,
	  than a logical C 'case' change.

v1->v2: Wrap the code which creates the SPE device in
	    a new CONFIG_ARM_SPE_ACPI ifdef.
	Move arm,spe-v1 device name into common header file
	Some comment/case sensitivity/function name changes.

Jeremy Linton (5):
  ACPI/PPTT: Trivial, change the capitalization of CPU
  ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  arm_pmu: acpi: spe: Add initial MADT/SPE probing
  perf: arm_spe: Enable ACPI/Platform automatic module loading

 arch/arm64/include/asm/acpi.h |   3 +
 drivers/acpi/pptt.c           | 102 +++++++++++++++++++++++++---------
 drivers/perf/Kconfig          |   5 ++
 drivers/perf/arm_pmu_acpi.c   |  76 +++++++++++++++++++++++++
 drivers/perf/arm_spe_pmu.c    |  12 +++-
 include/linux/acpi.h          |   5 ++
 include/linux/perf/arm_pmu.h  |   2 +
 7 files changed, 176 insertions(+), 29 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU
  2019-05-03 23:24 [PATCH v3 0/5] arm64: SPE ACPI enablement Jeremy Linton
  2019-05-03 23:24 ` Jeremy Linton
@ 2019-05-03 23:24 ` Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
  2019-05-07 18:12   ` Jeremy Linton
  2019-05-03 23:24 ` [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens Jeremy Linton
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, Rafael J . Wysocki, will.deacon,
	sudeep.holla, rjw, linuxarm, Jeremy Linton, linux-acpi,
	catalin.marinas, john.garry, lenb

CPU is an acronym and customarily capitalized. Much of the
commenting in the PPTT code has been using "cpu" rather
than "CPU". Correct that, and other human readable strings.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/pptt.c | 48 ++++++++++++++++++++++-----------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 065c4fc245d1..83a026765faa 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -164,7 +164,7 @@ static struct acpi_pptt_cache *acpi_find_cache_level(struct acpi_table_header *t
 }
 
 /**
- * acpi_count_levels() - Given a PPTT table, and a cpu node, count the caches
+ * acpi_count_levels() - Given a PPTT table, and a CPU node, count the caches
  * @table_hdr: Pointer to the head of the PPTT table
  * @cpu_node: processor node we wish to count caches for
  *
@@ -235,7 +235,7 @@ static int acpi_pptt_leaf_node(struct acpi_table_header *table_hdr,
 /**
  * acpi_find_processor_node() - Given a PPTT table find the requested processor
  * @table_hdr:  Pointer to the head of the PPTT table
- * @acpi_cpu_id: cpu we are searching for
+ * @acpi_cpu_id: CPU we are searching for
  *
  * Find the subtable entry describing the provided processor.
  * This is done by iterating the PPTT table looking for processor nodes
@@ -456,21 +456,21 @@ static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_ta
 
 static void acpi_pptt_warn_missing(void)
 {
-	pr_warn_once("No PPTT table found, cpu and cache topology may be inaccurate\n");
+	pr_warn_once("No PPTT table found, CPU and cache topology may be inaccurate\n");
 }
 
 /**
  * topology_get_acpi_cpu_tag() - Find a unique topology value for a feature
  * @table: Pointer to the head of the PPTT table
- * @cpu: Kernel logical cpu number
+ * @cpu: Kernel logical CPU number
  * @level: A level that terminates the search
  * @flag: A flag which terminates the search
  *
- * Get a unique value given a cpu, and a topology level, that can be
- * matched to determine which cpus share common topological features
+ * Get a unique value given a CPU, and a topology level, that can be
+ * matched to determine which CPUs share common topological features
  * at that level.
  *
- * Return: Unique value, or -ENOENT if unable to locate cpu
+ * Return: Unique value, or -ENOENT if unable to locate CPU
  */
 static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
 				     unsigned int cpu, int level, int flag)
@@ -510,7 +510,7 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
 		return -ENOENT;
 	}
 	retval = topology_get_acpi_cpu_tag(table, cpu, level, flag);
-	pr_debug("Topology Setup ACPI cpu %d, level %d ret = %d\n",
+	pr_debug("Topology Setup ACPI CPU %d, level %d ret = %d\n",
 		 cpu, level, retval);
 	acpi_put_table(table);
 
@@ -519,9 +519,9 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
 
 /**
  * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
- * @cpu: Kernel logical cpu number
+ * @cpu: Kernel logical CPU number
  *
- * Given a logical cpu number, returns the number of levels of cache represented
+ * Given a logical CPU number, returns the number of levels of cache represented
  * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
  * indicating we didn't find any cache levels.
  *
@@ -534,7 +534,7 @@ int acpi_find_last_cache_level(unsigned int cpu)
 	int number_of_levels = 0;
 	acpi_status status;
 
-	pr_debug("Cache Setup find last level cpu=%d\n", cpu);
+	pr_debug("Cache Setup find last level CPU=%d\n", cpu);
 
 	acpi_cpu_id = get_acpi_id_for_cpu(cpu);
 	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
@@ -551,14 +551,14 @@ int acpi_find_last_cache_level(unsigned int cpu)
 
 /**
  * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
- * @cpu: Kernel logical cpu number
+ * @cpu: Kernel logical CPU number
  *
  * Updates the global cache info provided by cpu_get_cacheinfo()
  * when there are valid properties in the acpi_pptt_cache nodes. A
  * successful parse may not result in any updates if none of the
  * cache levels have any valid flags set.  Futher, a unique value is
  * associated with each known CPU cache entry. This unique value
- * can be used to determine whether caches are shared between cpus.
+ * can be used to determine whether caches are shared between CPUs.
  *
  * Return: -ENOENT on failure to find table, or 0 on success
  */
@@ -567,7 +567,7 @@ int cache_setup_acpi(unsigned int cpu)
 	struct acpi_table_header *table;
 	acpi_status status;
 
-	pr_debug("Cache Setup ACPI cpu %d\n", cpu);
+	pr_debug("Cache Setup ACPI CPU %d\n", cpu);
 
 	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
 	if (ACPI_FAILURE(status)) {
@@ -582,8 +582,8 @@ int cache_setup_acpi(unsigned int cpu)
 }
 
 /**
- * find_acpi_cpu_topology() - Determine a unique topology value for a given cpu
- * @cpu: Kernel logical cpu number
+ * find_acpi_cpu_topology() - Determine a unique topology value for a given CPU
+ * @cpu: Kernel logical CPU number
  * @level: The topological level for which we would like a unique ID
  *
  * Determine a topology unique ID for each thread/core/cluster/mc_grouping
@@ -596,7 +596,7 @@ int cache_setup_acpi(unsigned int cpu)
  * other levels beyond this use a generated value to uniquely identify
  * a topological feature.
  *
- * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
  * Otherwise returns a value which represents a unique topological feature.
  */
 int find_acpi_cpu_topology(unsigned int cpu, int level)
@@ -606,12 +606,12 @@ int find_acpi_cpu_topology(unsigned int cpu, int level)
 
 /**
  * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
- * @cpu: Kernel logical cpu number
+ * @cpu: Kernel logical CPU number
  * @level: The cache level for which we would like a unique ID
  *
  * Determine a unique ID for each unified cache in the system
  *
- * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
  * Otherwise returns a value which represents a unique topological feature.
  */
 int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
@@ -643,17 +643,17 @@ int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
 
 
 /**
- * find_acpi_cpu_topology_package() - Determine a unique cpu package value
- * @cpu: Kernel logical cpu number
+ * find_acpi_cpu_topology_package() - Determine a unique CPU package value
+ * @cpu: Kernel logical CPU number
  *
- * Determine a topology unique package ID for the given cpu.
+ * Determine a topology unique package ID for the given CPU.
  * This ID can then be used to group peers, which will have matching ids.
  *
  * The search terminates when either a level is found with the PHYSICAL_PACKAGE
  * flag set or we reach a root node.
  *
- * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
- * Otherwise returns a value which represents the package for this cpu.
+ * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
+ * Otherwise returns a value which represents the package for this CPU.
  */
 int find_acpi_cpu_topology_package(unsigned int cpu)
 {
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU
  2019-05-03 23:24 ` [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU Jeremy Linton
@ 2019-05-03 23:24   ` Jeremy Linton
  2019-05-07 18:12   ` Jeremy Linton
  1 sibling, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry, Jeremy Linton, Rafael J . Wysocki

CPU is an acronym and customarily capitalized. Much of the
commenting in the PPTT code has been using "cpu" rather
than "CPU". Correct that, and other human readable strings.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/pptt.c | 48 ++++++++++++++++++++++-----------------------
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 065c4fc245d1..83a026765faa 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -164,7 +164,7 @@ static struct acpi_pptt_cache *acpi_find_cache_level(struct acpi_table_header *t
 }
 
 /**
- * acpi_count_levels() - Given a PPTT table, and a cpu node, count the caches
+ * acpi_count_levels() - Given a PPTT table, and a CPU node, count the caches
  * @table_hdr: Pointer to the head of the PPTT table
  * @cpu_node: processor node we wish to count caches for
  *
@@ -235,7 +235,7 @@ static int acpi_pptt_leaf_node(struct acpi_table_header *table_hdr,
 /**
  * acpi_find_processor_node() - Given a PPTT table find the requested processor
  * @table_hdr:  Pointer to the head of the PPTT table
- * @acpi_cpu_id: cpu we are searching for
+ * @acpi_cpu_id: CPU we are searching for
  *
  * Find the subtable entry describing the provided processor.
  * This is done by iterating the PPTT table looking for processor nodes
@@ -456,21 +456,21 @@ static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_ta
 
 static void acpi_pptt_warn_missing(void)
 {
-	pr_warn_once("No PPTT table found, cpu and cache topology may be inaccurate\n");
+	pr_warn_once("No PPTT table found, CPU and cache topology may be inaccurate\n");
 }
 
 /**
  * topology_get_acpi_cpu_tag() - Find a unique topology value for a feature
  * @table: Pointer to the head of the PPTT table
- * @cpu: Kernel logical cpu number
+ * @cpu: Kernel logical CPU number
  * @level: A level that terminates the search
  * @flag: A flag which terminates the search
  *
- * Get a unique value given a cpu, and a topology level, that can be
- * matched to determine which cpus share common topological features
+ * Get a unique value given a CPU, and a topology level, that can be
+ * matched to determine which CPUs share common topological features
  * at that level.
  *
- * Return: Unique value, or -ENOENT if unable to locate cpu
+ * Return: Unique value, or -ENOENT if unable to locate CPU
  */
 static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
 				     unsigned int cpu, int level, int flag)
@@ -510,7 +510,7 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
 		return -ENOENT;
 	}
 	retval = topology_get_acpi_cpu_tag(table, cpu, level, flag);
-	pr_debug("Topology Setup ACPI cpu %d, level %d ret = %d\n",
+	pr_debug("Topology Setup ACPI CPU %d, level %d ret = %d\n",
 		 cpu, level, retval);
 	acpi_put_table(table);
 
@@ -519,9 +519,9 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
 
 /**
  * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
- * @cpu: Kernel logical cpu number
+ * @cpu: Kernel logical CPU number
  *
- * Given a logical cpu number, returns the number of levels of cache represented
+ * Given a logical CPU number, returns the number of levels of cache represented
  * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
  * indicating we didn't find any cache levels.
  *
@@ -534,7 +534,7 @@ int acpi_find_last_cache_level(unsigned int cpu)
 	int number_of_levels = 0;
 	acpi_status status;
 
-	pr_debug("Cache Setup find last level cpu=%d\n", cpu);
+	pr_debug("Cache Setup find last level CPU=%d\n", cpu);
 
 	acpi_cpu_id = get_acpi_id_for_cpu(cpu);
 	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
@@ -551,14 +551,14 @@ int acpi_find_last_cache_level(unsigned int cpu)
 
 /**
  * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
- * @cpu: Kernel logical cpu number
+ * @cpu: Kernel logical CPU number
  *
  * Updates the global cache info provided by cpu_get_cacheinfo()
  * when there are valid properties in the acpi_pptt_cache nodes. A
  * successful parse may not result in any updates if none of the
  * cache levels have any valid flags set.  Futher, a unique value is
  * associated with each known CPU cache entry. This unique value
- * can be used to determine whether caches are shared between cpus.
+ * can be used to determine whether caches are shared between CPUs.
  *
  * Return: -ENOENT on failure to find table, or 0 on success
  */
@@ -567,7 +567,7 @@ int cache_setup_acpi(unsigned int cpu)
 	struct acpi_table_header *table;
 	acpi_status status;
 
-	pr_debug("Cache Setup ACPI cpu %d\n", cpu);
+	pr_debug("Cache Setup ACPI CPU %d\n", cpu);
 
 	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
 	if (ACPI_FAILURE(status)) {
@@ -582,8 +582,8 @@ int cache_setup_acpi(unsigned int cpu)
 }
 
 /**
- * find_acpi_cpu_topology() - Determine a unique topology value for a given cpu
- * @cpu: Kernel logical cpu number
+ * find_acpi_cpu_topology() - Determine a unique topology value for a given CPU
+ * @cpu: Kernel logical CPU number
  * @level: The topological level for which we would like a unique ID
  *
  * Determine a topology unique ID for each thread/core/cluster/mc_grouping
@@ -596,7 +596,7 @@ int cache_setup_acpi(unsigned int cpu)
  * other levels beyond this use a generated value to uniquely identify
  * a topological feature.
  *
- * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
  * Otherwise returns a value which represents a unique topological feature.
  */
 int find_acpi_cpu_topology(unsigned int cpu, int level)
@@ -606,12 +606,12 @@ int find_acpi_cpu_topology(unsigned int cpu, int level)
 
 /**
  * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
- * @cpu: Kernel logical cpu number
+ * @cpu: Kernel logical CPU number
  * @level: The cache level for which we would like a unique ID
  *
  * Determine a unique ID for each unified cache in the system
  *
- * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
  * Otherwise returns a value which represents a unique topological feature.
  */
 int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
@@ -643,17 +643,17 @@ int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
 
 
 /**
- * find_acpi_cpu_topology_package() - Determine a unique cpu package value
- * @cpu: Kernel logical cpu number
+ * find_acpi_cpu_topology_package() - Determine a unique CPU package value
+ * @cpu: Kernel logical CPU number
  *
- * Determine a topology unique package ID for the given cpu.
+ * Determine a topology unique package ID for the given CPU.
  * This ID can then be used to group peers, which will have matching ids.
  *
  * The search terminates when either a level is found with the PHYSICAL_PACKAGE
  * flag set or we reach a root node.
  *
- * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
- * Otherwise returns a value which represents the package for this cpu.
+ * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
+ * Otherwise returns a value which represents the package for this CPU.
  */
 int find_acpi_cpu_topology_package(unsigned int cpu)
 {
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  2019-05-03 23:24 [PATCH v3 0/5] arm64: SPE ACPI enablement Jeremy Linton
  2019-05-03 23:24 ` Jeremy Linton
  2019-05-03 23:24 ` [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU Jeremy Linton
@ 2019-05-03 23:24 ` Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
                     ` (2 more replies)
  2019-05-03 23:24 ` [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL Jeremy Linton
                   ` (3 subsequent siblings)
  6 siblings, 3 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, Jeremy Linton, linux-acpi, catalin.marinas, john.garry,
	lenb

ACPI 6.3 adds a flag to indicate that child nodes are all
identical cores. This is useful to authoritatively determine
if a set of (possibly offline) cores are identical or not.

Since the flag doesn't give us a unique id we can generate
one and use it to create bitmaps of sibling nodes, or simply
in a loop to determine if a subset of cores are identical.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c  | 26 ++++++++++++++++++++++++++
 include/linux/acpi.h |  5 +++++
 2 files changed, 31 insertions(+)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 83a026765faa..1865515297ca 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -660,3 +660,29 @@ int find_acpi_cpu_topology_package(unsigned int cpu)
 	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
 					  ACPI_PPTT_PHYSICAL_PACKAGE);
 }
+
+/**
+ * find_acpi_cpu_topology_hetero_id() - Get a core architecture tag
+ * @cpu: Kernel logical CPU number
+ *
+ * Determine a unique heterogeneous tag for the given CPU. CPUs with the same
+ * implementation should have matching tags.
+ *
+ * The returned tag can be used to group peers with identical implementation.
+ *
+ * The search terminates when a level is found with the identical implementation
+ * flag set or we reach a root node.
+ *
+ * Due to limitations in the PPTT data structure, there may be rare situations
+ * where two cores in a heterogeneous machine may be identical, but won't have
+ * the same tag.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
+ * Otherwise returns a value which represents a group of identical cores
+ * similar to this CPU.
+ */
+int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
+{
+	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
+					  ACPI_PPTT_ACPI_IDENTICAL);
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index d5dcebd7aad3..1444fb042898 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1309,6 +1309,7 @@ static inline int lpit_read_residency_count_address(u64 *address)
 #ifdef CONFIG_ACPI_PPTT
 int find_acpi_cpu_topology(unsigned int cpu, int level);
 int find_acpi_cpu_topology_package(unsigned int cpu);
+int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
 int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
 #else
 static inline int find_acpi_cpu_topology(unsigned int cpu, int level)
@@ -1319,6 +1320,10 @@ static inline int find_acpi_cpu_topology_package(unsigned int cpu)
 {
 	return -EINVAL;
 }
+static int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
+{
+	return -EINVAL;
+}
 static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
 {
 	return -EINVAL;
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  2019-05-03 23:24 ` [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens Jeremy Linton
@ 2019-05-03 23:24   ` Jeremy Linton
  2019-05-05  7:09   ` Kefeng Wang
  2019-06-07  9:49   ` Sudeep Holla
  2 siblings, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry, Jeremy Linton

ACPI 6.3 adds a flag to indicate that child nodes are all
identical cores. This is useful to authoritatively determine
if a set of (possibly offline) cores are identical or not.

Since the flag doesn't give us a unique id we can generate
one and use it to create bitmaps of sibling nodes, or simply
in a loop to determine if a subset of cores are identical.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c  | 26 ++++++++++++++++++++++++++
 include/linux/acpi.h |  5 +++++
 2 files changed, 31 insertions(+)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 83a026765faa..1865515297ca 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -660,3 +660,29 @@ int find_acpi_cpu_topology_package(unsigned int cpu)
 	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
 					  ACPI_PPTT_PHYSICAL_PACKAGE);
 }
+
+/**
+ * find_acpi_cpu_topology_hetero_id() - Get a core architecture tag
+ * @cpu: Kernel logical CPU number
+ *
+ * Determine a unique heterogeneous tag for the given CPU. CPUs with the same
+ * implementation should have matching tags.
+ *
+ * The returned tag can be used to group peers with identical implementation.
+ *
+ * The search terminates when a level is found with the identical implementation
+ * flag set or we reach a root node.
+ *
+ * Due to limitations in the PPTT data structure, there may be rare situations
+ * where two cores in a heterogeneous machine may be identical, but won't have
+ * the same tag.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
+ * Otherwise returns a value which represents a group of identical cores
+ * similar to this CPU.
+ */
+int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
+{
+	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
+					  ACPI_PPTT_ACPI_IDENTICAL);
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index d5dcebd7aad3..1444fb042898 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1309,6 +1309,7 @@ static inline int lpit_read_residency_count_address(u64 *address)
 #ifdef CONFIG_ACPI_PPTT
 int find_acpi_cpu_topology(unsigned int cpu, int level);
 int find_acpi_cpu_topology_package(unsigned int cpu);
+int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
 int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
 #else
 static inline int find_acpi_cpu_topology(unsigned int cpu, int level)
@@ -1319,6 +1320,10 @@ static inline int find_acpi_cpu_topology_package(unsigned int cpu)
 {
 	return -EINVAL;
 }
+static int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
+{
+	return -EINVAL;
+}
 static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
 {
 	return -EINVAL;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  2019-05-03 23:24 [PATCH v3 0/5] arm64: SPE ACPI enablement Jeremy Linton
                   ` (2 preceding siblings ...)
  2019-05-03 23:24 ` [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens Jeremy Linton
@ 2019-05-03 23:24 ` Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
  2019-06-07  9:53   ` Sudeep Holla
  2019-05-03 23:24 ` [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing Jeremy Linton
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, Jeremy Linton, linux-acpi, catalin.marinas, john.garry,
	lenb

The ACPI specification implies that the IDENTICAL flag should be
set on all non leaf nodes where the children are identical.
This means that we need to be searching for the last node with
the identical flag set rather than the first one.

Since this flag is also dependent on the table revision, we
need to add a bit of extra code to verify the table revision,
and the next node's state in the traversal. Since we want to
avoid function pointers here, lets just special case
the IDENTICAL flag.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 1865515297ca..456e1c0a35ae 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -432,17 +432,39 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
 	}
 }
 
+static bool flag_identical(struct acpi_table_header *table_hdr,
+			  struct acpi_pptt_processor *cpu)
+{
+	struct acpi_pptt_processor *next;
+
+	/* heterogeneous machines must use PPTT revision > 1 */
+	if (table_hdr->revision < 2)
+		return false;
+
+	/* Locate the last node in the tree with IDENTICAL set */
+	if (cpu->flags & ACPI_PPTT_ACPI_IDENTICAL) {
+		next = fetch_pptt_node(table_hdr, cpu->parent);
+		if (!(next && next->flags & ACPI_PPTT_ACPI_IDENTICAL))
+			return true;
+	}
+
+	return false;
+}
+
 /* Passing level values greater than this will result in search termination */
 #define PPTT_ABORT_PACKAGE 0xFF
 
-static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_table_header *table_hdr,
+static struct acpi_pptt_processor *acpi_find_processor_tag_id(struct acpi_table_header *table_hdr,
 								  struct acpi_pptt_processor *cpu,
 								  int level, int flag)
 {
 	struct acpi_pptt_processor *prev_node;
 
 	while (cpu && level) {
-		if (cpu->flags & flag)
+		if (flag == ACPI_PPTT_ACPI_IDENTICAL) {
+			if (flag_identical(table_hdr, cpu))
+				break;
+		} else if (cpu->flags & flag)
 			break;
 		pr_debug("level %d\n", level);
 		prev_node = fetch_pptt_node(table_hdr, cpu->parent);
@@ -480,7 +502,7 @@ static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
 
 	cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
 	if (cpu_node) {
-		cpu_node = acpi_find_processor_package_id(table, cpu_node,
+		cpu_node = acpi_find_processor_tag_id(table, cpu_node,
 							  level, flag);
 		/*
 		 * As per specification if the processor structure represents
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  2019-05-03 23:24 ` [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL Jeremy Linton
@ 2019-05-03 23:24   ` Jeremy Linton
  2019-06-07  9:53   ` Sudeep Holla
  1 sibling, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry, Jeremy Linton

The ACPI specification implies that the IDENTICAL flag should be
set on all non leaf nodes where the children are identical.
This means that we need to be searching for the last node with
the identical flag set rather than the first one.

Since this flag is also dependent on the table revision, we
need to add a bit of extra code to verify the table revision,
and the next node's state in the traversal. Since we want to
avoid function pointers here, lets just special case
the IDENTICAL flag.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 1865515297ca..456e1c0a35ae 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -432,17 +432,39 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
 	}
 }
 
+static bool flag_identical(struct acpi_table_header *table_hdr,
+			  struct acpi_pptt_processor *cpu)
+{
+	struct acpi_pptt_processor *next;
+
+	/* heterogeneous machines must use PPTT revision > 1 */
+	if (table_hdr->revision < 2)
+		return false;
+
+	/* Locate the last node in the tree with IDENTICAL set */
+	if (cpu->flags & ACPI_PPTT_ACPI_IDENTICAL) {
+		next = fetch_pptt_node(table_hdr, cpu->parent);
+		if (!(next && next->flags & ACPI_PPTT_ACPI_IDENTICAL))
+			return true;
+	}
+
+	return false;
+}
+
 /* Passing level values greater than this will result in search termination */
 #define PPTT_ABORT_PACKAGE 0xFF
 
-static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_table_header *table_hdr,
+static struct acpi_pptt_processor *acpi_find_processor_tag_id(struct acpi_table_header *table_hdr,
 								  struct acpi_pptt_processor *cpu,
 								  int level, int flag)
 {
 	struct acpi_pptt_processor *prev_node;
 
 	while (cpu && level) {
-		if (cpu->flags & flag)
+		if (flag == ACPI_PPTT_ACPI_IDENTICAL) {
+			if (flag_identical(table_hdr, cpu))
+				break;
+		} else if (cpu->flags & flag)
 			break;
 		pr_debug("level %d\n", level);
 		prev_node = fetch_pptt_node(table_hdr, cpu->parent);
@@ -480,7 +502,7 @@ static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
 
 	cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
 	if (cpu_node) {
-		cpu_node = acpi_find_processor_package_id(table, cpu_node,
+		cpu_node = acpi_find_processor_tag_id(table, cpu_node,
 							  level, flag);
 		/*
 		 * As per specification if the processor structure represents
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing
  2019-05-03 23:24 [PATCH v3 0/5] arm64: SPE ACPI enablement Jeremy Linton
                   ` (3 preceding siblings ...)
  2019-05-03 23:24 ` [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL Jeremy Linton
@ 2019-05-03 23:24 ` Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
                     ` (2 more replies)
  2019-05-03 23:24 ` [PATCH v3 5/5] perf: arm_spe: Enable ACPI/Platform automatic module loading Jeremy Linton
  2019-05-04 11:06 ` [PATCH v3 0/5] arm64: SPE ACPI enablement Hanjun Guo
  6 siblings, 3 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, Jeremy Linton, linux-acpi, catalin.marinas, john.garry,
	lenb

ACPI 6.3 adds additional fields to the MADT GICC
structure to describe SPE PPI's. We pick these out
of the cached reference to the madt_gicc structure
similarly to the core PMU code. We then create a platform
device referring to the IRQ and let the user/module loader
decide whether to load the SPE driver.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/acpi.h |  3 ++
 drivers/perf/Kconfig          |  5 +++
 drivers/perf/arm_pmu_acpi.c   | 76 +++++++++++++++++++++++++++++++++++
 include/linux/perf/arm_pmu.h  |  2 +
 4 files changed, 86 insertions(+)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 7628efbe6c12..d10399b9f998 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -41,6 +41,9 @@
 	(!(entry) || (entry)->header.length < ACPI_MADT_GICC_MIN_LENGTH || \
 	(unsigned long)(entry) + (entry)->header.length > (end))
 
+#define ACPI_MADT_GICC_SPE  (ACPI_OFFSET(struct acpi_madt_generic_interrupt, \
+	spe_interrupt) + sizeof(u16))
+
 /* Basic configuration for ACPI */
 #ifdef	CONFIG_ACPI
 pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index af9bc178495d..bc2647c64c9d 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -52,6 +52,11 @@ config ARM_PMU_ACPI
 	depends on ARM_PMU && ACPI
 	def_bool y
 
+config ARM_SPE_ACPI
+	depends on ARM_PMU_ACPI && ARM_SPE_PMU
+	def_bool y
+
+
 config ARM_DSU_PMU
 	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
 	depends on ARM64
diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 0f197516d708..b0244e1e8c91 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -74,6 +74,80 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
 	acpi_unregister_gsi(gsi);
 }
 
+#ifdef CONFIG_ARM_SPE_ACPI
+static struct resource spe_resources[] = {
+	{
+		/* irq */
+		.flags          = IORESOURCE_IRQ,
+	}
+};
+
+static struct platform_device spe_dev = {
+	.name = ARMV8_SPE_PDEV_NAME,
+	.id = -1,
+	.resource = spe_resources,
+	.num_resources = ARRAY_SIZE(spe_resources)
+};
+
+/*
+ * For lack of a better place, hook the normal PMU MADT walk
+ * and create a SPE device if we detect a recent MADT with
+ * a homogeneous PPI mapping.
+ */
+static int arm_spe_acpi_register_device(void)
+{
+	int cpu, ret, irq;
+	int hetid;
+	u16 gsi = 0;
+	bool first = true;
+
+	struct acpi_madt_generic_interrupt *gicc;
+
+	/*
+	 * sanity check all the GICC tables for the same interrupt number
+	 * for now we only support homogeneous ACPI/SPE machines.
+	 */
+	for_each_possible_cpu(cpu) {
+		gicc = acpi_cpu_get_madt_gicc(cpu);
+
+		if (gicc->header.length < ACPI_MADT_GICC_SPE)
+			return -ENODEV;
+		if (first) {
+			gsi = gicc->spe_interrupt;
+			if (!gsi)
+				return -ENODEV;
+			hetid = find_acpi_cpu_topology_hetero_id(cpu);
+			first = false;
+		} else if ((gsi != gicc->spe_interrupt) ||
+			   (hetid != find_acpi_cpu_topology_hetero_id(cpu))) {
+			pr_warn("ACPI: SPE must be homogeneous\n");
+			return -EINVAL;
+		}
+	}
+
+	irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE,
+				ACPI_ACTIVE_HIGH);
+	if (irq < 0) {
+		pr_warn("ACPI: SPE Unable to register interrupt: %d\n", gsi);
+		return irq;
+	}
+
+	spe_resources[0].start = irq;
+	ret = platform_device_register(&spe_dev);
+	if (ret < 0) {
+		pr_warn("ACPI: SPE: Unable to register device\n");
+		acpi_unregister_gsi(gsi);
+	}
+
+	return ret;
+}
+#else
+static inline int arm_spe_acpi_register_device(void)
+{
+	return -ENODEV;
+}
+#endif /* CONFIG_ARM_SPE_ACPI */
+
 static int arm_pmu_acpi_parse_irqs(void)
 {
 	int irq, cpu, irq_cpu, err;
@@ -279,6 +353,8 @@ static int arm_pmu_acpi_init(void)
 	if (acpi_disabled)
 		return 0;
 
+	arm_spe_acpi_register_device(); /* failures are expected */
+
 	ret = arm_pmu_acpi_parse_irqs();
 	if (ret)
 		return ret;
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 4641e850b204..784bc58f165a 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -175,4 +175,6 @@ void armpmu_free_irq(int irq, int cpu);
 
 #endif /* CONFIG_ARM_PMU */
 
+#define ARMV8_SPE_PDEV_NAME "arm,spe-v1"
+
 #endif /* __ARM_PMU_H__ */
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing
  2019-05-03 23:24 ` [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing Jeremy Linton
@ 2019-05-03 23:24   ` Jeremy Linton
  2019-05-08 11:18   ` John Garry
  2019-06-07  9:57   ` Sudeep Holla
  2 siblings, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry, Jeremy Linton

ACPI 6.3 adds additional fields to the MADT GICC
structure to describe SPE PPI's. We pick these out
of the cached reference to the madt_gicc structure
similarly to the core PMU code. We then create a platform
device referring to the IRQ and let the user/module loader
decide whether to load the SPE driver.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/acpi.h |  3 ++
 drivers/perf/Kconfig          |  5 +++
 drivers/perf/arm_pmu_acpi.c   | 76 +++++++++++++++++++++++++++++++++++
 include/linux/perf/arm_pmu.h  |  2 +
 4 files changed, 86 insertions(+)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 7628efbe6c12..d10399b9f998 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -41,6 +41,9 @@
 	(!(entry) || (entry)->header.length < ACPI_MADT_GICC_MIN_LENGTH || \
 	(unsigned long)(entry) + (entry)->header.length > (end))
 
+#define ACPI_MADT_GICC_SPE  (ACPI_OFFSET(struct acpi_madt_generic_interrupt, \
+	spe_interrupt) + sizeof(u16))
+
 /* Basic configuration for ACPI */
 #ifdef	CONFIG_ACPI
 pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index af9bc178495d..bc2647c64c9d 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -52,6 +52,11 @@ config ARM_PMU_ACPI
 	depends on ARM_PMU && ACPI
 	def_bool y
 
+config ARM_SPE_ACPI
+	depends on ARM_PMU_ACPI && ARM_SPE_PMU
+	def_bool y
+
+
 config ARM_DSU_PMU
 	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
 	depends on ARM64
diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 0f197516d708..b0244e1e8c91 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -74,6 +74,80 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
 	acpi_unregister_gsi(gsi);
 }
 
+#ifdef CONFIG_ARM_SPE_ACPI
+static struct resource spe_resources[] = {
+	{
+		/* irq */
+		.flags          = IORESOURCE_IRQ,
+	}
+};
+
+static struct platform_device spe_dev = {
+	.name = ARMV8_SPE_PDEV_NAME,
+	.id = -1,
+	.resource = spe_resources,
+	.num_resources = ARRAY_SIZE(spe_resources)
+};
+
+/*
+ * For lack of a better place, hook the normal PMU MADT walk
+ * and create a SPE device if we detect a recent MADT with
+ * a homogeneous PPI mapping.
+ */
+static int arm_spe_acpi_register_device(void)
+{
+	int cpu, ret, irq;
+	int hetid;
+	u16 gsi = 0;
+	bool first = true;
+
+	struct acpi_madt_generic_interrupt *gicc;
+
+	/*
+	 * sanity check all the GICC tables for the same interrupt number
+	 * for now we only support homogeneous ACPI/SPE machines.
+	 */
+	for_each_possible_cpu(cpu) {
+		gicc = acpi_cpu_get_madt_gicc(cpu);
+
+		if (gicc->header.length < ACPI_MADT_GICC_SPE)
+			return -ENODEV;
+		if (first) {
+			gsi = gicc->spe_interrupt;
+			if (!gsi)
+				return -ENODEV;
+			hetid = find_acpi_cpu_topology_hetero_id(cpu);
+			first = false;
+		} else if ((gsi != gicc->spe_interrupt) ||
+			   (hetid != find_acpi_cpu_topology_hetero_id(cpu))) {
+			pr_warn("ACPI: SPE must be homogeneous\n");
+			return -EINVAL;
+		}
+	}
+
+	irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE,
+				ACPI_ACTIVE_HIGH);
+	if (irq < 0) {
+		pr_warn("ACPI: SPE Unable to register interrupt: %d\n", gsi);
+		return irq;
+	}
+
+	spe_resources[0].start = irq;
+	ret = platform_device_register(&spe_dev);
+	if (ret < 0) {
+		pr_warn("ACPI: SPE: Unable to register device\n");
+		acpi_unregister_gsi(gsi);
+	}
+
+	return ret;
+}
+#else
+static inline int arm_spe_acpi_register_device(void)
+{
+	return -ENODEV;
+}
+#endif /* CONFIG_ARM_SPE_ACPI */
+
 static int arm_pmu_acpi_parse_irqs(void)
 {
 	int irq, cpu, irq_cpu, err;
@@ -279,6 +353,8 @@ static int arm_pmu_acpi_init(void)
 	if (acpi_disabled)
 		return 0;
 
+	arm_spe_acpi_register_device(); /* failures are expected */
+
 	ret = arm_pmu_acpi_parse_irqs();
 	if (ret)
 		return ret;
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 4641e850b204..784bc58f165a 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -175,4 +175,6 @@ void armpmu_free_irq(int irq, int cpu);
 
 #endif /* CONFIG_ARM_PMU */
 
+#define ARMV8_SPE_PDEV_NAME "arm,spe-v1"
+
 #endif /* __ARM_PMU_H__ */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 5/5] perf: arm_spe: Enable ACPI/Platform automatic module loading
  2019-05-03 23:24 [PATCH v3 0/5] arm64: SPE ACPI enablement Jeremy Linton
                   ` (4 preceding siblings ...)
  2019-05-03 23:24 ` [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing Jeremy Linton
@ 2019-05-03 23:24 ` Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
  2019-05-04 11:06 ` [PATCH v3 0/5] arm64: SPE ACPI enablement Hanjun Guo
  6 siblings, 1 reply; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, Jeremy Linton, linux-acpi, catalin.marinas, john.garry,
	lenb

Lets add the MODULE_TABLE and platform id_table entries so that
the SPE driver can attach to the ACPI platform device created by
the core pmu code.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/perf/arm_spe_pmu.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 7cb766dafe85..a11951b08330 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -38,6 +38,7 @@
 #include <linux/of_address.h>
 #include <linux/of_device.h>
 #include <linux/perf_event.h>
+#include <linux/perf/arm_pmu.h>
 #include <linux/platform_device.h>
 #include <linux/printk.h>
 #include <linux/slab.h>
@@ -1176,7 +1177,13 @@ static const struct of_device_id arm_spe_pmu_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, arm_spe_pmu_of_match);
 
-static int arm_spe_pmu_device_dt_probe(struct platform_device *pdev)
+static const struct platform_device_id arm_spe_match[] = {
+	{ ARMV8_SPE_PDEV_NAME, 0},
+	{ }
+};
+MODULE_DEVICE_TABLE(platform, arm_spe_match);
+
+static int arm_spe_pmu_device_probe(struct platform_device *pdev)
 {
 	int ret;
 	struct arm_spe_pmu *spe_pmu;
@@ -1236,11 +1243,12 @@ static int arm_spe_pmu_device_remove(struct platform_device *pdev)
 }
 
 static struct platform_driver arm_spe_pmu_driver = {
+	.id_table = arm_spe_match,
 	.driver	= {
 		.name		= DRVNAME,
 		.of_match_table	= of_match_ptr(arm_spe_pmu_of_match),
 	},
-	.probe	= arm_spe_pmu_device_dt_probe,
+	.probe	= arm_spe_pmu_device_probe,
 	.remove	= arm_spe_pmu_device_remove,
 };
 
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v3 5/5] perf: arm_spe: Enable ACPI/Platform automatic module loading
  2019-05-03 23:24 ` [PATCH v3 5/5] perf: arm_spe: Enable ACPI/Platform automatic module loading Jeremy Linton
@ 2019-05-03 23:24   ` Jeremy Linton
  0 siblings, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-03 23:24 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry, Jeremy Linton

Lets add the MODULE_TABLE and platform id_table entries so that
the SPE driver can attach to the ACPI platform device created by
the core pmu code.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
---
 drivers/perf/arm_spe_pmu.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 7cb766dafe85..a11951b08330 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -38,6 +38,7 @@
 #include <linux/of_address.h>
 #include <linux/of_device.h>
 #include <linux/perf_event.h>
+#include <linux/perf/arm_pmu.h>
 #include <linux/platform_device.h>
 #include <linux/printk.h>
 #include <linux/slab.h>
@@ -1176,7 +1177,13 @@ static const struct of_device_id arm_spe_pmu_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, arm_spe_pmu_of_match);
 
-static int arm_spe_pmu_device_dt_probe(struct platform_device *pdev)
+static const struct platform_device_id arm_spe_match[] = {
+	{ ARMV8_SPE_PDEV_NAME, 0},
+	{ }
+};
+MODULE_DEVICE_TABLE(platform, arm_spe_match);
+
+static int arm_spe_pmu_device_probe(struct platform_device *pdev)
 {
 	int ret;
 	struct arm_spe_pmu *spe_pmu;
@@ -1236,11 +1243,12 @@ static int arm_spe_pmu_device_remove(struct platform_device *pdev)
 }
 
 static struct platform_driver arm_spe_pmu_driver = {
+	.id_table = arm_spe_match,
 	.driver	= {
 		.name		= DRVNAME,
 		.of_match_table	= of_match_ptr(arm_spe_pmu_of_match),
 	},
-	.probe	= arm_spe_pmu_device_dt_probe,
+	.probe	= arm_spe_pmu_device_probe,
 	.remove	= arm_spe_pmu_device_remove,
 };
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-03 23:24 [PATCH v3 0/5] arm64: SPE ACPI enablement Jeremy Linton
                   ` (5 preceding siblings ...)
  2019-05-03 23:24 ` [PATCH v3 5/5] perf: arm_spe: Enable ACPI/Platform automatic module loading Jeremy Linton
@ 2019-05-04 11:06 ` Hanjun Guo
  2019-05-04 11:06   ` Hanjun Guo
                     ` (2 more replies)
  6 siblings, 3 replies; 40+ messages in thread
From: Hanjun Guo @ 2019-05-04 11:06 UTC (permalink / raw)
  To: Jeremy Linton, linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, linux-acpi, catalin.marinas, john.garry, lenb

Hi Jeremy, Mark,

On 2019/5/4 7:24, Jeremy Linton wrote:
> This patch series enables the Arm Statistical Profiling
> Extension (SPE) on ACPI platforms.
> 
> This is possible because ACPI 6.3 uses a previously
> reserved field in the MADT to store the SPE interrupt
> number, similarly to how the normal PMU is described.
> If a consistent valid interrupt exists across all the
> cores in the system, a platform device is registered.
> That then triggers the SPE module, which runs as normal.
> 
> We also add the ability to parse the PPTT for IDENTICAL
> cores. We then use this to sanity check the single SPE
> device we create. This creates a bit of a problem with
> respect to the specification though. The specification
> says that its legal for multiple tree's to exist in the
> PPTT. We handle this fine, but what happens in the
> case of multiple tree's is that the lack of a common
> node with IDENTICAL set forces us to assume that there
> are multiple non-IDENTICAL cores in the machine.

Adding this patch set on top of latest mainline kernel,
and tested on D06 which has the SPE feature, in boot message
shows it was probed successfully:

arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]

but when I test it with spe events such as

perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls

it fails with:
failed to mmap with 12 (Cannot allocate memory),

Confirmed that patch [0] is merged and other perf events are working
fine.

I narrowed this issue down that mmap() failed to alloc 4M memory
in perf tool but seems have no relationship with this SPE patch set,
then I'm lost, could you take look please?

Thanks
Hanjun

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=528871b456026e6127d95b1b2bd8e3a003dc1614

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-04 11:06 ` [PATCH v3 0/5] arm64: SPE ACPI enablement Hanjun Guo
@ 2019-05-04 11:06   ` Hanjun Guo
  2019-05-07 17:58   ` Jeremy Linton
  2019-05-08 16:45   ` Sudeep Holla
  2 siblings, 0 replies; 40+ messages in thread
From: Hanjun Guo @ 2019-05-04 11:06 UTC (permalink / raw)
  To: Jeremy Linton, linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry

Hi Jeremy, Mark,

On 2019/5/4 7:24, Jeremy Linton wrote:
> This patch series enables the Arm Statistical Profiling
> Extension (SPE) on ACPI platforms.
> 
> This is possible because ACPI 6.3 uses a previously
> reserved field in the MADT to store the SPE interrupt
> number, similarly to how the normal PMU is described.
> If a consistent valid interrupt exists across all the
> cores in the system, a platform device is registered.
> That then triggers the SPE module, which runs as normal.
> 
> We also add the ability to parse the PPTT for IDENTICAL
> cores. We then use this to sanity check the single SPE
> device we create. This creates a bit of a problem with
> respect to the specification though. The specification
> says that its legal for multiple tree's to exist in the
> PPTT. We handle this fine, but what happens in the
> case of multiple tree's is that the lack of a common
> node with IDENTICAL set forces us to assume that there
> are multiple non-IDENTICAL cores in the machine.

Adding this patch set on top of latest mainline kernel,
and tested on D06 which has the SPE feature, in boot message
shows it was probed successfully:

arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]

but when I test it with spe events such as

perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls

it fails with:
failed to mmap with 12 (Cannot allocate memory),

Confirmed that patch [0] is merged and other perf events are working
fine.

I narrowed this issue down that mmap() failed to alloc 4M memory
in perf tool but seems have no relationship with this SPE patch set,
then I'm lost, could you take look please?

Thanks
Hanjun

[0]: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=528871b456026e6127d95b1b2bd8e3a003dc1614


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  2019-05-03 23:24 ` [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
@ 2019-05-05  7:09   ` Kefeng Wang
  2019-05-05  7:09     ` Kefeng Wang
  2019-05-07 18:26     ` Jeremy Linton
  2019-06-07  9:49   ` Sudeep Holla
  2 siblings, 2 replies; 40+ messages in thread
From: Kefeng Wang @ 2019-05-05  7:09 UTC (permalink / raw)
  To: Jeremy Linton, linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, catalin.marinas, will.deacon,
	rjw, linuxarm, linux-acpi, sudeep.holla, lenb


On 2019/5/4 7:24, Jeremy Linton wrote:
> ACPI 6.3 adds a flag to indicate that child nodes are all
> identical cores. This is useful to authoritatively determine
> if a set of (possibly offline) cores are identical or not.
>
> Since the flag doesn't give us a unique id we can generate
> one and use it to create bitmaps of sibling nodes, or simply
> in a loop to determine if a subset of cores are identical.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c  | 26 ++++++++++++++++++++++++++
>  include/linux/acpi.h |  5 +++++
>  2 files changed, 31 insertions(+)
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 83a026765faa..1865515297ca 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -660,3 +660,29 @@ int find_acpi_cpu_topology_package(unsigned int cpu)
>  	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
>  					  ACPI_PPTT_PHYSICAL_PACKAGE);
>  }
> +
> +/**
> + * find_acpi_cpu_topology_hetero_id() - Get a core architecture tag
> + * @cpu: Kernel logical CPU number
> + *
> + * Determine a unique heterogeneous tag for the given CPU. CPUs with the same
> + * implementation should have matching tags.
> + *
> + * The returned tag can be used to group peers with identical implementation.
> + *
> + * The search terminates when a level is found with the identical implementation
> + * flag set or we reach a root node.
> + *
> + * Due to limitations in the PPTT data structure, there may be rare situations
> + * where two cores in a heterogeneous machine may be identical, but won't have
> + * the same tag.
> + *
> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
> + * Otherwise returns a value which represents a group of identical cores
> + * similar to this CPU.
> + */
> +int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> +{
> +	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
> +					  ACPI_PPTT_ACPI_IDENTICAL);
> +}
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index d5dcebd7aad3..1444fb042898 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1309,6 +1309,7 @@ static inline int lpit_read_residency_count_address(u64 *address)
>  #ifdef CONFIG_ACPI_PPTT
>  int find_acpi_cpu_topology(unsigned int cpu, int level);
>  int find_acpi_cpu_topology_package(unsigned int cpu);
> +int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
>  int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
>  #else
>  static inline int find_acpi_cpu_topology(unsigned int cpu, int level)
> @@ -1319,6 +1320,10 @@ static inline int find_acpi_cpu_topology_package(unsigned int cpu)
>  {
>  	return -EINVAL;
>  }
static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> +static int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> +{
> +	return -EINVAL;
> +}
>  static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
>  {
>  	return -EINVAL;

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  2019-05-05  7:09   ` Kefeng Wang
@ 2019-05-05  7:09     ` Kefeng Wang
  2019-05-07 18:26     ` Jeremy Linton
  1 sibling, 0 replies; 40+ messages in thread
From: Kefeng Wang @ 2019-05-05  7:09 UTC (permalink / raw)
  To: Jeremy Linton, linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, linux-acpi, catalin.marinas, lenb


On 2019/5/4 7:24, Jeremy Linton wrote:
> ACPI 6.3 adds a flag to indicate that child nodes are all
> identical cores. This is useful to authoritatively determine
> if a set of (possibly offline) cores are identical or not.
>
> Since the flag doesn't give us a unique id we can generate
> one and use it to create bitmaps of sibling nodes, or simply
> in a loop to determine if a subset of cores are identical.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c  | 26 ++++++++++++++++++++++++++
>  include/linux/acpi.h |  5 +++++
>  2 files changed, 31 insertions(+)
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 83a026765faa..1865515297ca 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -660,3 +660,29 @@ int find_acpi_cpu_topology_package(unsigned int cpu)
>  	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
>  					  ACPI_PPTT_PHYSICAL_PACKAGE);
>  }
> +
> +/**
> + * find_acpi_cpu_topology_hetero_id() - Get a core architecture tag
> + * @cpu: Kernel logical CPU number
> + *
> + * Determine a unique heterogeneous tag for the given CPU. CPUs with the same
> + * implementation should have matching tags.
> + *
> + * The returned tag can be used to group peers with identical implementation.
> + *
> + * The search terminates when a level is found with the identical implementation
> + * flag set or we reach a root node.
> + *
> + * Due to limitations in the PPTT data structure, there may be rare situations
> + * where two cores in a heterogeneous machine may be identical, but won't have
> + * the same tag.
> + *
> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
> + * Otherwise returns a value which represents a group of identical cores
> + * similar to this CPU.
> + */
> +int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> +{
> +	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
> +					  ACPI_PPTT_ACPI_IDENTICAL);
> +}
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index d5dcebd7aad3..1444fb042898 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1309,6 +1309,7 @@ static inline int lpit_read_residency_count_address(u64 *address)
>  #ifdef CONFIG_ACPI_PPTT
>  int find_acpi_cpu_topology(unsigned int cpu, int level);
>  int find_acpi_cpu_topology_package(unsigned int cpu);
> +int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
>  int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
>  #else
>  static inline int find_acpi_cpu_topology(unsigned int cpu, int level)
> @@ -1319,6 +1320,10 @@ static inline int find_acpi_cpu_topology_package(unsigned int cpu)
>  {
>  	return -EINVAL;
>  }
static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> +static int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> +{
> +	return -EINVAL;
> +}
>  static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
>  {
>  	return -EINVAL;


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-04 11:06 ` [PATCH v3 0/5] arm64: SPE ACPI enablement Hanjun Guo
  2019-05-04 11:06   ` Hanjun Guo
@ 2019-05-07 17:58   ` Jeremy Linton
  2019-05-07 17:58     ` Jeremy Linton
  2019-05-08  9:35     ` Hanjun Guo
  2019-05-08 16:45   ` Sudeep Holla
  2 siblings, 2 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-07 17:58 UTC (permalink / raw)
  To: Hanjun Guo, linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, linux-acpi, catalin.marinas, john.garry, lenb

Hi,

On 5/4/19 6:06 AM, Hanjun Guo wrote:
> Hi Jeremy, Mark,
> 
> On 2019/5/4 7:24, Jeremy Linton wrote:
>> This patch series enables the Arm Statistical Profiling
>> Extension (SPE) on ACPI platforms.
>>
>> This is possible because ACPI 6.3 uses a previously
>> reserved field in the MADT to store the SPE interrupt
>> number, similarly to how the normal PMU is described.
>> If a consistent valid interrupt exists across all the
>> cores in the system, a platform device is registered.
>> That then triggers the SPE module, which runs as normal.
>>
>> We also add the ability to parse the PPTT for IDENTICAL
>> cores. We then use this to sanity check the single SPE
>> device we create. This creates a bit of a problem with
>> respect to the specification though. The specification
>> says that its legal for multiple tree's to exist in the
>> PPTT. We handle this fine, but what happens in the
>> case of multiple tree's is that the lack of a common
>> node with IDENTICAL set forces us to assume that there
>> are multiple non-IDENTICAL cores in the machine.
> 
> Adding this patch set on top of latest mainline kernel,
> and tested on D06 which has the SPE feature, in boot message
> shows it was probed successfully:
> 
> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]
> 
> but when I test it with spe events such as
> 
> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls
> 
> it fails with:
> failed to mmap with 12 (Cannot allocate memory),
> 
> Confirmed that patch [0] is merged and other perf events are working
> fine.

Its pretty easy to get into the weeds with this driver, does it work 
with examples like:

https://lkml.org/lkml/2018/1/14/122

> I narrowed this issue down that mmap() failed to alloc 4M memory
> in perf tool but seems have no relationship with this SPE patch set,
> then I'm lost, could you take look please?
> 
> Thanks
> Hanjun
> 
> [0]: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=528871b456026e6127d95b1b2bd8e3a003dc1614
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-07 17:58   ` Jeremy Linton
@ 2019-05-07 17:58     ` Jeremy Linton
  2019-05-08  9:35     ` Hanjun Guo
  1 sibling, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-07 17:58 UTC (permalink / raw)
  To: Hanjun Guo, linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry

Hi,

On 5/4/19 6:06 AM, Hanjun Guo wrote:
> Hi Jeremy, Mark,
> 
> On 2019/5/4 7:24, Jeremy Linton wrote:
>> This patch series enables the Arm Statistical Profiling
>> Extension (SPE) on ACPI platforms.
>>
>> This is possible because ACPI 6.3 uses a previously
>> reserved field in the MADT to store the SPE interrupt
>> number, similarly to how the normal PMU is described.
>> If a consistent valid interrupt exists across all the
>> cores in the system, a platform device is registered.
>> That then triggers the SPE module, which runs as normal.
>>
>> We also add the ability to parse the PPTT for IDENTICAL
>> cores. We then use this to sanity check the single SPE
>> device we create. This creates a bit of a problem with
>> respect to the specification though. The specification
>> says that its legal for multiple tree's to exist in the
>> PPTT. We handle this fine, but what happens in the
>> case of multiple tree's is that the lack of a common
>> node with IDENTICAL set forces us to assume that there
>> are multiple non-IDENTICAL cores in the machine.
> 
> Adding this patch set on top of latest mainline kernel,
> and tested on D06 which has the SPE feature, in boot message
> shows it was probed successfully:
> 
> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]
> 
> but when I test it with spe events such as
> 
> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls
> 
> it fails with:
> failed to mmap with 12 (Cannot allocate memory),
> 
> Confirmed that patch [0] is merged and other perf events are working
> fine.

Its pretty easy to get into the weeds with this driver, does it work 
with examples like:

https://lkml.org/lkml/2018/1/14/122

> I narrowed this issue down that mmap() failed to alloc 4M memory
> in perf tool but seems have no relationship with this SPE patch set,
> then I'm lost, could you take look please?
> 
> Thanks
> Hanjun
> 
> [0]: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?id=528871b456026e6127d95b1b2bd8e3a003dc1614
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU
  2019-05-03 23:24 ` [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
@ 2019-05-07 18:12   ` Jeremy Linton
  2019-05-07 18:12     ` Jeremy Linton
  1 sibling, 1 reply; 40+ messages in thread
From: Jeremy Linton @ 2019-05-07 18:12 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, Rafael J . Wysocki, will.deacon,
	sudeep.holla, rjw, linuxarm, linux-acpi, catalin.marinas,
	john.garry, lenb

Hi,


On 5/3/19 6:24 PM, Jeremy Linton wrote:
> CPU is an acronym and customarily capitalized. Much of the
> commenting in the PPTT code has been using "cpu" rather
> than "CPU". Correct that, and other human readable strings.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Looks like a similar patch (covering more than this file) landed as 
603fadf33604a, so this patch should be dropped.



> ---
>   drivers/acpi/pptt.c | 48 ++++++++++++++++++++++-----------------------
>   1 file changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 065c4fc245d1..83a026765faa 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -164,7 +164,7 @@ static struct acpi_pptt_cache *acpi_find_cache_level(struct acpi_table_header *t
>   }
>   
>   /**
> - * acpi_count_levels() - Given a PPTT table, and a cpu node, count the caches
> + * acpi_count_levels() - Given a PPTT table, and a CPU node, count the caches
>    * @table_hdr: Pointer to the head of the PPTT table
>    * @cpu_node: processor node we wish to count caches for
>    *
> @@ -235,7 +235,7 @@ static int acpi_pptt_leaf_node(struct acpi_table_header *table_hdr,
>   /**
>    * acpi_find_processor_node() - Given a PPTT table find the requested processor
>    * @table_hdr:  Pointer to the head of the PPTT table
> - * @acpi_cpu_id: cpu we are searching for
> + * @acpi_cpu_id: CPU we are searching for
>    *
>    * Find the subtable entry describing the provided processor.
>    * This is done by iterating the PPTT table looking for processor nodes
> @@ -456,21 +456,21 @@ static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_ta
>   
>   static void acpi_pptt_warn_missing(void)
>   {
> -	pr_warn_once("No PPTT table found, cpu and cache topology may be inaccurate\n");
> +	pr_warn_once("No PPTT table found, CPU and cache topology may be inaccurate\n");
>   }
>   
>   /**
>    * topology_get_acpi_cpu_tag() - Find a unique topology value for a feature
>    * @table: Pointer to the head of the PPTT table
> - * @cpu: Kernel logical cpu number
> + * @cpu: Kernel logical CPU number
>    * @level: A level that terminates the search
>    * @flag: A flag which terminates the search
>    *
> - * Get a unique value given a cpu, and a topology level, that can be
> - * matched to determine which cpus share common topological features
> + * Get a unique value given a CPU, and a topology level, that can be
> + * matched to determine which CPUs share common topological features
>    * at that level.
>    *
> - * Return: Unique value, or -ENOENT if unable to locate cpu
> + * Return: Unique value, or -ENOENT if unable to locate CPU
>    */
>   static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
>   				     unsigned int cpu, int level, int flag)
> @@ -510,7 +510,7 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
>   		return -ENOENT;
>   	}
>   	retval = topology_get_acpi_cpu_tag(table, cpu, level, flag);
> -	pr_debug("Topology Setup ACPI cpu %d, level %d ret = %d\n",
> +	pr_debug("Topology Setup ACPI CPU %d, level %d ret = %d\n",
>   		 cpu, level, retval);
>   	acpi_put_table(table);
>   
> @@ -519,9 +519,9 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
>   
>   /**
>    * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
> - * @cpu: Kernel logical cpu number
> + * @cpu: Kernel logical CPU number
>    *
> - * Given a logical cpu number, returns the number of levels of cache represented
> + * Given a logical CPU number, returns the number of levels of cache represented
>    * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
>    * indicating we didn't find any cache levels.
>    *
> @@ -534,7 +534,7 @@ int acpi_find_last_cache_level(unsigned int cpu)
>   	int number_of_levels = 0;
>   	acpi_status status;
>   
> -	pr_debug("Cache Setup find last level cpu=%d\n", cpu);
> +	pr_debug("Cache Setup find last level CPU=%d\n", cpu);
>   
>   	acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>   	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> @@ -551,14 +551,14 @@ int acpi_find_last_cache_level(unsigned int cpu)
>   
>   /**
>    * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
> - * @cpu: Kernel logical cpu number
> + * @cpu: Kernel logical CPU number
>    *
>    * Updates the global cache info provided by cpu_get_cacheinfo()
>    * when there are valid properties in the acpi_pptt_cache nodes. A
>    * successful parse may not result in any updates if none of the
>    * cache levels have any valid flags set.  Futher, a unique value is
>    * associated with each known CPU cache entry. This unique value
> - * can be used to determine whether caches are shared between cpus.
> + * can be used to determine whether caches are shared between CPUs.
>    *
>    * Return: -ENOENT on failure to find table, or 0 on success
>    */
> @@ -567,7 +567,7 @@ int cache_setup_acpi(unsigned int cpu)
>   	struct acpi_table_header *table;
>   	acpi_status status;
>   
> -	pr_debug("Cache Setup ACPI cpu %d\n", cpu);
> +	pr_debug("Cache Setup ACPI CPU %d\n", cpu);
>   
>   	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>   	if (ACPI_FAILURE(status)) {
> @@ -582,8 +582,8 @@ int cache_setup_acpi(unsigned int cpu)
>   }
>   
>   /**
> - * find_acpi_cpu_topology() - Determine a unique topology value for a given cpu
> - * @cpu: Kernel logical cpu number
> + * find_acpi_cpu_topology() - Determine a unique topology value for a given CPU
> + * @cpu: Kernel logical CPU number
>    * @level: The topological level for which we would like a unique ID
>    *
>    * Determine a topology unique ID for each thread/core/cluster/mc_grouping
> @@ -596,7 +596,7 @@ int cache_setup_acpi(unsigned int cpu)
>    * other levels beyond this use a generated value to uniquely identify
>    * a topological feature.
>    *
> - * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
>    * Otherwise returns a value which represents a unique topological feature.
>    */
>   int find_acpi_cpu_topology(unsigned int cpu, int level)
> @@ -606,12 +606,12 @@ int find_acpi_cpu_topology(unsigned int cpu, int level)
>   
>   /**
>    * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
> - * @cpu: Kernel logical cpu number
> + * @cpu: Kernel logical CPU number
>    * @level: The cache level for which we would like a unique ID
>    *
>    * Determine a unique ID for each unified cache in the system
>    *
> - * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
>    * Otherwise returns a value which represents a unique topological feature.
>    */
>   int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
> @@ -643,17 +643,17 @@ int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
>   
>   
>   /**
> - * find_acpi_cpu_topology_package() - Determine a unique cpu package value
> - * @cpu: Kernel logical cpu number
> + * find_acpi_cpu_topology_package() - Determine a unique CPU package value
> + * @cpu: Kernel logical CPU number
>    *
> - * Determine a topology unique package ID for the given cpu.
> + * Determine a topology unique package ID for the given CPU.
>    * This ID can then be used to group peers, which will have matching ids.
>    *
>    * The search terminates when either a level is found with the PHYSICAL_PACKAGE
>    * flag set or we reach a root node.
>    *
> - * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
> - * Otherwise returns a value which represents the package for this cpu.
> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
> + * Otherwise returns a value which represents the package for this CPU.
>    */
>   int find_acpi_cpu_topology_package(unsigned int cpu)
>   {
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU
  2019-05-07 18:12   ` Jeremy Linton
@ 2019-05-07 18:12     ` Jeremy Linton
  0 siblings, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-07 18:12 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry, Rafael J . Wysocki

Hi,


On 5/3/19 6:24 PM, Jeremy Linton wrote:
> CPU is an acronym and customarily capitalized. Much of the
> commenting in the PPTT code has been using "cpu" rather
> than "CPU". Correct that, and other human readable strings.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Looks like a similar patch (covering more than this file) landed as 
603fadf33604a, so this patch should be dropped.



> ---
>   drivers/acpi/pptt.c | 48 ++++++++++++++++++++++-----------------------
>   1 file changed, 24 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 065c4fc245d1..83a026765faa 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -164,7 +164,7 @@ static struct acpi_pptt_cache *acpi_find_cache_level(struct acpi_table_header *t
>   }
>   
>   /**
> - * acpi_count_levels() - Given a PPTT table, and a cpu node, count the caches
> + * acpi_count_levels() - Given a PPTT table, and a CPU node, count the caches
>    * @table_hdr: Pointer to the head of the PPTT table
>    * @cpu_node: processor node we wish to count caches for
>    *
> @@ -235,7 +235,7 @@ static int acpi_pptt_leaf_node(struct acpi_table_header *table_hdr,
>   /**
>    * acpi_find_processor_node() - Given a PPTT table find the requested processor
>    * @table_hdr:  Pointer to the head of the PPTT table
> - * @acpi_cpu_id: cpu we are searching for
> + * @acpi_cpu_id: CPU we are searching for
>    *
>    * Find the subtable entry describing the provided processor.
>    * This is done by iterating the PPTT table looking for processor nodes
> @@ -456,21 +456,21 @@ static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_ta
>   
>   static void acpi_pptt_warn_missing(void)
>   {
> -	pr_warn_once("No PPTT table found, cpu and cache topology may be inaccurate\n");
> +	pr_warn_once("No PPTT table found, CPU and cache topology may be inaccurate\n");
>   }
>   
>   /**
>    * topology_get_acpi_cpu_tag() - Find a unique topology value for a feature
>    * @table: Pointer to the head of the PPTT table
> - * @cpu: Kernel logical cpu number
> + * @cpu: Kernel logical CPU number
>    * @level: A level that terminates the search
>    * @flag: A flag which terminates the search
>    *
> - * Get a unique value given a cpu, and a topology level, that can be
> - * matched to determine which cpus share common topological features
> + * Get a unique value given a CPU, and a topology level, that can be
> + * matched to determine which CPUs share common topological features
>    * at that level.
>    *
> - * Return: Unique value, or -ENOENT if unable to locate cpu
> + * Return: Unique value, or -ENOENT if unable to locate CPU
>    */
>   static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
>   				     unsigned int cpu, int level, int flag)
> @@ -510,7 +510,7 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
>   		return -ENOENT;
>   	}
>   	retval = topology_get_acpi_cpu_tag(table, cpu, level, flag);
> -	pr_debug("Topology Setup ACPI cpu %d, level %d ret = %d\n",
> +	pr_debug("Topology Setup ACPI CPU %d, level %d ret = %d\n",
>   		 cpu, level, retval);
>   	acpi_put_table(table);
>   
> @@ -519,9 +519,9 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
>   
>   /**
>    * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
> - * @cpu: Kernel logical cpu number
> + * @cpu: Kernel logical CPU number
>    *
> - * Given a logical cpu number, returns the number of levels of cache represented
> + * Given a logical CPU number, returns the number of levels of cache represented
>    * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
>    * indicating we didn't find any cache levels.
>    *
> @@ -534,7 +534,7 @@ int acpi_find_last_cache_level(unsigned int cpu)
>   	int number_of_levels = 0;
>   	acpi_status status;
>   
> -	pr_debug("Cache Setup find last level cpu=%d\n", cpu);
> +	pr_debug("Cache Setup find last level CPU=%d\n", cpu);
>   
>   	acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>   	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> @@ -551,14 +551,14 @@ int acpi_find_last_cache_level(unsigned int cpu)
>   
>   /**
>    * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
> - * @cpu: Kernel logical cpu number
> + * @cpu: Kernel logical CPU number
>    *
>    * Updates the global cache info provided by cpu_get_cacheinfo()
>    * when there are valid properties in the acpi_pptt_cache nodes. A
>    * successful parse may not result in any updates if none of the
>    * cache levels have any valid flags set.  Futher, a unique value is
>    * associated with each known CPU cache entry. This unique value
> - * can be used to determine whether caches are shared between cpus.
> + * can be used to determine whether caches are shared between CPUs.
>    *
>    * Return: -ENOENT on failure to find table, or 0 on success
>    */
> @@ -567,7 +567,7 @@ int cache_setup_acpi(unsigned int cpu)
>   	struct acpi_table_header *table;
>   	acpi_status status;
>   
> -	pr_debug("Cache Setup ACPI cpu %d\n", cpu);
> +	pr_debug("Cache Setup ACPI CPU %d\n", cpu);
>   
>   	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>   	if (ACPI_FAILURE(status)) {
> @@ -582,8 +582,8 @@ int cache_setup_acpi(unsigned int cpu)
>   }
>   
>   /**
> - * find_acpi_cpu_topology() - Determine a unique topology value for a given cpu
> - * @cpu: Kernel logical cpu number
> + * find_acpi_cpu_topology() - Determine a unique topology value for a given CPU
> + * @cpu: Kernel logical CPU number
>    * @level: The topological level for which we would like a unique ID
>    *
>    * Determine a topology unique ID for each thread/core/cluster/mc_grouping
> @@ -596,7 +596,7 @@ int cache_setup_acpi(unsigned int cpu)
>    * other levels beyond this use a generated value to uniquely identify
>    * a topological feature.
>    *
> - * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
>    * Otherwise returns a value which represents a unique topological feature.
>    */
>   int find_acpi_cpu_topology(unsigned int cpu, int level)
> @@ -606,12 +606,12 @@ int find_acpi_cpu_topology(unsigned int cpu, int level)
>   
>   /**
>    * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
> - * @cpu: Kernel logical cpu number
> + * @cpu: Kernel logical CPU number
>    * @level: The cache level for which we would like a unique ID
>    *
>    * Determine a unique ID for each unified cache in the system
>    *
> - * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
>    * Otherwise returns a value which represents a unique topological feature.
>    */
>   int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
> @@ -643,17 +643,17 @@ int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
>   
>   
>   /**
> - * find_acpi_cpu_topology_package() - Determine a unique cpu package value
> - * @cpu: Kernel logical cpu number
> + * find_acpi_cpu_topology_package() - Determine a unique CPU package value
> + * @cpu: Kernel logical CPU number
>    *
> - * Determine a topology unique package ID for the given cpu.
> + * Determine a topology unique package ID for the given CPU.
>    * This ID can then be used to group peers, which will have matching ids.
>    *
>    * The search terminates when either a level is found with the PHYSICAL_PACKAGE
>    * flag set or we reach a root node.
>    *
> - * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
> - * Otherwise returns a value which represents the package for this cpu.
> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
> + * Otherwise returns a value which represents the package for this CPU.
>    */
>   int find_acpi_cpu_topology_package(unsigned int cpu)
>   {
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  2019-05-05  7:09   ` Kefeng Wang
  2019-05-05  7:09     ` Kefeng Wang
@ 2019-05-07 18:26     ` Jeremy Linton
  2019-05-07 18:26       ` Jeremy Linton
  1 sibling, 1 reply; 40+ messages in thread
From: Jeremy Linton @ 2019-05-07 18:26 UTC (permalink / raw)
  To: Kefeng Wang, linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, catalin.marinas, will.deacon,
	rjw, linuxarm, linux-acpi, sudeep.holla, lenb

Hi,

On 5/5/19 2:09 AM, Kefeng Wang wrote:
> 
> On 2019/5/4 7:24, Jeremy Linton wrote:
>> ACPI 6.3 adds a flag to indicate that child nodes are all
>> identical cores. This is useful to authoritatively determine
>> if a set of (possibly offline) cores are identical or not.
>>
>> Since the flag doesn't give us a unique id we can generate
>> one and use it to create bitmaps of sibling nodes, or simply
>> in a loop to determine if a subset of cores are identical.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/pptt.c  | 26 ++++++++++++++++++++++++++
>>   include/linux/acpi.h |  5 +++++
>>   2 files changed, 31 insertions(+)
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index 83a026765faa..1865515297ca 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -660,3 +660,29 @@ int find_acpi_cpu_topology_package(unsigned int cpu)
>>   	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
>>   					  ACPI_PPTT_PHYSICAL_PACKAGE);
>>   }
>> +
>> +/**
>> + * find_acpi_cpu_topology_hetero_id() - Get a core architecture tag
>> + * @cpu: Kernel logical CPU number
>> + *
>> + * Determine a unique heterogeneous tag for the given CPU. CPUs with the same
>> + * implementation should have matching tags.
>> + *
>> + * The returned tag can be used to group peers with identical implementation.
>> + *
>> + * The search terminates when a level is found with the identical implementation
>> + * flag set or we reach a root node.
>> + *
>> + * Due to limitations in the PPTT data structure, there may be rare situations
>> + * where two cores in a heterogeneous machine may be identical, but won't have
>> + * the same tag.
>> + *
>> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
>> + * Otherwise returns a value which represents a group of identical cores
>> + * similar to this CPU.
>> + */
>> +int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
>> +{
>> +	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
>> +					  ACPI_PPTT_ACPI_IDENTICAL);
>> +}
>> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
>> index d5dcebd7aad3..1444fb042898 100644
>> --- a/include/linux/acpi.h
>> +++ b/include/linux/acpi.h
>> @@ -1309,6 +1309,7 @@ static inline int lpit_read_residency_count_address(u64 *address)
>>   #ifdef CONFIG_ACPI_PPTT
>>   int find_acpi_cpu_topology(unsigned int cpu, int level);
>>   int find_acpi_cpu_topology_package(unsigned int cpu);
>> +int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
>>   int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
>>   #else
>>   static inline int find_acpi_cpu_topology(unsigned int cpu, int level)
>> @@ -1319,6 +1320,10 @@ static inline int find_acpi_cpu_topology_package(unsigned int cpu)
>>   {
>>   	return -EINVAL;
>>   }
> static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)

Yes, good catch, I just saw that warning.


>> +static int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
>> +{
>> +	return -EINVAL;
>> +}
>>   static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
>>   {
>>   	return -EINVAL;
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  2019-05-07 18:26     ` Jeremy Linton
@ 2019-05-07 18:26       ` Jeremy Linton
  0 siblings, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-07 18:26 UTC (permalink / raw)
  To: Kefeng Wang, linux-arm-kernel
  Cc: mark.rutland, lorenzo.pieralisi, will.deacon, sudeep.holla, rjw,
	linuxarm, linux-acpi, catalin.marinas, lenb

Hi,

On 5/5/19 2:09 AM, Kefeng Wang wrote:
> 
> On 2019/5/4 7:24, Jeremy Linton wrote:
>> ACPI 6.3 adds a flag to indicate that child nodes are all
>> identical cores. This is useful to authoritatively determine
>> if a set of (possibly offline) cores are identical or not.
>>
>> Since the flag doesn't give us a unique id we can generate
>> one and use it to create bitmaps of sibling nodes, or simply
>> in a loop to determine if a subset of cores are identical.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/pptt.c  | 26 ++++++++++++++++++++++++++
>>   include/linux/acpi.h |  5 +++++
>>   2 files changed, 31 insertions(+)
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index 83a026765faa..1865515297ca 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -660,3 +660,29 @@ int find_acpi_cpu_topology_package(unsigned int cpu)
>>   	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
>>   					  ACPI_PPTT_PHYSICAL_PACKAGE);
>>   }
>> +
>> +/**
>> + * find_acpi_cpu_topology_hetero_id() - Get a core architecture tag
>> + * @cpu: Kernel logical CPU number
>> + *
>> + * Determine a unique heterogeneous tag for the given CPU. CPUs with the same
>> + * implementation should have matching tags.
>> + *
>> + * The returned tag can be used to group peers with identical implementation.
>> + *
>> + * The search terminates when a level is found with the identical implementation
>> + * flag set or we reach a root node.
>> + *
>> + * Due to limitations in the PPTT data structure, there may be rare situations
>> + * where two cores in a heterogeneous machine may be identical, but won't have
>> + * the same tag.
>> + *
>> + * Return: -ENOENT if the PPTT doesn't exist, or the CPU cannot be found.
>> + * Otherwise returns a value which represents a group of identical cores
>> + * similar to this CPU.
>> + */
>> +int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
>> +{
>> +	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
>> +					  ACPI_PPTT_ACPI_IDENTICAL);
>> +}
>> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
>> index d5dcebd7aad3..1444fb042898 100644
>> --- a/include/linux/acpi.h
>> +++ b/include/linux/acpi.h
>> @@ -1309,6 +1309,7 @@ static inline int lpit_read_residency_count_address(u64 *address)
>>   #ifdef CONFIG_ACPI_PPTT
>>   int find_acpi_cpu_topology(unsigned int cpu, int level);
>>   int find_acpi_cpu_topology_package(unsigned int cpu);
>> +int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
>>   int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
>>   #else
>>   static inline int find_acpi_cpu_topology(unsigned int cpu, int level)
>> @@ -1319,6 +1320,10 @@ static inline int find_acpi_cpu_topology_package(unsigned int cpu)
>>   {
>>   	return -EINVAL;
>>   }
> static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)

Yes, good catch, I just saw that warning.


>> +static int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
>> +{
>> +	return -EINVAL;
>> +}
>>   static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
>>   {
>>   	return -EINVAL;
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-07 17:58   ` Jeremy Linton
  2019-05-07 17:58     ` Jeremy Linton
@ 2019-05-08  9:35     ` Hanjun Guo
  2019-05-08 16:51       ` Sudeep Holla
  1 sibling, 1 reply; 40+ messages in thread
From: Hanjun Guo @ 2019-05-08  9:35 UTC (permalink / raw)
  To: Jeremy Linton, linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm,
	john.garry, Hongbo Yao, Alexander Shishkin

+Cc Alexander.

On 2019/5/8 1:58, Jeremy Linton wrote:
> Hi,
> 
> On 5/4/19 6:06 AM, Hanjun Guo wrote:
>> Hi Jeremy, Mark,
>>
>> On 2019/5/4 7:24, Jeremy Linton wrote:
>>> This patch series enables the Arm Statistical Profiling
>>> Extension (SPE) on ACPI platforms.
>>>
>>> This is possible because ACPI 6.3 uses a previously
>>> reserved field in the MADT to store the SPE interrupt
>>> number, similarly to how the normal PMU is described.
>>> If a consistent valid interrupt exists across all the
>>> cores in the system, a platform device is registered.
>>> That then triggers the SPE module, which runs as normal.
>>>
>>> We also add the ability to parse the PPTT for IDENTICAL
>>> cores. We then use this to sanity check the single SPE
>>> device we create. This creates a bit of a problem with
>>> respect to the specification though. The specification
>>> says that its legal for multiple tree's to exist in the
>>> PPTT. We handle this fine, but what happens in the
>>> case of multiple tree's is that the lack of a common
>>> node with IDENTICAL set forces us to assume that there
>>> are multiple non-IDENTICAL cores in the machine.
>>
>> Adding this patch set on top of latest mainline kernel,
>> and tested on D06 which has the SPE feature, in boot message
>> shows it was probed successfully:
>>
>> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]
>>
>> but when I test it with spe events such as
>>
>> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls
>>
>> it fails with:
>> failed to mmap with 12 (Cannot allocate memory),
>>
>> Confirmed that patch [0] is merged and other perf events are working
>> fine.
> 
> Its pretty easy to get into the weeds with this driver, does it work with examples like:
> 
> https://lkml.org/lkml/2018/1/14/122

No, not work at all.

SPE works on 5.0, but not work after 5.1-rc1, bisected to this commit:

5768402fd9c6 perf/ring_buffer: Use high order allocations for AUX buffers optimistically

Cced Alexander as well as Alexander is the author of above patch.

Thanks
Hanjun


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing
  2019-05-03 23:24 ` [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
@ 2019-05-08 11:18   ` John Garry
  2019-05-08 20:04     ` Jeremy Linton
  2019-06-07  9:57   ` Sudeep Holla
  2 siblings, 1 reply; 40+ messages in thread
From: John Garry @ 2019-05-08 11:18 UTC (permalink / raw)
  To: Jeremy Linton, linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm

On 04/05/2019 00:24, Jeremy Linton wrote:
> ACPI 6.3 adds additional fields to the MADT GICC
> structure to describe SPE PPI's. We pick these out
> of the cached reference to the madt_gicc structure
> similarly to the core PMU code. We then create a platform
> device referring to the IRQ and let the user/module loader
> decide whether to load the SPE driver.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/arm64/include/asm/acpi.h |  3 ++
>  drivers/perf/Kconfig          |  5 +++
>  drivers/perf/arm_pmu_acpi.c   | 76 +++++++++++++++++++++++++++++++++++
>  include/linux/perf/arm_pmu.h  |  2 +
>  4 files changed, 86 insertions(+)
>
> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> index 7628efbe6c12..d10399b9f998 100644
> --- a/arch/arm64/include/asm/acpi.h
> +++ b/arch/arm64/include/asm/acpi.h
> @@ -41,6 +41,9 @@
>  	(!(entry) || (entry)->header.length < ACPI_MADT_GICC_MIN_LENGTH || \
>  	(unsigned long)(entry) + (entry)->header.length > (end))
>
> +#define ACPI_MADT_GICC_SPE  (ACPI_OFFSET(struct acpi_madt_generic_interrupt, \
> +	spe_interrupt) + sizeof(u16))
> +
>  /* Basic configuration for ACPI */
>  #ifdef	CONFIG_ACPI
>  pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index af9bc178495d..bc2647c64c9d 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -52,6 +52,11 @@ config ARM_PMU_ACPI
>  	depends on ARM_PMU && ACPI
>  	def_bool y
>
> +config ARM_SPE_ACPI

Is it possible to just use this check in arm_pmu_acpi.c instead, to 
avoid introducing another config here:

if defined(CONFIG_ARM_SPE_PMU)

> +	depends on ARM_PMU_ACPI && ARM_SPE_PMU
> +	def_bool y
> +
> +

nit: extra line

>  config ARM_DSU_PMU
>  	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
>  	depends on ARM64
> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
> index 0f197516d708..b0244e1e8c91 100644
> --- a/drivers/perf/arm_pmu_acpi.c
> +++ b/drivers/perf/arm_pmu_acpi.c
> @@ -74,6 +74,80 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
>  	acpi_unregister_gsi(gsi);
>  }
>
> +#ifdef CONFIG_ARM_SPE_ACPI
> +static struct resource spe_resources[] = {
> +	{
> +		/* irq */
> +		.flags          = IORESOURCE_IRQ,
> +	}
> +};
> +
> +static struct platform_device spe_dev = {
> +	.name = ARMV8_SPE_PDEV_NAME,
> +	.id = -1,
> +	.resource = spe_resources,
> +	.num_resources = ARRAY_SIZE(spe_resources)
> +};
> +
> +/*
> + * For lack of a better place, hook the normal PMU MADT walk
> + * and create a SPE device if we detect a recent MADT with
> + * a homogeneous PPI mapping.
> + */
> +static int arm_spe_acpi_register_device(void)
> +{
> +	int cpu, ret, irq;
> +	int hetid;

nit: not sure why you use multiple lines here

> +	u16 gsi = 0;
> +	bool first = true;
> +

nit: extra line, and gicc could be declared within the loop in which 
it's used to limit scope.

> +	struct acpi_madt_generic_interrupt *gicc;
> +
> +	/*
> +	 * sanity check all the GICC tables for the same interrupt number
> +	 * for now we only support homogeneous ACPI/SPE machines.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		gicc = acpi_cpu_get_madt_gicc(cpu);
> +
> +		if (gicc->header.length < ACPI_MADT_GICC_SPE)
> +			return -ENODEV;
> +		if (first) {
> +			gsi = gicc->spe_interrupt;
> +			if (!gsi)
> +				return -ENODEV;
> +			hetid = find_acpi_cpu_topology_hetero_id(cpu);
> +			first = false;
> +		} else if ((gsi != gicc->spe_interrupt) ||
> +			   (hetid != find_acpi_cpu_topology_hetero_id(cpu))) {
> +			pr_warn("ACPI: SPE must be homogeneous\n");
> +			return -EINVAL;
> +		}
> +	}
> +
> +	irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE,
> +				ACPI_ACTIVE_HIGH);
> +	if (irq < 0) {

is irq == 0 a valid Linux IRQ number? From checking 
irq_create_fw_spec_mapping(), it does not seem to be.

> +		pr_warn("ACPI: SPE Unable to register interrupt: %d\n", gsi);
> +		return irq;
> +	}
> +
> +	spe_resources[0].start = irq;
> +	ret = platform_device_register(&spe_dev);
> +	if (ret < 0) {
> +		pr_warn("ACPI: SPE: Unable to register device\n");
> +		acpi_unregister_gsi(gsi);
> +	}
> +
> +	return ret;
> +}
> +#else
> +static inline int arm_spe_acpi_register_device(void)
> +{
> +	return -ENODEV;
> +}
> +#endif /* CONFIG_ARM_SPE_ACPI */
> +
>  static int arm_pmu_acpi_parse_irqs(void)
>  {
>  	int irq, cpu, irq_cpu, err;
> @@ -279,6 +353,8 @@ static int arm_pmu_acpi_init(void)
>  	if (acpi_disabled)
>  		return 0;
>
> +	arm_spe_acpi_register_device(); /* failures are expected */
> +
>  	ret = arm_pmu_acpi_parse_irqs();
>  	if (ret)
>  		return ret;
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 4641e850b204..784bc58f165a 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -175,4 +175,6 @@ void armpmu_free_irq(int irq, int cpu);
>
>  #endif /* CONFIG_ARM_PMU */
>
> +#define ARMV8_SPE_PDEV_NAME "arm,spe-v1"
> +
>  #endif /* __ARM_PMU_H__ */
>

Thanks!




^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-04 11:06 ` [PATCH v3 0/5] arm64: SPE ACPI enablement Hanjun Guo
  2019-05-04 11:06   ` Hanjun Guo
  2019-05-07 17:58   ` Jeremy Linton
@ 2019-05-08 16:45   ` Sudeep Holla
  2 siblings, 0 replies; 40+ messages in thread
From: Sudeep Holla @ 2019-05-08 16:45 UTC (permalink / raw)
  To: Hanjun Guo
  Cc: Jeremy Linton, linux-arm-kernel, linux-acpi, catalin.marinas,
	will.deacon, rjw, lenb, mark.rutland, lorenzo.pieralisi,
	linuxarm, john.garry, Sudeep Holla

On Sat, May 04, 2019 at 07:06:19PM +0800, Hanjun Guo wrote:
> Hi Jeremy, Mark,
>
> On 2019/5/4 7:24, Jeremy Linton wrote:
> > This patch series enables the Arm Statistical Profiling
> > Extension (SPE) on ACPI platforms.
> >
> > This is possible because ACPI 6.3 uses a previously
> > reserved field in the MADT to store the SPE interrupt
> > number, similarly to how the normal PMU is described.
> > If a consistent valid interrupt exists across all the
> > cores in the system, a platform device is registered.
> > That then triggers the SPE module, which runs as normal.
> >
> > We also add the ability to parse the PPTT for IDENTICAL
> > cores. We then use this to sanity check the single SPE
> > device we create. This creates a bit of a problem with
> > respect to the specification though. The specification
> > says that its legal for multiple tree's to exist in the
> > PPTT. We handle this fine, but what happens in the
> > case of multiple tree's is that the lack of a common
> > node with IDENTICAL set forces us to assume that there
> > are multiple non-IDENTICAL cores in the machine.
>
> Adding this patch set on top of latest mainline kernel,
> and tested on D06 which has the SPE feature, in boot message
> shows it was probed successfully:
>
> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]
>
> but when I test it with spe events such as
>
> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls
>
> it fails with:
> failed to mmap with 12 (Cannot allocate memory),
>
> Confirmed that patch [0] is merged and other perf events are working
> fine.
>
> I narrowed this issue down that mmap() failed to alloc 4M memory
> in perf tool but seems have no relationship with this SPE patch set,
> then I'm lost, could you take look please?
>

Thanks for pointing this out. I had last tested SPE only with v5.0 and
missed completely to check on v5.1. FWIW, I can reproduce this issue
on v5.1

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-08  9:35     ` Hanjun Guo
@ 2019-05-08 16:51       ` Sudeep Holla
  2019-05-09  9:28         ` Will Deacon
  0 siblings, 1 reply; 40+ messages in thread
From: Sudeep Holla @ 2019-05-08 16:51 UTC (permalink / raw)
  To: Hanjun Guo
  Cc: Jeremy Linton, linux-arm-kernel, linux-acpi, catalin.marinas,
	will.deacon, rjw, lenb, mark.rutland, lorenzo.pieralisi,
	linuxarm, john.garry, Hongbo Yao, Alexander Shishkin,
	Sudeep Holla

On Wed, May 08, 2019 at 05:35:51PM +0800, Hanjun Guo wrote:
> +Cc Alexander.
>
> On 2019/5/8 1:58, Jeremy Linton wrote:
> > Hi,
> >
> > On 5/4/19 6:06 AM, Hanjun Guo wrote:
> >> Hi Jeremy, Mark,
> >>
> >> On 2019/5/4 7:24, Jeremy Linton wrote:
> >>> This patch series enables the Arm Statistical Profiling
> >>> Extension (SPE) on ACPI platforms.
> >>>
> >>> This is possible because ACPI 6.3 uses a previously
> >>> reserved field in the MADT to store the SPE interrupt
> >>> number, similarly to how the normal PMU is described.
> >>> If a consistent valid interrupt exists across all the
> >>> cores in the system, a platform device is registered.
> >>> That then triggers the SPE module, which runs as normal.
> >>>
> >>> We also add the ability to parse the PPTT for IDENTICAL
> >>> cores. We then use this to sanity check the single SPE
> >>> device we create. This creates a bit of a problem with
> >>> respect to the specification though. The specification
> >>> says that its legal for multiple tree's to exist in the
> >>> PPTT. We handle this fine, but what happens in the
> >>> case of multiple tree's is that the lack of a common
> >>> node with IDENTICAL set forces us to assume that there
> >>> are multiple non-IDENTICAL cores in the machine.
> >>
> >> Adding this patch set on top of latest mainline kernel,
> >> and tested on D06 which has the SPE feature, in boot message
> >> shows it was probed successfully:
> >>
> >> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]
> >>
> >> but when I test it with spe events such as
> >>
> >> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls
> >>
> >> it fails with:
> >> failed to mmap with 12 (Cannot allocate memory),
> >>
> >> Confirmed that patch [0] is merged and other perf events are working
> >> fine.
> >
> > Its pretty easy to get into the weeds with this driver, does it work with examples like:
> >
> > https://lkml.org/lkml/2018/1/14/122
>
> No, not work at all.
>
> SPE works on 5.0, but not work after 5.1-rc1, bisected to this commit:
>
> 5768402fd9c6 perf/ring_buffer: Use high order allocations for AUX buffers optimistically
>

Indeed this patch breaks SPE. As mentioned in the patch, it uses high
order allocations for AUX buffers and SPE PMU setup_aux explicitly
fails with the warning "unexpected high-order page for auxbuf!" if
it encounters one.

I don't know the intention of that check in SPE. Will ?

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing
  2019-05-08 11:18   ` John Garry
@ 2019-05-08 20:04     ` Jeremy Linton
  0 siblings, 0 replies; 40+ messages in thread
From: Jeremy Linton @ 2019-05-08 20:04 UTC (permalink / raw)
  To: John Garry, linux-arm-kernel
  Cc: linux-acpi, catalin.marinas, will.deacon, rjw, lenb,
	mark.rutland, lorenzo.pieralisi, sudeep.holla, linuxarm

Hi,

Thanks for taking a look at this,

On 5/8/19 6:18 AM, John Garry wrote:
e> On 04/05/2019 00:24, Jeremy Linton wrote:
>> ACPI 6.3 adds additional fields to the MADT GICC
>> structure to describe SPE PPI's. We pick these out
>> of the cached reference to the madt_gicc structure
>> similarly to the core PMU code. We then create a platform
>> device referring to the IRQ and let the user/module loader
>> decide whether to load the SPE driver.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>  arch/arm64/include/asm/acpi.h |  3 ++
>>  drivers/perf/Kconfig          |  5 +++
>>  drivers/perf/arm_pmu_acpi.c   | 76 +++++++++++++++++++++++++++++++++++
>>  include/linux/perf/arm_pmu.h  |  2 +
>>  4 files changed, 86 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/acpi.h 
>> b/arch/arm64/include/asm/acpi.h
>> index 7628efbe6c12..d10399b9f998 100644
>> --- a/arch/arm64/include/asm/acpi.h
>> +++ b/arch/arm64/include/asm/acpi.h
>> @@ -41,6 +41,9 @@
>>      (!(entry) || (entry)->header.length < ACPI_MADT_GICC_MIN_LENGTH || \
>>      (unsigned long)(entry) + (entry)->header.length > (end))
>>
>> +#define ACPI_MADT_GICC_SPE  (ACPI_OFFSET(struct 
>> acpi_madt_generic_interrupt, \
>> +    spe_interrupt) + sizeof(u16))
>> +
>>  /* Basic configuration for ACPI */
>>  #ifdef    CONFIG_ACPI
>>  pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
>> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
>> index af9bc178495d..bc2647c64c9d 100644
>> --- a/drivers/perf/Kconfig
>> +++ b/drivers/perf/Kconfig
>> @@ -52,6 +52,11 @@ config ARM_PMU_ACPI
>>      depends on ARM_PMU && ACPI
>>      def_bool y
>>
>> +config ARM_SPE_ACPI
> 
> Is it possible to just use this check in arm_pmu_acpi.c instead, to 
> avoid introducing another config here:
> 
> if defined(CONFIG_ARM_SPE_PMU)

I'm sure it works, if this is preferred, sure..


> 
>> +    depends on ARM_PMU_ACPI && ARM_SPE_PMU
>> +    def_bool y
>> +
>> +
> 
> nit: extra line
> 
>>  config ARM_DSU_PMU
>>      tristate "ARM DynamIQ Shared Unit (DSU) PMU"
>>      depends on ARM64
>> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
>> index 0f197516d708..b0244e1e8c91 100644
>> --- a/drivers/perf/arm_pmu_acpi.c
>> +++ b/drivers/perf/arm_pmu_acpi.c
>> @@ -74,6 +74,80 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
>>      acpi_unregister_gsi(gsi);
>>  }
>>
>> +#ifdef CONFIG_ARM_SPE_ACPI
>> +static struct resource spe_resources[] = {
>> +    {
>> +        /* irq */
>> +        .flags          = IORESOURCE_IRQ,
>> +    }
>> +};
>> +
>> +static struct platform_device spe_dev = {
>> +    .name = ARMV8_SPE_PDEV_NAME,
>> +    .id = -1,
>> +    .resource = spe_resources,
>> +    .num_resources = ARRAY_SIZE(spe_resources)
>> +};
>> +
>> +/*
>> + * For lack of a better place, hook the normal PMU MADT walk
>> + * and create a SPE device if we detect a recent MADT with
>> + * a homogeneous PPI mapping.
>> + */
>> +static int arm_spe_acpi_register_device(void)
>> +{
>> +    int cpu, ret, irq;
>> +    int hetid;
>  > nit: not sure why you use multiple lines here

Functional grouping, but I should probably re-arrange them...


> 
>> +    u16 gsi = 0;
>> +    bool first = true;
>> +
> 
> nit: extra line, and gicc could be declared within the loop in which 
> it's used to limit scope.
> 
>> +    struct acpi_madt_generic_interrupt *gicc;
>> +
>> +    /*
>> +     * sanity check all the GICC tables for the same interrupt number
>> +     * for now we only support homogeneous ACPI/SPE machines.
>> +     */
>> +    for_each_possible_cpu(cpu) {
>> +        gicc = acpi_cpu_get_madt_gicc(cpu);
>> +
>> +        if (gicc->header.length < ACPI_MADT_GICC_SPE)
>> +            return -ENODEV;
>> +        if (first) {
>> +            gsi = gicc->spe_interrupt;
>> +            if (!gsi)
>> +                return -ENODEV;
>> +            hetid = find_acpi_cpu_topology_hetero_id(cpu);
>> +            first = false;
>> +        } else if ((gsi != gicc->spe_interrupt) ||
>> +               (hetid != find_acpi_cpu_topology_hetero_id(cpu))) {
>> +            pr_warn("ACPI: SPE must be homogeneous\n");
>> +            return -EINVAL;
>> +        }
>> +    }
>> +
>> +    irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE,
>> +                ACPI_ACTIVE_HIGH);
>> +    if (irq < 0) {
> 
> is irq == 0 a valid Linux IRQ number? From checking 
> irq_create_fw_spec_mapping(), it does not seem to be.

I think this is a bit of a trick question, acpi_register_gsi's 
commented/implementations/etc seem to assume that 0 may be a valid 
interrupt, for example

"
  * Returns: a valid linux IRQ number on success
  *          -EINVAL on failure
"

And various pieces of code have >=0  valid IRQ checks. So... I don't 
think its a problem written this way. It leaves the door open for a 
possible 0 despite that likely not being a valid interrupt.. :)


> 
>> +        pr_warn("ACPI: SPE Unable to register interrupt: %d\n", gsi);
>> +        return irq;
>> +    }
>> +
>> +    spe_resources[0].start = irq;
>> +    ret = platform_device_register(&spe_dev);
>> +    if (ret < 0) {
>> +        pr_warn("ACPI: SPE: Unable to register device\n");
>> +        acpi_unregister_gsi(gsi);
>> +    }
>> +
>> +    return ret;
>> +}
>> +#else
>> +static inline int arm_spe_acpi_register_device(void)
>> +{
>> +    return -ENODEV;
>> +}
>> +#endif /* CONFIG_ARM_SPE_ACPI */
>> +
>>  static int arm_pmu_acpi_parse_irqs(void)
>>  {
>>      int irq, cpu, irq_cpu, err;
>> @@ -279,6 +353,8 @@ static int arm_pmu_acpi_init(void)
>>      if (acpi_disabled)
>>          return 0;
>>
>> +    arm_spe_acpi_register_device(); /* failures are expected */
>> +
>>      ret = arm_pmu_acpi_parse_irqs();
>>      if (ret)
>>          return ret;
>> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
>> index 4641e850b204..784bc58f165a 100644
>> --- a/include/linux/perf/arm_pmu.h
>> +++ b/include/linux/perf/arm_pmu.h
>> @@ -175,4 +175,6 @@ void armpmu_free_irq(int irq, int cpu);
>>
>>  #endif /* CONFIG_ARM_PMU */
>>
>> +#define ARMV8_SPE_PDEV_NAME "arm,spe-v1"
>> +
>>  #endif /* __ARM_PMU_H__ */
>>
> 
> Thanks!
> 
> 
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-08 16:51       ` Sudeep Holla
@ 2019-05-09  9:28         ` Will Deacon
  2019-05-09 10:35           ` Sudeep Holla
  0 siblings, 1 reply; 40+ messages in thread
From: Will Deacon @ 2019-05-09  9:28 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Hanjun Guo, Jeremy Linton, linux-arm-kernel, linux-acpi,
	catalin.marinas, rjw, lenb, mark.rutland, lorenzo.pieralisi,
	linuxarm, john.garry, Hongbo Yao, Alexander Shishkin

On Wed, May 08, 2019 at 05:51:49PM +0100, Sudeep Holla wrote:
> On Wed, May 08, 2019 at 05:35:51PM +0800, Hanjun Guo wrote:
> > +Cc Alexander.
> >
> > On 2019/5/8 1:58, Jeremy Linton wrote:
> > > Hi,
> > >
> > > On 5/4/19 6:06 AM, Hanjun Guo wrote:
> > >> Hi Jeremy, Mark,
> > >>
> > >> On 2019/5/4 7:24, Jeremy Linton wrote:
> > >>> This patch series enables the Arm Statistical Profiling
> > >>> Extension (SPE) on ACPI platforms.
> > >>>
> > >>> This is possible because ACPI 6.3 uses a previously
> > >>> reserved field in the MADT to store the SPE interrupt
> > >>> number, similarly to how the normal PMU is described.
> > >>> If a consistent valid interrupt exists across all the
> > >>> cores in the system, a platform device is registered.
> > >>> That then triggers the SPE module, which runs as normal.
> > >>>
> > >>> We also add the ability to parse the PPTT for IDENTICAL
> > >>> cores. We then use this to sanity check the single SPE
> > >>> device we create. This creates a bit of a problem with
> > >>> respect to the specification though. The specification
> > >>> says that its legal for multiple tree's to exist in the
> > >>> PPTT. We handle this fine, but what happens in the
> > >>> case of multiple tree's is that the lack of a common
> > >>> node with IDENTICAL set forces us to assume that there
> > >>> are multiple non-IDENTICAL cores in the machine.
> > >>
> > >> Adding this patch set on top of latest mainline kernel,
> > >> and tested on D06 which has the SPE feature, in boot message
> > >> shows it was probed successfully:
> > >>
> > >> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]
> > >>
> > >> but when I test it with spe events such as
> > >>
> > >> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls
> > >>
> > >> it fails with:
> > >> failed to mmap with 12 (Cannot allocate memory),
> > >>
> > >> Confirmed that patch [0] is merged and other perf events are working
> > >> fine.
> > >
> > > Its pretty easy to get into the weeds with this driver, does it work with examples like:
> > >
> > > https://lkml.org/lkml/2018/1/14/122
> >
> > No, not work at all.
> >
> > SPE works on 5.0, but not work after 5.1-rc1, bisected to this commit:
> >
> > 5768402fd9c6 perf/ring_buffer: Use high order allocations for AUX buffers optimistically
> >
> 
> Indeed this patch breaks SPE. As mentioned in the patch, it uses high
> order allocations for AUX buffers and SPE PMU setup_aux explicitly
> fails with the warning "unexpected high-order page for auxbuf!" if
> it encounters one.
> 
> I don't know the intention of that check in SPE. Will ?

Since SPE uses virtual addressing, we don't really care about the underlying
page layout so there's no need to use higher-order allocations. I suppose we
could theoretically map them at the pmd level in some cases, but ignoring
them should also be harmless and I suspect you can delete the check.

Does the patch below fix the problem?

Will

--->8

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 7cb766dafe85..e120f933412a 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -855,16 +855,8 @@ static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
 	if (!pglist)
 		goto out_free_buf;
 
-	for (i = 0; i < nr_pages; ++i) {
-		struct page *page = virt_to_page(pages[i]);
-
-		if (PagePrivate(page)) {
-			pr_warn("unexpected high-order page for auxbuf!");
-			goto out_free_pglist;
-		}
-
+	for (i = 0; i < nr_pages; ++i)
 		pglist[i] = virt_to_page(pages[i]);
-	}
 
 	buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);
 	if (!buf->base)

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-09  9:28         ` Will Deacon
@ 2019-05-09 10:35           ` Sudeep Holla
  2019-05-09 14:13             ` Sudeep Holla
  2019-05-13 11:10             ` Hanjun Guo
  0 siblings, 2 replies; 40+ messages in thread
From: Sudeep Holla @ 2019-05-09 10:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: Hanjun Guo, Jeremy Linton, linux-arm-kernel, linux-acpi,
	catalin.marinas, rjw, lenb, mark.rutland, lorenzo.pieralisi,
	linuxarm, john.garry, Hongbo Yao, Alexander Shishkin,
	Sudeep Holla

On Thu, May 09, 2019 at 10:28:11AM +0100, Will Deacon wrote:
> On Wed, May 08, 2019 at 05:51:49PM +0100, Sudeep Holla wrote:
> > On Wed, May 08, 2019 at 05:35:51PM +0800, Hanjun Guo wrote:
> > > +Cc Alexander.
> > >
> > > On 2019/5/8 1:58, Jeremy Linton wrote:
> > > > Hi,
> > > >
> > > > On 5/4/19 6:06 AM, Hanjun Guo wrote:
> > > >> Hi Jeremy, Mark,
> > > >>
> > > >> On 2019/5/4 7:24, Jeremy Linton wrote:
> > > >>> This patch series enables the Arm Statistical Profiling
> > > >>> Extension (SPE) on ACPI platforms.
> > > >>>
> > > >>> This is possible because ACPI 6.3 uses a previously
> > > >>> reserved field in the MADT to store the SPE interrupt
> > > >>> number, similarly to how the normal PMU is described.
> > > >>> If a consistent valid interrupt exists across all the
> > > >>> cores in the system, a platform device is registered.
> > > >>> That then triggers the SPE module, which runs as normal.
> > > >>>
> > > >>> We also add the ability to parse the PPTT for IDENTICAL
> > > >>> cores. We then use this to sanity check the single SPE
> > > >>> device we create. This creates a bit of a problem with
> > > >>> respect to the specification though. The specification
> > > >>> says that its legal for multiple tree's to exist in the
> > > >>> PPTT. We handle this fine, but what happens in the
> > > >>> case of multiple tree's is that the lack of a common
> > > >>> node with IDENTICAL set forces us to assume that there
> > > >>> are multiple non-IDENTICAL cores in the machine.
> > > >>
> > > >> Adding this patch set on top of latest mainline kernel,
> > > >> and tested on D06 which has the SPE feature, in boot message
> > > >> shows it was probed successfully:
> > > >>
> > > >> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]
> > > >>
> > > >> but when I test it with spe events such as
> > > >>
> > > >> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls
> > > >>
> > > >> it fails with:
> > > >> failed to mmap with 12 (Cannot allocate memory),
> > > >>
> > > >> Confirmed that patch [0] is merged and other perf events are working
> > > >> fine.
> > > >
> > > > Its pretty easy to get into the weeds with this driver, does it work with examples like:
> > > >
> > > > https://lkml.org/lkml/2018/1/14/122
> > >
> > > No, not work at all.
> > >
> > > SPE works on 5.0, but not work after 5.1-rc1, bisected to this commit:
> > >
> > > 5768402fd9c6 perf/ring_buffer: Use high order allocations for AUX buffers optimistically
> > >
> > 
> > Indeed this patch breaks SPE. As mentioned in the patch, it uses high
> > order allocations for AUX buffers and SPE PMU setup_aux explicitly
> > fails with the warning "unexpected high-order page for auxbuf!" if
> > it encounters one.
> > 
> > I don't know the intention of that check in SPE. Will ?
> 
> Since SPE uses virtual addressing, we don't really care about the underlying
> page layout so there's no need to use higher-order allocations. I suppose we
> could theoretically map them at the pmd level in some cases, but ignoring
> them should also be harmless and I suspect you can delete the check.
>

Yes, I did a quick look to see if we can do that, but couldn't find a clue.
Not sure if that's any optimisation, we can use order from page_private
and set the values accordingly ?

> Does the patch below fix the problem?
>

Yes it should help, I tried exactly the same thing yesterday and it does
fix the issue.

Regards,
Sudeep

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-09 10:35           ` Sudeep Holla
@ 2019-05-09 14:13             ` Sudeep Holla
  2019-05-13 10:56               ` Will Deacon
  2019-05-13 11:10             ` Hanjun Guo
  1 sibling, 1 reply; 40+ messages in thread
From: Sudeep Holla @ 2019-05-09 14:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: Hanjun Guo, Jeremy Linton, linux-arm-kernel, linux-acpi,
	catalin.marinas, rjw, lenb, mark.rutland, lorenzo.pieralisi,
	linuxarm, john.garry, Hongbo Yao, Alexander Shishkin,
	Sudeep Holla

On Thu, May 09, 2019 at 11:35:59AM +0100, Sudeep Holla wrote:
> On Thu, May 09, 2019 at 10:28:11AM +0100, Will Deacon wrote:

[...]

> >
> > Since SPE uses virtual addressing, we don't really care about the underlying
> > page layout so there's no need to use higher-order allocations. I suppose we
> > could theoretically map them at the pmd level in some cases, but ignoring
> > them should also be harmless and I suspect you can delete the check.
> >
>
> Yes, I did a quick look to see if we can do that, but couldn't find a clue.
> Not sure if that's any optimisation, we can use order from page_private
> and set the values accordingly ?
>
And I forgot to add the diff that I mentioned above, something like the
patch below.

Regards,
Sudeep

-->8

diff --git i/drivers/perf/arm_spe_pmu.c w/drivers/perf/arm_spe_pmu.c
index 7cb766dafe85..45cd62517080 100644
--- i/drivers/perf/arm_spe_pmu.c
+++ w/drivers/perf/arm_spe_pmu.c
@@ -827,7 +827,7 @@ static void arm_spe_pmu_read(struct perf_event *event)
 static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
 				   int nr_pages, bool snapshot)
 {
-	int i, cpu = event->cpu;
+	int i, j, cpu = event->cpu;
 	struct page **pglist;
 	struct arm_spe_pmu_buf *buf;
 
@@ -859,11 +859,12 @@ static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
 		struct page *page = virt_to_page(pages[i]);
 
 		if (PagePrivate(page)) {
-			pr_warn("unexpected high-order page for auxbuf!");
-			goto out_free_pglist;
+			for (j = 0; j < 1 << page_private(page); j++)
+				pglist[i + j] = page++;
+			i += j - 1;
+		} else {
+			pglist[i] = page;
 		}
-
-		pglist[i] = virt_to_page(pages[i]);
 	}
 
 	buf->base = vmap(pglist, nr_pages, VM_MAP, PAGE_KERNEL);

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-09 14:13             ` Sudeep Holla
@ 2019-05-13 10:56               ` Will Deacon
  2019-05-13 11:31                 ` Sudeep Holla
  0 siblings, 1 reply; 40+ messages in thread
From: Will Deacon @ 2019-05-13 10:56 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Hanjun Guo, Jeremy Linton, linux-arm-kernel, linux-acpi,
	catalin.marinas, rjw, lenb, mark.rutland, lorenzo.pieralisi,
	linuxarm, john.garry, Hongbo Yao, Alexander Shishkin

Hi Sudeep,

On Thu, May 09, 2019 at 03:13:50PM +0100, Sudeep Holla wrote:
> On Thu, May 09, 2019 at 11:35:59AM +0100, Sudeep Holla wrote:
> > On Thu, May 09, 2019 at 10:28:11AM +0100, Will Deacon wrote:
> 
> [...]
> 
> > >
> > > Since SPE uses virtual addressing, we don't really care about the underlying
> > > page layout so there's no need to use higher-order allocations. I suppose we
> > > could theoretically map them at the pmd level in some cases, but ignoring
> > > them should also be harmless and I suspect you can delete the check.
> > >
> >
> > Yes, I did a quick look to see if we can do that, but couldn't find a clue.
> > Not sure if that's any optimisation, we can use order from page_private
> > and set the values accordingly ?
> >
> And I forgot to add the diff that I mentioned above, something like the
> patch below.
> 
> Regards,
> Sudeep
> 
> -->8
> 
> diff --git i/drivers/perf/arm_spe_pmu.c w/drivers/perf/arm_spe_pmu.c
> index 7cb766dafe85..45cd62517080 100644
> --- i/drivers/perf/arm_spe_pmu.c
> +++ w/drivers/perf/arm_spe_pmu.c
> @@ -827,7 +827,7 @@ static void arm_spe_pmu_read(struct perf_event *event)
>  static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
>  				   int nr_pages, bool snapshot)
>  {
> -	int i, cpu = event->cpu;
> +	int i, j, cpu = event->cpu;
>  	struct page **pglist;
>  	struct arm_spe_pmu_buf *buf;
>  
> @@ -859,11 +859,12 @@ static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
>  		struct page *page = virt_to_page(pages[i]);
>  
>  		if (PagePrivate(page)) {
> -			pr_warn("unexpected high-order page for auxbuf!");
> -			goto out_free_pglist;
> +			for (j = 0; j < 1 << page_private(page); j++)
> +				pglist[i + j] = page++;
> +			i += j - 1;
> +		} else {
> +			pglist[i] = page;

Hmm. Given that vmap() doesn't do anything special for high-order pages
and rb_alloc_aux()/rb_alloc_aux_page() already split the allocation up
for the page array, what does your change accomplish on top of that?

Will

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-09 10:35           ` Sudeep Holla
  2019-05-09 14:13             ` Sudeep Holla
@ 2019-05-13 11:10             ` Hanjun Guo
  1 sibling, 0 replies; 40+ messages in thread
From: Hanjun Guo @ 2019-05-13 11:10 UTC (permalink / raw)
  To: Sudeep Holla, Will Deacon
  Cc: Jeremy Linton, linux-arm-kernel, linux-acpi, catalin.marinas,
	rjw, lenb, mark.rutland, lorenzo.pieralisi, linuxarm, john.garry,
	Hongbo Yao, Alexander Shishkin

On 2019/5/9 18:35, Sudeep Holla wrote:
> On Thu, May 09, 2019 at 10:28:11AM +0100, Will Deacon wrote:
>> On Wed, May 08, 2019 at 05:51:49PM +0100, Sudeep Holla wrote:
>>> On Wed, May 08, 2019 at 05:35:51PM +0800, Hanjun Guo wrote:
[...]
>>>>>>
>>>>>> Adding this patch set on top of latest mainline kernel,
>>>>>> and tested on D06 which has the SPE feature, in boot message
>>>>>> shows it was probed successfully:
>>>>>>
>>>>>> arm_spe_pmu arm,spe-v1: probed for CPUs 0-95 [max_record_sz 128, align 4, features 0x7]
>>>>>>
>>>>>> but when I test it with spe events such as
>>>>>>
>>>>>> perf record -c 1024 -e arm_spe_0/branch_filter=0/ -o spe ls
>>>>>>
>>>>>> it fails with:
>>>>>> failed to mmap with 12 (Cannot allocate memory),
>>>>>>
>>>>>> Confirmed that patch [0] is merged and other perf events are working
>>>>>> fine.
>>>>>
>>>>> Its pretty easy to get into the weeds with this driver, does it work with examples like:
>>>>>
>>>>> https://lkml.org/lkml/2018/1/14/122
>>>>
>>>> No, not work at all.
>>>>
>>>> SPE works on 5.0, but not work after 5.1-rc1, bisected to this commit:
>>>>
>>>> 5768402fd9c6 perf/ring_buffer: Use high order allocations for AUX buffers optimistically
>>>>
>>>
>>> Indeed this patch breaks SPE. As mentioned in the patch, it uses high
>>> order allocations for AUX buffers and SPE PMU setup_aux explicitly
>>> fails with the warning "unexpected high-order page for auxbuf!" if
>>> it encounters one.
>>>
>>> I don't know the intention of that check in SPE. Will ?
>>
>> Since SPE uses virtual addressing, we don't really care about the underlying
>> page layout so there's no need to use higher-order allocations. I suppose we
>> could theoretically map them at the pmd level in some cases, but ignoring
>> them should also be harmless and I suspect you can delete the check.
>>
> 
> Yes, I did a quick look to see if we can do that, but couldn't find a clue.
> Not sure if that's any optimisation, we can use order from page_private
> and set the values accordingly ?
> 
>> Does the patch below fix the problem?
>>
> 
> Yes it should help, I tried exactly the same thing yesterday and it does
> fix the issue.

Works for me too, thank you Sudeep and Will for looking into this issue.

Best Regards
Hanjun


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 0/5] arm64: SPE ACPI enablement
  2019-05-13 10:56               ` Will Deacon
@ 2019-05-13 11:31                 ` Sudeep Holla
  0 siblings, 0 replies; 40+ messages in thread
From: Sudeep Holla @ 2019-05-13 11:31 UTC (permalink / raw)
  To: Will Deacon
  Cc: Hanjun Guo, Jeremy Linton, linux-arm-kernel, linux-acpi,
	catalin.marinas, rjw, lenb, mark.rutland, lorenzo.pieralisi,
	linuxarm, john.garry, Hongbo Yao, Sudeep Holla,
	Alexander Shishkin

On Mon, May 13, 2019 at 11:56:31AM +0100, Will Deacon wrote:
> Hi Sudeep,
> 
> On Thu, May 09, 2019 at 03:13:50PM +0100, Sudeep Holla wrote:
> > On Thu, May 09, 2019 at 11:35:59AM +0100, Sudeep Holla wrote:
> > > On Thu, May 09, 2019 at 10:28:11AM +0100, Will Deacon wrote:
> > 
> > [...]
> > 
> > > >
> > > > Since SPE uses virtual addressing, we don't really care about the underlying
> > > > page layout so there's no need to use higher-order allocations. I suppose we
> > > > could theoretically map them at the pmd level in some cases, but ignoring
> > > > them should also be harmless and I suspect you can delete the check.
> > > >
> > >
> > > Yes, I did a quick look to see if we can do that, but couldn't find a clue.
> > > Not sure if that's any optimisation, we can use order from page_private
> > > and set the values accordingly ?
> > >
> > And I forgot to add the diff that I mentioned above, something like the
> > patch below.
> > 
> > Regards,
> > Sudeep
> > 
> > -->8
> > 
> > diff --git i/drivers/perf/arm_spe_pmu.c w/drivers/perf/arm_spe_pmu.c
> > index 7cb766dafe85..45cd62517080 100644
> > --- i/drivers/perf/arm_spe_pmu.c
> > +++ w/drivers/perf/arm_spe_pmu.c
> > @@ -827,7 +827,7 @@ static void arm_spe_pmu_read(struct perf_event *event)
> >  static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
> >  				   int nr_pages, bool snapshot)
> >  {
> > -	int i, cpu = event->cpu;
> > +	int i, j, cpu = event->cpu;
> >  	struct page **pglist;
> >  	struct arm_spe_pmu_buf *buf;
> >  
> > @@ -859,11 +859,12 @@ static void *arm_spe_pmu_setup_aux(struct perf_event *event, void **pages,
> >  		struct page *page = virt_to_page(pages[i]);
> >  
> >  		if (PagePrivate(page)) {
> > -			pr_warn("unexpected high-order page for auxbuf!");
> > -			goto out_free_pglist;
> > +			for (j = 0; j < 1 << page_private(page); j++)
> > +				pglist[i + j] = page++;
> > +			i += j - 1;
> > +		} else {
> > +			pglist[i] = page;
> 
> Hmm. Given that vmap() doesn't do anything special for high-order pages
> and rb_alloc_aux()/rb_alloc_aux_page() already split the allocation up
> for the page array, what does your change accomplish on top of that?
> 

Not much, instead of computing page ptr for each page using virt_to_page,
we jump pointers automatically for all the pages that are private.

page_private(page) holds the order. i.e. for 2MB high order allocation
we can skip calling virt_to_page for 511 pages that are contiguous.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  2019-05-03 23:24 ` [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
  2019-05-05  7:09   ` Kefeng Wang
@ 2019-06-07  9:49   ` Sudeep Holla
  2 siblings, 0 replies; 40+ messages in thread
From: Sudeep Holla @ 2019-06-07  9:49 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-arm-kernel, linux-acpi, catalin.marinas, will.deacon, rjw,
	lenb, mark.rutland, lorenzo.pieralisi, linuxarm, john.garry,
	Sudeep Holla

On Fri, May 03, 2019 at 06:24:04PM -0500, Jeremy Linton wrote:
> ACPI 6.3 adds a flag to indicate that child nodes are all
> identical cores. This is useful to authoritatively determine
> if a set of (possibly offline) cores are identical or not.
> 
> Since the flag doesn't give us a unique id we can generate
> one and use it to create bitmaps of sibling nodes, or simply
> in a loop to determine if a subset of cores are identical.
>

If possible reorder this patch with next just to be sure.
I know the user is not introduced until 4/5, but 3/5 kind of fixes
the implementation.


Apart from that, this looks fine to me.

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c  | 26 ++++++++++++++++++++++++++
>  include/linux/acpi.h |  5 +++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 83a026765faa..1865515297ca 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -660,3 +660,29 @@ int find_acpi_cpu_topology_package(unsigned int cpu)
>  	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
>  					  ACPI_PPTT_PHYSICAL_PACKAGE);
>  }
> +
> +/**
> + * find_acpi_cpu_topology_hetero_id() - Get a core architecture tag
> + * @cpu: Kernel logical CPU number
> + *
> + * Determine a unique heterogeneous tag for the given CPU. CPUs with the same
> + * implementation should have matching tags.
> + *
> + * The returned tag can be used to group peers with identical implementation.
> + *
> + * The search terminates when a level is found with the identical implementation
> + * flag set or we reach a root node.
> + *
> + * Due to limitations in the PPTT data structure, there may be rare situations
> + * where two cores in a heterogeneous machine may be identical, but won't have
> + * the same tag.
> + *

Indeed, it's unfortunate. I gave some thoughts if we can find ways to
avoid this. Hope we don't have to see such weird combinations with ACPI
based systems.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  2019-05-03 23:24 ` [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
@ 2019-06-07  9:53   ` Sudeep Holla
  2019-06-07 13:15     ` Jeremy Linton
  1 sibling, 1 reply; 40+ messages in thread
From: Sudeep Holla @ 2019-06-07  9:53 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-arm-kernel, linux-acpi, catalin.marinas, will.deacon, rjw,
	lenb, mark.rutland, lorenzo.pieralisi, linuxarm, john.garry,
	Sudeep Holla

On Fri, May 03, 2019 at 06:24:05PM -0500, Jeremy Linton wrote:
> The ACPI specification implies that the IDENTICAL flag should be
> set on all non leaf nodes where the children are identical.
> This means that we need to be searching for the last node with
> the identical flag set rather than the first one.
> 
> Since this flag is also dependent on the table revision, we
> need to add a bit of extra code to verify the table revision,
> and the next node's state in the traversal. Since we want to
> avoid function pointers here, lets just special case
> the IDENTICAL flag.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c | 28 +++++++++++++++++++++++++---
>  1 file changed, 25 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 1865515297ca..456e1c0a35ae 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -432,17 +432,39 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
>  	}
>  }
>  
> +static bool flag_identical(struct acpi_table_header *table_hdr,
> +			  struct acpi_pptt_processor *cpu)

Not sure if it's email client problem, but I see quite a few mis-alignment
with parenthesis like above one.

> +{
> +	struct acpi_pptt_processor *next;
> +
> +	/* heterogeneous machines must use PPTT revision > 1 */
> +	if (table_hdr->revision < 2)
> +		return false;
> +
> +	/* Locate the last node in the tree with IDENTICAL set */
> +	if (cpu->flags & ACPI_PPTT_ACPI_IDENTICAL) {
> +		next = fetch_pptt_node(table_hdr, cpu->parent);
> +		if (!(next && next->flags & ACPI_PPTT_ACPI_IDENTICAL))
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
>  /* Passing level values greater than this will result in search termination */
>  #define PPTT_ABORT_PACKAGE 0xFF
>  
> -static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_table_header *table_hdr,
> +static struct acpi_pptt_processor *acpi_find_processor_tag_id(struct acpi_table_header *table_hdr,
>  								  struct acpi_pptt_processor *cpu,
>  								  int level, int flag)
>  {
>  	struct acpi_pptt_processor *prev_node;
>  
>  	while (cpu && level) {
> -		if (cpu->flags & flag)
> +		if (flag == ACPI_PPTT_ACPI_IDENTICAL) {

flag_identical anyways check the flag, so I assume you can drop the above
check.

> +			if (flag_identical(table_hdr, cpu))
> +				break;
> +		} else if (cpu->flags & flag)
>  			break;
>  		pr_debug("level %d\n", level);
>  		prev_node = fetch_pptt_node(table_hdr, cpu->parent);
> @@ -480,7 +502,7 @@ static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
>  
>  	cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
>  	if (cpu_node) {
> -		cpu_node = acpi_find_processor_package_id(table, cpu_node,
> +		cpu_node = acpi_find_processor_tag_id(table, cpu_node,
>  							  level, flag);


Again misaligned, may be that's because of renaming.

--
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing
  2019-05-03 23:24 ` [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing Jeremy Linton
  2019-05-03 23:24   ` Jeremy Linton
  2019-05-08 11:18   ` John Garry
@ 2019-06-07  9:57   ` Sudeep Holla
  2019-06-07 13:28     ` Jeremy Linton
  2 siblings, 1 reply; 40+ messages in thread
From: Sudeep Holla @ 2019-06-07  9:57 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-arm-kernel, linux-acpi, catalin.marinas, will.deacon, rjw,
	lenb, mark.rutland, lorenzo.pieralisi, linuxarm, john.garry,
	Sudeep Holla

On Fri, May 03, 2019 at 06:24:06PM -0500, Jeremy Linton wrote:
> ACPI 6.3 adds additional fields to the MADT GICC
> structure to describe SPE PPI's. We pick these out
> of the cached reference to the madt_gicc structure
> similarly to the core PMU code. We then create a platform
> device referring to the IRQ and let the user/module loader
> decide whether to load the SPE driver.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/arm64/include/asm/acpi.h |  3 ++
>  drivers/perf/Kconfig          |  5 +++
>  drivers/perf/arm_pmu_acpi.c   | 76 +++++++++++++++++++++++++++++++++++
>  include/linux/perf/arm_pmu.h  |  2 +
>  4 files changed, 86 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> index 7628efbe6c12..d10399b9f998 100644
> --- a/arch/arm64/include/asm/acpi.h
> +++ b/arch/arm64/include/asm/acpi.h
> @@ -41,6 +41,9 @@
>  	(!(entry) || (entry)->header.length < ACPI_MADT_GICC_MIN_LENGTH || \
>  	(unsigned long)(entry) + (entry)->header.length > (end))
>  
> +#define ACPI_MADT_GICC_SPE  (ACPI_OFFSET(struct acpi_madt_generic_interrupt, \
> +	spe_interrupt) + sizeof(u16))
> +
>  /* Basic configuration for ACPI */
>  #ifdef	CONFIG_ACPI
>  pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> index af9bc178495d..bc2647c64c9d 100644
> --- a/drivers/perf/Kconfig
> +++ b/drivers/perf/Kconfig
> @@ -52,6 +52,11 @@ config ARM_PMU_ACPI
>  	depends on ARM_PMU && ACPI
>  	def_bool y
>  
> +config ARM_SPE_ACPI
> +	depends on ARM_PMU_ACPI && ARM_SPE_PMU
> +	def_bool y
> +
> +
>  config ARM_DSU_PMU
>  	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
>  	depends on ARM64
> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
> index 0f197516d708..b0244e1e8c91 100644
> --- a/drivers/perf/arm_pmu_acpi.c
> +++ b/drivers/perf/arm_pmu_acpi.c
> @@ -74,6 +74,80 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
>  	acpi_unregister_gsi(gsi);
>  }
>  
> +#ifdef CONFIG_ARM_SPE_ACPI
> +static struct resource spe_resources[] = {
> +	{
> +		/* irq */
> +		.flags          = IORESOURCE_IRQ,
> +	}
> +};
> +
> +static struct platform_device spe_dev = {
> +	.name = ARMV8_SPE_PDEV_NAME,
> +	.id = -1,
> +	.resource = spe_resources,
> +	.num_resources = ARRAY_SIZE(spe_resources)
> +};
> +
> +/*
> + * For lack of a better place, hook the normal PMU MADT walk
> + * and create a SPE device if we detect a recent MADT with
> + * a homogeneous PPI mapping.
> + */
> +static int arm_spe_acpi_register_device(void)
> +{
> +	int cpu, ret, irq;
> +	int hetid;
> +	u16 gsi = 0;
> +	bool first = true;
> +
> +	struct acpi_madt_generic_interrupt *gicc;
> +
> +	/*
> +	 * sanity check all the GICC tables for the same interrupt number
> +	 * for now we only support homogeneous ACPI/SPE machines.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		gicc = acpi_cpu_get_madt_gicc(cpu);
> +
> +		if (gicc->header.length < ACPI_MADT_GICC_SPE)
> +			return -ENODEV;
> +		if (first) {
> +			gsi = gicc->spe_interrupt;
> +			if (!gsi)
> +				return -ENODEV;
> +			hetid = find_acpi_cpu_topology_hetero_id(cpu);
> +			first = false;
> +		} else if ((gsi != gicc->spe_interrupt) ||
> +			   (hetid != find_acpi_cpu_topology_hetero_id(cpu))) {

Sorry, I should have noticed this in patch 2 itself. Won't this break for
multi-socket system ? The hetid in that case will be package id, no ?

Otherwise the patch looks good.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  2019-06-07  9:53   ` Sudeep Holla
@ 2019-06-07 13:15     ` Jeremy Linton
  2019-06-07 13:47       ` Sudeep Holla
  0 siblings, 1 reply; 40+ messages in thread
From: Jeremy Linton @ 2019-06-07 13:15 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-arm-kernel, linux-acpi, catalin.marinas, will.deacon, rjw,
	lenb, mark.rutland, lorenzo.pieralisi, linuxarm, john.garry

Hi,

Thanks for taking a look at this.

On 6/7/19 4:53 AM, Sudeep Holla wrote:
> On Fri, May 03, 2019 at 06:24:05PM -0500, Jeremy Linton wrote:
>> The ACPI specification implies that the IDENTICAL flag should be
>> set on all non leaf nodes where the children are identical.
>> This means that we need to be searching for the last node with
>> the identical flag set rather than the first one.
>>
>> Since this flag is also dependent on the table revision, we
>> need to add a bit of extra code to verify the table revision,
>> and the next node's state in the traversal. Since we want to
>> avoid function pointers here, lets just special case
>> the IDENTICAL flag.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/pptt.c | 28 +++++++++++++++++++++++++---
>>   1 file changed, 25 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index 1865515297ca..456e1c0a35ae 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -432,17 +432,39 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
>>   	}
>>   }
>>   
>> +static bool flag_identical(struct acpi_table_header *table_hdr,
>> +			  struct acpi_pptt_processor *cpu)
> 
> Not sure if it's email client problem, but I see quite a few mis-alignment
> with parenthesis like above one.

It looks fine in the original editor/text patch, but yes in my email 
client I see it off by one (or two/three now that i'm replying). Its a 
mix of tabs/spaces and I've seen this happen before in my email client 
due to the leading "[>+]"?


> 
>> +{
>> +	struct acpi_pptt_processor *next;
>> +
>> +	/* heterogeneous machines must use PPTT revision > 1 */
>> +	if (table_hdr->revision < 2)
>> +		return false;
>> +
>> +	/* Locate the last node in the tree with IDENTICAL set */
>> +	if (cpu->flags & ACPI_PPTT_ACPI_IDENTICAL) {
>> +		next = fetch_pptt_node(table_hdr, cpu->parent);
>> +		if (!(next && next->flags & ACPI_PPTT_ACPI_IDENTICAL))
>> +			return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>>   /* Passing level values greater than this will result in search termination */
>>   #define PPTT_ABORT_PACKAGE 0xFF
>>   
>> -static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_table_header *table_hdr,
>> +static struct acpi_pptt_processor *acpi_find_processor_tag_id(struct acpi_table_header *table_hdr,
>>   								  struct acpi_pptt_processor *cpu,
>>   								  int level, int flag)
>>   {
>>   	struct acpi_pptt_processor *prev_node;
>>   
>>   	while (cpu && level) {
>> -		if (cpu->flags & flag)
>> +		if (flag == ACPI_PPTT_ACPI_IDENTICAL) {
> 
> flag_identical anyways check the flag, so I assume you can drop the above
> check.

? I think that would be a bug because then we would always be returning 
the value of the IDENTICAL flag rather than the other flags passed into 
this routine. This is the special case I think Raphael was asking for 
rather than the function pointer/callback interface.

> 
>> +			if (flag_identical(table_hdr, cpu))
>> +				break;
>> +		} else if (cpu->flags & flag)
>>   			break;
>>   		pr_debug("level %d\n", level);
>>   		prev_node = fetch_pptt_node(table_hdr, cpu->parent);
>> @@ -480,7 +502,7 @@ static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
>>   
>>   	cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
>>   	if (cpu_node) {
>> -		cpu_node = acpi_find_processor_package_id(table, cpu_node,
>> +		cpu_node = acpi_find_processor_tag_id(table, cpu_node,
>>   							  level, flag);
> 
> 
> Again misaligned, may be that's because of renaming.
> 
> --
> Regards,
> Sudeep
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing
  2019-06-07  9:57   ` Sudeep Holla
@ 2019-06-07 13:28     ` Jeremy Linton
  2019-06-07 13:37       ` Sudeep Holla
  0 siblings, 1 reply; 40+ messages in thread
From: Jeremy Linton @ 2019-06-07 13:28 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: linux-arm-kernel, linux-acpi, catalin.marinas, will.deacon, rjw,
	lenb, mark.rutland, lorenzo.pieralisi, linuxarm, john.garry

Hi,

On 6/7/19 4:57 AM, Sudeep Holla wrote:
> On Fri, May 03, 2019 at 06:24:06PM -0500, Jeremy Linton wrote:
>> ACPI 6.3 adds additional fields to the MADT GICC
>> structure to describe SPE PPI's. We pick these out
>> of the cached reference to the madt_gicc structure
>> similarly to the core PMU code. We then create a platform
>> device referring to the IRQ and let the user/module loader
>> decide whether to load the SPE driver.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   arch/arm64/include/asm/acpi.h |  3 ++
>>   drivers/perf/Kconfig          |  5 +++
>>   drivers/perf/arm_pmu_acpi.c   | 76 +++++++++++++++++++++++++++++++++++
>>   include/linux/perf/arm_pmu.h  |  2 +
>>   4 files changed, 86 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
>> index 7628efbe6c12..d10399b9f998 100644
>> --- a/arch/arm64/include/asm/acpi.h
>> +++ b/arch/arm64/include/asm/acpi.h
>> @@ -41,6 +41,9 @@
>>   	(!(entry) || (entry)->header.length < ACPI_MADT_GICC_MIN_LENGTH || \
>>   	(unsigned long)(entry) + (entry)->header.length > (end))
>>   
>> +#define ACPI_MADT_GICC_SPE  (ACPI_OFFSET(struct acpi_madt_generic_interrupt, \
>> +	spe_interrupt) + sizeof(u16))
>> +
>>   /* Basic configuration for ACPI */
>>   #ifdef	CONFIG_ACPI
>>   pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
>> diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
>> index af9bc178495d..bc2647c64c9d 100644
>> --- a/drivers/perf/Kconfig
>> +++ b/drivers/perf/Kconfig
>> @@ -52,6 +52,11 @@ config ARM_PMU_ACPI
>>   	depends on ARM_PMU && ACPI
>>   	def_bool y
>>   
>> +config ARM_SPE_ACPI
>> +	depends on ARM_PMU_ACPI && ARM_SPE_PMU
>> +	def_bool y
>> +
>> +
>>   config ARM_DSU_PMU
>>   	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
>>   	depends on ARM64
>> diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
>> index 0f197516d708..b0244e1e8c91 100644
>> --- a/drivers/perf/arm_pmu_acpi.c
>> +++ b/drivers/perf/arm_pmu_acpi.c
>> @@ -74,6 +74,80 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
>>   	acpi_unregister_gsi(gsi);
>>   }
>>   
>> +#ifdef CONFIG_ARM_SPE_ACPI
>> +static struct resource spe_resources[] = {
>> +	{
>> +		/* irq */
>> +		.flags          = IORESOURCE_IRQ,
>> +	}
>> +};
>> +
>> +static struct platform_device spe_dev = {
>> +	.name = ARMV8_SPE_PDEV_NAME,
>> +	.id = -1,
>> +	.resource = spe_resources,
>> +	.num_resources = ARRAY_SIZE(spe_resources)
>> +};
>> +
>> +/*
>> + * For lack of a better place, hook the normal PMU MADT walk
>> + * and create a SPE device if we detect a recent MADT with
>> + * a homogeneous PPI mapping.
>> + */
>> +static int arm_spe_acpi_register_device(void)
>> +{
>> +	int cpu, ret, irq;
>> +	int hetid;
>> +	u16 gsi = 0;
>> +	bool first = true;
>> +
>> +	struct acpi_madt_generic_interrupt *gicc;
>> +
>> +	/*
>> +	 * sanity check all the GICC tables for the same interrupt number
>> +	 * for now we only support homogeneous ACPI/SPE machines.
>> +	 */
>> +	for_each_possible_cpu(cpu) {
>> +		gicc = acpi_cpu_get_madt_gicc(cpu);
>> +
>> +		if (gicc->header.length < ACPI_MADT_GICC_SPE)
>> +			return -ENODEV;
>> +		if (first) {
>> +			gsi = gicc->spe_interrupt;
>> +			if (!gsi)
>> +				return -ENODEV;
>> +			hetid = find_acpi_cpu_topology_hetero_id(cpu);
>> +			first = false;
>> +		} else if ((gsi != gicc->spe_interrupt) ||
>> +			   (hetid != find_acpi_cpu_topology_hetero_id(cpu))) {
> 
> Sorry, I should have noticed this in patch 2 itself. Won't this break for
> multi-socket system ? The hetid in that case will be package id, no ?

Your assuming a PPTT with multiple trees, one for each socket? Yes it 
breaks in that case, and is what I was complaining about in the cover 
letter because there won't be a common node with the IDENTICAL flag set. 
OTOH, I think it works fine for multi-socket given a single tree where 
the root has IDENTICAL set (I did some very light testing on such a 
machine).

I don't think there is a solution to this problem that won't break in 
the case of a heterogeneous machine with multiple sockets populated with 
differing cores.

> 
> Otherwise the patch looks good. 
> 
> --
> Regards,
> Sudeep
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing
  2019-06-07 13:28     ` Jeremy Linton
@ 2019-06-07 13:37       ` Sudeep Holla
  0 siblings, 0 replies; 40+ messages in thread
From: Sudeep Holla @ 2019-06-07 13:37 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-arm-kernel, linux-acpi, catalin.marinas, will.deacon, rjw,
	lenb, mark.rutland, lorenzo.pieralisi, linuxarm, john.garry

On Fri, Jun 07, 2019 at 08:28:04AM -0500, Jeremy Linton wrote:
> Hi,
> 
> On 6/7/19 4:57 AM, Sudeep Holla wrote:
> > On Fri, May 03, 2019 at 06:24:06PM -0500, Jeremy Linton wrote:
> > > ACPI 6.3 adds additional fields to the MADT GICC
> > > structure to describe SPE PPI's. We pick these out
> > > of the cached reference to the madt_gicc structure
> > > similarly to the core PMU code. We then create a platform
> > > device referring to the IRQ and let the user/module loader
> > > decide whether to load the SPE driver.
> > > 
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > ---
> > >   arch/arm64/include/asm/acpi.h |  3 ++
> > >   drivers/perf/Kconfig          |  5 +++
> > >   drivers/perf/arm_pmu_acpi.c   | 76 +++++++++++++++++++++++++++++++++++
> > >   include/linux/perf/arm_pmu.h  |  2 +
> > >   4 files changed, 86 insertions(+)
> > > 
> > > diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> > > index 7628efbe6c12..d10399b9f998 100644
> > > --- a/arch/arm64/include/asm/acpi.h
> > > +++ b/arch/arm64/include/asm/acpi.h
> > > @@ -41,6 +41,9 @@
> > >   	(!(entry) || (entry)->header.length < ACPI_MADT_GICC_MIN_LENGTH || \
> > >   	(unsigned long)(entry) + (entry)->header.length > (end))
> > > +#define ACPI_MADT_GICC_SPE  (ACPI_OFFSET(struct acpi_madt_generic_interrupt, \
> > > +	spe_interrupt) + sizeof(u16))
> > > +
> > >   /* Basic configuration for ACPI */
> > >   #ifdef	CONFIG_ACPI
> > >   pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
> > > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
> > > index af9bc178495d..bc2647c64c9d 100644
> > > --- a/drivers/perf/Kconfig
> > > +++ b/drivers/perf/Kconfig
> > > @@ -52,6 +52,11 @@ config ARM_PMU_ACPI
> > >   	depends on ARM_PMU && ACPI
> > >   	def_bool y
> > > +config ARM_SPE_ACPI
> > > +	depends on ARM_PMU_ACPI && ARM_SPE_PMU
> > > +	def_bool y
> > > +
> > > +
> > >   config ARM_DSU_PMU
> > >   	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
> > >   	depends on ARM64
> > > diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
> > > index 0f197516d708..b0244e1e8c91 100644
> > > --- a/drivers/perf/arm_pmu_acpi.c
> > > +++ b/drivers/perf/arm_pmu_acpi.c
> > > @@ -74,6 +74,80 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
> > >   	acpi_unregister_gsi(gsi);
> > >   }
> > > +#ifdef CONFIG_ARM_SPE_ACPI
> > > +static struct resource spe_resources[] = {
> > > +	{
> > > +		/* irq */
> > > +		.flags          = IORESOURCE_IRQ,
> > > +	}
> > > +};
> > > +
> > > +static struct platform_device spe_dev = {
> > > +	.name = ARMV8_SPE_PDEV_NAME,
> > > +	.id = -1,
> > > +	.resource = spe_resources,
> > > +	.num_resources = ARRAY_SIZE(spe_resources)
> > > +};
> > > +
> > > +/*
> > > + * For lack of a better place, hook the normal PMU MADT walk
> > > + * and create a SPE device if we detect a recent MADT with
> > > + * a homogeneous PPI mapping.
> > > + */
> > > +static int arm_spe_acpi_register_device(void)
> > > +{
> > > +	int cpu, ret, irq;
> > > +	int hetid;
> > > +	u16 gsi = 0;
> > > +	bool first = true;
> > > +
> > > +	struct acpi_madt_generic_interrupt *gicc;
> > > +
> > > +	/*
> > > +	 * sanity check all the GICC tables for the same interrupt number
> > > +	 * for now we only support homogeneous ACPI/SPE machines.
> > > +	 */
> > > +	for_each_possible_cpu(cpu) {
> > > +		gicc = acpi_cpu_get_madt_gicc(cpu);
> > > +
> > > +		if (gicc->header.length < ACPI_MADT_GICC_SPE)
> > > +			return -ENODEV;
> > > +		if (first) {
> > > +			gsi = gicc->spe_interrupt;
> > > +			if (!gsi)
> > > +				return -ENODEV;
> > > +			hetid = find_acpi_cpu_topology_hetero_id(cpu);
> > > +			first = false;
> > > +		} else if ((gsi != gicc->spe_interrupt) ||
> > > +			   (hetid != find_acpi_cpu_topology_hetero_id(cpu))) {
> > 
> > Sorry, I should have noticed this in patch 2 itself. Won't this break for
> > multi-socket system ? The hetid in that case will be package id, no ?
> 
> Your assuming a PPTT with multiple trees, one for each socket? Yes it breaks
> in that case, and is what I was complaining about in the cover letter
> because there won't be a common node with the IDENTICAL flag set. OTOH, I
> think it works fine for multi-socket given a single tree where the root has
> IDENTICAL set (I did some very light testing on such a machine).
>

Ah OK, so you are saying such system must have root with IDENTICAL flag
set. Sounds fine, but is there any way to check that ? It's not mandated
right. At least, I didn't have when I hacked up PPTT for Juno and no one
complains(iasl/uefi/linux). But yes that's a different problem, at-least
we know how to fix such PPTT topology.

> I don't think there is a solution to this problem that won't break in the
> case of a heterogeneous machine with multiple sockets populated with
> differing cores.
>

Agreed, I was worried about only about case, but having root node is
feasible solution.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  2019-06-07 13:15     ` Jeremy Linton
@ 2019-06-07 13:47       ` Sudeep Holla
  0 siblings, 0 replies; 40+ messages in thread
From: Sudeep Holla @ 2019-06-07 13:47 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-arm-kernel, linux-acpi, catalin.marinas, will.deacon, rjw,
	lenb, mark.rutland, lorenzo.pieralisi, linuxarm, john.garry,
	Sudeep Holla

On Fri, Jun 07, 2019 at 08:15:50AM -0500, Jeremy Linton wrote:
> Hi,
> 
> Thanks for taking a look at this.
> 
> On 6/7/19 4:53 AM, Sudeep Holla wrote:
> > On Fri, May 03, 2019 at 06:24:05PM -0500, Jeremy Linton wrote:
> > > The ACPI specification implies that the IDENTICAL flag should be
> > > set on all non leaf nodes where the children are identical.
> > > This means that we need to be searching for the last node with
> > > the identical flag set rather than the first one.
> > > 
> > > Since this flag is also dependent on the table revision, we
> > > need to add a bit of extra code to verify the table revision,
> > > and the next node's state in the traversal. Since we want to
> > > avoid function pointers here, lets just special case
> > > the IDENTICAL flag.
> > > 
> > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> > > ---
> > >   drivers/acpi/pptt.c | 28 +++++++++++++++++++++++++---
> > >   1 file changed, 25 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> > > index 1865515297ca..456e1c0a35ae 100644
> > > --- a/drivers/acpi/pptt.c
> > > +++ b/drivers/acpi/pptt.c
> > > @@ -432,17 +432,39 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
> > >   	}
> > >   }
> > > +static bool flag_identical(struct acpi_table_header *table_hdr,
> > > +			  struct acpi_pptt_processor *cpu)
> > 
> > Not sure if it's email client problem, but I see quite a few mis-alignment
> > with parenthesis like above one.
> 
> It looks fine in the original editor/text patch, but yes in my email client
> I see it off by one (or two/three now that i'm replying). Its a mix of
> tabs/spaces and I've seen this happen before in my email client due to the
> leading "[>+]"?
>

No I have configured(hopefully correctly) my client, but if you not seeing
issue with patch, that's fine.

> 
> > 
> > > +{
> > > +	struct acpi_pptt_processor *next;
> > > +
> > > +	/* heterogeneous machines must use PPTT revision > 1 */
> > > +	if (table_hdr->revision < 2)
> > > +		return false;
> > > +
> > > +	/* Locate the last node in the tree with IDENTICAL set */
> > > +	if (cpu->flags & ACPI_PPTT_ACPI_IDENTICAL) {
> > > +		next = fetch_pptt_node(table_hdr, cpu->parent);
> > > +		if (!(next && next->flags & ACPI_PPTT_ACPI_IDENTICAL))
> > > +			return true;
> > > +	}
> > > +
> > > +	return false;
> > > +}
> > > +
> > >   /* Passing level values greater than this will result in search termination */
> > >   #define PPTT_ABORT_PACKAGE 0xFF
> > > -static struct acpi_pptt_processor *acpi_find_processor_package_id(struct acpi_table_header *table_hdr,
> > > +static struct acpi_pptt_processor *acpi_find_processor_tag_id(struct acpi_table_header *table_hdr,
> > >   								  struct acpi_pptt_processor *cpu,
> > >   								  int level, int flag)
> > >   {
> > >   	struct acpi_pptt_processor *prev_node;
> > >   	while (cpu && level) {
> > > -		if (cpu->flags & flag)
> > > +		if (flag == ACPI_PPTT_ACPI_IDENTICAL) {
> > 
> > flag_identical anyways check the flag, so I assume you can drop the above
> > check.
> 
> ? I think that would be a bug because then we would always be returning the
> value of the IDENTICAL flag rather than the other flags passed into this
> routine. This is the special case I think Raphael was asking for rather than
> the function pointer/callback interface.
>

Ah OK, got it. Worth a comment ? I am sure I will forget next time I see this.

--
Regards,
Sudeep


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2019-06-07 13:48 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-03 23:24 [PATCH v3 0/5] arm64: SPE ACPI enablement Jeremy Linton
2019-05-03 23:24 ` Jeremy Linton
2019-05-03 23:24 ` [PATCH v3 1/5] ACPI/PPTT: Trivial, change the capitalization of CPU Jeremy Linton
2019-05-03 23:24   ` Jeremy Linton
2019-05-07 18:12   ` Jeremy Linton
2019-05-07 18:12     ` Jeremy Linton
2019-05-03 23:24 ` [PATCH v3 2/5] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens Jeremy Linton
2019-05-03 23:24   ` Jeremy Linton
2019-05-05  7:09   ` Kefeng Wang
2019-05-05  7:09     ` Kefeng Wang
2019-05-07 18:26     ` Jeremy Linton
2019-05-07 18:26       ` Jeremy Linton
2019-06-07  9:49   ` Sudeep Holla
2019-05-03 23:24 ` [PATCH v3 3/5] ACPI/PPTT: Modify node flag detection to find last IDENTICAL Jeremy Linton
2019-05-03 23:24   ` Jeremy Linton
2019-06-07  9:53   ` Sudeep Holla
2019-06-07 13:15     ` Jeremy Linton
2019-06-07 13:47       ` Sudeep Holla
2019-05-03 23:24 ` [PATCH v3 4/5] arm_pmu: acpi: spe: Add initial MADT/SPE probing Jeremy Linton
2019-05-03 23:24   ` Jeremy Linton
2019-05-08 11:18   ` John Garry
2019-05-08 20:04     ` Jeremy Linton
2019-06-07  9:57   ` Sudeep Holla
2019-06-07 13:28     ` Jeremy Linton
2019-06-07 13:37       ` Sudeep Holla
2019-05-03 23:24 ` [PATCH v3 5/5] perf: arm_spe: Enable ACPI/Platform automatic module loading Jeremy Linton
2019-05-03 23:24   ` Jeremy Linton
2019-05-04 11:06 ` [PATCH v3 0/5] arm64: SPE ACPI enablement Hanjun Guo
2019-05-04 11:06   ` Hanjun Guo
2019-05-07 17:58   ` Jeremy Linton
2019-05-07 17:58     ` Jeremy Linton
2019-05-08  9:35     ` Hanjun Guo
2019-05-08 16:51       ` Sudeep Holla
2019-05-09  9:28         ` Will Deacon
2019-05-09 10:35           ` Sudeep Holla
2019-05-09 14:13             ` Sudeep Holla
2019-05-13 10:56               ` Will Deacon
2019-05-13 11:31                 ` Sudeep Holla
2019-05-13 11:10             ` Hanjun Guo
2019-05-08 16:45   ` Sudeep Holla

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).