All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v7 00/13] Support PPTT for ARM64
  2018-03-01 12:06   ` Sudeep Holla
  (?)
@ 2018-02-27 18:49     ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-27 18:49 UTC (permalink / raw)
  To: Sudeep Holla, linux-acpi
  Cc: linux-arm-kernel, lorenzo.pieralisi, hanjun.guo, rjw,
	will.deacon, catalin.marinas, gregkh, mark.rutland, linux-kernel,
	linux-riscv, wangxiongfeng2, vkilari, ahs3, dietmar.eggemann,
	morten.rasmussen, palmer, lenb, john.garry, austinwc, tnowicki

On 03/01/2018 06:06 AM, Sudeep Holla wrote:
> Hi Jeremy,
> 
> On 28/02/18 22:06, Jeremy Linton wrote:
>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>> used to describe the processor and cache topology. Ideally it is
>> used to extend/override information provided by the hardware, but
>> right now ARM64 is entirely dependent on firmware provided tables.
>>
>> This patch parses the table for the cache topology and CPU topology.
>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>> PPTT node flagged as the physical package by the firmware.
>> This results in topologies that match what the remainder of the
>> system expects. To avoid inverted scheduler domains we then
>> set the MC domain equal to the largest cache within the socket
>> below the NUMA domain.
>>
> I remember reviewing and acknowledging most of the cacheinfo stuff with
> couple of minor suggestions for v6. I don't see any Acked-by tags in
> this series and don't know if I need to review/ack any more cacheinfo
> related patches.

Hi,

Yes, I didn't put them in because I changed the functionality in 2/13 
and there is a bug fix in 5/13. I thought you might want to do a quick 
diff of the git v6->v7 tree.

Although given that most of the changes were in response to your 
comments in v6 I probably should have just put the tags in.


Thanks,

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-02-27 18:49     ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-27 18:49 UTC (permalink / raw)
  To: linux-riscv

On 03/01/2018 06:06 AM, Sudeep Holla wrote:
> Hi Jeremy,
> 
> On 28/02/18 22:06, Jeremy Linton wrote:
>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>> used to describe the processor and cache topology. Ideally it is
>> used to extend/override information provided by the hardware, but
>> right now ARM64 is entirely dependent on firmware provided tables.
>>
>> This patch parses the table for the cache topology and CPU topology.
>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>> PPTT node flagged as the physical package by the firmware.
>> This results in topologies that match what the remainder of the
>> system expects. To avoid inverted scheduler domains we then
>> set the MC domain equal to the largest cache within the socket
>> below the NUMA domain.
>>
> I remember reviewing and acknowledging most of the cacheinfo stuff with
> couple of minor suggestions for v6. I don't see any Acked-by tags in
> this series and don't know if I need to review/ack any more cacheinfo
> related patches.

Hi,

Yes, I didn't put them in because I changed the functionality in 2/13 
and there is a bug fix in 5/13. I thought you might want to do a quick 
diff of the git v6->v7 tree.

Although given that most of the changes were in response to your 
comments in v6 I probably should have just put the tags in.


Thanks,

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-02-27 18:49     ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-27 18:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 03/01/2018 06:06 AM, Sudeep Holla wrote:
> Hi Jeremy,
> 
> On 28/02/18 22:06, Jeremy Linton wrote:
>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>> used to describe the processor and cache topology. Ideally it is
>> used to extend/override information provided by the hardware, but
>> right now ARM64 is entirely dependent on firmware provided tables.
>>
>> This patch parses the table for the cache topology and CPU topology.
>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>> PPTT node flagged as the physical package by the firmware.
>> This results in topologies that match what the remainder of the
>> system expects. To avoid inverted scheduler domains we then
>> set the MC domain equal to the largest cache within the socket
>> below the NUMA domain.
>>
> I remember reviewing and acknowledging most of the cacheinfo stuff with
> couple of minor suggestions for v6. I don't see any Acked-by tags in
> this series and don't know if I need to review/ack any more cacheinfo
> related patches.

Hi,

Yes, I didn't put them in because I changed the functionality in 2/13 
and there is a bug fix in 5/13. I thought you might want to do a quick 
diff of the git v6->v7 tree.

Although given that most of the changes were in response to your 
comments in v6 I probably should have just put the tags in.


Thanks,

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-03-01 15:52     ` Morten Rasmussen
  (?)
@ 2018-02-27 20:18       ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-27 20:18 UTC (permalink / raw)
  To: Morten Rasmussen
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, lorenzo.pieralisi,
	hanjun.guo, rjw, will.deacon, catalin.marinas, gregkh,
	mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2, vkilari,
	ahs3, dietmar.eggemann, palmer, lenb, john.garry, austinwc,
	tnowicki

Hi,


First, thanks for taking a look at this.

On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
> Hi Jeremy,
> 
> On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
>> Now that we have an accurate view of the physical topology
>> we need to represent it correctly to the scheduler. In the
>> case of NUMA in socket, we need to assure that the sched domain
>> we build for the MC layer isn't larger than the DIE above it.
> 
> MC shouldn't be larger than any of the NUMA domains either.

Right, that is one of the things this patch is assuring..

> 
>> To do this correctly, we should really base that on the cache
>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> 
> That means we wouldn't support multi-die NUMA nodes?

You mean a bottom level NUMA domain that crosses multiple sockets/dies? 
That should work. This patch is picking the widest cache layer below the 
smallest of the package or numa grouping. What actually happens depends 
on the topology. Given a case where there are multiple dies in a socket, 
and the numa domain is at the socket level the MC is going to reflect 
the caching topology immediately below the socket. In the case of 
multiple dies, with a cache that crosses them in socket, then the MC is 
basically going to be the socket, otherwise if the widest cache is per 
die, or some narrower grouping (cluster?) then that is what ends up in 
the MC. (this is easier with some pictures)

> 
>> This patch creates a set of early cache_siblings masks, then
>> when the scheduler requests the coregroup mask we pick the
>> smaller of the physical package siblings, or the numa siblings
>> and locate the largest cache which is an entire subset of
>> those siblings. If we are unable to find a proper subset of
>> cores then we retain the original behavior and return the
>> core_sibling list.
> 
> IIUC, for numa-in-package it is a strict requirement that there is a
> cache that span the entire NUMA node? For example, having a NUMA node
> consisting of two clusters with per-cluster caches only wouldn't be
> supported?

Everything is supported, the MC is reflecting the cache topology. We 
just use the physical/numa topology to help us pick which layer of cache 
topology lands in the MC. (unless of course we fail to find a PPTT/cache 
topology, in which case we fallback to the old behavior of the 
core_siblings which can reflect the MPIDR/etc).

> 
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   arch/arm64/include/asm/topology.h |  5 +++
>>   arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 69 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
>> index 6b10459e6905..08db3e4e44e1 100644
>> --- a/arch/arm64/include/asm/topology.h
>> +++ b/arch/arm64/include/asm/topology.h
>> @@ -4,12 +4,17 @@
>>   
>>   #include <linux/cpumask.h>
>>   
>> +#define MAX_CACHE_CHECKS 4
>> +
>>   struct cpu_topology {
>>   	int thread_id;
>>   	int core_id;
>>   	int package_id;
>> +	int cache_id[MAX_CACHE_CHECKS];
>>   	cpumask_t thread_sibling;
>>   	cpumask_t core_sibling;
>> +	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
>> +	int cache_level;
>>   };
>>   
>>   extern struct cpu_topology cpu_topology[NR_CPUS];
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index bd1aae438a31..1809dc9d347c 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
>>   struct cpu_topology cpu_topology[NR_CPUS];
>>   EXPORT_SYMBOL_GPL(cpu_topology);
>>   
>> +static void find_llc_topology_for_cpu(int cpu)
> 
> Isn't it more find core/node siblings? Or is it a requirement that the
> last level cache spans exactly one NUMA node? For example, a package
> level cache isn't allowed for numa-in-package?

Yes, its a core siblings group, but more like a 
widest_core_siblings_sharing_a_cache_equalorsmaller_than_the_smallest_of_numa_or_package()

LLC is a bit of a misnomer because its the 'LLC' within the package/px 
domain. Is possible there is a LLC grouping larger than whatever we pick 
but we don't care.


> 
>> +{
>> +	/* first determine if we are a NUMA in package */
>> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>> +	int indx;
>> +
>> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
>> +		/* not numa in package, lets use the package siblings */
>> +		node_mask = &cpu_topology[cpu].core_sibling;
>> +	}
>> +
>> +	/*
>> +	 * node_mask should represent the smallest package/numa grouping
>> +	 * lets search for the largest cache smaller than the node_mask.
>> +	 */
>> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
>> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
>> +
>> +		if (cpu_topology[cpu].cache_id[indx] < 0)
>> +			continue;
>> +
>> +		if (cpumask_subset(cache_sibs, node_mask))
>> +			cpu_topology[cpu].cache_level = indx;
> 
> I don't this guarantees that the cache level we found matches exactly
> the NUMA node. Taking the two cluster NUMA node example from above, we
> would set cache_level to point at the per-cluster cache as it is a
> subset of the NUMA node but it would only span half of the node. Or am I
> missing something?

I think you got it. If the system is a traditional ARM system with 
shared L2's at the cluster level and it doesn't have any L3's/etc and 
the NUMA node crosses multiple clusters then you get the cluster L2 
grouping in the MC.

I think this is what we want. Particularly, since the newer/larger 
machines do have L3+'s contained within their sockets or numa domains, 
so you end up with that as the MC.


> 
>> +	}
>> +}
>> +
>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>   {
>> +	int *llc = &cpu_topology[cpu].cache_level;
>> +
>> +	if (*llc == -1)
>> +		find_llc_topology_for_cpu(cpu);
>> +
>> +	if (*llc != -1)
>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>> +
>>   	return &cpu_topology[cpu].core_sibling;
>>   }
>>   
>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>   {
>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>   	int cpu;
>> +	int idx;
>>   
>>   	/* update core and thread sibling masks */
>>   	for_each_possible_cpu(cpu) {
>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>   			continue;
>>   
>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>> +			cpumask_t *lsib;
>> +			int cput_id = cpuid_topo->cache_id[idx];
>> +
>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>> +				lsib = &cpuid_topo->cache_siblings[idx];
>> +				cpumask_set_cpu(cpu, lsib);
>> +			}
> 
> Shouldn't the cache_id validity be checked here? I don't think it breaks
> anything though.

It could be, but since its explicitly looking for unified caches its 
likely that some of the levels are invalid. Invalid levels get ignored 
later on so we don't really care if they are valid here.

> 
> Overall, I think this is more or less in line with the MC domain
> shrinking I just mentioned in the v6 discussion. It is mostly the corner
> cases and assumption about the system topology I'm not sure about.

I think its the corner cases i'm taking care of. The simple fix in v6 is 
to take the smaller of core_siblings or node_siblings, but that ignores 
cases with split L3s (or the L2 only example above). The idea here is to 
assure that MC is following a cache topology. In my mind, it is more a 
question of how that is picked. The other way I see to do this, is with 
a PX domain flag in the PPTT. We could then pick the core grouping one 
below that flag. Doing it that way affords the firmware vendors a lever 
they can pull to optimize a given machine for the linux scheduler behavior.

This seems a good first pass given that isn't in the ACPI spec.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-02-27 20:18       ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-27 20:18 UTC (permalink / raw)
  To: linux-riscv

Hi,


First, thanks for taking a look at this.

On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
> Hi Jeremy,
> 
> On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
>> Now that we have an accurate view of the physical topology
>> we need to represent it correctly to the scheduler. In the
>> case of NUMA in socket, we need to assure that the sched domain
>> we build for the MC layer isn't larger than the DIE above it.
> 
> MC shouldn't be larger than any of the NUMA domains either.

Right, that is one of the things this patch is assuring..

> 
>> To do this correctly, we should really base that on the cache
>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> 
> That means we wouldn't support multi-die NUMA nodes?

You mean a bottom level NUMA domain that crosses multiple sockets/dies? 
That should work. This patch is picking the widest cache layer below the 
smallest of the package or numa grouping. What actually happens depends 
on the topology. Given a case where there are multiple dies in a socket, 
and the numa domain is at the socket level the MC is going to reflect 
the caching topology immediately below the socket. In the case of 
multiple dies, with a cache that crosses them in socket, then the MC is 
basically going to be the socket, otherwise if the widest cache is per 
die, or some narrower grouping (cluster?) then that is what ends up in 
the MC. (this is easier with some pictures)

> 
>> This patch creates a set of early cache_siblings masks, then
>> when the scheduler requests the coregroup mask we pick the
>> smaller of the physical package siblings, or the numa siblings
>> and locate the largest cache which is an entire subset of
>> those siblings. If we are unable to find a proper subset of
>> cores then we retain the original behavior and return the
>> core_sibling list.
> 
> IIUC, for numa-in-package it is a strict requirement that there is a
> cache that span the entire NUMA node? For example, having a NUMA node
> consisting of two clusters with per-cluster caches only wouldn't be
> supported?

Everything is supported, the MC is reflecting the cache topology. We 
just use the physical/numa topology to help us pick which layer of cache 
topology lands in the MC. (unless of course we fail to find a PPTT/cache 
topology, in which case we fallback to the old behavior of the 
core_siblings which can reflect the MPIDR/etc).

> 
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   arch/arm64/include/asm/topology.h |  5 +++
>>   arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 69 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
>> index 6b10459e6905..08db3e4e44e1 100644
>> --- a/arch/arm64/include/asm/topology.h
>> +++ b/arch/arm64/include/asm/topology.h
>> @@ -4,12 +4,17 @@
>>   
>>   #include <linux/cpumask.h>
>>   
>> +#define MAX_CACHE_CHECKS 4
>> +
>>   struct cpu_topology {
>>   	int thread_id;
>>   	int core_id;
>>   	int package_id;
>> +	int cache_id[MAX_CACHE_CHECKS];
>>   	cpumask_t thread_sibling;
>>   	cpumask_t core_sibling;
>> +	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
>> +	int cache_level;
>>   };
>>   
>>   extern struct cpu_topology cpu_topology[NR_CPUS];
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index bd1aae438a31..1809dc9d347c 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
>>   struct cpu_topology cpu_topology[NR_CPUS];
>>   EXPORT_SYMBOL_GPL(cpu_topology);
>>   
>> +static void find_llc_topology_for_cpu(int cpu)
> 
> Isn't it more find core/node siblings? Or is it a requirement that the
> last level cache spans exactly one NUMA node? For example, a package
> level cache isn't allowed for numa-in-package?

Yes, its a core siblings group, but more like a 
widest_core_siblings_sharing_a_cache_equalorsmaller_than_the_smallest_of_numa_or_package()

LLC is a bit of a misnomer because its the 'LLC' within the package/px 
domain. Is possible there is a LLC grouping larger than whatever we pick 
but we don't care.


> 
>> +{
>> +	/* first determine if we are a NUMA in package */
>> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>> +	int indx;
>> +
>> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
>> +		/* not numa in package, lets use the package siblings */
>> +		node_mask = &cpu_topology[cpu].core_sibling;
>> +	}
>> +
>> +	/*
>> +	 * node_mask should represent the smallest package/numa grouping
>> +	 * lets search for the largest cache smaller than the node_mask.
>> +	 */
>> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
>> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
>> +
>> +		if (cpu_topology[cpu].cache_id[indx] < 0)
>> +			continue;
>> +
>> +		if (cpumask_subset(cache_sibs, node_mask))
>> +			cpu_topology[cpu].cache_level = indx;
> 
> I don't this guarantees that the cache level we found matches exactly
> the NUMA node. Taking the two cluster NUMA node example from above, we
> would set cache_level to point at the per-cluster cache as it is a
> subset of the NUMA node but it would only span half of the node. Or am I
> missing something?

I think you got it. If the system is a traditional ARM system with 
shared L2's at the cluster level and it doesn't have any L3's/etc and 
the NUMA node crosses multiple clusters then you get the cluster L2 
grouping in the MC.

I think this is what we want. Particularly, since the newer/larger 
machines do have L3+'s contained within their sockets or numa domains, 
so you end up with that as the MC.


> 
>> +	}
>> +}
>> +
>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>   {
>> +	int *llc = &cpu_topology[cpu].cache_level;
>> +
>> +	if (*llc == -1)
>> +		find_llc_topology_for_cpu(cpu);
>> +
>> +	if (*llc != -1)
>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>> +
>>   	return &cpu_topology[cpu].core_sibling;
>>   }
>>   
>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>   {
>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>   	int cpu;
>> +	int idx;
>>   
>>   	/* update core and thread sibling masks */
>>   	for_each_possible_cpu(cpu) {
>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>   			continue;
>>   
>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>> +			cpumask_t *lsib;
>> +			int cput_id = cpuid_topo->cache_id[idx];
>> +
>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>> +				lsib = &cpuid_topo->cache_siblings[idx];
>> +				cpumask_set_cpu(cpu, lsib);
>> +			}
> 
> Shouldn't the cache_id validity be checked here? I don't think it breaks
> anything though.

It could be, but since its explicitly looking for unified caches its 
likely that some of the levels are invalid. Invalid levels get ignored 
later on so we don't really care if they are valid here.

> 
> Overall, I think this is more or less in line with the MC domain
> shrinking I just mentioned in the v6 discussion. It is mostly the corner
> cases and assumption about the system topology I'm not sure about.

I think its the corner cases i'm taking care of. The simple fix in v6 is 
to take the smaller of core_siblings or node_siblings, but that ignores 
cases with split L3s (or the L2 only example above). The idea here is to 
assure that MC is following a cache topology. In my mind, it is more a 
question of how that is picked. The other way I see to do this, is with 
a PX domain flag in the PPTT. We could then pick the core grouping one 
below that flag. Doing it that way affords the firmware vendors a lever 
they can pull to optimize a given machine for the linux scheduler behavior.

This seems a good first pass given that isn't in the ACPI spec.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-02-27 20:18       ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-27 20:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,


First, thanks for taking a look at this.

On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
> Hi Jeremy,
> 
> On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
>> Now that we have an accurate view of the physical topology
>> we need to represent it correctly to the scheduler. In the
>> case of NUMA in socket, we need to assure that the sched domain
>> we build for the MC layer isn't larger than the DIE above it.
> 
> MC shouldn't be larger than any of the NUMA domains either.

Right, that is one of the things this patch is assuring..

> 
>> To do this correctly, we should really base that on the cache
>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> 
> That means we wouldn't support multi-die NUMA nodes?

You mean a bottom level NUMA domain that crosses multiple sockets/dies? 
That should work. This patch is picking the widest cache layer below the 
smallest of the package or numa grouping. What actually happens depends 
on the topology. Given a case where there are multiple dies in a socket, 
and the numa domain is at the socket level the MC is going to reflect 
the caching topology immediately below the socket. In the case of 
multiple dies, with a cache that crosses them in socket, then the MC is 
basically going to be the socket, otherwise if the widest cache is per 
die, or some narrower grouping (cluster?) then that is what ends up in 
the MC. (this is easier with some pictures)

> 
>> This patch creates a set of early cache_siblings masks, then
>> when the scheduler requests the coregroup mask we pick the
>> smaller of the physical package siblings, or the numa siblings
>> and locate the largest cache which is an entire subset of
>> those siblings. If we are unable to find a proper subset of
>> cores then we retain the original behavior and return the
>> core_sibling list.
> 
> IIUC, for numa-in-package it is a strict requirement that there is a
> cache that span the entire NUMA node? For example, having a NUMA node
> consisting of two clusters with per-cluster caches only wouldn't be
> supported?

Everything is supported, the MC is reflecting the cache topology. We 
just use the physical/numa topology to help us pick which layer of cache 
topology lands in the MC. (unless of course we fail to find a PPTT/cache 
topology, in which case we fallback to the old behavior of the 
core_siblings which can reflect the MPIDR/etc).

> 
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   arch/arm64/include/asm/topology.h |  5 +++
>>   arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 69 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
>> index 6b10459e6905..08db3e4e44e1 100644
>> --- a/arch/arm64/include/asm/topology.h
>> +++ b/arch/arm64/include/asm/topology.h
>> @@ -4,12 +4,17 @@
>>   
>>   #include <linux/cpumask.h>
>>   
>> +#define MAX_CACHE_CHECKS 4
>> +
>>   struct cpu_topology {
>>   	int thread_id;
>>   	int core_id;
>>   	int package_id;
>> +	int cache_id[MAX_CACHE_CHECKS];
>>   	cpumask_t thread_sibling;
>>   	cpumask_t core_sibling;
>> +	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
>> +	int cache_level;
>>   };
>>   
>>   extern struct cpu_topology cpu_topology[NR_CPUS];
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index bd1aae438a31..1809dc9d347c 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
>>   struct cpu_topology cpu_topology[NR_CPUS];
>>   EXPORT_SYMBOL_GPL(cpu_topology);
>>   
>> +static void find_llc_topology_for_cpu(int cpu)
> 
> Isn't it more find core/node siblings? Or is it a requirement that the
> last level cache spans exactly one NUMA node? For example, a package
> level cache isn't allowed for numa-in-package?

Yes, its a core siblings group, but more like a 
widest_core_siblings_sharing_a_cache_equalorsmaller_than_the_smallest_of_numa_or_package()

LLC is a bit of a misnomer because its the 'LLC' within the package/px 
domain. Is possible there is a LLC grouping larger than whatever we pick 
but we don't care.


> 
>> +{
>> +	/* first determine if we are a NUMA in package */
>> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>> +	int indx;
>> +
>> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
>> +		/* not numa in package, lets use the package siblings */
>> +		node_mask = &cpu_topology[cpu].core_sibling;
>> +	}
>> +
>> +	/*
>> +	 * node_mask should represent the smallest package/numa grouping
>> +	 * lets search for the largest cache smaller than the node_mask.
>> +	 */
>> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
>> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
>> +
>> +		if (cpu_topology[cpu].cache_id[indx] < 0)
>> +			continue;
>> +
>> +		if (cpumask_subset(cache_sibs, node_mask))
>> +			cpu_topology[cpu].cache_level = indx;
> 
> I don't this guarantees that the cache level we found matches exactly
> the NUMA node. Taking the two cluster NUMA node example from above, we
> would set cache_level to point at the per-cluster cache as it is a
> subset of the NUMA node but it would only span half of the node. Or am I
> missing something?

I think you got it. If the system is a traditional ARM system with 
shared L2's at the cluster level and it doesn't have any L3's/etc and 
the NUMA node crosses multiple clusters then you get the cluster L2 
grouping in the MC.

I think this is what we want. Particularly, since the newer/larger 
machines do have L3+'s contained within their sockets or numa domains, 
so you end up with that as the MC.


> 
>> +	}
>> +}
>> +
>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>   {
>> +	int *llc = &cpu_topology[cpu].cache_level;
>> +
>> +	if (*llc == -1)
>> +		find_llc_topology_for_cpu(cpu);
>> +
>> +	if (*llc != -1)
>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>> +
>>   	return &cpu_topology[cpu].core_sibling;
>>   }
>>   
>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>   {
>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>   	int cpu;
>> +	int idx;
>>   
>>   	/* update core and thread sibling masks */
>>   	for_each_possible_cpu(cpu) {
>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>   			continue;
>>   
>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>> +			cpumask_t *lsib;
>> +			int cput_id = cpuid_topo->cache_id[idx];
>> +
>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>> +				lsib = &cpuid_topo->cache_siblings[idx];
>> +				cpumask_set_cpu(cpu, lsib);
>> +			}
> 
> Shouldn't the cache_id validity be checked here? I don't think it breaks
> anything though.

It could be, but since its explicitly looking for unified caches its 
likely that some of the levels are invalid. Invalid levels get ignored 
later on so we don't really care if they are valid here.

> 
> Overall, I think this is more or less in line with the MC domain
> shrinking I just mentioned in the v6 discussion. It is mostly the corner
> cases and assumption about the system topology I'm not sure about.

I think its the corner cases i'm taking care of. The simple fix in v6 is 
to take the smaller of core_siblings or node_siblings, but that ignores 
cases with split L3s (or the L2 only example above). The idea here is to 
assure that MC is following a cache topology. In my mind, it is more a 
question of how that is picked. The other way I see to do this, is with 
a PX domain flag in the PPTT. We could then pick the core grouping one 
below that flag. Doing it that way affords the firmware vendors a lever 
they can pull to optimize a given machine for the linux scheduler behavior.

This seems a good first pass given that isn't in the ACPI spec.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-02-28 22:06 ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to describe the processor and cache topology. Ideally it is
used to extend/override information provided by the hardware, but
right now ARM64 is entirely dependent on firmware provided tables.

This patch parses the table for the cache topology and CPU topology.
When we enable ACPI/PPTT for arm64 we map the physical_id to the
PPTT node flagged as the physical package by the firmware.
This results in topologies that match what the remainder of the
system expects. To avoid inverted scheduler domains we then
set the MC domain equal to the largest cache within the socket
below the NUMA domain.

For example on juno:
[root@mammon-juno-rh topology]# lstopo-no-graphics
  Package L#0
    L2 L#0 (1024KB)
      L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
      L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L2 L#1 (2048KB)
      L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
  HostBridge L#0
    PCIBridge
      PCIBridge
        PCIBridge
          PCI 1095:3132
            Block(Disk) L#0 "sda"
        PCIBridge
          PCI 1002:68f9
            GPU L#1 "renderD128"
            GPU L#2 "card0"
            GPU L#3 "controlD64"
        PCIBridge
          PCI 11ab:4380
            Net L#4 "enp8s0"

Git tree at:
http://linux-arm.org/git?p=linux-jlinton.git
branch: pptt_v7

v6->v7:
Add additional patch to use the last cache level within the NUMA
  or socket as the MC domain. This assures the MC domain is
  equal or smaller than the DIE.
  
Various formatting/etc review comments.

Rebase to 4.16rc2

v5->v6:
Add additional patches which re-factor how the initial DT code sets
  up the cacheinfo structure so that its not as dependent on the
  of_node stored in that tree. Once that is done we rename it
  for use with the ACPI code.

Additionally there were a fair number of minor name/location/etc
  tweaks scattered about made in response to review comments.

v4->v5:
Update the cache type from NOCACHE to UNIFIED when all the cache
  attributes we update are valid. This fixes a problem where caches
  which are entirely created by the PPTT don't show up in lstopo.

Give the PPTT its own firmware_node in the cache structure instead of
  sharing it with the of_node.

Move some pieces around between patches.

(see previous cover letters for futher changes)

Jeremy Linton (13):
  drivers: base: cacheinfo: move cache_setup_of_node()
  drivers: base: cacheinfo: setup DT cache properties early
  cacheinfo: rename of_node to fw_token
  arm64/acpi: Create arch specific cpu to acpi id helper
  ACPI/PPTT: Add Processor Properties Topology Table parsing
  ACPI: Enable PPTT support on ARM64
  drivers: base cacheinfo: Add support for ACPI based firmware tables
  arm64: Add support for ACPI based firmware tables
  ACPI/PPTT: Add topology parsing code
  arm64: topology: rename cluster_id
  arm64: topology: enable ACPI/PPTT based CPU topology
  ACPI: Add PPTT to injectable table list
  arm64: topology: divorce MC scheduling domain from core_siblings

 arch/arm64/Kconfig                |   1 +
 arch/arm64/include/asm/acpi.h     |   4 +
 arch/arm64/include/asm/topology.h |   9 +-
 arch/arm64/kernel/cacheinfo.c     |  15 +-
 arch/arm64/kernel/topology.c      | 132 +++++++-
 arch/riscv/kernel/cacheinfo.c     |   1 -
 drivers/acpi/Kconfig              |   3 +
 drivers/acpi/Makefile             |   1 +
 drivers/acpi/pptt.c               | 642 ++++++++++++++++++++++++++++++++++++++
 drivers/acpi/tables.c             |   2 +-
 drivers/base/cacheinfo.c          | 157 +++++-----
 include/linux/acpi.h              |   4 +
 include/linux/cacheinfo.h         |  17 +-
 13 files changed, 882 insertions(+), 106 deletions(-)
 create mode 100644 drivers/acpi/pptt.c

-- 
2.13.6

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-02-28 22:06 ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to describe the processor and cache topology. Ideally it is
used to extend/override information provided by the hardware, but
right now ARM64 is entirely dependent on firmware provided tables.

This patch parses the table for the cache topology and CPU topology.
When we enable ACPI/PPTT for arm64 we map the physical_id to the
PPTT node flagged as the physical package by the firmware.
This results in topologies that match what the remainder of the
system expects. To avoid inverted scheduler domains we then
set the MC domain equal to the largest cache within the socket
below the NUMA domain.

For example on juno:
[root at mammon-juno-rh topology]# lstopo-no-graphics
  Package L#0
    L2 L#0 (1024KB)
      L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
      L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L2 L#1 (2048KB)
      L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
  HostBridge L#0
    PCIBridge
      PCIBridge
        PCIBridge
          PCI 1095:3132
            Block(Disk) L#0 "sda"
        PCIBridge
          PCI 1002:68f9
            GPU L#1 "renderD128"
            GPU L#2 "card0"
            GPU L#3 "controlD64"
        PCIBridge
          PCI 11ab:4380
            Net L#4 "enp8s0"

Git tree at:
http://linux-arm.org/git?p=linux-jlinton.git
branch: pptt_v7

v6->v7:
Add additional patch to use the last cache level within the NUMA
  or socket as the MC domain. This assures the MC domain is
  equal or smaller than the DIE.
  
Various formatting/etc review comments.

Rebase to 4.16rc2

v5->v6:
Add additional patches which re-factor how the initial DT code sets
  up the cacheinfo structure so that its not as dependent on the
  of_node stored in that tree. Once that is done we rename it
  for use with the ACPI code.

Additionally there were a fair number of minor name/location/etc
  tweaks scattered about made in response to review comments.

v4->v5:
Update the cache type from NOCACHE to UNIFIED when all the cache
  attributes we update are valid. This fixes a problem where caches
  which are entirely created by the PPTT don't show up in lstopo.

Give the PPTT its own firmware_node in the cache structure instead of
  sharing it with the of_node.

Move some pieces around between patches.

(see previous cover letters for futher changes)

Jeremy Linton (13):
  drivers: base: cacheinfo: move cache_setup_of_node()
  drivers: base: cacheinfo: setup DT cache properties early
  cacheinfo: rename of_node to fw_token
  arm64/acpi: Create arch specific cpu to acpi id helper
  ACPI/PPTT: Add Processor Properties Topology Table parsing
  ACPI: Enable PPTT support on ARM64
  drivers: base cacheinfo: Add support for ACPI based firmware tables
  arm64: Add support for ACPI based firmware tables
  ACPI/PPTT: Add topology parsing code
  arm64: topology: rename cluster_id
  arm64: topology: enable ACPI/PPTT based CPU topology
  ACPI: Add PPTT to injectable table list
  arm64: topology: divorce MC scheduling domain from core_siblings

 arch/arm64/Kconfig                |   1 +
 arch/arm64/include/asm/acpi.h     |   4 +
 arch/arm64/include/asm/topology.h |   9 +-
 arch/arm64/kernel/cacheinfo.c     |  15 +-
 arch/arm64/kernel/topology.c      | 132 +++++++-
 arch/riscv/kernel/cacheinfo.c     |   1 -
 drivers/acpi/Kconfig              |   3 +
 drivers/acpi/Makefile             |   1 +
 drivers/acpi/pptt.c               | 642 ++++++++++++++++++++++++++++++++++++++
 drivers/acpi/tables.c             |   2 +-
 drivers/base/cacheinfo.c          | 157 +++++-----
 include/linux/acpi.h              |   4 +
 include/linux/cacheinfo.h         |  17 +-
 13 files changed, 882 insertions(+), 106 deletions(-)
 create mode 100644 drivers/acpi/pptt.c

-- 
2.13.6

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-02-28 22:06 ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to describe the processor and cache topology. Ideally it is
used to extend/override information provided by the hardware, but
right now ARM64 is entirely dependent on firmware provided tables.

This patch parses the table for the cache topology and CPU topology.
When we enable ACPI/PPTT for arm64 we map the physical_id to the
PPTT node flagged as the physical package by the firmware.
This results in topologies that match what the remainder of the
system expects. To avoid inverted scheduler domains we then
set the MC domain equal to the largest cache within the socket
below the NUMA domain.

For example on juno:
[root at mammon-juno-rh topology]# lstopo-no-graphics
  Package L#0
    L2 L#0 (1024KB)
      L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
      L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L2 L#1 (2048KB)
      L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
  HostBridge L#0
    PCIBridge
      PCIBridge
        PCIBridge
          PCI 1095:3132
            Block(Disk) L#0 "sda"
        PCIBridge
          PCI 1002:68f9
            GPU L#1 "renderD128"
            GPU L#2 "card0"
            GPU L#3 "controlD64"
        PCIBridge
          PCI 11ab:4380
            Net L#4 "enp8s0"

Git tree at:
http://linux-arm.org/git?p=linux-jlinton.git
branch: pptt_v7

v6->v7:
Add additional patch to use the last cache level within the NUMA
  or socket as the MC domain. This assures the MC domain is
  equal or smaller than the DIE.
  
Various formatting/etc review comments.

Rebase to 4.16rc2

v5->v6:
Add additional patches which re-factor how the initial DT code sets
  up the cacheinfo structure so that its not as dependent on the
  of_node stored in that tree. Once that is done we rename it
  for use with the ACPI code.

Additionally there were a fair number of minor name/location/etc
  tweaks scattered about made in response to review comments.

v4->v5:
Update the cache type from NOCACHE to UNIFIED when all the cache
  attributes we update are valid. This fixes a problem where caches
  which are entirely created by the PPTT don't show up in lstopo.

Give the PPTT its own firmware_node in the cache structure instead of
  sharing it with the of_node.

Move some pieces around between patches.

(see previous cover letters for futher changes)

Jeremy Linton (13):
  drivers: base: cacheinfo: move cache_setup_of_node()
  drivers: base: cacheinfo: setup DT cache properties early
  cacheinfo: rename of_node to fw_token
  arm64/acpi: Create arch specific cpu to acpi id helper
  ACPI/PPTT: Add Processor Properties Topology Table parsing
  ACPI: Enable PPTT support on ARM64
  drivers: base cacheinfo: Add support for ACPI based firmware tables
  arm64: Add support for ACPI based firmware tables
  ACPI/PPTT: Add topology parsing code
  arm64: topology: rename cluster_id
  arm64: topology: enable ACPI/PPTT based CPU topology
  ACPI: Add PPTT to injectable table list
  arm64: topology: divorce MC scheduling domain from core_siblings

 arch/arm64/Kconfig                |   1 +
 arch/arm64/include/asm/acpi.h     |   4 +
 arch/arm64/include/asm/topology.h |   9 +-
 arch/arm64/kernel/cacheinfo.c     |  15 +-
 arch/arm64/kernel/topology.c      | 132 +++++++-
 arch/riscv/kernel/cacheinfo.c     |   1 -
 drivers/acpi/Kconfig              |   3 +
 drivers/acpi/Makefile             |   1 +
 drivers/acpi/pptt.c               | 642 ++++++++++++++++++++++++++++++++++++++
 drivers/acpi/tables.c             |   2 +-
 drivers/base/cacheinfo.c          | 157 +++++-----
 include/linux/acpi.h              |   4 +
 include/linux/cacheinfo.h         |  17 +-
 13 files changed, 882 insertions(+), 106 deletions(-)
 create mode 100644 drivers/acpi/pptt.c

-- 
2.13.6

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 01/13] drivers: base: cacheinfo: move cache_setup_of_node()
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

In preparation for the next patch, and to aid in
review of that patch, lets move cache_setup_of_node
further down in the module without any changes.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/base/cacheinfo.c | 80 ++++++++++++++++++++++++------------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index edf726267282..09ccef7ddc99 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -32,46 +32,6 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 }
 
 #ifdef CONFIG_OF
-static int cache_setup_of_node(unsigned int cpu)
-{
-	struct device_node *np;
-	struct cacheinfo *this_leaf;
-	struct device *cpu_dev = get_cpu_device(cpu);
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-	unsigned int index = 0;
-
-	/* skip if of_node is already populated */
-	if (this_cpu_ci->info_list->of_node)
-		return 0;
-
-	if (!cpu_dev) {
-		pr_err("No cpu device for CPU %d\n", cpu);
-		return -ENODEV;
-	}
-	np = cpu_dev->of_node;
-	if (!np) {
-		pr_err("Failed to find cpu%d device node\n", cpu);
-		return -ENOENT;
-	}
-
-	while (index < cache_leaves(cpu)) {
-		this_leaf = this_cpu_ci->info_list + index;
-		if (this_leaf->level != 1)
-			np = of_find_next_cache_node(np);
-		else
-			np = of_node_get(np);/* cpu node itself */
-		if (!np)
-			break;
-		this_leaf->of_node = np;
-		index++;
-	}
-
-	if (index != cache_leaves(cpu)) /* not all OF nodes populated */
-		return -ENOENT;
-
-	return 0;
-}
-
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
@@ -202,6 +162,46 @@ static void cache_of_override_properties(unsigned int cpu)
 		cache_associativity(this_leaf);
 	}
 }
+
+static int cache_setup_of_node(unsigned int cpu)
+{
+	struct device_node *np;
+	struct cacheinfo *this_leaf;
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	unsigned int index = 0;
+
+	/* skip if of_node is already populated */
+	if (this_cpu_ci->info_list->of_node)
+		return 0;
+
+	if (!cpu_dev) {
+		pr_err("No cpu device for CPU %d\n", cpu);
+		return -ENODEV;
+	}
+	np = cpu_dev->of_node;
+	if (!np) {
+		pr_err("Failed to find cpu%d device node\n", cpu);
+		return -ENOENT;
+	}
+
+	while (index < cache_leaves(cpu)) {
+		this_leaf = this_cpu_ci->info_list + index;
+		if (this_leaf->level != 1)
+			np = of_find_next_cache_node(np);
+		else
+			np = of_node_get(np);/* cpu node itself */
+		if (!np)
+			break;
+		this_leaf->of_node = np;
+		index++;
+	}
+
+	if (index != cache_leaves(cpu)) /* not all OF nodes populated */
+		return -ENOENT;
+
+	return 0;
+}
 #else
 static void cache_of_override_properties(unsigned int cpu) { }
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 01/13] drivers: base: cacheinfo: move cache_setup_of_node()
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

In preparation for the next patch, and to aid in
review of that patch, lets move cache_setup_of_node
further down in the module without any changes.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/base/cacheinfo.c | 80 ++++++++++++++++++++++++------------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index edf726267282..09ccef7ddc99 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -32,46 +32,6 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 }
 
 #ifdef CONFIG_OF
-static int cache_setup_of_node(unsigned int cpu)
-{
-	struct device_node *np;
-	struct cacheinfo *this_leaf;
-	struct device *cpu_dev = get_cpu_device(cpu);
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-	unsigned int index = 0;
-
-	/* skip if of_node is already populated */
-	if (this_cpu_ci->info_list->of_node)
-		return 0;
-
-	if (!cpu_dev) {
-		pr_err("No cpu device for CPU %d\n", cpu);
-		return -ENODEV;
-	}
-	np = cpu_dev->of_node;
-	if (!np) {
-		pr_err("Failed to find cpu%d device node\n", cpu);
-		return -ENOENT;
-	}
-
-	while (index < cache_leaves(cpu)) {
-		this_leaf = this_cpu_ci->info_list + index;
-		if (this_leaf->level != 1)
-			np = of_find_next_cache_node(np);
-		else
-			np = of_node_get(np);/* cpu node itself */
-		if (!np)
-			break;
-		this_leaf->of_node = np;
-		index++;
-	}
-
-	if (index != cache_leaves(cpu)) /* not all OF nodes populated */
-		return -ENOENT;
-
-	return 0;
-}
-
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
@@ -202,6 +162,46 @@ static void cache_of_override_properties(unsigned int cpu)
 		cache_associativity(this_leaf);
 	}
 }
+
+static int cache_setup_of_node(unsigned int cpu)
+{
+	struct device_node *np;
+	struct cacheinfo *this_leaf;
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	unsigned int index = 0;
+
+	/* skip if of_node is already populated */
+	if (this_cpu_ci->info_list->of_node)
+		return 0;
+
+	if (!cpu_dev) {
+		pr_err("No cpu device for CPU %d\n", cpu);
+		return -ENODEV;
+	}
+	np = cpu_dev->of_node;
+	if (!np) {
+		pr_err("Failed to find cpu%d device node\n", cpu);
+		return -ENOENT;
+	}
+
+	while (index < cache_leaves(cpu)) {
+		this_leaf = this_cpu_ci->info_list + index;
+		if (this_leaf->level != 1)
+			np = of_find_next_cache_node(np);
+		else
+			np = of_node_get(np);/* cpu node itself */
+		if (!np)
+			break;
+		this_leaf->of_node = np;
+		index++;
+	}
+
+	if (index != cache_leaves(cpu)) /* not all OF nodes populated */
+		return -ENOENT;
+
+	return 0;
+}
 #else
 static void cache_of_override_properties(unsigned int cpu) { }
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 01/13] drivers: base: cacheinfo: move cache_setup_of_node()
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

In preparation for the next patch, and to aid in
review of that patch, lets move cache_setup_of_node
further down in the module without any changes.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/base/cacheinfo.c | 80 ++++++++++++++++++++++++------------------------
 1 file changed, 40 insertions(+), 40 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index edf726267282..09ccef7ddc99 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -32,46 +32,6 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 }
 
 #ifdef CONFIG_OF
-static int cache_setup_of_node(unsigned int cpu)
-{
-	struct device_node *np;
-	struct cacheinfo *this_leaf;
-	struct device *cpu_dev = get_cpu_device(cpu);
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-	unsigned int index = 0;
-
-	/* skip if of_node is already populated */
-	if (this_cpu_ci->info_list->of_node)
-		return 0;
-
-	if (!cpu_dev) {
-		pr_err("No cpu device for CPU %d\n", cpu);
-		return -ENODEV;
-	}
-	np = cpu_dev->of_node;
-	if (!np) {
-		pr_err("Failed to find cpu%d device node\n", cpu);
-		return -ENOENT;
-	}
-
-	while (index < cache_leaves(cpu)) {
-		this_leaf = this_cpu_ci->info_list + index;
-		if (this_leaf->level != 1)
-			np = of_find_next_cache_node(np);
-		else
-			np = of_node_get(np);/* cpu node itself */
-		if (!np)
-			break;
-		this_leaf->of_node = np;
-		index++;
-	}
-
-	if (index != cache_leaves(cpu)) /* not all OF nodes populated */
-		return -ENOENT;
-
-	return 0;
-}
-
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
@@ -202,6 +162,46 @@ static void cache_of_override_properties(unsigned int cpu)
 		cache_associativity(this_leaf);
 	}
 }
+
+static int cache_setup_of_node(unsigned int cpu)
+{
+	struct device_node *np;
+	struct cacheinfo *this_leaf;
+	struct device *cpu_dev = get_cpu_device(cpu);
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	unsigned int index = 0;
+
+	/* skip if of_node is already populated */
+	if (this_cpu_ci->info_list->of_node)
+		return 0;
+
+	if (!cpu_dev) {
+		pr_err("No cpu device for CPU %d\n", cpu);
+		return -ENODEV;
+	}
+	np = cpu_dev->of_node;
+	if (!np) {
+		pr_err("Failed to find cpu%d device node\n", cpu);
+		return -ENOENT;
+	}
+
+	while (index < cache_leaves(cpu)) {
+		this_leaf = this_cpu_ci->info_list + index;
+		if (this_leaf->level != 1)
+			np = of_find_next_cache_node(np);
+		else
+			np = of_node_get(np);/* cpu node itself */
+		if (!np)
+			break;
+		this_leaf->of_node = np;
+		index++;
+	}
+
+	if (index != cache_leaves(cpu)) /* not all OF nodes populated */
+		return -ENOENT;
+
+	return 0;
+}
 #else
 static void cache_of_override_properties(unsigned int cpu) { }
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

The original intent in cacheinfo was that an architecture
specific populate_cache_leaves() would probe the hardware
and then cache_shared_cpu_map_setup() and
cache_override_properties() would provide firmware help to
extend/expand upon what was probed. Arm64 was really
the only architecture that was working this way, and
with the removal of most of the hardware probing logic it
became clear that it was possible to simplify the logic a bit.

This patch combines the walk of the DT nodes with the
code updating the cache size/line_size and nr_sets.
cache_override_properties() (which was DT specific) is
then removed. The result is that cacheinfo.of_node is
no longer used as a temporary place to hold DT references
for future calls that update cache properties. That change
helps to clarify its one remaining use (matching
cacheinfo nodes that represent shared caches) which
will be used by the ACPI/PPTT code in the following patches.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/riscv/kernel/cacheinfo.c |  1 -
 drivers/base/cacheinfo.c      | 65 +++++++++++++++++++------------------------
 2 files changed, 29 insertions(+), 37 deletions(-)

diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 10ed2749e246..0bc86e5f8f3f 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -20,7 +20,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 			 struct device_node *node,
 			 enum cache_type type, unsigned int level)
 {
-	this_leaf->of_node = node;
 	this_leaf->level = level;
 	this_leaf->type = type;
 	/* not a sector cache */
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 09ccef7ddc99..a872523e8951 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -71,7 +71,7 @@ static inline int get_cacheinfo_idx(enum cache_type type)
 	return type;
 }
 
-static void cache_size(struct cacheinfo *this_leaf)
+static void cache_size(struct cacheinfo *this_leaf, struct device_node *np)
 {
 	const char *propname;
 	const __be32 *cache_size;
@@ -80,13 +80,14 @@ static void cache_size(struct cacheinfo *this_leaf)
 	ct_idx = get_cacheinfo_idx(this_leaf->type);
 	propname = cache_type_info[ct_idx].size_prop;
 
-	cache_size = of_get_property(this_leaf->of_node, propname, NULL);
+	cache_size = of_get_property(np, propname, NULL);
 	if (cache_size)
 		this_leaf->size = of_read_number(cache_size, 1);
 }
 
 /* not cache_line_size() because that's a macro in include/linux/cache.h */
-static void cache_get_line_size(struct cacheinfo *this_leaf)
+static void cache_get_line_size(struct cacheinfo *this_leaf,
+				struct device_node *np)
 {
 	const __be32 *line_size;
 	int i, lim, ct_idx;
@@ -98,7 +99,7 @@ static void cache_get_line_size(struct cacheinfo *this_leaf)
 		const char *propname;
 
 		propname = cache_type_info[ct_idx].line_size_props[i];
-		line_size = of_get_property(this_leaf->of_node, propname, NULL);
+		line_size = of_get_property(np, propname, NULL);
 		if (line_size)
 			break;
 	}
@@ -107,7 +108,7 @@ static void cache_get_line_size(struct cacheinfo *this_leaf)
 		this_leaf->coherency_line_size = of_read_number(line_size, 1);
 }
 
-static void cache_nr_sets(struct cacheinfo *this_leaf)
+static void cache_nr_sets(struct cacheinfo *this_leaf, struct device_node *np)
 {
 	const char *propname;
 	const __be32 *nr_sets;
@@ -116,7 +117,7 @@ static void cache_nr_sets(struct cacheinfo *this_leaf)
 	ct_idx = get_cacheinfo_idx(this_leaf->type);
 	propname = cache_type_info[ct_idx].nr_sets_prop;
 
-	nr_sets = of_get_property(this_leaf->of_node, propname, NULL);
+	nr_sets = of_get_property(np, propname, NULL);
 	if (nr_sets)
 		this_leaf->number_of_sets = of_read_number(nr_sets, 1);
 }
@@ -135,32 +136,27 @@ static void cache_associativity(struct cacheinfo *this_leaf)
 		this_leaf->ways_of_associativity = (size / nr_sets) / line_size;
 }
 
-static bool cache_node_is_unified(struct cacheinfo *this_leaf)
+static bool cache_node_is_unified(struct cacheinfo *this_leaf,
+				  struct device_node *np)
 {
-	return of_property_read_bool(this_leaf->of_node, "cache-unified");
+	return of_property_read_bool(np, "cache-unified");
 }
 
-static void cache_of_override_properties(unsigned int cpu)
+static void cache_of_set_props(struct cacheinfo *this_leaf,
+			       struct device_node *np)
 {
-	int index;
-	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-
-	for (index = 0; index < cache_leaves(cpu); index++) {
-		this_leaf = this_cpu_ci->info_list + index;
-		/*
-		 * init_cache_level must setup the cache level correctly
-		 * overriding the architecturally specified levels, so
-		 * if type is NONE at this stage, it should be unified
-		 */
-		if (this_leaf->type == CACHE_TYPE_NOCACHE &&
-		    cache_node_is_unified(this_leaf))
-			this_leaf->type = CACHE_TYPE_UNIFIED;
-		cache_size(this_leaf);
-		cache_get_line_size(this_leaf);
-		cache_nr_sets(this_leaf);
-		cache_associativity(this_leaf);
-	}
+	/*
+	 * init_cache_level must setup the cache level correctly
+	 * overriding the architecturally specified levels, so
+	 * if type is NONE at this stage, it should be unified
+	 */
+	if (this_leaf->type == CACHE_TYPE_NOCACHE &&
+	    cache_node_is_unified(this_leaf, np))
+		this_leaf->type = CACHE_TYPE_UNIFIED;
+	cache_size(this_leaf, np);
+	cache_get_line_size(this_leaf, np);
+	cache_nr_sets(this_leaf, np);
+	cache_associativity(this_leaf);
 }
 
 static int cache_setup_of_node(unsigned int cpu)
@@ -193,6 +189,7 @@ static int cache_setup_of_node(unsigned int cpu)
 			np = of_node_get(np);/* cpu node itself */
 		if (!np)
 			break;
+		cache_of_set_props(this_leaf, np);
 		this_leaf->of_node = np;
 		index++;
 	}
@@ -203,7 +200,6 @@ static int cache_setup_of_node(unsigned int cpu)
 	return 0;
 }
 #else
-static void cache_of_override_properties(unsigned int cpu) { }
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
@@ -286,12 +282,6 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 	}
 }
 
-static void cache_override_properties(unsigned int cpu)
-{
-	if (of_have_populated_dt())
-		return cache_of_override_properties(cpu);
-}
-
 static void free_cache_attributes(unsigned int cpu)
 {
 	if (!per_cpu_cacheinfo(cpu))
@@ -325,6 +315,10 @@ static int detect_cache_attributes(unsigned int cpu)
 	if (per_cpu_cacheinfo(cpu) == NULL)
 		return -ENOMEM;
 
+	/*
+	 * populate_cache_leaves() may completely setup the cache leaves and
+	 * shared_cpu_map or it may leave it partially setup.
+	 */
 	ret = populate_cache_leaves(cpu);
 	if (ret)
 		goto free_ci;
@@ -338,7 +332,6 @@ static int detect_cache_attributes(unsigned int cpu)
 		goto free_ci;
 	}
 
-	cache_override_properties(cpu);
 	return 0;
 
 free_ci:
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

The original intent in cacheinfo was that an architecture
specific populate_cache_leaves() would probe the hardware
and then cache_shared_cpu_map_setup() and
cache_override_properties() would provide firmware help to
extend/expand upon what was probed. Arm64 was really
the only architecture that was working this way, and
with the removal of most of the hardware probing logic it
became clear that it was possible to simplify the logic a bit.

This patch combines the walk of the DT nodes with the
code updating the cache size/line_size and nr_sets.
cache_override_properties() (which was DT specific) is
then removed. The result is that cacheinfo.of_node is
no longer used as a temporary place to hold DT references
for future calls that update cache properties. That change
helps to clarify its one remaining use (matching
cacheinfo nodes that represent shared caches) which
will be used by the ACPI/PPTT code in the following patches.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/riscv/kernel/cacheinfo.c |  1 -
 drivers/base/cacheinfo.c      | 65 +++++++++++++++++++------------------------
 2 files changed, 29 insertions(+), 37 deletions(-)

diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 10ed2749e246..0bc86e5f8f3f 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -20,7 +20,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 			 struct device_node *node,
 			 enum cache_type type, unsigned int level)
 {
-	this_leaf->of_node = node;
 	this_leaf->level = level;
 	this_leaf->type = type;
 	/* not a sector cache */
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 09ccef7ddc99..a872523e8951 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -71,7 +71,7 @@ static inline int get_cacheinfo_idx(enum cache_type type)
 	return type;
 }
 
-static void cache_size(struct cacheinfo *this_leaf)
+static void cache_size(struct cacheinfo *this_leaf, struct device_node *np)
 {
 	const char *propname;
 	const __be32 *cache_size;
@@ -80,13 +80,14 @@ static void cache_size(struct cacheinfo *this_leaf)
 	ct_idx = get_cacheinfo_idx(this_leaf->type);
 	propname = cache_type_info[ct_idx].size_prop;
 
-	cache_size = of_get_property(this_leaf->of_node, propname, NULL);
+	cache_size = of_get_property(np, propname, NULL);
 	if (cache_size)
 		this_leaf->size = of_read_number(cache_size, 1);
 }
 
 /* not cache_line_size() because that's a macro in include/linux/cache.h */
-static void cache_get_line_size(struct cacheinfo *this_leaf)
+static void cache_get_line_size(struct cacheinfo *this_leaf,
+				struct device_node *np)
 {
 	const __be32 *line_size;
 	int i, lim, ct_idx;
@@ -98,7 +99,7 @@ static void cache_get_line_size(struct cacheinfo *this_leaf)
 		const char *propname;
 
 		propname = cache_type_info[ct_idx].line_size_props[i];
-		line_size = of_get_property(this_leaf->of_node, propname, NULL);
+		line_size = of_get_property(np, propname, NULL);
 		if (line_size)
 			break;
 	}
@@ -107,7 +108,7 @@ static void cache_get_line_size(struct cacheinfo *this_leaf)
 		this_leaf->coherency_line_size = of_read_number(line_size, 1);
 }
 
-static void cache_nr_sets(struct cacheinfo *this_leaf)
+static void cache_nr_sets(struct cacheinfo *this_leaf, struct device_node *np)
 {
 	const char *propname;
 	const __be32 *nr_sets;
@@ -116,7 +117,7 @@ static void cache_nr_sets(struct cacheinfo *this_leaf)
 	ct_idx = get_cacheinfo_idx(this_leaf->type);
 	propname = cache_type_info[ct_idx].nr_sets_prop;
 
-	nr_sets = of_get_property(this_leaf->of_node, propname, NULL);
+	nr_sets = of_get_property(np, propname, NULL);
 	if (nr_sets)
 		this_leaf->number_of_sets = of_read_number(nr_sets, 1);
 }
@@ -135,32 +136,27 @@ static void cache_associativity(struct cacheinfo *this_leaf)
 		this_leaf->ways_of_associativity = (size / nr_sets) / line_size;
 }
 
-static bool cache_node_is_unified(struct cacheinfo *this_leaf)
+static bool cache_node_is_unified(struct cacheinfo *this_leaf,
+				  struct device_node *np)
 {
-	return of_property_read_bool(this_leaf->of_node, "cache-unified");
+	return of_property_read_bool(np, "cache-unified");
 }
 
-static void cache_of_override_properties(unsigned int cpu)
+static void cache_of_set_props(struct cacheinfo *this_leaf,
+			       struct device_node *np)
 {
-	int index;
-	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-
-	for (index = 0; index < cache_leaves(cpu); index++) {
-		this_leaf = this_cpu_ci->info_list + index;
-		/*
-		 * init_cache_level must setup the cache level correctly
-		 * overriding the architecturally specified levels, so
-		 * if type is NONE at this stage, it should be unified
-		 */
-		if (this_leaf->type == CACHE_TYPE_NOCACHE &&
-		    cache_node_is_unified(this_leaf))
-			this_leaf->type = CACHE_TYPE_UNIFIED;
-		cache_size(this_leaf);
-		cache_get_line_size(this_leaf);
-		cache_nr_sets(this_leaf);
-		cache_associativity(this_leaf);
-	}
+	/*
+	 * init_cache_level must setup the cache level correctly
+	 * overriding the architecturally specified levels, so
+	 * if type is NONE at this stage, it should be unified
+	 */
+	if (this_leaf->type == CACHE_TYPE_NOCACHE &&
+	    cache_node_is_unified(this_leaf, np))
+		this_leaf->type = CACHE_TYPE_UNIFIED;
+	cache_size(this_leaf, np);
+	cache_get_line_size(this_leaf, np);
+	cache_nr_sets(this_leaf, np);
+	cache_associativity(this_leaf);
 }
 
 static int cache_setup_of_node(unsigned int cpu)
@@ -193,6 +189,7 @@ static int cache_setup_of_node(unsigned int cpu)
 			np = of_node_get(np);/* cpu node itself */
 		if (!np)
 			break;
+		cache_of_set_props(this_leaf, np);
 		this_leaf->of_node = np;
 		index++;
 	}
@@ -203,7 +200,6 @@ static int cache_setup_of_node(unsigned int cpu)
 	return 0;
 }
 #else
-static void cache_of_override_properties(unsigned int cpu) { }
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
@@ -286,12 +282,6 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 	}
 }
 
-static void cache_override_properties(unsigned int cpu)
-{
-	if (of_have_populated_dt())
-		return cache_of_override_properties(cpu);
-}
-
 static void free_cache_attributes(unsigned int cpu)
 {
 	if (!per_cpu_cacheinfo(cpu))
@@ -325,6 +315,10 @@ static int detect_cache_attributes(unsigned int cpu)
 	if (per_cpu_cacheinfo(cpu) == NULL)
 		return -ENOMEM;
 
+	/*
+	 * populate_cache_leaves() may completely setup the cache leaves and
+	 * shared_cpu_map or it may leave it partially setup.
+	 */
 	ret = populate_cache_leaves(cpu);
 	if (ret)
 		goto free_ci;
@@ -338,7 +332,6 @@ static int detect_cache_attributes(unsigned int cpu)
 		goto free_ci;
 	}
 
-	cache_override_properties(cpu);
 	return 0;
 
 free_ci:
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

The original intent in cacheinfo was that an architecture
specific populate_cache_leaves() would probe the hardware
and then cache_shared_cpu_map_setup() and
cache_override_properties() would provide firmware help to
extend/expand upon what was probed. Arm64 was really
the only architecture that was working this way, and
with the removal of most of the hardware probing logic it
became clear that it was possible to simplify the logic a bit.

This patch combines the walk of the DT nodes with the
code updating the cache size/line_size and nr_sets.
cache_override_properties() (which was DT specific) is
then removed. The result is that cacheinfo.of_node is
no longer used as a temporary place to hold DT references
for future calls that update cache properties. That change
helps to clarify its one remaining use (matching
cacheinfo nodes that represent shared caches) which
will be used by the ACPI/PPTT code in the following patches.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/riscv/kernel/cacheinfo.c |  1 -
 drivers/base/cacheinfo.c      | 65 +++++++++++++++++++------------------------
 2 files changed, 29 insertions(+), 37 deletions(-)

diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
index 10ed2749e246..0bc86e5f8f3f 100644
--- a/arch/riscv/kernel/cacheinfo.c
+++ b/arch/riscv/kernel/cacheinfo.c
@@ -20,7 +20,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 			 struct device_node *node,
 			 enum cache_type type, unsigned int level)
 {
-	this_leaf->of_node = node;
 	this_leaf->level = level;
 	this_leaf->type = type;
 	/* not a sector cache */
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 09ccef7ddc99..a872523e8951 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -71,7 +71,7 @@ static inline int get_cacheinfo_idx(enum cache_type type)
 	return type;
 }
 
-static void cache_size(struct cacheinfo *this_leaf)
+static void cache_size(struct cacheinfo *this_leaf, struct device_node *np)
 {
 	const char *propname;
 	const __be32 *cache_size;
@@ -80,13 +80,14 @@ static void cache_size(struct cacheinfo *this_leaf)
 	ct_idx = get_cacheinfo_idx(this_leaf->type);
 	propname = cache_type_info[ct_idx].size_prop;
 
-	cache_size = of_get_property(this_leaf->of_node, propname, NULL);
+	cache_size = of_get_property(np, propname, NULL);
 	if (cache_size)
 		this_leaf->size = of_read_number(cache_size, 1);
 }
 
 /* not cache_line_size() because that's a macro in include/linux/cache.h */
-static void cache_get_line_size(struct cacheinfo *this_leaf)
+static void cache_get_line_size(struct cacheinfo *this_leaf,
+				struct device_node *np)
 {
 	const __be32 *line_size;
 	int i, lim, ct_idx;
@@ -98,7 +99,7 @@ static void cache_get_line_size(struct cacheinfo *this_leaf)
 		const char *propname;
 
 		propname = cache_type_info[ct_idx].line_size_props[i];
-		line_size = of_get_property(this_leaf->of_node, propname, NULL);
+		line_size = of_get_property(np, propname, NULL);
 		if (line_size)
 			break;
 	}
@@ -107,7 +108,7 @@ static void cache_get_line_size(struct cacheinfo *this_leaf)
 		this_leaf->coherency_line_size = of_read_number(line_size, 1);
 }
 
-static void cache_nr_sets(struct cacheinfo *this_leaf)
+static void cache_nr_sets(struct cacheinfo *this_leaf, struct device_node *np)
 {
 	const char *propname;
 	const __be32 *nr_sets;
@@ -116,7 +117,7 @@ static void cache_nr_sets(struct cacheinfo *this_leaf)
 	ct_idx = get_cacheinfo_idx(this_leaf->type);
 	propname = cache_type_info[ct_idx].nr_sets_prop;
 
-	nr_sets = of_get_property(this_leaf->of_node, propname, NULL);
+	nr_sets = of_get_property(np, propname, NULL);
 	if (nr_sets)
 		this_leaf->number_of_sets = of_read_number(nr_sets, 1);
 }
@@ -135,32 +136,27 @@ static void cache_associativity(struct cacheinfo *this_leaf)
 		this_leaf->ways_of_associativity = (size / nr_sets) / line_size;
 }
 
-static bool cache_node_is_unified(struct cacheinfo *this_leaf)
+static bool cache_node_is_unified(struct cacheinfo *this_leaf,
+				  struct device_node *np)
 {
-	return of_property_read_bool(this_leaf->of_node, "cache-unified");
+	return of_property_read_bool(np, "cache-unified");
 }
 
-static void cache_of_override_properties(unsigned int cpu)
+static void cache_of_set_props(struct cacheinfo *this_leaf,
+			       struct device_node *np)
 {
-	int index;
-	struct cacheinfo *this_leaf;
-	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
-
-	for (index = 0; index < cache_leaves(cpu); index++) {
-		this_leaf = this_cpu_ci->info_list + index;
-		/*
-		 * init_cache_level must setup the cache level correctly
-		 * overriding the architecturally specified levels, so
-		 * if type is NONE at this stage, it should be unified
-		 */
-		if (this_leaf->type == CACHE_TYPE_NOCACHE &&
-		    cache_node_is_unified(this_leaf))
-			this_leaf->type = CACHE_TYPE_UNIFIED;
-		cache_size(this_leaf);
-		cache_get_line_size(this_leaf);
-		cache_nr_sets(this_leaf);
-		cache_associativity(this_leaf);
-	}
+	/*
+	 * init_cache_level must setup the cache level correctly
+	 * overriding the architecturally specified levels, so
+	 * if type is NONE at this stage, it should be unified
+	 */
+	if (this_leaf->type == CACHE_TYPE_NOCACHE &&
+	    cache_node_is_unified(this_leaf, np))
+		this_leaf->type = CACHE_TYPE_UNIFIED;
+	cache_size(this_leaf, np);
+	cache_get_line_size(this_leaf, np);
+	cache_nr_sets(this_leaf, np);
+	cache_associativity(this_leaf);
 }
 
 static int cache_setup_of_node(unsigned int cpu)
@@ -193,6 +189,7 @@ static int cache_setup_of_node(unsigned int cpu)
 			np = of_node_get(np);/* cpu node itself */
 		if (!np)
 			break;
+		cache_of_set_props(this_leaf, np);
 		this_leaf->of_node = np;
 		index++;
 	}
@@ -203,7 +200,6 @@ static int cache_setup_of_node(unsigned int cpu)
 	return 0;
 }
 #else
-static void cache_of_override_properties(unsigned int cpu) { }
 static inline int cache_setup_of_node(unsigned int cpu) { return 0; }
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
@@ -286,12 +282,6 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 	}
 }
 
-static void cache_override_properties(unsigned int cpu)
-{
-	if (of_have_populated_dt())
-		return cache_of_override_properties(cpu);
-}
-
 static void free_cache_attributes(unsigned int cpu)
 {
 	if (!per_cpu_cacheinfo(cpu))
@@ -325,6 +315,10 @@ static int detect_cache_attributes(unsigned int cpu)
 	if (per_cpu_cacheinfo(cpu) == NULL)
 		return -ENOMEM;
 
+	/*
+	 * populate_cache_leaves() may completely setup the cache leaves and
+	 * shared_cpu_map or it may leave it partially setup.
+	 */
 	ret = populate_cache_leaves(cpu);
 	if (ret)
 		goto free_ci;
@@ -338,7 +332,6 @@ static int detect_cache_attributes(unsigned int cpu)
 		goto free_ci;
 	}
 
-	cache_override_properties(cpu);
 	return 0;
 
 free_ci:
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 03/13] cacheinfo: rename of_node to fw_token
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

Rename and change the type of of_node to indicate
it is a generic pointer which is generally only used
for comparison purposes. In a later patch we will put
an ACPI/PPTT token pointer in fw_token so that
the code which builds the shared cpu masks can be reused.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/base/cacheinfo.c  | 16 +++++++++-------
 include/linux/cacheinfo.h |  8 +++-----
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index a872523e8951..597aacb233fc 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -35,7 +35,7 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
-	return sib_leaf->of_node == this_leaf->of_node;
+	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
 /* OF properties to query for a given cache type */
@@ -167,9 +167,10 @@ static int cache_setup_of_node(unsigned int cpu)
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
-	/* skip if of_node is already populated */
-	if (this_cpu_ci->info_list->of_node)
+	/* skip if fw_token is already populated */
+	if (this_cpu_ci->info_list->fw_token) {
 		return 0;
+	}
 
 	if (!cpu_dev) {
 		pr_err("No cpu device for CPU %d\n", cpu);
@@ -190,7 +191,7 @@ static int cache_setup_of_node(unsigned int cpu)
 		if (!np)
 			break;
 		cache_of_set_props(this_leaf, np);
-		this_leaf->of_node = np;
+		this_leaf->fw_token = np;
 		index++;
 	}
 
@@ -278,7 +279,7 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
 		}
-		of_node_put(this_leaf->of_node);
+		of_node_put(this_leaf->fw_token);
 	}
 }
 
@@ -323,8 +324,9 @@ static int detect_cache_attributes(unsigned int cpu)
 	if (ret)
 		goto free_ci;
 	/*
-	 * For systems using DT for cache hierarchy, of_node and shared_cpu_map
-	 * will be set up here only if they are not populated already
+	 * For systems using DT for cache hierarchy, fw_token
+	 * and shared_cpu_map will be set up here only if they are
+	 * not populated already
 	 */
 	ret = cache_shared_cpu_map_setup(cpu);
 	if (ret) {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 3d9805297cda..0c6f658054d2 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -34,9 +34,8 @@ enum cache_type {
  * @shared_cpu_map: logical cpumask representing all the cpus sharing
  *	this cache node
  * @attributes: bitfield representing various cache attributes
- * @of_node: if devicetree is used, this represents either the cpu node in
- *	case there's no explicit cache node or the cache node itself in the
- *	device tree
+ * @fw_token: Unique value used to determine if different cacheinfo
+ *	structures represent a single hardware cache instance.
  * @disable_sysfs: indicates whether this node is visible to the user via
  *	sysfs or not
  * @priv: pointer to any private data structure specific to particular
@@ -65,8 +64,7 @@ struct cacheinfo {
 #define CACHE_ALLOCATE_POLICY_MASK	\
 	(CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
 #define CACHE_ID		BIT(4)
-
-	struct device_node *of_node;
+	void *fw_token;
 	bool disable_sysfs;
 	void *priv;
 };
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 03/13] cacheinfo: rename of_node to fw_token
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

Rename and change the type of of_node to indicate
it is a generic pointer which is generally only used
for comparison purposes. In a later patch we will put
an ACPI/PPTT token pointer in fw_token so that
the code which builds the shared cpu masks can be reused.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/base/cacheinfo.c  | 16 +++++++++-------
 include/linux/cacheinfo.h |  8 +++-----
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index a872523e8951..597aacb233fc 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -35,7 +35,7 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
-	return sib_leaf->of_node == this_leaf->of_node;
+	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
 /* OF properties to query for a given cache type */
@@ -167,9 +167,10 @@ static int cache_setup_of_node(unsigned int cpu)
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
-	/* skip if of_node is already populated */
-	if (this_cpu_ci->info_list->of_node)
+	/* skip if fw_token is already populated */
+	if (this_cpu_ci->info_list->fw_token) {
 		return 0;
+	}
 
 	if (!cpu_dev) {
 		pr_err("No cpu device for CPU %d\n", cpu);
@@ -190,7 +191,7 @@ static int cache_setup_of_node(unsigned int cpu)
 		if (!np)
 			break;
 		cache_of_set_props(this_leaf, np);
-		this_leaf->of_node = np;
+		this_leaf->fw_token = np;
 		index++;
 	}
 
@@ -278,7 +279,7 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
 		}
-		of_node_put(this_leaf->of_node);
+		of_node_put(this_leaf->fw_token);
 	}
 }
 
@@ -323,8 +324,9 @@ static int detect_cache_attributes(unsigned int cpu)
 	if (ret)
 		goto free_ci;
 	/*
-	 * For systems using DT for cache hierarchy, of_node and shared_cpu_map
-	 * will be set up here only if they are not populated already
+	 * For systems using DT for cache hierarchy, fw_token
+	 * and shared_cpu_map will be set up here only if they are
+	 * not populated already
 	 */
 	ret = cache_shared_cpu_map_setup(cpu);
 	if (ret) {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 3d9805297cda..0c6f658054d2 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -34,9 +34,8 @@ enum cache_type {
  * @shared_cpu_map: logical cpumask representing all the cpus sharing
  *	this cache node
  * @attributes: bitfield representing various cache attributes
- * @of_node: if devicetree is used, this represents either the cpu node in
- *	case there's no explicit cache node or the cache node itself in the
- *	device tree
+ * @fw_token: Unique value used to determine if different cacheinfo
+ *	structures represent a single hardware cache instance.
  * @disable_sysfs: indicates whether this node is visible to the user via
  *	sysfs or not
  * @priv: pointer to any private data structure specific to particular
@@ -65,8 +64,7 @@ struct cacheinfo {
 #define CACHE_ALLOCATE_POLICY_MASK	\
 	(CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
 #define CACHE_ID		BIT(4)
-
-	struct device_node *of_node;
+	void *fw_token;
 	bool disable_sysfs;
 	void *priv;
 };
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 03/13] cacheinfo: rename of_node to fw_token
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

Rename and change the type of of_node to indicate
it is a generic pointer which is generally only used
for comparison purposes. In a later patch we will put
an ACPI/PPTT token pointer in fw_token so that
the code which builds the shared cpu masks can be reused.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/base/cacheinfo.c  | 16 +++++++++-------
 include/linux/cacheinfo.h |  8 +++-----
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index a872523e8951..597aacb233fc 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -35,7 +35,7 @@ struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu)
 static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
-	return sib_leaf->of_node == this_leaf->of_node;
+	return sib_leaf->fw_token == this_leaf->fw_token;
 }
 
 /* OF properties to query for a given cache type */
@@ -167,9 +167,10 @@ static int cache_setup_of_node(unsigned int cpu)
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 	unsigned int index = 0;
 
-	/* skip if of_node is already populated */
-	if (this_cpu_ci->info_list->of_node)
+	/* skip if fw_token is already populated */
+	if (this_cpu_ci->info_list->fw_token) {
 		return 0;
+	}
 
 	if (!cpu_dev) {
 		pr_err("No cpu device for CPU %d\n", cpu);
@@ -190,7 +191,7 @@ static int cache_setup_of_node(unsigned int cpu)
 		if (!np)
 			break;
 		cache_of_set_props(this_leaf, np);
-		this_leaf->of_node = np;
+		this_leaf->fw_token = np;
 		index++;
 	}
 
@@ -278,7 +279,7 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
 		}
-		of_node_put(this_leaf->of_node);
+		of_node_put(this_leaf->fw_token);
 	}
 }
 
@@ -323,8 +324,9 @@ static int detect_cache_attributes(unsigned int cpu)
 	if (ret)
 		goto free_ci;
 	/*
-	 * For systems using DT for cache hierarchy, of_node and shared_cpu_map
-	 * will be set up here only if they are not populated already
+	 * For systems using DT for cache hierarchy, fw_token
+	 * and shared_cpu_map will be set up here only if they are
+	 * not populated already
 	 */
 	ret = cache_shared_cpu_map_setup(cpu);
 	if (ret) {
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 3d9805297cda..0c6f658054d2 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -34,9 +34,8 @@ enum cache_type {
  * @shared_cpu_map: logical cpumask representing all the cpus sharing
  *	this cache node
  * @attributes: bitfield representing various cache attributes
- * @of_node: if devicetree is used, this represents either the cpu node in
- *	case there's no explicit cache node or the cache node itself in the
- *	device tree
+ * @fw_token: Unique value used to determine if different cacheinfo
+ *	structures represent a single hardware cache instance.
  * @disable_sysfs: indicates whether this node is visible to the user via
  *	sysfs or not
  * @priv: pointer to any private data structure specific to particular
@@ -65,8 +64,7 @@ struct cacheinfo {
 #define CACHE_ALLOCATE_POLICY_MASK	\
 	(CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
 #define CACHE_ID		BIT(4)
-
-	struct device_node *of_node;
+	void *fw_token;
 	bool disable_sysfs;
 	void *priv;
 };
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 04/13] arm64/acpi: Create arch specific cpu to acpi id helper
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

Its helpful to be able to lookup the acpi_processor_id associated
with a logical cpu. Provide an arm64 helper to do this.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/acpi.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 32f465a80e4e..0db62a4cbce2 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -86,6 +86,10 @@ static inline bool acpi_has_cpu_in_madt(void)
 }
 
 struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu);
+static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
+{
+	return	acpi_cpu_get_madt_gicc(cpu)->uid;
+}
 
 static inline void arch_fix_phys_package_id(int num, u32 slot) { }
 void __init acpi_init_cpus(void);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 04/13] arm64/acpi: Create arch specific cpu to acpi id helper
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

Its helpful to be able to lookup the acpi_processor_id associated
with a logical cpu. Provide an arm64 helper to do this.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/acpi.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 32f465a80e4e..0db62a4cbce2 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -86,6 +86,10 @@ static inline bool acpi_has_cpu_in_madt(void)
 }
 
 struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu);
+static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
+{
+	return	acpi_cpu_get_madt_gicc(cpu)->uid;
+}
 
 static inline void arch_fix_phys_package_id(int num, u32 slot) { }
 void __init acpi_init_cpus(void);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 04/13] arm64/acpi: Create arch specific cpu to acpi id helper
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

Its helpful to be able to lookup the acpi_processor_id associated
with a logical cpu. Provide an arm64 helper to do this.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/acpi.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 32f465a80e4e..0db62a4cbce2 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -86,6 +86,10 @@ static inline bool acpi_has_cpu_in_madt(void)
 }
 
 struct acpi_madt_generic_interrupt *acpi_cpu_get_madt_gicc(int cpu);
+static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
+{
+	return	acpi_cpu_get_madt_gicc(cpu)->uid;
+}
 
 static inline void arch_fix_phys_package_id(int num, u32 slot) { }
 void __init acpi_init_cpus(void);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

ACPI 6.2 adds a new table, which describes how processing units
are related to each other in tree like fashion. Caches are
also sprinkled throughout the tree and describe the properties
of the caches in relation to other caches and processing units.

Add the code to parse the cache hierarchy and report the total
number of levels of cache for a given core using
acpi_find_last_cache_level() as well as fill out the individual
cores cache information with cache_setup_acpi() once the
cpu_cacheinfo structure has been populated by the arch specific
code.

An additional patch later in the set adds the ability to report
peers in the topology using find_acpi_cpu_topology()
to report a unique ID for each processing unit at a given level
in the tree. These unique id's can then be used to match related
processing units which exist as threads, COD (clusters
on die), within a given package, etc.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 488 insertions(+)
 create mode 100644 drivers/acpi/pptt.c

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
new file mode 100644
index 000000000000..883e4318c6cd
--- /dev/null
+++ b/drivers/acpi/pptt.c
@@ -0,0 +1,488 @@
+/*
+ * Copyright (C) 2018, ARM
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * This file implements parsing of Processor Properties Topology Table (PPTT)
+ * which is optionally used to describe the processor and cache topology.
+ * Due to the relative pointers used throughout the table, this doesn't
+ * leverage the existing subtable parsing in the kernel.
+ *
+ * The PPTT structure is an inverted tree, with each node potentially
+ * holding one or two inverted tree data structures describing
+ * the caches available at that level. Each cache structure optionally
+ * contains properties describing the cache at a given level which can be
+ * used to override hardware probed values.
+ */
+#define pr_fmt(fmt) "ACPI PPTT: " fmt
+
+#include <linux/acpi.h>
+#include <linux/cacheinfo.h>
+#include <acpi/processor.h>
+
+/*
+ * Given the PPTT table, find and verify that the subtable entry
+ * is located within the table
+ */
+static struct acpi_subtable_header *fetch_pptt_subtable(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	struct acpi_subtable_header *entry;
+
+	/* there isn't a subtable at reference 0 */
+	if (pptt_ref < sizeof(struct acpi_subtable_header))
+		return NULL;
+
+	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
+		return NULL;
+
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
+
+	if (pptt_ref + entry->length > table_hdr->length)
+		return NULL;
+
+	return entry;
+}
+
+static struct acpi_pptt_processor *fetch_pptt_node(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
+								 pptt_ref);
+}
+
+static struct acpi_pptt_cache *fetch_pptt_cache(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
+							     pptt_ref);
+}
+
+static struct acpi_subtable_header *acpi_get_pptt_resource(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *node, int resource)
+{
+	u32 *ref;
+
+	if (resource >= node->number_of_priv_resources)
+		return NULL;
+
+	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
+	ref += resource;
+
+	return fetch_pptt_subtable(table_hdr, *ref);
+}
+
+/*
+ * Match the type passed and special case the TYPE_UNIFIED so that
+ * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
+ */
+static inline bool acpi_pptt_match_type(int table_type, int type)
+{
+	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
+		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
+}
+
+/*
+ * Attempt to find a given cache level, while counting the max number
+ * of cache levels for the cache node.
+ *
+ * Given a pptt resource, verify that it is a cache node, then walk
+ * down each level of caches, counting how many levels are found
+ * as well as checking the cache type (icache, dcache, unified). If a
+ * level & type match, then we set found, and continue the search.
+ * Once the entire cache branch has been walked return its max
+ * depth.
+ */
+static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
+				int local_level,
+				struct acpi_subtable_header *res,
+				struct acpi_pptt_cache **found,
+				int level, int type)
+{
+	struct acpi_pptt_cache *cache;
+
+	if (res->type != ACPI_PPTT_TYPE_CACHE)
+		return 0;
+
+	cache = (struct acpi_pptt_cache *) res;
+	while (cache) {
+		local_level++;
+
+		if ((local_level == level) &&
+		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
+		    acpi_pptt_match_type(cache->attributes, type)) {
+			if ((*found != NULL) && (cache != *found))
+				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");
+
+			pr_debug("Found cache @ level %d\n", level);
+			*found = cache;
+			/*
+			 * continue looking at this node's resource list
+			 * to verify that we don't find a duplicate
+			 * cache node.
+			 */
+		}
+		cache = fetch_pptt_cache(table_hdr, cache->next_level_of_cache);
+	}
+	return local_level;
+}
+
+/*
+ * Given a CPU node look for cache levels that exist at this level, and then
+ * for each cache node, count how many levels exist below (logically above) it.
+ * If a level and type are specified, and we find that level/type, abort
+ * processing and return the acpi_pptt_cache structure.
+ */
+static struct acpi_pptt_cache *acpi_find_cache_level(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *cpu_node,
+	int *starting_level, int level, int type)
+{
+	struct acpi_subtable_header *res;
+	int number_of_levels = *starting_level;
+	int resource = 0;
+	struct acpi_pptt_cache *ret = NULL;
+	int local_level;
+
+	/* walk down from processor node */
+	while ((res = acpi_get_pptt_resource(table_hdr, cpu_node, resource))) {
+		resource++;
+
+		local_level = acpi_pptt_walk_cache(table_hdr, *starting_level,
+						   res, &ret, level, type);
+		/*
+		 * we are looking for the max depth. Since its potentially
+		 * possible for a given node to have resources with differing
+		 * depths verify that the depth we have found is the largest.
+		 */
+		if (number_of_levels < local_level)
+			number_of_levels = local_level;
+	}
+	if (number_of_levels > *starting_level)
+		*starting_level = number_of_levels;
+
+	return ret;
+}
+
+/*
+ * Given a processor node containing a processing unit, walk into it and count
+ * how many levels exist solely for it, and then walk up each level until we hit
+ * the root node (ignore the package level because it may be possible to have
+ * caches that exist across packages). Count the number of cache levels that
+ * exist at each level on the way up.
+ */
+static int acpi_process_node(struct acpi_table_header *table_hdr,
+			     struct acpi_pptt_processor *cpu_node)
+{
+	int total_levels = 0;
+
+	do {
+		acpi_find_cache_level(table_hdr, cpu_node, &total_levels, 0, 0);
+		cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
+	} while (cpu_node);
+
+	return total_levels;
+}
+
+/*
+ * Determine if the *node parameter is a leaf node by iterating the
+ * PPTT table, looking for nodes which reference it.
+ * Return 0 if we find a node referencing the passed node,
+ * or 1 if we don't.
+ */
+static int acpi_pptt_leaf_node(struct acpi_table_header *table_hdr,
+			       struct acpi_pptt_processor *node)
+{
+	struct acpi_subtable_header *entry;
+	unsigned long table_end;
+	u32 node_entry;
+	struct acpi_pptt_processor *cpu_node;
+	u32 proc_sz;
+
+	table_end = (unsigned long)table_hdr + table_hdr->length;
+	node_entry = ACPI_PTR_DIFF(node, table_hdr);
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
+			     sizeof(struct acpi_table_pptt));
+	proc_sz = sizeof(struct acpi_pptt_processor *);
+
+	while (((unsigned long)entry + proc_sz) < table_end) {
+		cpu_node = (struct acpi_pptt_processor *)entry;
+		if ((entry->type == ACPI_PPTT_TYPE_PROCESSOR) &&
+		    (cpu_node->parent == node_entry))
+			return 0;
+		entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
+				     entry->length);
+	}
+	return 1;
+}
+
+/*
+ * Find the subtable entry describing the provided processor.
+ * This is done by iterating the PPTT table looking for processor nodes
+ * which have an acpi_processor_id that matches the acpi_cpu_id parameter
+ * passed into the function. If we find a node that matches this criteria
+ * we verify that its a leaf node in the topology rather than depending
+ * on the valid flag, which doesn't need to be set for leaf nodes.
+ */
+static struct acpi_pptt_processor *acpi_find_processor_node(
+	struct acpi_table_header *table_hdr,
+	u32 acpi_cpu_id)
+{
+	struct acpi_subtable_header *entry;
+	unsigned long table_end;
+	struct acpi_pptt_processor *cpu_node;
+	u32 proc_sz;
+
+	table_end = (unsigned long)table_hdr + table_hdr->length;
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
+			     sizeof(struct acpi_table_pptt));
+	proc_sz = sizeof(struct acpi_pptt_processor *);
+
+	/* find the processor structure associated with this cpuid */
+	while (((unsigned long)entry + proc_sz) < table_end) {
+		cpu_node = (struct acpi_pptt_processor *)entry;
+
+		if (entry->length == 0) {
+			pr_err("Invalid zero length subtable\n");
+			break;
+		}
+		if ((entry->type == ACPI_PPTT_TYPE_PROCESSOR) &&
+		    (acpi_cpu_id == cpu_node->acpi_processor_id) &&
+		     acpi_pptt_leaf_node(table_hdr, cpu_node)) {
+			return (struct acpi_pptt_processor *)entry;
+		}
+
+		entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
+				     entry->length);
+	}
+
+	return NULL;
+}
+
+static int acpi_find_cache_levels(struct acpi_table_header *table_hdr,
+				  u32 acpi_cpu_id)
+{
+	int number_of_levels = 0;
+	struct acpi_pptt_processor *cpu;
+
+	cpu = acpi_find_processor_node(table_hdr, acpi_cpu_id);
+	if (cpu)
+		number_of_levels = acpi_process_node(table_hdr, cpu);
+
+	return number_of_levels;
+}
+
+/* Convert the linux cache_type to a ACPI PPTT cache type value */
+static u8 acpi_cache_type(enum cache_type type)
+{
+	switch (type) {
+	case CACHE_TYPE_DATA:
+		pr_debug("Looking for data cache\n");
+		return ACPI_PPTT_CACHE_TYPE_DATA;
+	case CACHE_TYPE_INST:
+		pr_debug("Looking for instruction cache\n");
+		return ACPI_PPTT_CACHE_TYPE_INSTR;
+	default:
+	case CACHE_TYPE_UNIFIED:
+		pr_debug("Looking for unified cache\n");
+		/*
+		 * It is important that ACPI_PPTT_CACHE_TYPE_UNIFIED
+		 * contains the bit pattern that will match both
+		 * ACPI unified bit patterns because we use it later
+		 * to match both cases.
+		 */
+		return ACPI_PPTT_CACHE_TYPE_UNIFIED;
+	}
+}
+
+/* find the ACPI node describing the cache type/level for the given CPU */
+static struct acpi_pptt_cache *acpi_find_cache_node(
+	struct acpi_table_header *table_hdr, u32 acpi_cpu_id,
+	enum cache_type type, unsigned int level,
+	struct acpi_pptt_processor **node)
+{
+	int total_levels = 0;
+	struct acpi_pptt_cache *found = NULL;
+	struct acpi_pptt_processor *cpu_node;
+	u8 acpi_type = acpi_cache_type(type);
+
+	pr_debug("Looking for CPU %d's level %d cache type %d\n",
+		 acpi_cpu_id, level, acpi_type);
+
+	cpu_node = acpi_find_processor_node(table_hdr, acpi_cpu_id);
+
+	while ((cpu_node) && (!found)) {
+		found = acpi_find_cache_level(table_hdr, cpu_node,
+					      &total_levels, level, acpi_type);
+		*node = cpu_node;
+		cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
+	}
+
+	return found;
+}
+
+/* total number of attributes checked by the properties code */
+#define PPTT_CHECKED_ATTRIBUTES 4
+
+/*
+ * The ACPI spec implies that the fields in the cache structures are used to
+ * extend and correct the information probed from the hardware. Lets only
+ * set fields that we determine are VALID.
+ */
+static void update_cache_properties(struct cacheinfo *this_leaf,
+				    struct acpi_pptt_cache *found_cache,
+				    struct acpi_pptt_processor *cpu_node)
+{
+	int valid_flags = 0;
+
+	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
+		this_leaf->size = found_cache->size;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
+		this_leaf->coherency_line_size = found_cache->line_size;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
+		this_leaf->number_of_sets = found_cache->number_of_sets;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
+		this_leaf->ways_of_associativity = found_cache->associativity;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
+		switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
+		case ACPI_PPTT_CACHE_POLICY_WT:
+			this_leaf->attributes = CACHE_WRITE_THROUGH;
+			break;
+		case ACPI_PPTT_CACHE_POLICY_WB:
+			this_leaf->attributes = CACHE_WRITE_BACK;
+			break;
+		}
+	}
+	if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
+		switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
+		case ACPI_PPTT_CACHE_READ_ALLOCATE:
+			this_leaf->attributes |= CACHE_READ_ALLOCATE;
+			break;
+		case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
+			this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
+			break;
+		case ACPI_PPTT_CACHE_RW_ALLOCATE:
+		case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
+			this_leaf->attributes |=
+				CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
+			break;
+		}
+	}
+	/*
+	 * If the above flags are valid, and the cache type is NOCACHE
+	 * update the cache type as well.
+	 */
+	if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
+	    (valid_flags == PPTT_CHECKED_ATTRIBUTES))
+		this_leaf->type = CACHE_TYPE_UNIFIED;
+}
+
+/*
+ * Update the kernel cache information for each level of cache
+ * associated with the given acpi cpu.
+ */
+static void cache_setup_acpi_cpu(struct acpi_table_header *table,
+				 unsigned int cpu)
+{
+	struct acpi_pptt_cache *found_cache;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	struct cacheinfo *this_leaf;
+	unsigned int index = 0;
+	struct acpi_pptt_processor *cpu_node = NULL;
+
+	while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
+		this_leaf = this_cpu_ci->info_list + index;
+		found_cache = acpi_find_cache_node(table, acpi_cpu_id,
+						   this_leaf->type,
+						   this_leaf->level,
+						   &cpu_node);
+		pr_debug("found = %p %p\n", found_cache, cpu_node);
+		if (found_cache)
+			update_cache_properties(this_leaf,
+						found_cache,
+						cpu_node);
+
+		index++;
+	}
+}
+
+/**
+ * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
+ * @cpu: Kernel logical cpu number
+ *
+ * Given a logical cpu number, returns the number of levels of cache represented
+ * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
+ * indicating we didn't find any cache levels.
+ *
+ * Return: Cache levels visible to this core.
+ */
+int acpi_find_last_cache_level(unsigned int cpu)
+{
+	u32 acpi_cpu_id;
+	struct acpi_table_header *table;
+	int number_of_levels = 0;
+	acpi_status status;
+
+	pr_debug("Cache Setup find last level cpu=%d\n", cpu);
+
+	acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
+	} else {
+		number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
+		acpi_put_table(table);
+	}
+	pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
+
+	return number_of_levels;
+}
+
+/**
+ * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
+ * @cpu: Kernel logical cpu number
+ *
+ * Updates the global cache info provided by cpu_get_cacheinfo()
+ * when there are valid properties in the acpi_pptt_cache nodes. A
+ * successful parse may not result in any updates if none of the
+ * cache levels have any valid flags set.  Futher, a unique value is
+ * associated with each known CPU cache entry. This unique value
+ * can be used to determine whether caches are shared between cpus.
+ *
+ * Return: -ENOENT on failure to find table, or 0 on success
+ */
+int cache_setup_acpi(unsigned int cpu)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+
+	pr_debug("Cache Setup ACPI cpu %d\n", cpu);
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
+		return -ENOENT;
+	}
+
+	cache_setup_acpi_cpu(table, cpu);
+	acpi_put_table(table);
+
+	return status;
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

ACPI 6.2 adds a new table, which describes how processing units
are related to each other in tree like fashion. Caches are
also sprinkled throughout the tree and describe the properties
of the caches in relation to other caches and processing units.

Add the code to parse the cache hierarchy and report the total
number of levels of cache for a given core using
acpi_find_last_cache_level() as well as fill out the individual
cores cache information with cache_setup_acpi() once the
cpu_cacheinfo structure has been populated by the arch specific
code.

An additional patch later in the set adds the ability to report
peers in the topology using find_acpi_cpu_topology()
to report a unique ID for each processing unit at a given level
in the tree. These unique id's can then be used to match related
processing units which exist as threads, COD (clusters
on die), within a given package, etc.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 488 insertions(+)
 create mode 100644 drivers/acpi/pptt.c

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
new file mode 100644
index 000000000000..883e4318c6cd
--- /dev/null
+++ b/drivers/acpi/pptt.c
@@ -0,0 +1,488 @@
+/*
+ * Copyright (C) 2018, ARM
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * This file implements parsing of Processor Properties Topology Table (PPTT)
+ * which is optionally used to describe the processor and cache topology.
+ * Due to the relative pointers used throughout the table, this doesn't
+ * leverage the existing subtable parsing in the kernel.
+ *
+ * The PPTT structure is an inverted tree, with each node potentially
+ * holding one or two inverted tree data structures describing
+ * the caches available at that level. Each cache structure optionally
+ * contains properties describing the cache at a given level which can be
+ * used to override hardware probed values.
+ */
+#define pr_fmt(fmt) "ACPI PPTT: " fmt
+
+#include <linux/acpi.h>
+#include <linux/cacheinfo.h>
+#include <acpi/processor.h>
+
+/*
+ * Given the PPTT table, find and verify that the subtable entry
+ * is located within the table
+ */
+static struct acpi_subtable_header *fetch_pptt_subtable(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	struct acpi_subtable_header *entry;
+
+	/* there isn't a subtable at reference 0 */
+	if (pptt_ref < sizeof(struct acpi_subtable_header))
+		return NULL;
+
+	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
+		return NULL;
+
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
+
+	if (pptt_ref + entry->length > table_hdr->length)
+		return NULL;
+
+	return entry;
+}
+
+static struct acpi_pptt_processor *fetch_pptt_node(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
+								 pptt_ref);
+}
+
+static struct acpi_pptt_cache *fetch_pptt_cache(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
+							     pptt_ref);
+}
+
+static struct acpi_subtable_header *acpi_get_pptt_resource(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *node, int resource)
+{
+	u32 *ref;
+
+	if (resource >= node->number_of_priv_resources)
+		return NULL;
+
+	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
+	ref += resource;
+
+	return fetch_pptt_subtable(table_hdr, *ref);
+}
+
+/*
+ * Match the type passed and special case the TYPE_UNIFIED so that
+ * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
+ */
+static inline bool acpi_pptt_match_type(int table_type, int type)
+{
+	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
+		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
+}
+
+/*
+ * Attempt to find a given cache level, while counting the max number
+ * of cache levels for the cache node.
+ *
+ * Given a pptt resource, verify that it is a cache node, then walk
+ * down each level of caches, counting how many levels are found
+ * as well as checking the cache type (icache, dcache, unified). If a
+ * level & type match, then we set found, and continue the search.
+ * Once the entire cache branch has been walked return its max
+ * depth.
+ */
+static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
+				int local_level,
+				struct acpi_subtable_header *res,
+				struct acpi_pptt_cache **found,
+				int level, int type)
+{
+	struct acpi_pptt_cache *cache;
+
+	if (res->type != ACPI_PPTT_TYPE_CACHE)
+		return 0;
+
+	cache = (struct acpi_pptt_cache *) res;
+	while (cache) {
+		local_level++;
+
+		if ((local_level == level) &&
+		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
+		    acpi_pptt_match_type(cache->attributes, type)) {
+			if ((*found != NULL) && (cache != *found))
+				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");
+
+			pr_debug("Found cache @ level %d\n", level);
+			*found = cache;
+			/*
+			 * continue looking at this node's resource list
+			 * to verify that we don't find a duplicate
+			 * cache node.
+			 */
+		}
+		cache = fetch_pptt_cache(table_hdr, cache->next_level_of_cache);
+	}
+	return local_level;
+}
+
+/*
+ * Given a CPU node look for cache levels that exist at this level, and then
+ * for each cache node, count how many levels exist below (logically above) it.
+ * If a level and type are specified, and we find that level/type, abort
+ * processing and return the acpi_pptt_cache structure.
+ */
+static struct acpi_pptt_cache *acpi_find_cache_level(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *cpu_node,
+	int *starting_level, int level, int type)
+{
+	struct acpi_subtable_header *res;
+	int number_of_levels = *starting_level;
+	int resource = 0;
+	struct acpi_pptt_cache *ret = NULL;
+	int local_level;
+
+	/* walk down from processor node */
+	while ((res = acpi_get_pptt_resource(table_hdr, cpu_node, resource))) {
+		resource++;
+
+		local_level = acpi_pptt_walk_cache(table_hdr, *starting_level,
+						   res, &ret, level, type);
+		/*
+		 * we are looking for the max depth. Since its potentially
+		 * possible for a given node to have resources with differing
+		 * depths verify that the depth we have found is the largest.
+		 */
+		if (number_of_levels < local_level)
+			number_of_levels = local_level;
+	}
+	if (number_of_levels > *starting_level)
+		*starting_level = number_of_levels;
+
+	return ret;
+}
+
+/*
+ * Given a processor node containing a processing unit, walk into it and count
+ * how many levels exist solely for it, and then walk up each level until we hit
+ * the root node (ignore the package level because it may be possible to have
+ * caches that exist across packages). Count the number of cache levels that
+ * exist at each level on the way up.
+ */
+static int acpi_process_node(struct acpi_table_header *table_hdr,
+			     struct acpi_pptt_processor *cpu_node)
+{
+	int total_levels = 0;
+
+	do {
+		acpi_find_cache_level(table_hdr, cpu_node, &total_levels, 0, 0);
+		cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
+	} while (cpu_node);
+
+	return total_levels;
+}
+
+/*
+ * Determine if the *node parameter is a leaf node by iterating the
+ * PPTT table, looking for nodes which reference it.
+ * Return 0 if we find a node referencing the passed node,
+ * or 1 if we don't.
+ */
+static int acpi_pptt_leaf_node(struct acpi_table_header *table_hdr,
+			       struct acpi_pptt_processor *node)
+{
+	struct acpi_subtable_header *entry;
+	unsigned long table_end;
+	u32 node_entry;
+	struct acpi_pptt_processor *cpu_node;
+	u32 proc_sz;
+
+	table_end = (unsigned long)table_hdr + table_hdr->length;
+	node_entry = ACPI_PTR_DIFF(node, table_hdr);
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
+			     sizeof(struct acpi_table_pptt));
+	proc_sz = sizeof(struct acpi_pptt_processor *);
+
+	while (((unsigned long)entry + proc_sz) < table_end) {
+		cpu_node = (struct acpi_pptt_processor *)entry;
+		if ((entry->type == ACPI_PPTT_TYPE_PROCESSOR) &&
+		    (cpu_node->parent == node_entry))
+			return 0;
+		entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
+				     entry->length);
+	}
+	return 1;
+}
+
+/*
+ * Find the subtable entry describing the provided processor.
+ * This is done by iterating the PPTT table looking for processor nodes
+ * which have an acpi_processor_id that matches the acpi_cpu_id parameter
+ * passed into the function. If we find a node that matches this criteria
+ * we verify that its a leaf node in the topology rather than depending
+ * on the valid flag, which doesn't need to be set for leaf nodes.
+ */
+static struct acpi_pptt_processor *acpi_find_processor_node(
+	struct acpi_table_header *table_hdr,
+	u32 acpi_cpu_id)
+{
+	struct acpi_subtable_header *entry;
+	unsigned long table_end;
+	struct acpi_pptt_processor *cpu_node;
+	u32 proc_sz;
+
+	table_end = (unsigned long)table_hdr + table_hdr->length;
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
+			     sizeof(struct acpi_table_pptt));
+	proc_sz = sizeof(struct acpi_pptt_processor *);
+
+	/* find the processor structure associated with this cpuid */
+	while (((unsigned long)entry + proc_sz) < table_end) {
+		cpu_node = (struct acpi_pptt_processor *)entry;
+
+		if (entry->length == 0) {
+			pr_err("Invalid zero length subtable\n");
+			break;
+		}
+		if ((entry->type == ACPI_PPTT_TYPE_PROCESSOR) &&
+		    (acpi_cpu_id == cpu_node->acpi_processor_id) &&
+		     acpi_pptt_leaf_node(table_hdr, cpu_node)) {
+			return (struct acpi_pptt_processor *)entry;
+		}
+
+		entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
+				     entry->length);
+	}
+
+	return NULL;
+}
+
+static int acpi_find_cache_levels(struct acpi_table_header *table_hdr,
+				  u32 acpi_cpu_id)
+{
+	int number_of_levels = 0;
+	struct acpi_pptt_processor *cpu;
+
+	cpu = acpi_find_processor_node(table_hdr, acpi_cpu_id);
+	if (cpu)
+		number_of_levels = acpi_process_node(table_hdr, cpu);
+
+	return number_of_levels;
+}
+
+/* Convert the linux cache_type to a ACPI PPTT cache type value */
+static u8 acpi_cache_type(enum cache_type type)
+{
+	switch (type) {
+	case CACHE_TYPE_DATA:
+		pr_debug("Looking for data cache\n");
+		return ACPI_PPTT_CACHE_TYPE_DATA;
+	case CACHE_TYPE_INST:
+		pr_debug("Looking for instruction cache\n");
+		return ACPI_PPTT_CACHE_TYPE_INSTR;
+	default:
+	case CACHE_TYPE_UNIFIED:
+		pr_debug("Looking for unified cache\n");
+		/*
+		 * It is important that ACPI_PPTT_CACHE_TYPE_UNIFIED
+		 * contains the bit pattern that will match both
+		 * ACPI unified bit patterns because we use it later
+		 * to match both cases.
+		 */
+		return ACPI_PPTT_CACHE_TYPE_UNIFIED;
+	}
+}
+
+/* find the ACPI node describing the cache type/level for the given CPU */
+static struct acpi_pptt_cache *acpi_find_cache_node(
+	struct acpi_table_header *table_hdr, u32 acpi_cpu_id,
+	enum cache_type type, unsigned int level,
+	struct acpi_pptt_processor **node)
+{
+	int total_levels = 0;
+	struct acpi_pptt_cache *found = NULL;
+	struct acpi_pptt_processor *cpu_node;
+	u8 acpi_type = acpi_cache_type(type);
+
+	pr_debug("Looking for CPU %d's level %d cache type %d\n",
+		 acpi_cpu_id, level, acpi_type);
+
+	cpu_node = acpi_find_processor_node(table_hdr, acpi_cpu_id);
+
+	while ((cpu_node) && (!found)) {
+		found = acpi_find_cache_level(table_hdr, cpu_node,
+					      &total_levels, level, acpi_type);
+		*node = cpu_node;
+		cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
+	}
+
+	return found;
+}
+
+/* total number of attributes checked by the properties code */
+#define PPTT_CHECKED_ATTRIBUTES 4
+
+/*
+ * The ACPI spec implies that the fields in the cache structures are used to
+ * extend and correct the information probed from the hardware. Lets only
+ * set fields that we determine are VALID.
+ */
+static void update_cache_properties(struct cacheinfo *this_leaf,
+				    struct acpi_pptt_cache *found_cache,
+				    struct acpi_pptt_processor *cpu_node)
+{
+	int valid_flags = 0;
+
+	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
+		this_leaf->size = found_cache->size;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
+		this_leaf->coherency_line_size = found_cache->line_size;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
+		this_leaf->number_of_sets = found_cache->number_of_sets;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
+		this_leaf->ways_of_associativity = found_cache->associativity;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
+		switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
+		case ACPI_PPTT_CACHE_POLICY_WT:
+			this_leaf->attributes = CACHE_WRITE_THROUGH;
+			break;
+		case ACPI_PPTT_CACHE_POLICY_WB:
+			this_leaf->attributes = CACHE_WRITE_BACK;
+			break;
+		}
+	}
+	if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
+		switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
+		case ACPI_PPTT_CACHE_READ_ALLOCATE:
+			this_leaf->attributes |= CACHE_READ_ALLOCATE;
+			break;
+		case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
+			this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
+			break;
+		case ACPI_PPTT_CACHE_RW_ALLOCATE:
+		case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
+			this_leaf->attributes |=
+				CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
+			break;
+		}
+	}
+	/*
+	 * If the above flags are valid, and the cache type is NOCACHE
+	 * update the cache type as well.
+	 */
+	if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
+	    (valid_flags == PPTT_CHECKED_ATTRIBUTES))
+		this_leaf->type = CACHE_TYPE_UNIFIED;
+}
+
+/*
+ * Update the kernel cache information for each level of cache
+ * associated with the given acpi cpu.
+ */
+static void cache_setup_acpi_cpu(struct acpi_table_header *table,
+				 unsigned int cpu)
+{
+	struct acpi_pptt_cache *found_cache;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	struct cacheinfo *this_leaf;
+	unsigned int index = 0;
+	struct acpi_pptt_processor *cpu_node = NULL;
+
+	while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
+		this_leaf = this_cpu_ci->info_list + index;
+		found_cache = acpi_find_cache_node(table, acpi_cpu_id,
+						   this_leaf->type,
+						   this_leaf->level,
+						   &cpu_node);
+		pr_debug("found = %p %p\n", found_cache, cpu_node);
+		if (found_cache)
+			update_cache_properties(this_leaf,
+						found_cache,
+						cpu_node);
+
+		index++;
+	}
+}
+
+/**
+ * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
+ * @cpu: Kernel logical cpu number
+ *
+ * Given a logical cpu number, returns the number of levels of cache represented
+ * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
+ * indicating we didn't find any cache levels.
+ *
+ * Return: Cache levels visible to this core.
+ */
+int acpi_find_last_cache_level(unsigned int cpu)
+{
+	u32 acpi_cpu_id;
+	struct acpi_table_header *table;
+	int number_of_levels = 0;
+	acpi_status status;
+
+	pr_debug("Cache Setup find last level cpu=%d\n", cpu);
+
+	acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
+	} else {
+		number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
+		acpi_put_table(table);
+	}
+	pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
+
+	return number_of_levels;
+}
+
+/**
+ * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
+ * @cpu: Kernel logical cpu number
+ *
+ * Updates the global cache info provided by cpu_get_cacheinfo()
+ * when there are valid properties in the acpi_pptt_cache nodes. A
+ * successful parse may not result in any updates if none of the
+ * cache levels have any valid flags set.  Futher, a unique value is
+ * associated with each known CPU cache entry. This unique value
+ * can be used to determine whether caches are shared between cpus.
+ *
+ * Return: -ENOENT on failure to find table, or 0 on success
+ */
+int cache_setup_acpi(unsigned int cpu)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+
+	pr_debug("Cache Setup ACPI cpu %d\n", cpu);
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
+		return -ENOENT;
+	}
+
+	cache_setup_acpi_cpu(table, cpu);
+	acpi_put_table(table);
+
+	return status;
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

ACPI 6.2 adds a new table, which describes how processing units
are related to each other in tree like fashion. Caches are
also sprinkled throughout the tree and describe the properties
of the caches in relation to other caches and processing units.

Add the code to parse the cache hierarchy and report the total
number of levels of cache for a given core using
acpi_find_last_cache_level() as well as fill out the individual
cores cache information with cache_setup_acpi() once the
cpu_cacheinfo structure has been populated by the arch specific
code.

An additional patch later in the set adds the ability to report
peers in the topology using find_acpi_cpu_topology()
to report a unique ID for each processing unit at a given level
in the tree. These unique id's can then be used to match related
processing units which exist as threads, COD (clusters
on die), within a given package, etc.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 488 insertions(+)
 create mode 100644 drivers/acpi/pptt.c

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
new file mode 100644
index 000000000000..883e4318c6cd
--- /dev/null
+++ b/drivers/acpi/pptt.c
@@ -0,0 +1,488 @@
+/*
+ * Copyright (C) 2018, ARM
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * This file implements parsing of Processor Properties Topology Table (PPTT)
+ * which is optionally used to describe the processor and cache topology.
+ * Due to the relative pointers used throughout the table, this doesn't
+ * leverage the existing subtable parsing in the kernel.
+ *
+ * The PPTT structure is an inverted tree, with each node potentially
+ * holding one or two inverted tree data structures describing
+ * the caches available at that level. Each cache structure optionally
+ * contains properties describing the cache at a given level which can be
+ * used to override hardware probed values.
+ */
+#define pr_fmt(fmt) "ACPI PPTT: " fmt
+
+#include <linux/acpi.h>
+#include <linux/cacheinfo.h>
+#include <acpi/processor.h>
+
+/*
+ * Given the PPTT table, find and verify that the subtable entry
+ * is located within the table
+ */
+static struct acpi_subtable_header *fetch_pptt_subtable(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	struct acpi_subtable_header *entry;
+
+	/* there isn't a subtable at reference 0 */
+	if (pptt_ref < sizeof(struct acpi_subtable_header))
+		return NULL;
+
+	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
+		return NULL;
+
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
+
+	if (pptt_ref + entry->length > table_hdr->length)
+		return NULL;
+
+	return entry;
+}
+
+static struct acpi_pptt_processor *fetch_pptt_node(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
+								 pptt_ref);
+}
+
+static struct acpi_pptt_cache *fetch_pptt_cache(
+	struct acpi_table_header *table_hdr, u32 pptt_ref)
+{
+	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
+							     pptt_ref);
+}
+
+static struct acpi_subtable_header *acpi_get_pptt_resource(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *node, int resource)
+{
+	u32 *ref;
+
+	if (resource >= node->number_of_priv_resources)
+		return NULL;
+
+	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
+	ref += resource;
+
+	return fetch_pptt_subtable(table_hdr, *ref);
+}
+
+/*
+ * Match the type passed and special case the TYPE_UNIFIED so that
+ * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
+ */
+static inline bool acpi_pptt_match_type(int table_type, int type)
+{
+	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
+		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
+}
+
+/*
+ * Attempt to find a given cache level, while counting the max number
+ * of cache levels for the cache node.
+ *
+ * Given a pptt resource, verify that it is a cache node, then walk
+ * down each level of caches, counting how many levels are found
+ * as well as checking the cache type (icache, dcache, unified). If a
+ * level & type match, then we set found, and continue the search.
+ * Once the entire cache branch has been walked return its max
+ * depth.
+ */
+static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
+				int local_level,
+				struct acpi_subtable_header *res,
+				struct acpi_pptt_cache **found,
+				int level, int type)
+{
+	struct acpi_pptt_cache *cache;
+
+	if (res->type != ACPI_PPTT_TYPE_CACHE)
+		return 0;
+
+	cache = (struct acpi_pptt_cache *) res;
+	while (cache) {
+		local_level++;
+
+		if ((local_level == level) &&
+		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
+		    acpi_pptt_match_type(cache->attributes, type)) {
+			if ((*found != NULL) && (cache != *found))
+				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");
+
+			pr_debug("Found cache @ level %d\n", level);
+			*found = cache;
+			/*
+			 * continue looking at this node's resource list
+			 * to verify that we don't find a duplicate
+			 * cache node.
+			 */
+		}
+		cache = fetch_pptt_cache(table_hdr, cache->next_level_of_cache);
+	}
+	return local_level;
+}
+
+/*
+ * Given a CPU node look for cache levels that exist at this level, and then
+ * for each cache node, count how many levels exist below (logically above) it.
+ * If a level and type are specified, and we find that level/type, abort
+ * processing and return the acpi_pptt_cache structure.
+ */
+static struct acpi_pptt_cache *acpi_find_cache_level(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *cpu_node,
+	int *starting_level, int level, int type)
+{
+	struct acpi_subtable_header *res;
+	int number_of_levels = *starting_level;
+	int resource = 0;
+	struct acpi_pptt_cache *ret = NULL;
+	int local_level;
+
+	/* walk down from processor node */
+	while ((res = acpi_get_pptt_resource(table_hdr, cpu_node, resource))) {
+		resource++;
+
+		local_level = acpi_pptt_walk_cache(table_hdr, *starting_level,
+						   res, &ret, level, type);
+		/*
+		 * we are looking for the max depth. Since its potentially
+		 * possible for a given node to have resources with differing
+		 * depths verify that the depth we have found is the largest.
+		 */
+		if (number_of_levels < local_level)
+			number_of_levels = local_level;
+	}
+	if (number_of_levels > *starting_level)
+		*starting_level = number_of_levels;
+
+	return ret;
+}
+
+/*
+ * Given a processor node containing a processing unit, walk into it and count
+ * how many levels exist solely for it, and then walk up each level until we hit
+ * the root node (ignore the package level because it may be possible to have
+ * caches that exist across packages). Count the number of cache levels that
+ * exist at each level on the way up.
+ */
+static int acpi_process_node(struct acpi_table_header *table_hdr,
+			     struct acpi_pptt_processor *cpu_node)
+{
+	int total_levels = 0;
+
+	do {
+		acpi_find_cache_level(table_hdr, cpu_node, &total_levels, 0, 0);
+		cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
+	} while (cpu_node);
+
+	return total_levels;
+}
+
+/*
+ * Determine if the *node parameter is a leaf node by iterating the
+ * PPTT table, looking for nodes which reference it.
+ * Return 0 if we find a node referencing the passed node,
+ * or 1 if we don't.
+ */
+static int acpi_pptt_leaf_node(struct acpi_table_header *table_hdr,
+			       struct acpi_pptt_processor *node)
+{
+	struct acpi_subtable_header *entry;
+	unsigned long table_end;
+	u32 node_entry;
+	struct acpi_pptt_processor *cpu_node;
+	u32 proc_sz;
+
+	table_end = (unsigned long)table_hdr + table_hdr->length;
+	node_entry = ACPI_PTR_DIFF(node, table_hdr);
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
+			     sizeof(struct acpi_table_pptt));
+	proc_sz = sizeof(struct acpi_pptt_processor *);
+
+	while (((unsigned long)entry + proc_sz) < table_end) {
+		cpu_node = (struct acpi_pptt_processor *)entry;
+		if ((entry->type == ACPI_PPTT_TYPE_PROCESSOR) &&
+		    (cpu_node->parent == node_entry))
+			return 0;
+		entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
+				     entry->length);
+	}
+	return 1;
+}
+
+/*
+ * Find the subtable entry describing the provided processor.
+ * This is done by iterating the PPTT table looking for processor nodes
+ * which have an acpi_processor_id that matches the acpi_cpu_id parameter
+ * passed into the function. If we find a node that matches this criteria
+ * we verify that its a leaf node in the topology rather than depending
+ * on the valid flag, which doesn't need to be set for leaf nodes.
+ */
+static struct acpi_pptt_processor *acpi_find_processor_node(
+	struct acpi_table_header *table_hdr,
+	u32 acpi_cpu_id)
+{
+	struct acpi_subtable_header *entry;
+	unsigned long table_end;
+	struct acpi_pptt_processor *cpu_node;
+	u32 proc_sz;
+
+	table_end = (unsigned long)table_hdr + table_hdr->length;
+	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
+			     sizeof(struct acpi_table_pptt));
+	proc_sz = sizeof(struct acpi_pptt_processor *);
+
+	/* find the processor structure associated with this cpuid */
+	while (((unsigned long)entry + proc_sz) < table_end) {
+		cpu_node = (struct acpi_pptt_processor *)entry;
+
+		if (entry->length == 0) {
+			pr_err("Invalid zero length subtable\n");
+			break;
+		}
+		if ((entry->type == ACPI_PPTT_TYPE_PROCESSOR) &&
+		    (acpi_cpu_id == cpu_node->acpi_processor_id) &&
+		     acpi_pptt_leaf_node(table_hdr, cpu_node)) {
+			return (struct acpi_pptt_processor *)entry;
+		}
+
+		entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
+				     entry->length);
+	}
+
+	return NULL;
+}
+
+static int acpi_find_cache_levels(struct acpi_table_header *table_hdr,
+				  u32 acpi_cpu_id)
+{
+	int number_of_levels = 0;
+	struct acpi_pptt_processor *cpu;
+
+	cpu = acpi_find_processor_node(table_hdr, acpi_cpu_id);
+	if (cpu)
+		number_of_levels = acpi_process_node(table_hdr, cpu);
+
+	return number_of_levels;
+}
+
+/* Convert the linux cache_type to a ACPI PPTT cache type value */
+static u8 acpi_cache_type(enum cache_type type)
+{
+	switch (type) {
+	case CACHE_TYPE_DATA:
+		pr_debug("Looking for data cache\n");
+		return ACPI_PPTT_CACHE_TYPE_DATA;
+	case CACHE_TYPE_INST:
+		pr_debug("Looking for instruction cache\n");
+		return ACPI_PPTT_CACHE_TYPE_INSTR;
+	default:
+	case CACHE_TYPE_UNIFIED:
+		pr_debug("Looking for unified cache\n");
+		/*
+		 * It is important that ACPI_PPTT_CACHE_TYPE_UNIFIED
+		 * contains the bit pattern that will match both
+		 * ACPI unified bit patterns because we use it later
+		 * to match both cases.
+		 */
+		return ACPI_PPTT_CACHE_TYPE_UNIFIED;
+	}
+}
+
+/* find the ACPI node describing the cache type/level for the given CPU */
+static struct acpi_pptt_cache *acpi_find_cache_node(
+	struct acpi_table_header *table_hdr, u32 acpi_cpu_id,
+	enum cache_type type, unsigned int level,
+	struct acpi_pptt_processor **node)
+{
+	int total_levels = 0;
+	struct acpi_pptt_cache *found = NULL;
+	struct acpi_pptt_processor *cpu_node;
+	u8 acpi_type = acpi_cache_type(type);
+
+	pr_debug("Looking for CPU %d's level %d cache type %d\n",
+		 acpi_cpu_id, level, acpi_type);
+
+	cpu_node = acpi_find_processor_node(table_hdr, acpi_cpu_id);
+
+	while ((cpu_node) && (!found)) {
+		found = acpi_find_cache_level(table_hdr, cpu_node,
+					      &total_levels, level, acpi_type);
+		*node = cpu_node;
+		cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
+	}
+
+	return found;
+}
+
+/* total number of attributes checked by the properties code */
+#define PPTT_CHECKED_ATTRIBUTES 4
+
+/*
+ * The ACPI spec implies that the fields in the cache structures are used to
+ * extend and correct the information probed from the hardware. Lets only
+ * set fields that we determine are VALID.
+ */
+static void update_cache_properties(struct cacheinfo *this_leaf,
+				    struct acpi_pptt_cache *found_cache,
+				    struct acpi_pptt_processor *cpu_node)
+{
+	int valid_flags = 0;
+
+	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
+		this_leaf->size = found_cache->size;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
+		this_leaf->coherency_line_size = found_cache->line_size;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
+		this_leaf->number_of_sets = found_cache->number_of_sets;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
+		this_leaf->ways_of_associativity = found_cache->associativity;
+		valid_flags++;
+	}
+	if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
+		switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
+		case ACPI_PPTT_CACHE_POLICY_WT:
+			this_leaf->attributes = CACHE_WRITE_THROUGH;
+			break;
+		case ACPI_PPTT_CACHE_POLICY_WB:
+			this_leaf->attributes = CACHE_WRITE_BACK;
+			break;
+		}
+	}
+	if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
+		switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
+		case ACPI_PPTT_CACHE_READ_ALLOCATE:
+			this_leaf->attributes |= CACHE_READ_ALLOCATE;
+			break;
+		case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
+			this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
+			break;
+		case ACPI_PPTT_CACHE_RW_ALLOCATE:
+		case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
+			this_leaf->attributes |=
+				CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
+			break;
+		}
+	}
+	/*
+	 * If the above flags are valid, and the cache type is NOCACHE
+	 * update the cache type as well.
+	 */
+	if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
+	    (valid_flags == PPTT_CHECKED_ATTRIBUTES))
+		this_leaf->type = CACHE_TYPE_UNIFIED;
+}
+
+/*
+ * Update the kernel cache information for each level of cache
+ * associated with the given acpi cpu.
+ */
+static void cache_setup_acpi_cpu(struct acpi_table_header *table,
+				 unsigned int cpu)
+{
+	struct acpi_pptt_cache *found_cache;
+	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	struct cacheinfo *this_leaf;
+	unsigned int index = 0;
+	struct acpi_pptt_processor *cpu_node = NULL;
+
+	while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
+		this_leaf = this_cpu_ci->info_list + index;
+		found_cache = acpi_find_cache_node(table, acpi_cpu_id,
+						   this_leaf->type,
+						   this_leaf->level,
+						   &cpu_node);
+		pr_debug("found = %p %p\n", found_cache, cpu_node);
+		if (found_cache)
+			update_cache_properties(this_leaf,
+						found_cache,
+						cpu_node);
+
+		index++;
+	}
+}
+
+/**
+ * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
+ * @cpu: Kernel logical cpu number
+ *
+ * Given a logical cpu number, returns the number of levels of cache represented
+ * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
+ * indicating we didn't find any cache levels.
+ *
+ * Return: Cache levels visible to this core.
+ */
+int acpi_find_last_cache_level(unsigned int cpu)
+{
+	u32 acpi_cpu_id;
+	struct acpi_table_header *table;
+	int number_of_levels = 0;
+	acpi_status status;
+
+	pr_debug("Cache Setup find last level cpu=%d\n", cpu);
+
+	acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
+	} else {
+		number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
+		acpi_put_table(table);
+	}
+	pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
+
+	return number_of_levels;
+}
+
+/**
+ * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
+ * @cpu: Kernel logical cpu number
+ *
+ * Updates the global cache info provided by cpu_get_cacheinfo()
+ * when there are valid properties in the acpi_pptt_cache nodes. A
+ * successful parse may not result in any updates if none of the
+ * cache levels have any valid flags set.  Futher, a unique value is
+ * associated with each known CPU cache entry. This unique value
+ * can be used to determine whether caches are shared between cpus.
+ *
+ * Return: -ENOENT on failure to find table, or 0 on success
+ */
+int cache_setup_acpi(unsigned int cpu)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+
+	pr_debug("Cache Setup ACPI cpu %d\n", cpu);
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
+		return -ENOENT;
+	}
+
+	cache_setup_acpi_cpu(table, cpu);
+	acpi_put_table(table);
+
+	return status;
+}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 06/13] ACPI: Enable PPTT support on ARM64
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

Now that we have a PPTT parser, in preparation for its use
on arm64, lets build it.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/Kconfig    | 1 +
 drivers/acpi/Kconfig  | 3 +++
 drivers/acpi/Makefile | 1 +
 3 files changed, 5 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7381eeb7ef8e..439684a71e18 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -7,6 +7,7 @@ config ARM64
 	select ACPI_REDUCED_HARDWARE_ONLY if ACPI
 	select ACPI_MCFG if ACPI
 	select ACPI_SPCR_TABLE if ACPI
+	select ACPI_PPTT if ACPI
 	select ARCH_CLOCKSOURCE_DATA
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index f505e9a01b2d..55aa9789d474 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -534,6 +534,9 @@ config ACPI_CONFIGFS
 
 if ARM64
 source "drivers/acpi/arm64/Kconfig"
+
+config ACPI_PPTT
+	bool
 endif
 
 config TPS68470_PMIC_OPREGION
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 41954a601989..b6056b566df4 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -87,6 +87,7 @@ obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
 obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
 obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
 obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
+obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
 
 # processor has its own "processor." module_param namespace
 processor-y			:= processor_driver.o
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 06/13] ACPI: Enable PPTT support on ARM64
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

Now that we have a PPTT parser, in preparation for its use
on arm64, lets build it.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/Kconfig    | 1 +
 drivers/acpi/Kconfig  | 3 +++
 drivers/acpi/Makefile | 1 +
 3 files changed, 5 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7381eeb7ef8e..439684a71e18 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -7,6 +7,7 @@ config ARM64
 	select ACPI_REDUCED_HARDWARE_ONLY if ACPI
 	select ACPI_MCFG if ACPI
 	select ACPI_SPCR_TABLE if ACPI
+	select ACPI_PPTT if ACPI
 	select ARCH_CLOCKSOURCE_DATA
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index f505e9a01b2d..55aa9789d474 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -534,6 +534,9 @@ config ACPI_CONFIGFS
 
 if ARM64
 source "drivers/acpi/arm64/Kconfig"
+
+config ACPI_PPTT
+	bool
 endif
 
 config TPS68470_PMIC_OPREGION
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 41954a601989..b6056b566df4 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -87,6 +87,7 @@ obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
 obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
 obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
 obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
+obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
 
 # processor has its own "processor." module_param namespace
 processor-y			:= processor_driver.o
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 06/13] ACPI: Enable PPTT support on ARM64
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we have a PPTT parser, in preparation for its use
on arm64, lets build it.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/Kconfig    | 1 +
 drivers/acpi/Kconfig  | 3 +++
 drivers/acpi/Makefile | 1 +
 3 files changed, 5 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7381eeb7ef8e..439684a71e18 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -7,6 +7,7 @@ config ARM64
 	select ACPI_REDUCED_HARDWARE_ONLY if ACPI
 	select ACPI_MCFG if ACPI
 	select ACPI_SPCR_TABLE if ACPI
+	select ACPI_PPTT if ACPI
 	select ARCH_CLOCKSOURCE_DATA
 	select ARCH_HAS_DEBUG_VIRTUAL
 	select ARCH_HAS_DEVMEM_IS_ALLOWED
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index f505e9a01b2d..55aa9789d474 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -534,6 +534,9 @@ config ACPI_CONFIGFS
 
 if ARM64
 source "drivers/acpi/arm64/Kconfig"
+
+config ACPI_PPTT
+	bool
 endif
 
 config TPS68470_PMIC_OPREGION
diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
index 41954a601989..b6056b566df4 100644
--- a/drivers/acpi/Makefile
+++ b/drivers/acpi/Makefile
@@ -87,6 +87,7 @@ obj-$(CONFIG_ACPI_BGRT)		+= bgrt.o
 obj-$(CONFIG_ACPI_CPPC_LIB)	+= cppc_acpi.o
 obj-$(CONFIG_ACPI_SPCR_TABLE)	+= spcr.o
 obj-$(CONFIG_ACPI_DEBUGGER_USER) += acpi_dbg.o
+obj-$(CONFIG_ACPI_PPTT) 	+= pptt.o
 
 # processor has its own "processor." module_param namespace
 processor-y			:= processor_driver.o
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

Call ACPI cache parsing routines from base cacheinfo code if ACPI
is enable. Also stub out cache_setup_acpi() so that individual
architectures can enable ACPI topology parsing.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c       |  1 +
 drivers/base/cacheinfo.c  | 14 ++++++++++----
 include/linux/cacheinfo.h |  9 +++++++++
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 883e4318c6cd..c98f94ebd272 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -343,6 +343,7 @@ static void update_cache_properties(struct cacheinfo *this_leaf,
 {
 	int valid_flags = 0;
 
+	this_leaf->fw_token = cpu_node;
 	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
 		this_leaf->size = found_cache->size;
 		valid_flags++;
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 597aacb233fc..2880e2ab01f5 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -206,7 +206,7 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
 	/*
-	 * For non-DT systems, assume unique level 1 cache, system-wide
+	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
 	 * shared caches for all other levels. This will be used only if
 	 * arch specific code has not populated shared_cpu_map
 	 */
@@ -214,6 +214,11 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 }
 #endif
 
+int __weak cache_setup_acpi(unsigned int cpu)
+{
+	return -ENOTSUPP;
+}
+
 static int cache_shared_cpu_map_setup(unsigned int cpu)
 {
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -227,8 +232,8 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	if (of_have_populated_dt())
 		ret = cache_setup_of_node(cpu);
 	else if (!acpi_disabled)
-		/* No cache property/hierarchy support yet in ACPI */
-		ret = -ENOTSUPP;
+		ret = cache_setup_acpi(cpu);
+
 	if (ret)
 		return ret;
 
@@ -279,7 +284,8 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
 		}
-		of_node_put(this_leaf->fw_token);
+		if (of_have_populated_dt())
+			of_node_put(this_leaf->fw_token);
 	}
 }
 
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 0c6f658054d2..1446d3f053a2 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
+int cache_setup_acpi(unsigned int cpu);
+int acpi_find_last_cache_level(unsigned int cpu);
+#ifndef CONFIG_ACPI
+int acpi_find_last_cache_level(unsigned int cpu)
+{
+	/* ACPI kernels should be built with PPTT support */
+	return 0;
+}
+#endif
 
 const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

Call ACPI cache parsing routines from base cacheinfo code if ACPI
is enable. Also stub out cache_setup_acpi() so that individual
architectures can enable ACPI topology parsing.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c       |  1 +
 drivers/base/cacheinfo.c  | 14 ++++++++++----
 include/linux/cacheinfo.h |  9 +++++++++
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 883e4318c6cd..c98f94ebd272 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -343,6 +343,7 @@ static void update_cache_properties(struct cacheinfo *this_leaf,
 {
 	int valid_flags = 0;
 
+	this_leaf->fw_token = cpu_node;
 	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
 		this_leaf->size = found_cache->size;
 		valid_flags++;
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 597aacb233fc..2880e2ab01f5 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -206,7 +206,7 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
 	/*
-	 * For non-DT systems, assume unique level 1 cache, system-wide
+	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
 	 * shared caches for all other levels. This will be used only if
 	 * arch specific code has not populated shared_cpu_map
 	 */
@@ -214,6 +214,11 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 }
 #endif
 
+int __weak cache_setup_acpi(unsigned int cpu)
+{
+	return -ENOTSUPP;
+}
+
 static int cache_shared_cpu_map_setup(unsigned int cpu)
 {
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -227,8 +232,8 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	if (of_have_populated_dt())
 		ret = cache_setup_of_node(cpu);
 	else if (!acpi_disabled)
-		/* No cache property/hierarchy support yet in ACPI */
-		ret = -ENOTSUPP;
+		ret = cache_setup_acpi(cpu);
+
 	if (ret)
 		return ret;
 
@@ -279,7 +284,8 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
 		}
-		of_node_put(this_leaf->fw_token);
+		if (of_have_populated_dt())
+			of_node_put(this_leaf->fw_token);
 	}
 }
 
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 0c6f658054d2..1446d3f053a2 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
+int cache_setup_acpi(unsigned int cpu);
+int acpi_find_last_cache_level(unsigned int cpu);
+#ifndef CONFIG_ACPI
+int acpi_find_last_cache_level(unsigned int cpu)
+{
+	/* ACPI kernels should be built with PPTT support */
+	return 0;
+}
+#endif
 
 const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

Call ACPI cache parsing routines from base cacheinfo code if ACPI
is enable. Also stub out cache_setup_acpi() so that individual
architectures can enable ACPI topology parsing.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c       |  1 +
 drivers/base/cacheinfo.c  | 14 ++++++++++----
 include/linux/cacheinfo.h |  9 +++++++++
 3 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 883e4318c6cd..c98f94ebd272 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -343,6 +343,7 @@ static void update_cache_properties(struct cacheinfo *this_leaf,
 {
 	int valid_flags = 0;
 
+	this_leaf->fw_token = cpu_node;
 	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
 		this_leaf->size = found_cache->size;
 		valid_flags++;
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 597aacb233fc..2880e2ab01f5 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -206,7 +206,7 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 					   struct cacheinfo *sib_leaf)
 {
 	/*
-	 * For non-DT systems, assume unique level 1 cache, system-wide
+	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
 	 * shared caches for all other levels. This will be used only if
 	 * arch specific code has not populated shared_cpu_map
 	 */
@@ -214,6 +214,11 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
 }
 #endif
 
+int __weak cache_setup_acpi(unsigned int cpu)
+{
+	return -ENOTSUPP;
+}
+
 static int cache_shared_cpu_map_setup(unsigned int cpu)
 {
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
@@ -227,8 +232,8 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
 	if (of_have_populated_dt())
 		ret = cache_setup_of_node(cpu);
 	else if (!acpi_disabled)
-		/* No cache property/hierarchy support yet in ACPI */
-		ret = -ENOTSUPP;
+		ret = cache_setup_acpi(cpu);
+
 	if (ret)
 		return ret;
 
@@ -279,7 +284,8 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
 			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
 			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
 		}
-		of_node_put(this_leaf->fw_token);
+		if (of_have_populated_dt())
+			of_node_put(this_leaf->fw_token);
 	}
 }
 
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 0c6f658054d2..1446d3f053a2 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
 struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
 int init_cache_level(unsigned int cpu);
 int populate_cache_leaves(unsigned int cpu);
+int cache_setup_acpi(unsigned int cpu);
+int acpi_find_last_cache_level(unsigned int cpu);
+#ifndef CONFIG_ACPI
+int acpi_find_last_cache_level(unsigned int cpu)
+{
+	/* ACPI kernels should be built with PPTT support */
+	return 0;
+}
+#endif
 
 const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

The /sys cache entries should support ACPI/PPTT generated cache
topology information. Lets detect ACPI systems and call
an arch specific cache_setup_acpi() routine to update the hardware
probed cache topology.

For arm64, if ACPI is enabled, determine the max number of cache
levels and populate them using the PPTT table if one is available.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/kernel/cacheinfo.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index 380f2e2fbed5..0bf0a835122f 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -17,6 +17,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi.h>
 #include <linux/cacheinfo.h>
 #include <linux/of.h>
 
@@ -46,7 +47,7 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 
 static int __init_cache_level(unsigned int cpu)
 {
-	unsigned int ctype, level, leaves, of_level;
+	unsigned int ctype, level, leaves, fw_level;
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 
 	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
@@ -59,15 +60,19 @@ static int __init_cache_level(unsigned int cpu)
 		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
 	}
 
-	of_level = of_find_last_cache_level(cpu);
-	if (level < of_level) {
+	if (acpi_disabled)
+		fw_level = of_find_last_cache_level(cpu);
+	else
+		fw_level = acpi_find_last_cache_level(cpu);
+
+	if (level < fw_level) {
 		/*
 		 * some external caches not specified in CLIDR_EL1
 		 * the information may be available in the device tree
 		 * only unified external caches are considered here
 		 */
-		leaves += (of_level - level);
-		level = of_level;
+		leaves += (fw_level - level);
+		level = fw_level;
 	}
 
 	this_cpu_ci->num_levels = level;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

The /sys cache entries should support ACPI/PPTT generated cache
topology information. Lets detect ACPI systems and call
an arch specific cache_setup_acpi() routine to update the hardware
probed cache topology.

For arm64, if ACPI is enabled, determine the max number of cache
levels and populate them using the PPTT table if one is available.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/kernel/cacheinfo.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index 380f2e2fbed5..0bf0a835122f 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -17,6 +17,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi.h>
 #include <linux/cacheinfo.h>
 #include <linux/of.h>
 
@@ -46,7 +47,7 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 
 static int __init_cache_level(unsigned int cpu)
 {
-	unsigned int ctype, level, leaves, of_level;
+	unsigned int ctype, level, leaves, fw_level;
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 
 	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
@@ -59,15 +60,19 @@ static int __init_cache_level(unsigned int cpu)
 		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
 	}
 
-	of_level = of_find_last_cache_level(cpu);
-	if (level < of_level) {
+	if (acpi_disabled)
+		fw_level = of_find_last_cache_level(cpu);
+	else
+		fw_level = acpi_find_last_cache_level(cpu);
+
+	if (level < fw_level) {
 		/*
 		 * some external caches not specified in CLIDR_EL1
 		 * the information may be available in the device tree
 		 * only unified external caches are considered here
 		 */
-		leaves += (of_level - level);
-		level = of_level;
+		leaves += (fw_level - level);
+		level = fw_level;
 	}
 
 	this_cpu_ci->num_levels = level;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

The /sys cache entries should support ACPI/PPTT generated cache
topology information. Lets detect ACPI systems and call
an arch specific cache_setup_acpi() routine to update the hardware
probed cache topology.

For arm64, if ACPI is enabled, determine the max number of cache
levels and populate them using the PPTT table if one is available.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/kernel/cacheinfo.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c
index 380f2e2fbed5..0bf0a835122f 100644
--- a/arch/arm64/kernel/cacheinfo.c
+++ b/arch/arm64/kernel/cacheinfo.c
@@ -17,6 +17,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi.h>
 #include <linux/cacheinfo.h>
 #include <linux/of.h>
 
@@ -46,7 +47,7 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 
 static int __init_cache_level(unsigned int cpu)
 {
-	unsigned int ctype, level, leaves, of_level;
+	unsigned int ctype, level, leaves, fw_level;
 	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
 
 	for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) {
@@ -59,15 +60,19 @@ static int __init_cache_level(unsigned int cpu)
 		leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1;
 	}
 
-	of_level = of_find_last_cache_level(cpu);
-	if (level < of_level) {
+	if (acpi_disabled)
+		fw_level = of_find_last_cache_level(cpu);
+	else
+		fw_level = acpi_find_last_cache_level(cpu);
+
+	if (level < fw_level) {
 		/*
 		 * some external caches not specified in CLIDR_EL1
 		 * the information may be available in the device tree
 		 * only unified external caches are considered here
 		 */
-		leaves += (of_level - level);
-		level = of_level;
+		leaves += (fw_level - level);
+		level = fw_level;
 	}
 
 	this_cpu_ci->num_levels = level;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 09/13] ACPI/PPTT: Add topology parsing code
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

The PPTT can be used to determine the groupings of CPU's at
given levels in the system. Lets add a few routines to the PPTT
parsing code to return a unique id for each unique level in the
processor hierarchy. This can then be matched to build
thread/core/cluster/die/package/etc mappings for each processing
element in the system.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c  | 153 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/acpi.h |   4 ++
 2 files changed, 157 insertions(+)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index c98f94ebd272..d66342c2eb29 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -424,6 +424,79 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
 	}
 }
 
+/* Passing level values greater than this will result in search termination */
+#define PPTT_ABORT_PACKAGE 0xFF
+
+/*
+ * Given an acpi_pptt_processor node, walk up until we identify the
+ * package that the node is associated with, or we run out of levels
+ * to request or the search is terminated with a flag match
+ * The level parameter also serves to limit possible loops within the tree.
+ */
+static struct acpi_pptt_processor *acpi_find_processor_package_id(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *cpu,
+	int level, int flag)
+{
+	struct acpi_pptt_processor *prev_node;
+
+	while (cpu && level) {
+		if (cpu->flags & flag)
+			break;
+		pr_debug("level %d\n", level);
+		prev_node = fetch_pptt_node(table_hdr, cpu->parent);
+		if (prev_node == NULL)
+			break;
+		cpu = prev_node;
+		level--;
+	}
+	return cpu;
+}
+
+/*
+ * Get a unique value given a cpu, and a topology level, that can be
+ * matched to determine which cpus share common topological features
+ * at that level.
+ */
+static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
+				     unsigned int cpu, int level, int flag)
+{
+	struct acpi_pptt_processor *cpu_node;
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+
+	cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
+	if (cpu_node) {
+		cpu_node = acpi_find_processor_package_id(table, cpu_node,
+							  level, flag);
+		/* Only the first level has a guaranteed id */
+		if (level == 0)
+			return cpu_node->acpi_processor_id;
+		return ACPI_PTR_DIFF(cpu_node, table);
+	}
+	pr_err_once("PPTT table found, but unable to locate core for %d\n",
+		    cpu);
+	return -ENOENT;
+}
+
+static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+	int retval;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cpu topology may be inaccurate\n");
+		return -ENOENT;
+	}
+	retval = topology_get_acpi_cpu_tag(table, cpu, level, flag);
+	pr_debug("Topology Setup ACPI cpu %d, level %d ret = %d\n",
+		 cpu, level, retval);
+	acpi_put_table(table);
+
+	return retval;
+}
+
 /**
  * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
  * @cpu: Kernel logical cpu number
@@ -487,3 +560,83 @@ int cache_setup_acpi(unsigned int cpu)
 
 	return status;
 }
+
+/**
+ * find_acpi_cpu_topology() - Determine a unique topology value for a given cpu
+ * @cpu: Kernel logical cpu number
+ * @level: The topological level for which we would like a unique ID
+ *
+ * Determine a topology unique ID for each thread/core/cluster/mc_grouping
+ * /socket/etc. This ID can then be used to group peers, which will have
+ * matching ids.
+ *
+ * The search terminates when either the requested level is found or
+ * we reach a root node. Levels beyond the termination point will return the
+ * same unique ID. The unique id for level 0 is the acpi processor id. All
+ * other levels beyond this use a generated value to uniquely identify
+ * a topological feature.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents a unique topological feature.
+ */
+int find_acpi_cpu_topology(unsigned int cpu, int level)
+{
+	return find_acpi_cpu_topology_tag(cpu, level, 0);
+}
+
+/**
+ * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
+ * @cpu: Kernel logical cpu number
+ * @level: The cache level for which we would like a unique ID
+ *
+ * Determine a unique ID for each unified cache in the system
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents a unique topological feature.
+ */
+int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
+{
+	struct acpi_table_header *table;
+	struct acpi_pptt_cache *found_cache;
+	acpi_status status;
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	struct acpi_pptt_processor *cpu_node = NULL;
+	int ret = -1;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, topology may be inaccurate\n");
+		return -ENOENT;
+	}
+
+	found_cache = acpi_find_cache_node(table, acpi_cpu_id,
+					   CACHE_TYPE_UNIFIED,
+					   level,
+					   &cpu_node);
+	if (found_cache)
+		ret = ACPI_PTR_DIFF(cpu_node, table);
+
+	acpi_put_table(table);
+
+	return ret;
+}
+
+
+/**
+ * find_acpi_cpu_topology_package() - Determine a unique cpu package value
+ * @cpu: Kernel logical cpu number
+ *
+ * Determine a topology unique package ID for the given cpu.
+ * This ID can then be used to group peers, which will have matching ids.
+ *
+ * The search terminates when either a level is found with the PHYSICAL_PACKAGE
+ * flag set or we reach a root node.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents the package for this cpu.
+ */
+int find_acpi_cpu_topology_package(unsigned int cpu)
+{
+	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
+					  ACPI_PPTT_PHYSICAL_PACKAGE);
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 968173ec2726..2c9b6a000ea7 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1290,4 +1290,8 @@ static inline int lpit_read_residency_count_address(u64 *address)
 }
 #endif
 
+int find_acpi_cpu_topology(unsigned int cpu, int level);
+int find_acpi_cpu_topology_package(unsigned int cpu);
+int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
+
 #endif	/*_LINUX_ACPI_H*/
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 09/13] ACPI/PPTT: Add topology parsing code
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

The PPTT can be used to determine the groupings of CPU's at
given levels in the system. Lets add a few routines to the PPTT
parsing code to return a unique id for each unique level in the
processor hierarchy. This can then be matched to build
thread/core/cluster/die/package/etc mappings for each processing
element in the system.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c  | 153 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/acpi.h |   4 ++
 2 files changed, 157 insertions(+)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index c98f94ebd272..d66342c2eb29 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -424,6 +424,79 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
 	}
 }
 
+/* Passing level values greater than this will result in search termination */
+#define PPTT_ABORT_PACKAGE 0xFF
+
+/*
+ * Given an acpi_pptt_processor node, walk up until we identify the
+ * package that the node is associated with, or we run out of levels
+ * to request or the search is terminated with a flag match
+ * The level parameter also serves to limit possible loops within the tree.
+ */
+static struct acpi_pptt_processor *acpi_find_processor_package_id(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *cpu,
+	int level, int flag)
+{
+	struct acpi_pptt_processor *prev_node;
+
+	while (cpu && level) {
+		if (cpu->flags & flag)
+			break;
+		pr_debug("level %d\n", level);
+		prev_node = fetch_pptt_node(table_hdr, cpu->parent);
+		if (prev_node == NULL)
+			break;
+		cpu = prev_node;
+		level--;
+	}
+	return cpu;
+}
+
+/*
+ * Get a unique value given a cpu, and a topology level, that can be
+ * matched to determine which cpus share common topological features
+ * at that level.
+ */
+static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
+				     unsigned int cpu, int level, int flag)
+{
+	struct acpi_pptt_processor *cpu_node;
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+
+	cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
+	if (cpu_node) {
+		cpu_node = acpi_find_processor_package_id(table, cpu_node,
+							  level, flag);
+		/* Only the first level has a guaranteed id */
+		if (level == 0)
+			return cpu_node->acpi_processor_id;
+		return ACPI_PTR_DIFF(cpu_node, table);
+	}
+	pr_err_once("PPTT table found, but unable to locate core for %d\n",
+		    cpu);
+	return -ENOENT;
+}
+
+static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+	int retval;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cpu topology may be inaccurate\n");
+		return -ENOENT;
+	}
+	retval = topology_get_acpi_cpu_tag(table, cpu, level, flag);
+	pr_debug("Topology Setup ACPI cpu %d, level %d ret = %d\n",
+		 cpu, level, retval);
+	acpi_put_table(table);
+
+	return retval;
+}
+
 /**
  * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
  * @cpu: Kernel logical cpu number
@@ -487,3 +560,83 @@ int cache_setup_acpi(unsigned int cpu)
 
 	return status;
 }
+
+/**
+ * find_acpi_cpu_topology() - Determine a unique topology value for a given cpu
+ * @cpu: Kernel logical cpu number
+ * @level: The topological level for which we would like a unique ID
+ *
+ * Determine a topology unique ID for each thread/core/cluster/mc_grouping
+ * /socket/etc. This ID can then be used to group peers, which will have
+ * matching ids.
+ *
+ * The search terminates when either the requested level is found or
+ * we reach a root node. Levels beyond the termination point will return the
+ * same unique ID. The unique id for level 0 is the acpi processor id. All
+ * other levels beyond this use a generated value to uniquely identify
+ * a topological feature.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents a unique topological feature.
+ */
+int find_acpi_cpu_topology(unsigned int cpu, int level)
+{
+	return find_acpi_cpu_topology_tag(cpu, level, 0);
+}
+
+/**
+ * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
+ * @cpu: Kernel logical cpu number
+ * @level: The cache level for which we would like a unique ID
+ *
+ * Determine a unique ID for each unified cache in the system
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents a unique topological feature.
+ */
+int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
+{
+	struct acpi_table_header *table;
+	struct acpi_pptt_cache *found_cache;
+	acpi_status status;
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	struct acpi_pptt_processor *cpu_node = NULL;
+	int ret = -1;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, topology may be inaccurate\n");
+		return -ENOENT;
+	}
+
+	found_cache = acpi_find_cache_node(table, acpi_cpu_id,
+					   CACHE_TYPE_UNIFIED,
+					   level,
+					   &cpu_node);
+	if (found_cache)
+		ret = ACPI_PTR_DIFF(cpu_node, table);
+
+	acpi_put_table(table);
+
+	return ret;
+}
+
+
+/**
+ * find_acpi_cpu_topology_package() - Determine a unique cpu package value
+ * @cpu: Kernel logical cpu number
+ *
+ * Determine a topology unique package ID for the given cpu.
+ * This ID can then be used to group peers, which will have matching ids.
+ *
+ * The search terminates when either a level is found with the PHYSICAL_PACKAGE
+ * flag set or we reach a root node.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents the package for this cpu.
+ */
+int find_acpi_cpu_topology_package(unsigned int cpu)
+{
+	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
+					  ACPI_PPTT_PHYSICAL_PACKAGE);
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 968173ec2726..2c9b6a000ea7 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1290,4 +1290,8 @@ static inline int lpit_read_residency_count_address(u64 *address)
 }
 #endif
 
+int find_acpi_cpu_topology(unsigned int cpu, int level);
+int find_acpi_cpu_topology_package(unsigned int cpu);
+int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
+
 #endif	/*_LINUX_ACPI_H*/
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 09/13] ACPI/PPTT: Add topology parsing code
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

The PPTT can be used to determine the groupings of CPU's at
given levels in the system. Lets add a few routines to the PPTT
parsing code to return a unique id for each unique level in the
processor hierarchy. This can then be matched to build
thread/core/cluster/die/package/etc mappings for each processing
element in the system.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/pptt.c  | 153 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/acpi.h |   4 ++
 2 files changed, 157 insertions(+)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index c98f94ebd272..d66342c2eb29 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -424,6 +424,79 @@ static void cache_setup_acpi_cpu(struct acpi_table_header *table,
 	}
 }
 
+/* Passing level values greater than this will result in search termination */
+#define PPTT_ABORT_PACKAGE 0xFF
+
+/*
+ * Given an acpi_pptt_processor node, walk up until we identify the
+ * package that the node is associated with, or we run out of levels
+ * to request or the search is terminated with a flag match
+ * The level parameter also serves to limit possible loops within the tree.
+ */
+static struct acpi_pptt_processor *acpi_find_processor_package_id(
+	struct acpi_table_header *table_hdr,
+	struct acpi_pptt_processor *cpu,
+	int level, int flag)
+{
+	struct acpi_pptt_processor *prev_node;
+
+	while (cpu && level) {
+		if (cpu->flags & flag)
+			break;
+		pr_debug("level %d\n", level);
+		prev_node = fetch_pptt_node(table_hdr, cpu->parent);
+		if (prev_node == NULL)
+			break;
+		cpu = prev_node;
+		level--;
+	}
+	return cpu;
+}
+
+/*
+ * Get a unique value given a cpu, and a topology level, that can be
+ * matched to determine which cpus share common topological features
+ * at that level.
+ */
+static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
+				     unsigned int cpu, int level, int flag)
+{
+	struct acpi_pptt_processor *cpu_node;
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+
+	cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
+	if (cpu_node) {
+		cpu_node = acpi_find_processor_package_id(table, cpu_node,
+							  level, flag);
+		/* Only the first level has a guaranteed id */
+		if (level == 0)
+			return cpu_node->acpi_processor_id;
+		return ACPI_PTR_DIFF(cpu_node, table);
+	}
+	pr_err_once("PPTT table found, but unable to locate core for %d\n",
+		    cpu);
+	return -ENOENT;
+}
+
+static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
+{
+	struct acpi_table_header *table;
+	acpi_status status;
+	int retval;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, cpu topology may be inaccurate\n");
+		return -ENOENT;
+	}
+	retval = topology_get_acpi_cpu_tag(table, cpu, level, flag);
+	pr_debug("Topology Setup ACPI cpu %d, level %d ret = %d\n",
+		 cpu, level, retval);
+	acpi_put_table(table);
+
+	return retval;
+}
+
 /**
  * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
  * @cpu: Kernel logical cpu number
@@ -487,3 +560,83 @@ int cache_setup_acpi(unsigned int cpu)
 
 	return status;
 }
+
+/**
+ * find_acpi_cpu_topology() - Determine a unique topology value for a given cpu
+ * @cpu: Kernel logical cpu number
+ * @level: The topological level for which we would like a unique ID
+ *
+ * Determine a topology unique ID for each thread/core/cluster/mc_grouping
+ * /socket/etc. This ID can then be used to group peers, which will have
+ * matching ids.
+ *
+ * The search terminates when either the requested level is found or
+ * we reach a root node. Levels beyond the termination point will return the
+ * same unique ID. The unique id for level 0 is the acpi processor id. All
+ * other levels beyond this use a generated value to uniquely identify
+ * a topological feature.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents a unique topological feature.
+ */
+int find_acpi_cpu_topology(unsigned int cpu, int level)
+{
+	return find_acpi_cpu_topology_tag(cpu, level, 0);
+}
+
+/**
+ * find_acpi_cpu_cache_topology() - Determine a unique cache topology value
+ * @cpu: Kernel logical cpu number
+ * @level: The cache level for which we would like a unique ID
+ *
+ * Determine a unique ID for each unified cache in the system
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents a unique topological feature.
+ */
+int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
+{
+	struct acpi_table_header *table;
+	struct acpi_pptt_cache *found_cache;
+	acpi_status status;
+	u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+	struct acpi_pptt_processor *cpu_node = NULL;
+	int ret = -1;
+
+	status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+	if (ACPI_FAILURE(status)) {
+		pr_err_once("No PPTT table found, topology may be inaccurate\n");
+		return -ENOENT;
+	}
+
+	found_cache = acpi_find_cache_node(table, acpi_cpu_id,
+					   CACHE_TYPE_UNIFIED,
+					   level,
+					   &cpu_node);
+	if (found_cache)
+		ret = ACPI_PTR_DIFF(cpu_node, table);
+
+	acpi_put_table(table);
+
+	return ret;
+}
+
+
+/**
+ * find_acpi_cpu_topology_package() - Determine a unique cpu package value
+ * @cpu: Kernel logical cpu number
+ *
+ * Determine a topology unique package ID for the given cpu.
+ * This ID can then be used to group peers, which will have matching ids.
+ *
+ * The search terminates when either a level is found with the PHYSICAL_PACKAGE
+ * flag set or we reach a root node.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents the package for this cpu.
+ */
+int find_acpi_cpu_topology_package(unsigned int cpu)
+{
+	return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
+					  ACPI_PPTT_PHYSICAL_PACKAGE);
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 968173ec2726..2c9b6a000ea7 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1290,4 +1290,8 @@ static inline int lpit_read_residency_count_address(u64 *address)
 }
 #endif
 
+int find_acpi_cpu_topology(unsigned int cpu, int level);
+int find_acpi_cpu_topology_package(unsigned int cpu);
+int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
+
 #endif	/*_LINUX_ACPI_H*/
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 10/13] arm64: topology: rename cluster_id
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

Lets match the name of the arm64 topology field
to the kernel macro that uses it.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/topology.h |  4 ++--
 arch/arm64/kernel/topology.c      | 26 +++++++++++++-------------
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index c4f2d50491eb..6b10459e6905 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -7,14 +7,14 @@
 struct cpu_topology {
 	int thread_id;
 	int core_id;
-	int cluster_id;
+	int package_id;
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
 };
 
 extern struct cpu_topology cpu_topology[NR_CPUS];
 
-#define topology_physical_package_id(cpu)	(cpu_topology[cpu].cluster_id)
+#define topology_physical_package_id(cpu)	(cpu_topology[cpu].package_id)
 #define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
 #define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
 #define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 21868530018e..dc18b1e53194 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -47,7 +47,7 @@ static int __init get_cpu_for_node(struct device_node *node)
 	return cpu;
 }
 
-static int __init parse_core(struct device_node *core, int cluster_id,
+static int __init parse_core(struct device_node *core, int package_id,
 			     int core_id)
 {
 	char name[10];
@@ -63,7 +63,7 @@ static int __init parse_core(struct device_node *core, int cluster_id,
 			leaf = false;
 			cpu = get_cpu_for_node(t);
 			if (cpu >= 0) {
-				cpu_topology[cpu].cluster_id = cluster_id;
+				cpu_topology[cpu].package_id = package_id;
 				cpu_topology[cpu].core_id = core_id;
 				cpu_topology[cpu].thread_id = i;
 			} else {
@@ -85,7 +85,7 @@ static int __init parse_core(struct device_node *core, int cluster_id,
 			return -EINVAL;
 		}
 
-		cpu_topology[cpu].cluster_id = cluster_id;
+		cpu_topology[cpu].package_id = package_id;
 		cpu_topology[cpu].core_id = core_id;
 	} else if (leaf) {
 		pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -101,7 +101,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	bool leaf = true;
 	bool has_cores = false;
 	struct device_node *c;
-	static int cluster_id __initdata;
+	static int package_id __initdata;
 	int core_id = 0;
 	int i, ret;
 
@@ -140,7 +140,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, cluster_id, core_id++);
+				ret = parse_core(c, package_id, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -158,7 +158,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 		pr_warn("%pOF: empty cluster\n", cluster);
 
 	if (leaf)
-		cluster_id++;
+		package_id++;
 
 	return 0;
 }
@@ -194,7 +194,7 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].cluster_id == -1)
+		if (cpu_topology[cpu].package_id == -1)
 			ret = -EINVAL;
 
 out_map:
@@ -224,7 +224,7 @@ static void update_siblings_masks(unsigned int cpuid)
 	for_each_possible_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
 		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
@@ -245,7 +245,7 @@ void store_cpu_topology(unsigned int cpuid)
 	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
 	u64 mpidr;
 
-	if (cpuid_topo->cluster_id != -1)
+	if (cpuid_topo->package_id != -1)
 		goto topology_populated;
 
 	mpidr = read_cpuid_mpidr();
@@ -259,19 +259,19 @@ void store_cpu_topology(unsigned int cpuid)
 		/* Multiprocessor system : Multi-threads per core */
 		cpuid_topo->thread_id  = MPIDR_AFFINITY_LEVEL(mpidr, 0);
 		cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 1);
-		cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) |
+		cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 3) << 8;
 	} else {
 		/* Multiprocessor system : Single-thread per core */
 		cpuid_topo->thread_id  = -1;
 		cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 0);
-		cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) |
+		cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 2) << 8 |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 3) << 16;
 	}
 
 	pr_debug("CPU%u: cluster %d core %d thread %d mpidr %#016llx\n",
-		 cpuid, cpuid_topo->cluster_id, cpuid_topo->core_id,
+		 cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
 		 cpuid_topo->thread_id, mpidr);
 
 topology_populated:
@@ -287,7 +287,7 @@ static void __init reset_cpu_topology(void)
 
 		cpu_topo->thread_id = -1;
 		cpu_topo->core_id = 0;
-		cpu_topo->cluster_id = -1;
+		cpu_topo->package_id = -1;
 
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 10/13] arm64: topology: rename cluster_id
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

Lets match the name of the arm64 topology field
to the kernel macro that uses it.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/topology.h |  4 ++--
 arch/arm64/kernel/topology.c      | 26 +++++++++++++-------------
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index c4f2d50491eb..6b10459e6905 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -7,14 +7,14 @@
 struct cpu_topology {
 	int thread_id;
 	int core_id;
-	int cluster_id;
+	int package_id;
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
 };
 
 extern struct cpu_topology cpu_topology[NR_CPUS];
 
-#define topology_physical_package_id(cpu)	(cpu_topology[cpu].cluster_id)
+#define topology_physical_package_id(cpu)	(cpu_topology[cpu].package_id)
 #define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
 #define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
 #define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 21868530018e..dc18b1e53194 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -47,7 +47,7 @@ static int __init get_cpu_for_node(struct device_node *node)
 	return cpu;
 }
 
-static int __init parse_core(struct device_node *core, int cluster_id,
+static int __init parse_core(struct device_node *core, int package_id,
 			     int core_id)
 {
 	char name[10];
@@ -63,7 +63,7 @@ static int __init parse_core(struct device_node *core, int cluster_id,
 			leaf = false;
 			cpu = get_cpu_for_node(t);
 			if (cpu >= 0) {
-				cpu_topology[cpu].cluster_id = cluster_id;
+				cpu_topology[cpu].package_id = package_id;
 				cpu_topology[cpu].core_id = core_id;
 				cpu_topology[cpu].thread_id = i;
 			} else {
@@ -85,7 +85,7 @@ static int __init parse_core(struct device_node *core, int cluster_id,
 			return -EINVAL;
 		}
 
-		cpu_topology[cpu].cluster_id = cluster_id;
+		cpu_topology[cpu].package_id = package_id;
 		cpu_topology[cpu].core_id = core_id;
 	} else if (leaf) {
 		pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -101,7 +101,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	bool leaf = true;
 	bool has_cores = false;
 	struct device_node *c;
-	static int cluster_id __initdata;
+	static int package_id __initdata;
 	int core_id = 0;
 	int i, ret;
 
@@ -140,7 +140,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, cluster_id, core_id++);
+				ret = parse_core(c, package_id, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -158,7 +158,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 		pr_warn("%pOF: empty cluster\n", cluster);
 
 	if (leaf)
-		cluster_id++;
+		package_id++;
 
 	return 0;
 }
@@ -194,7 +194,7 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].cluster_id == -1)
+		if (cpu_topology[cpu].package_id == -1)
 			ret = -EINVAL;
 
 out_map:
@@ -224,7 +224,7 @@ static void update_siblings_masks(unsigned int cpuid)
 	for_each_possible_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
 		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
@@ -245,7 +245,7 @@ void store_cpu_topology(unsigned int cpuid)
 	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
 	u64 mpidr;
 
-	if (cpuid_topo->cluster_id != -1)
+	if (cpuid_topo->package_id != -1)
 		goto topology_populated;
 
 	mpidr = read_cpuid_mpidr();
@@ -259,19 +259,19 @@ void store_cpu_topology(unsigned int cpuid)
 		/* Multiprocessor system : Multi-threads per core */
 		cpuid_topo->thread_id  = MPIDR_AFFINITY_LEVEL(mpidr, 0);
 		cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 1);
-		cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) |
+		cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 3) << 8;
 	} else {
 		/* Multiprocessor system : Single-thread per core */
 		cpuid_topo->thread_id  = -1;
 		cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 0);
-		cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) |
+		cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 2) << 8 |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 3) << 16;
 	}
 
 	pr_debug("CPU%u: cluster %d core %d thread %d mpidr %#016llx\n",
-		 cpuid, cpuid_topo->cluster_id, cpuid_topo->core_id,
+		 cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
 		 cpuid_topo->thread_id, mpidr);
 
 topology_populated:
@@ -287,7 +287,7 @@ static void __init reset_cpu_topology(void)
 
 		cpu_topo->thread_id = -1;
 		cpu_topo->core_id = 0;
-		cpu_topo->cluster_id = -1;
+		cpu_topo->package_id = -1;
 
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 10/13] arm64: topology: rename cluster_id
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

Lets match the name of the arm64 topology field
to the kernel macro that uses it.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/topology.h |  4 ++--
 arch/arm64/kernel/topology.c      | 26 +++++++++++++-------------
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index c4f2d50491eb..6b10459e6905 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -7,14 +7,14 @@
 struct cpu_topology {
 	int thread_id;
 	int core_id;
-	int cluster_id;
+	int package_id;
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
 };
 
 extern struct cpu_topology cpu_topology[NR_CPUS];
 
-#define topology_physical_package_id(cpu)	(cpu_topology[cpu].cluster_id)
+#define topology_physical_package_id(cpu)	(cpu_topology[cpu].package_id)
 #define topology_core_id(cpu)		(cpu_topology[cpu].core_id)
 #define topology_core_cpumask(cpu)	(&cpu_topology[cpu].core_sibling)
 #define topology_sibling_cpumask(cpu)	(&cpu_topology[cpu].thread_sibling)
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 21868530018e..dc18b1e53194 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -47,7 +47,7 @@ static int __init get_cpu_for_node(struct device_node *node)
 	return cpu;
 }
 
-static int __init parse_core(struct device_node *core, int cluster_id,
+static int __init parse_core(struct device_node *core, int package_id,
 			     int core_id)
 {
 	char name[10];
@@ -63,7 +63,7 @@ static int __init parse_core(struct device_node *core, int cluster_id,
 			leaf = false;
 			cpu = get_cpu_for_node(t);
 			if (cpu >= 0) {
-				cpu_topology[cpu].cluster_id = cluster_id;
+				cpu_topology[cpu].package_id = package_id;
 				cpu_topology[cpu].core_id = core_id;
 				cpu_topology[cpu].thread_id = i;
 			} else {
@@ -85,7 +85,7 @@ static int __init parse_core(struct device_node *core, int cluster_id,
 			return -EINVAL;
 		}
 
-		cpu_topology[cpu].cluster_id = cluster_id;
+		cpu_topology[cpu].package_id = package_id;
 		cpu_topology[cpu].core_id = core_id;
 	} else if (leaf) {
 		pr_err("%pOF: Can't get CPU for leaf core\n", core);
@@ -101,7 +101,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 	bool leaf = true;
 	bool has_cores = false;
 	struct device_node *c;
-	static int cluster_id __initdata;
+	static int package_id __initdata;
 	int core_id = 0;
 	int i, ret;
 
@@ -140,7 +140,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 			}
 
 			if (leaf) {
-				ret = parse_core(c, cluster_id, core_id++);
+				ret = parse_core(c, package_id, core_id++);
 			} else {
 				pr_err("%pOF: Non-leaf cluster with core %s\n",
 				       cluster, name);
@@ -158,7 +158,7 @@ static int __init parse_cluster(struct device_node *cluster, int depth)
 		pr_warn("%pOF: empty cluster\n", cluster);
 
 	if (leaf)
-		cluster_id++;
+		package_id++;
 
 	return 0;
 }
@@ -194,7 +194,7 @@ static int __init parse_dt_topology(void)
 	 * only mark cores described in the DT as possible.
 	 */
 	for_each_possible_cpu(cpu)
-		if (cpu_topology[cpu].cluster_id == -1)
+		if (cpu_topology[cpu].package_id == -1)
 			ret = -EINVAL;
 
 out_map:
@@ -224,7 +224,7 @@ static void update_siblings_masks(unsigned int cpuid)
 	for_each_possible_cpu(cpu) {
 		cpu_topo = &cpu_topology[cpu];
 
-		if (cpuid_topo->cluster_id != cpu_topo->cluster_id)
+		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
 		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
@@ -245,7 +245,7 @@ void store_cpu_topology(unsigned int cpuid)
 	struct cpu_topology *cpuid_topo = &cpu_topology[cpuid];
 	u64 mpidr;
 
-	if (cpuid_topo->cluster_id != -1)
+	if (cpuid_topo->package_id != -1)
 		goto topology_populated;
 
 	mpidr = read_cpuid_mpidr();
@@ -259,19 +259,19 @@ void store_cpu_topology(unsigned int cpuid)
 		/* Multiprocessor system : Multi-threads per core */
 		cpuid_topo->thread_id  = MPIDR_AFFINITY_LEVEL(mpidr, 0);
 		cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 1);
-		cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) |
+		cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2) |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 3) << 8;
 	} else {
 		/* Multiprocessor system : Single-thread per core */
 		cpuid_topo->thread_id  = -1;
 		cpuid_topo->core_id    = MPIDR_AFFINITY_LEVEL(mpidr, 0);
-		cpuid_topo->cluster_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) |
+		cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1) |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 2) << 8 |
 					 MPIDR_AFFINITY_LEVEL(mpidr, 3) << 16;
 	}
 
 	pr_debug("CPU%u: cluster %d core %d thread %d mpidr %#016llx\n",
-		 cpuid, cpuid_topo->cluster_id, cpuid_topo->core_id,
+		 cpuid, cpuid_topo->package_id, cpuid_topo->core_id,
 		 cpuid_topo->thread_id, mpidr);
 
 topology_populated:
@@ -287,7 +287,7 @@ static void __init reset_cpu_topology(void)
 
 		cpu_topo->thread_id = -1;
 		cpu_topo->core_id = 0;
-		cpu_topo->cluster_id = -1;
+		cpu_topo->package_id = -1;
 
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 11/13] arm64: topology: enable ACPI/PPTT based CPU topology
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

Propagate the topology information from the PPTT tree to the
cpu_topology array. We can get the thread id, core_id and
cluster_id by assuming certain levels of the PPTT tree correspond
to those concepts. The package_id is flagged in the tree and can be
found by calling find_acpi_cpu_topology_package() which terminates
its search when it finds an ACPI node flagged as the physical
package. If the tree doesn't contain enough levels to represent
all of the requested levels then the root node will be returned
for all subsequent levels.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/kernel/topology.c | 45 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index dc18b1e53194..bd1aae438a31 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -11,6 +11,7 @@
  * for more details.
  */
 
+#include <linux/acpi.h>
 #include <linux/arch_topology.h>
 #include <linux/cpu.h>
 #include <linux/cpumask.h>
@@ -22,6 +23,7 @@
 #include <linux/sched.h>
 #include <linux/sched/topology.h>
 #include <linux/slab.h>
+#include <linux/smp.h>
 #include <linux/string.h>
 
 #include <asm/cpu.h>
@@ -296,6 +298,45 @@ static void __init reset_cpu_topology(void)
 	}
 }
 
+#ifdef CONFIG_ACPI
+/*
+ * Propagate the topology information of the processor_topology_node tree to the
+ * cpu_topology array.
+ */
+static int __init parse_acpi_topology(void)
+{
+	bool is_threaded;
+	int cpu, topology_id;
+
+	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
+
+	for_each_possible_cpu(cpu) {
+		topology_id = find_acpi_cpu_topology(cpu, 0);
+		if (topology_id < 0)
+			return topology_id;
+
+		if (is_threaded) {
+			cpu_topology[cpu].thread_id = topology_id;
+			topology_id = find_acpi_cpu_topology(cpu, 1);
+			cpu_topology[cpu].core_id   = topology_id;
+		} else {
+			cpu_topology[cpu].thread_id  = -1;
+			cpu_topology[cpu].core_id    = topology_id;
+		}
+		topology_id = find_acpi_cpu_topology_package(cpu);
+		cpu_topology[cpu].package_id = topology_id;
+	}
+
+	return 0;
+}
+
+#else
+static inline int __init parse_acpi_topology(void)
+{
+	return -EINVAL;
+}
+#endif
+
 void __init init_cpu_topology(void)
 {
 	reset_cpu_topology();
@@ -304,6 +345,8 @@ void __init init_cpu_topology(void)
 	 * Discard anything that was parsed if we hit an error so we
 	 * don't use partial information.
 	 */
-	if (of_have_populated_dt() && parse_dt_topology())
+	if ((!acpi_disabled) && parse_acpi_topology())
+		reset_cpu_topology();
+	else if (of_have_populated_dt() && parse_dt_topology())
 		reset_cpu_topology();
 }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 11/13] arm64: topology: enable ACPI/PPTT based CPU topology
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

Propagate the topology information from the PPTT tree to the
cpu_topology array. We can get the thread id, core_id and
cluster_id by assuming certain levels of the PPTT tree correspond
to those concepts. The package_id is flagged in the tree and can be
found by calling find_acpi_cpu_topology_package() which terminates
its search when it finds an ACPI node flagged as the physical
package. If the tree doesn't contain enough levels to represent
all of the requested levels then the root node will be returned
for all subsequent levels.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/kernel/topology.c | 45 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index dc18b1e53194..bd1aae438a31 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -11,6 +11,7 @@
  * for more details.
  */
 
+#include <linux/acpi.h>
 #include <linux/arch_topology.h>
 #include <linux/cpu.h>
 #include <linux/cpumask.h>
@@ -22,6 +23,7 @@
 #include <linux/sched.h>
 #include <linux/sched/topology.h>
 #include <linux/slab.h>
+#include <linux/smp.h>
 #include <linux/string.h>
 
 #include <asm/cpu.h>
@@ -296,6 +298,45 @@ static void __init reset_cpu_topology(void)
 	}
 }
 
+#ifdef CONFIG_ACPI
+/*
+ * Propagate the topology information of the processor_topology_node tree to the
+ * cpu_topology array.
+ */
+static int __init parse_acpi_topology(void)
+{
+	bool is_threaded;
+	int cpu, topology_id;
+
+	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
+
+	for_each_possible_cpu(cpu) {
+		topology_id = find_acpi_cpu_topology(cpu, 0);
+		if (topology_id < 0)
+			return topology_id;
+
+		if (is_threaded) {
+			cpu_topology[cpu].thread_id = topology_id;
+			topology_id = find_acpi_cpu_topology(cpu, 1);
+			cpu_topology[cpu].core_id   = topology_id;
+		} else {
+			cpu_topology[cpu].thread_id  = -1;
+			cpu_topology[cpu].core_id    = topology_id;
+		}
+		topology_id = find_acpi_cpu_topology_package(cpu);
+		cpu_topology[cpu].package_id = topology_id;
+	}
+
+	return 0;
+}
+
+#else
+static inline int __init parse_acpi_topology(void)
+{
+	return -EINVAL;
+}
+#endif
+
 void __init init_cpu_topology(void)
 {
 	reset_cpu_topology();
@@ -304,6 +345,8 @@ void __init init_cpu_topology(void)
 	 * Discard anything that was parsed if we hit an error so we
 	 * don't use partial information.
 	 */
-	if (of_have_populated_dt() && parse_dt_topology())
+	if ((!acpi_disabled) && parse_acpi_topology())
+		reset_cpu_topology();
+	else if (of_have_populated_dt() && parse_dt_topology())
 		reset_cpu_topology();
 }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 11/13] arm64: topology: enable ACPI/PPTT based CPU topology
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

Propagate the topology information from the PPTT tree to the
cpu_topology array. We can get the thread id, core_id and
cluster_id by assuming certain levels of the PPTT tree correspond
to those concepts. The package_id is flagged in the tree and can be
found by calling find_acpi_cpu_topology_package() which terminates
its search when it finds an ACPI node flagged as the physical
package. If the tree doesn't contain enough levels to represent
all of the requested levels then the root node will be returned
for all subsequent levels.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/kernel/topology.c | 45 +++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index dc18b1e53194..bd1aae438a31 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -11,6 +11,7 @@
  * for more details.
  */
 
+#include <linux/acpi.h>
 #include <linux/arch_topology.h>
 #include <linux/cpu.h>
 #include <linux/cpumask.h>
@@ -22,6 +23,7 @@
 #include <linux/sched.h>
 #include <linux/sched/topology.h>
 #include <linux/slab.h>
+#include <linux/smp.h>
 #include <linux/string.h>
 
 #include <asm/cpu.h>
@@ -296,6 +298,45 @@ static void __init reset_cpu_topology(void)
 	}
 }
 
+#ifdef CONFIG_ACPI
+/*
+ * Propagate the topology information of the processor_topology_node tree to the
+ * cpu_topology array.
+ */
+static int __init parse_acpi_topology(void)
+{
+	bool is_threaded;
+	int cpu, topology_id;
+
+	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
+
+	for_each_possible_cpu(cpu) {
+		topology_id = find_acpi_cpu_topology(cpu, 0);
+		if (topology_id < 0)
+			return topology_id;
+
+		if (is_threaded) {
+			cpu_topology[cpu].thread_id = topology_id;
+			topology_id = find_acpi_cpu_topology(cpu, 1);
+			cpu_topology[cpu].core_id   = topology_id;
+		} else {
+			cpu_topology[cpu].thread_id  = -1;
+			cpu_topology[cpu].core_id    = topology_id;
+		}
+		topology_id = find_acpi_cpu_topology_package(cpu);
+		cpu_topology[cpu].package_id = topology_id;
+	}
+
+	return 0;
+}
+
+#else
+static inline int __init parse_acpi_topology(void)
+{
+	return -EINVAL;
+}
+#endif
+
 void __init init_cpu_topology(void)
 {
 	reset_cpu_topology();
@@ -304,6 +345,8 @@ void __init init_cpu_topology(void)
 	 * Discard anything that was parsed if we hit an error so we
 	 * don't use partial information.
 	 */
-	if (of_have_populated_dt() && parse_dt_topology())
+	if ((!acpi_disabled) && parse_acpi_topology())
+		reset_cpu_topology();
+	else if (of_have_populated_dt() && parse_dt_topology())
 		reset_cpu_topology();
 }
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 12/13] ACPI: Add PPTT to injectable table list
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton, Geoffrey Blake

Add ACPI_SIG_PPTT to the table so initrd's can override the
system topology.

Signed-off-by: Geoffrey Blake <geoffrey.blake@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/tables.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 7bcb66ccccf3..b2efaa503c34 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -457,7 +457,7 @@ static const char * const table_sigs[] = {
 	ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
 	ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
 	ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT,
-	NULL };
+	ACPI_SIG_PPTT, NULL };
 
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 12/13] ACPI: Add PPTT to injectable table list
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

Add ACPI_SIG_PPTT to the table so initrd's can override the
system topology.

Signed-off-by: Geoffrey Blake <geoffrey.blake@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/tables.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 7bcb66ccccf3..b2efaa503c34 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -457,7 +457,7 @@ static const char * const table_sigs[] = {
 	ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
 	ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
 	ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT,
-	NULL };
+	ACPI_SIG_PPTT, NULL };
 
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 12/13] ACPI: Add PPTT to injectable table list
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

Add ACPI_SIG_PPTT to the table so initrd's can override the
system topology.

Signed-off-by: Geoffrey Blake <geoffrey.blake@arm.com>
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 drivers/acpi/tables.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index 7bcb66ccccf3..b2efaa503c34 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -457,7 +457,7 @@ static const char * const table_sigs[] = {
 	ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
 	ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
 	ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT,
-	NULL };
+	ACPI_SIG_PPTT, NULL };
 
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-02-28 22:06   ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki, Jeremy Linton

Now that we have an accurate view of the physical topology
we need to represent it correctly to the scheduler. In the
case of NUMA in socket, we need to assure that the sched domain
we build for the MC layer isn't larger than the DIE above it.
To do this correctly, we should really base that on the cache
topology immediately below the NUMA node (for NUMA in socket)
or below the physical package for normal NUMA configurations.

This patch creates a set of early cache_siblings masks, then
when the scheduler requests the coregroup mask we pick the
smaller of the physical package siblings, or the numa siblings
and locate the largest cache which is an entire subset of
those siblings. If we are unable to find a proper subset of
cores then we retain the original behavior and return the
core_sibling list.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/topology.h |  5 +++
 arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index 6b10459e6905..08db3e4e44e1 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -4,12 +4,17 @@
 
 #include <linux/cpumask.h>
 
+#define MAX_CACHE_CHECKS 4
+
 struct cpu_topology {
 	int thread_id;
 	int core_id;
 	int package_id;
+	int cache_id[MAX_CACHE_CHECKS];
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
+	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
+	int cache_level;
 };
 
 extern struct cpu_topology cpu_topology[NR_CPUS];
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index bd1aae438a31..1809dc9d347c 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
 struct cpu_topology cpu_topology[NR_CPUS];
 EXPORT_SYMBOL_GPL(cpu_topology);
 
+static void find_llc_topology_for_cpu(int cpu)
+{
+	/* first determine if we are a NUMA in package */
+	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
+	int indx;
+
+	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
+		/* not numa in package, lets use the package siblings */
+		node_mask = &cpu_topology[cpu].core_sibling;
+	}
+
+	/*
+	 * node_mask should represent the smallest package/numa grouping
+	 * lets search for the largest cache smaller than the node_mask.
+	 */
+	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
+		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
+
+		if (cpu_topology[cpu].cache_id[indx] < 0)
+			continue;
+
+		if (cpumask_subset(cache_sibs, node_mask))
+			cpu_topology[cpu].cache_level = indx;
+	}
+}
+
 const struct cpumask *cpu_coregroup_mask(int cpu)
 {
+	int *llc = &cpu_topology[cpu].cache_level;
+
+	if (*llc == -1)
+		find_llc_topology_for_cpu(cpu);
+
+	if (*llc != -1)
+		return &cpu_topology[cpu].cache_siblings[*llc];
+
 	return &cpu_topology[cpu].core_sibling;
 }
 
@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
 {
 	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
 	int cpu;
+	int idx;
 
 	/* update core and thread sibling masks */
 	for_each_possible_cpu(cpu) {
@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
+			cpumask_t *lsib;
+			int cput_id = cpuid_topo->cache_id[idx];
+
+			if (cput_id == cpu_topo->cache_id[idx]) {
+				lsib = &cpuid_topo->cache_siblings[idx];
+				cpumask_set_cpu(cpu, lsib);
+			}
+		}
+
 		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
 		if (cpu != cpuid)
 			cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
@@ -286,10 +331,18 @@ static void __init reset_cpu_topology(void)
 
 	for_each_possible_cpu(cpu) {
 		struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+		int idx;
 
 		cpu_topo->thread_id = -1;
 		cpu_topo->core_id = 0;
 		cpu_topo->package_id = -1;
+		cpu_topo->cache_level = -1;
+
+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
+			cpu_topo->cache_id[idx] = -1;
+			cpumask_clear(&cpu_topo->cache_siblings[idx]);
+			cpumask_set_cpu(cpu, &cpu_topo->cache_siblings[idx]);
+		}
 
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
@@ -311,6 +364,9 @@ static int __init parse_acpi_topology(void)
 	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
 
 	for_each_possible_cpu(cpu) {
+		int tidx = 0;
+		int i;
+
 		topology_id = find_acpi_cpu_topology(cpu, 0);
 		if (topology_id < 0)
 			return topology_id;
@@ -325,6 +381,14 @@ static int __init parse_acpi_topology(void)
 		}
 		topology_id = find_acpi_cpu_topology_package(cpu);
 		cpu_topology[cpu].package_id = topology_id;
+
+		for (i = 0; i < MAX_CACHE_CHECKS; i++) {
+			topology_id = find_acpi_cpu_cache_topology(cpu, i + 1);
+			if (topology_id > 0) {
+				cpu_topology[cpu].cache_id[tidx] = topology_id;
+				tidx++;
+			}
+		}
 	}
 
 	return 0;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-riscv

Now that we have an accurate view of the physical topology
we need to represent it correctly to the scheduler. In the
case of NUMA in socket, we need to assure that the sched domain
we build for the MC layer isn't larger than the DIE above it.
To do this correctly, we should really base that on the cache
topology immediately below the NUMA node (for NUMA in socket)
or below the physical package for normal NUMA configurations.

This patch creates a set of early cache_siblings masks, then
when the scheduler requests the coregroup mask we pick the
smaller of the physical package siblings, or the numa siblings
and locate the largest cache which is an entire subset of
those siblings. If we are unable to find a proper subset of
cores then we retain the original behavior and return the
core_sibling list.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/topology.h |  5 +++
 arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index 6b10459e6905..08db3e4e44e1 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -4,12 +4,17 @@
 
 #include <linux/cpumask.h>
 
+#define MAX_CACHE_CHECKS 4
+
 struct cpu_topology {
 	int thread_id;
 	int core_id;
 	int package_id;
+	int cache_id[MAX_CACHE_CHECKS];
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
+	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
+	int cache_level;
 };
 
 extern struct cpu_topology cpu_topology[NR_CPUS];
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index bd1aae438a31..1809dc9d347c 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
 struct cpu_topology cpu_topology[NR_CPUS];
 EXPORT_SYMBOL_GPL(cpu_topology);
 
+static void find_llc_topology_for_cpu(int cpu)
+{
+	/* first determine if we are a NUMA in package */
+	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
+	int indx;
+
+	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
+		/* not numa in package, lets use the package siblings */
+		node_mask = &cpu_topology[cpu].core_sibling;
+	}
+
+	/*
+	 * node_mask should represent the smallest package/numa grouping
+	 * lets search for the largest cache smaller than the node_mask.
+	 */
+	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
+		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
+
+		if (cpu_topology[cpu].cache_id[indx] < 0)
+			continue;
+
+		if (cpumask_subset(cache_sibs, node_mask))
+			cpu_topology[cpu].cache_level = indx;
+	}
+}
+
 const struct cpumask *cpu_coregroup_mask(int cpu)
 {
+	int *llc = &cpu_topology[cpu].cache_level;
+
+	if (*llc == -1)
+		find_llc_topology_for_cpu(cpu);
+
+	if (*llc != -1)
+		return &cpu_topology[cpu].cache_siblings[*llc];
+
 	return &cpu_topology[cpu].core_sibling;
 }
 
@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
 {
 	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
 	int cpu;
+	int idx;
 
 	/* update core and thread sibling masks */
 	for_each_possible_cpu(cpu) {
@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
+			cpumask_t *lsib;
+			int cput_id = cpuid_topo->cache_id[idx];
+
+			if (cput_id == cpu_topo->cache_id[idx]) {
+				lsib = &cpuid_topo->cache_siblings[idx];
+				cpumask_set_cpu(cpu, lsib);
+			}
+		}
+
 		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
 		if (cpu != cpuid)
 			cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
@@ -286,10 +331,18 @@ static void __init reset_cpu_topology(void)
 
 	for_each_possible_cpu(cpu) {
 		struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+		int idx;
 
 		cpu_topo->thread_id = -1;
 		cpu_topo->core_id = 0;
 		cpu_topo->package_id = -1;
+		cpu_topo->cache_level = -1;
+
+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
+			cpu_topo->cache_id[idx] = -1;
+			cpumask_clear(&cpu_topo->cache_siblings[idx]);
+			cpumask_set_cpu(cpu, &cpu_topo->cache_siblings[idx]);
+		}
 
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
@@ -311,6 +364,9 @@ static int __init parse_acpi_topology(void)
 	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
 
 	for_each_possible_cpu(cpu) {
+		int tidx = 0;
+		int i;
+
 		topology_id = find_acpi_cpu_topology(cpu, 0);
 		if (topology_id < 0)
 			return topology_id;
@@ -325,6 +381,14 @@ static int __init parse_acpi_topology(void)
 		}
 		topology_id = find_acpi_cpu_topology_package(cpu);
 		cpu_topology[cpu].package_id = topology_id;
+
+		for (i = 0; i < MAX_CACHE_CHECKS; i++) {
+			topology_id = find_acpi_cpu_cache_topology(cpu, i + 1);
+			if (topology_id > 0) {
+				cpu_topology[cpu].cache_id[tidx] = topology_id;
+				tidx++;
+			}
+		}
 	}
 
 	return 0;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-02-28 22:06   ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-02-28 22:06 UTC (permalink / raw)
  To: linux-arm-kernel

Now that we have an accurate view of the physical topology
we need to represent it correctly to the scheduler. In the
case of NUMA in socket, we need to assure that the sched domain
we build for the MC layer isn't larger than the DIE above it.
To do this correctly, we should really base that on the cache
topology immediately below the NUMA node (for NUMA in socket)
or below the physical package for normal NUMA configurations.

This patch creates a set of early cache_siblings masks, then
when the scheduler requests the coregroup mask we pick the
smaller of the physical package siblings, or the numa siblings
and locate the largest cache which is an entire subset of
those siblings. If we are unable to find a proper subset of
cores then we retain the original behavior and return the
core_sibling list.

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
 arch/arm64/include/asm/topology.h |  5 +++
 arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index 6b10459e6905..08db3e4e44e1 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -4,12 +4,17 @@
 
 #include <linux/cpumask.h>
 
+#define MAX_CACHE_CHECKS 4
+
 struct cpu_topology {
 	int thread_id;
 	int core_id;
 	int package_id;
+	int cache_id[MAX_CACHE_CHECKS];
 	cpumask_t thread_sibling;
 	cpumask_t core_sibling;
+	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
+	int cache_level;
 };
 
 extern struct cpu_topology cpu_topology[NR_CPUS];
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index bd1aae438a31..1809dc9d347c 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
 struct cpu_topology cpu_topology[NR_CPUS];
 EXPORT_SYMBOL_GPL(cpu_topology);
 
+static void find_llc_topology_for_cpu(int cpu)
+{
+	/* first determine if we are a NUMA in package */
+	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
+	int indx;
+
+	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
+		/* not numa in package, lets use the package siblings */
+		node_mask = &cpu_topology[cpu].core_sibling;
+	}
+
+	/*
+	 * node_mask should represent the smallest package/numa grouping
+	 * lets search for the largest cache smaller than the node_mask.
+	 */
+	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
+		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
+
+		if (cpu_topology[cpu].cache_id[indx] < 0)
+			continue;
+
+		if (cpumask_subset(cache_sibs, node_mask))
+			cpu_topology[cpu].cache_level = indx;
+	}
+}
+
 const struct cpumask *cpu_coregroup_mask(int cpu)
 {
+	int *llc = &cpu_topology[cpu].cache_level;
+
+	if (*llc == -1)
+		find_llc_topology_for_cpu(cpu);
+
+	if (*llc != -1)
+		return &cpu_topology[cpu].cache_siblings[*llc];
+
 	return &cpu_topology[cpu].core_sibling;
 }
 
@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
 {
 	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
 	int cpu;
+	int idx;
 
 	/* update core and thread sibling masks */
 	for_each_possible_cpu(cpu) {
@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
 		if (cpuid_topo->package_id != cpu_topo->package_id)
 			continue;
 
+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
+			cpumask_t *lsib;
+			int cput_id = cpuid_topo->cache_id[idx];
+
+			if (cput_id == cpu_topo->cache_id[idx]) {
+				lsib = &cpuid_topo->cache_siblings[idx];
+				cpumask_set_cpu(cpu, lsib);
+			}
+		}
+
 		cpumask_set_cpu(cpuid, &cpu_topo->core_sibling);
 		if (cpu != cpuid)
 			cpumask_set_cpu(cpu, &cpuid_topo->core_sibling);
@@ -286,10 +331,18 @@ static void __init reset_cpu_topology(void)
 
 	for_each_possible_cpu(cpu) {
 		struct cpu_topology *cpu_topo = &cpu_topology[cpu];
+		int idx;
 
 		cpu_topo->thread_id = -1;
 		cpu_topo->core_id = 0;
 		cpu_topo->package_id = -1;
+		cpu_topo->cache_level = -1;
+
+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
+			cpu_topo->cache_id[idx] = -1;
+			cpumask_clear(&cpu_topo->cache_siblings[idx]);
+			cpumask_set_cpu(cpu, &cpu_topo->cache_siblings[idx]);
+		}
 
 		cpumask_clear(&cpu_topo->core_sibling);
 		cpumask_set_cpu(cpu, &cpu_topo->core_sibling);
@@ -311,6 +364,9 @@ static int __init parse_acpi_topology(void)
 	is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK;
 
 	for_each_possible_cpu(cpu) {
+		int tidx = 0;
+		int i;
+
 		topology_id = find_acpi_cpu_topology(cpu, 0);
 		if (topology_id < 0)
 			return topology_id;
@@ -325,6 +381,14 @@ static int __init parse_acpi_topology(void)
 		}
 		topology_id = find_acpi_cpu_topology_package(cpu);
 		cpu_topology[cpu].package_id = topology_id;
+
+		for (i = 0; i < MAX_CACHE_CHECKS; i++) {
+			topology_id = find_acpi_cpu_cache_topology(cpu, i + 1);
+			if (topology_id > 0) {
+				cpu_topology[cpu].cache_id[tidx] = topology_id;
+				tidx++;
+			}
+		}
 	}
 
 	return 0;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
  2018-02-28 22:06   ` Jeremy Linton
  (?)
  (?)
@ 2018-02-28 22:34     ` Palmer Dabbelt
  -1 siblings, 0 replies; 136+ messages in thread
From: Palmer Dabbelt @ 2018-02-28 22:34 UTC (permalink / raw)
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, lorenzo.pieralisi,
	hanjun.guo, rjw, Will Deacon, catalin.marinas, Greg KH,
	mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2, vkilari,
	ahs3, dietmar.eggemann, morten.rasmussen, lenb, john.garry,
	austinwc, tnowicki, jeremy.linton

On Wed, 28 Feb 2018 14:06:08 PST (-0800), jeremy.linton@arm.com wrote:
> The original intent in cacheinfo was that an architecture
> specific populate_cache_leaves() would probe the hardware
> and then cache_shared_cpu_map_setup() and
> cache_override_properties() would provide firmware help to
> extend/expand upon what was probed. Arm64 was really
> the only architecture that was working this way, and
> with the removal of most of the hardware probing logic it
> became clear that it was possible to simplify the logic a bit.
>
> This patch combines the walk of the DT nodes with the
> code updating the cache size/line_size and nr_sets.
> cache_override_properties() (which was DT specific) is
> then removed. The result is that cacheinfo.of_node is
> no longer used as a temporary place to hold DT references
> for future calls that update cache properties. That change
> helps to clarify its one remaining use (matching
> cacheinfo nodes that represent shared caches) which
> will be used by the ACPI/PPTT code in the following patches.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/riscv/kernel/cacheinfo.c |  1 -
>  drivers/base/cacheinfo.c      | 65 +++++++++++++++++++------------------------
>  2 files changed, 29 insertions(+), 37 deletions(-)
>
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index 10ed2749e246..0bc86e5f8f3f 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -20,7 +20,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
>  			 struct device_node *node,
>  			 enum cache_type type, unsigned int level)
>  {
> -	this_leaf->of_node = node;
>  	this_leaf->level = level;
>  	this_leaf->type = type;
>  	/* not a sector cache */
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 09ccef7ddc99..a872523e8951 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -71,7 +71,7 @@ static inline int get_cacheinfo_idx(enum cache_type type)
>  	return type;
>  }

This looks good as far as RISC-V is concerned, though that's such a trivial 
part of the changeset it's not worth that much :).  Thanks!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
@ 2018-02-28 22:34     ` Palmer Dabbelt
  0 siblings, 0 replies; 136+ messages in thread
From: Palmer Dabbelt @ 2018-02-28 22:34 UTC (permalink / raw)
  To: jeremy.linton
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, lorenzo.pieralisi,
	hanjun.guo, rjw, Will Deacon, catalin.marinas, Greg KH,
	mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2, vkilari,
	ahs3, dietmar.eggemann, morten.rasmussen, lenb, john.garry,
	austinwc, tnowicki, jeremy.linton

On Wed, 28 Feb 2018 14:06:08 PST (-0800), jeremy.linton@arm.com wrote:
> The original intent in cacheinfo was that an architecture
> specific populate_cache_leaves() would probe the hardware
> and then cache_shared_cpu_map_setup() and
> cache_override_properties() would provide firmware help to
> extend/expand upon what was probed. Arm64 was really
> the only architecture that was working this way, and
> with the removal of most of the hardware probing logic it
> became clear that it was possible to simplify the logic a bit.
>
> This patch combines the walk of the DT nodes with the
> code updating the cache size/line_size and nr_sets.
> cache_override_properties() (which was DT specific) is
> then removed. The result is that cacheinfo.of_node is
> no longer used as a temporary place to hold DT references
> for future calls that update cache properties. That change
> helps to clarify its one remaining use (matching
> cacheinfo nodes that represent shared caches) which
> will be used by the ACPI/PPTT code in the following patches.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/riscv/kernel/cacheinfo.c |  1 -
>  drivers/base/cacheinfo.c      | 65 +++++++++++++++++++------------------------
>  2 files changed, 29 insertions(+), 37 deletions(-)
>
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index 10ed2749e246..0bc86e5f8f3f 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -20,7 +20,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
>  			 struct device_node *node,
>  			 enum cache_type type, unsigned int level)
>  {
> -	this_leaf->of_node = node;
>  	this_leaf->level = level;
>  	this_leaf->type = type;
>  	/* not a sector cache */
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 09ccef7ddc99..a872523e8951 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -71,7 +71,7 @@ static inline int get_cacheinfo_idx(enum cache_type type)
>  	return type;
>  }

This looks good as far as RISC-V is concerned, though that's such a trivial 
part of the changeset it's not worth that much :).  Thanks!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
@ 2018-02-28 22:34     ` Palmer Dabbelt
  0 siblings, 0 replies; 136+ messages in thread
From: Palmer Dabbelt @ 2018-02-28 22:34 UTC (permalink / raw)
  To: linux-riscv

On Wed, 28 Feb 2018 14:06:08 PST (-0800), jeremy.linton at arm.com wrote:
> The original intent in cacheinfo was that an architecture
> specific populate_cache_leaves() would probe the hardware
> and then cache_shared_cpu_map_setup() and
> cache_override_properties() would provide firmware help to
> extend/expand upon what was probed. Arm64 was really
> the only architecture that was working this way, and
> with the removal of most of the hardware probing logic it
> became clear that it was possible to simplify the logic a bit.
>
> This patch combines the walk of the DT nodes with the
> code updating the cache size/line_size and nr_sets.
> cache_override_properties() (which was DT specific) is
> then removed. The result is that cacheinfo.of_node is
> no longer used as a temporary place to hold DT references
> for future calls that update cache properties. That change
> helps to clarify its one remaining use (matching
> cacheinfo nodes that represent shared caches) which
> will be used by the ACPI/PPTT code in the following patches.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/riscv/kernel/cacheinfo.c |  1 -
>  drivers/base/cacheinfo.c      | 65 +++++++++++++++++++------------------------
>  2 files changed, 29 insertions(+), 37 deletions(-)
>
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index 10ed2749e246..0bc86e5f8f3f 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -20,7 +20,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
>  			 struct device_node *node,
>  			 enum cache_type type, unsigned int level)
>  {
> -	this_leaf->of_node = node;
>  	this_leaf->level = level;
>  	this_leaf->type = type;
>  	/* not a sector cache */
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 09ccef7ddc99..a872523e8951 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -71,7 +71,7 @@ static inline int get_cacheinfo_idx(enum cache_type type)
>  	return type;
>  }

This looks good as far as RISC-V is concerned, though that's such a trivial 
part of the changeset it's not worth that much :).  Thanks!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
@ 2018-02-28 22:34     ` Palmer Dabbelt
  0 siblings, 0 replies; 136+ messages in thread
From: Palmer Dabbelt @ 2018-02-28 22:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 28 Feb 2018 14:06:08 PST (-0800), jeremy.linton at arm.com wrote:
> The original intent in cacheinfo was that an architecture
> specific populate_cache_leaves() would probe the hardware
> and then cache_shared_cpu_map_setup() and
> cache_override_properties() would provide firmware help to
> extend/expand upon what was probed. Arm64 was really
> the only architecture that was working this way, and
> with the removal of most of the hardware probing logic it
> became clear that it was possible to simplify the logic a bit.
>
> This patch combines the walk of the DT nodes with the
> code updating the cache size/line_size and nr_sets.
> cache_override_properties() (which was DT specific) is
> then removed. The result is that cacheinfo.of_node is
> no longer used as a temporary place to hold DT references
> for future calls that update cache properties. That change
> helps to clarify its one remaining use (matching
> cacheinfo nodes that represent shared caches) which
> will be used by the ACPI/PPTT code in the following patches.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/riscv/kernel/cacheinfo.c |  1 -
>  drivers/base/cacheinfo.c      | 65 +++++++++++++++++++------------------------
>  2 files changed, 29 insertions(+), 37 deletions(-)
>
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index 10ed2749e246..0bc86e5f8f3f 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -20,7 +20,6 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
>  			 struct device_node *node,
>  			 enum cache_type type, unsigned int level)
>  {
> -	this_leaf->of_node = node;
>  	this_leaf->level = level;
>  	this_leaf->type = type;
>  	/* not a sector cache */
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 09ccef7ddc99..a872523e8951 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -71,7 +71,7 @@ static inline int get_cacheinfo_idx(enum cache_type type)
>  	return type;
>  }

This looks good as far as RISC-V is concerned, though that's such a trivial 
part of the changeset it's not worth that much :).  Thanks!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 00/13] Support PPTT for ARM64
  2018-02-28 22:06 ` Jeremy Linton
  (?)
@ 2018-03-01 12:06   ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-01 12:06 UTC (permalink / raw)
  To: Jeremy Linton, linux-acpi
  Cc: Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki

Hi Jeremy,

On 28/02/18 22:06, Jeremy Linton wrote:
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
> used to describe the processor and cache topology. Ideally it is
> used to extend/override information provided by the hardware, but
> right now ARM64 is entirely dependent on firmware provided tables.
> 
> This patch parses the table for the cache topology and CPU topology.
> When we enable ACPI/PPTT for arm64 we map the physical_id to the
> PPTT node flagged as the physical package by the firmware.
> This results in topologies that match what the remainder of the
> system expects. To avoid inverted scheduler domains we then
> set the MC domain equal to the largest cache within the socket
> below the NUMA domain.
> 
I remember reviewing and acknowledging most of the cacheinfo stuff with
couple of minor suggestions for v6. I don't see any Acked-by tags in
this series and don't know if I need to review/ack any more cacheinfo
related patches.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-01 12:06   ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-01 12:06 UTC (permalink / raw)
  To: linux-riscv

Hi Jeremy,

On 28/02/18 22:06, Jeremy Linton wrote:
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
> used to describe the processor and cache topology. Ideally it is
> used to extend/override information provided by the hardware, but
> right now ARM64 is entirely dependent on firmware provided tables.
> 
> This patch parses the table for the cache topology and CPU topology.
> When we enable ACPI/PPTT for arm64 we map the physical_id to the
> PPTT node flagged as the physical package by the firmware.
> This results in topologies that match what the remainder of the
> system expects. To avoid inverted scheduler domains we then
> set the MC domain equal to the largest cache within the socket
> below the NUMA domain.
> 
I remember reviewing and acknowledging most of the cacheinfo stuff with
couple of minor suggestions for v6. I don't see any Acked-by tags in
this series and don't know if I need to review/ack any more cacheinfo
related patches.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-01 12:06   ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-01 12:06 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jeremy,

On 28/02/18 22:06, Jeremy Linton wrote:
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
> used to describe the processor and cache topology. Ideally it is
> used to extend/override information provided by the hardware, but
> right now ARM64 is entirely dependent on firmware provided tables.
> 
> This patch parses the table for the cache topology and CPU topology.
> When we enable ACPI/PPTT for arm64 we map the physical_id to the
> PPTT node flagged as the physical package by the firmware.
> This results in topologies that match what the remainder of the
> system expects. To avoid inverted scheduler domains we then
> set the MC domain equal to the largest cache within the socket
> below the NUMA domain.
> 
I remember reviewing and acknowledging most of the cacheinfo stuff with
couple of minor suggestions for v6. I don't see any Acked-by tags in
this series and don't know if I need to review/ack any more cacheinfo
related patches.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-01 15:52     ` Morten Rasmussen
  -1 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-01 15:52 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, lorenzo.pieralisi,
	hanjun.guo, rjw, will.deacon, catalin.marinas, gregkh,
	mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2, vkilari,
	ahs3, dietmar.eggemann, palmer, lenb, john.garry, austinwc,
	tnowicki

Hi Jeremy,

On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
> Now that we have an accurate view of the physical topology
> we need to represent it correctly to the scheduler. In the
> case of NUMA in socket, we need to assure that the sched domain
> we build for the MC layer isn't larger than the DIE above it.

MC shouldn't be larger than any of the NUMA domains either.

> To do this correctly, we should really base that on the cache
> topology immediately below the NUMA node (for NUMA in socket)
> or below the physical package for normal NUMA configurations.

That means we wouldn't support multi-die NUMA nodes?

> This patch creates a set of early cache_siblings masks, then
> when the scheduler requests the coregroup mask we pick the
> smaller of the physical package siblings, or the numa siblings
> and locate the largest cache which is an entire subset of
> those siblings. If we are unable to find a proper subset of
> cores then we retain the original behavior and return the
> core_sibling list.

IIUC, for numa-in-package it is a strict requirement that there is a
cache that span the entire NUMA node? For example, having a NUMA node
consisting of two clusters with per-cluster caches only wouldn't be
supported?

> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/arm64/include/asm/topology.h |  5 +++
>  arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index 6b10459e6905..08db3e4e44e1 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -4,12 +4,17 @@
>  
>  #include <linux/cpumask.h>
>  
> +#define MAX_CACHE_CHECKS 4
> +
>  struct cpu_topology {
>  	int thread_id;
>  	int core_id;
>  	int package_id;
> +	int cache_id[MAX_CACHE_CHECKS];
>  	cpumask_t thread_sibling;
>  	cpumask_t core_sibling;
> +	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
> +	int cache_level;
>  };
>  
>  extern struct cpu_topology cpu_topology[NR_CPUS];
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index bd1aae438a31..1809dc9d347c 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
>  struct cpu_topology cpu_topology[NR_CPUS];
>  EXPORT_SYMBOL_GPL(cpu_topology);
>  
> +static void find_llc_topology_for_cpu(int cpu)

Isn't it more find core/node siblings? Or is it a requirement that the
last level cache spans exactly one NUMA node? For example, a package
level cache isn't allowed for numa-in-package?

> +{
> +	/* first determine if we are a NUMA in package */
> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
> +	int indx;
> +
> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
> +		/* not numa in package, lets use the package siblings */
> +		node_mask = &cpu_topology[cpu].core_sibling;
> +	}
> +
> +	/*
> +	 * node_mask should represent the smallest package/numa grouping
> +	 * lets search for the largest cache smaller than the node_mask.
> +	 */
> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
> +
> +		if (cpu_topology[cpu].cache_id[indx] < 0)
> +			continue;
> +
> +		if (cpumask_subset(cache_sibs, node_mask))
> +			cpu_topology[cpu].cache_level = indx;

I don't this guarantees that the cache level we found matches exactly
the NUMA node. Taking the two cluster NUMA node example from above, we
would set cache_level to point at the per-cluster cache as it is a
subset of the NUMA node but it would only span half of the node. Or am I
missing something?

> +	}
> +}
> +
>  const struct cpumask *cpu_coregroup_mask(int cpu)
>  {
> +	int *llc = &cpu_topology[cpu].cache_level;
> +
> +	if (*llc == -1)
> +		find_llc_topology_for_cpu(cpu);
> +
> +	if (*llc != -1)
> +		return &cpu_topology[cpu].cache_siblings[*llc];
> +
>  	return &cpu_topology[cpu].core_sibling;
>  }
>  
> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>  {
>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>  	int cpu;
> +	int idx;
>  
>  	/* update core and thread sibling masks */
>  	for_each_possible_cpu(cpu) {
> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>  		if (cpuid_topo->package_id != cpu_topo->package_id)
>  			continue;
>  
> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> +			cpumask_t *lsib;
> +			int cput_id = cpuid_topo->cache_id[idx];
> +
> +			if (cput_id == cpu_topo->cache_id[idx]) {
> +				lsib = &cpuid_topo->cache_siblings[idx];
> +				cpumask_set_cpu(cpu, lsib);
> +			}

Shouldn't the cache_id validity be checked here? I don't think it breaks
anything though.

Overall, I think this is more or less in line with the MC domain
shrinking I just mentioned in the v6 discussion. It is mostly the corner
cases and assumption about the system topology I'm not sure about.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-01 15:52     ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-01 15:52 UTC (permalink / raw)
  To: linux-riscv

Hi Jeremy,

On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
> Now that we have an accurate view of the physical topology
> we need to represent it correctly to the scheduler. In the
> case of NUMA in socket, we need to assure that the sched domain
> we build for the MC layer isn't larger than the DIE above it.

MC shouldn't be larger than any of the NUMA domains either.

> To do this correctly, we should really base that on the cache
> topology immediately below the NUMA node (for NUMA in socket)
> or below the physical package for normal NUMA configurations.

That means we wouldn't support multi-die NUMA nodes?

> This patch creates a set of early cache_siblings masks, then
> when the scheduler requests the coregroup mask we pick the
> smaller of the physical package siblings, or the numa siblings
> and locate the largest cache which is an entire subset of
> those siblings. If we are unable to find a proper subset of
> cores then we retain the original behavior and return the
> core_sibling list.

IIUC, for numa-in-package it is a strict requirement that there is a
cache that span the entire NUMA node? For example, having a NUMA node
consisting of two clusters with per-cluster caches only wouldn't be
supported?

> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/arm64/include/asm/topology.h |  5 +++
>  arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index 6b10459e6905..08db3e4e44e1 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -4,12 +4,17 @@
>  
>  #include <linux/cpumask.h>
>  
> +#define MAX_CACHE_CHECKS 4
> +
>  struct cpu_topology {
>  	int thread_id;
>  	int core_id;
>  	int package_id;
> +	int cache_id[MAX_CACHE_CHECKS];
>  	cpumask_t thread_sibling;
>  	cpumask_t core_sibling;
> +	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
> +	int cache_level;
>  };
>  
>  extern struct cpu_topology cpu_topology[NR_CPUS];
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index bd1aae438a31..1809dc9d347c 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
>  struct cpu_topology cpu_topology[NR_CPUS];
>  EXPORT_SYMBOL_GPL(cpu_topology);
>  
> +static void find_llc_topology_for_cpu(int cpu)

Isn't it more find core/node siblings? Or is it a requirement that the
last level cache spans exactly one NUMA node? For example, a package
level cache isn't allowed for numa-in-package?

> +{
> +	/* first determine if we are a NUMA in package */
> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
> +	int indx;
> +
> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
> +		/* not numa in package, lets use the package siblings */
> +		node_mask = &cpu_topology[cpu].core_sibling;
> +	}
> +
> +	/*
> +	 * node_mask should represent the smallest package/numa grouping
> +	 * lets search for the largest cache smaller than the node_mask.
> +	 */
> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
> +
> +		if (cpu_topology[cpu].cache_id[indx] < 0)
> +			continue;
> +
> +		if (cpumask_subset(cache_sibs, node_mask))
> +			cpu_topology[cpu].cache_level = indx;

I don't this guarantees that the cache level we found matches exactly
the NUMA node. Taking the two cluster NUMA node example from above, we
would set cache_level to point at the per-cluster cache as it is a
subset of the NUMA node but it would only span half of the node. Or am I
missing something?

> +	}
> +}
> +
>  const struct cpumask *cpu_coregroup_mask(int cpu)
>  {
> +	int *llc = &cpu_topology[cpu].cache_level;
> +
> +	if (*llc == -1)
> +		find_llc_topology_for_cpu(cpu);
> +
> +	if (*llc != -1)
> +		return &cpu_topology[cpu].cache_siblings[*llc];
> +
>  	return &cpu_topology[cpu].core_sibling;
>  }
>  
> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>  {
>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>  	int cpu;
> +	int idx;
>  
>  	/* update core and thread sibling masks */
>  	for_each_possible_cpu(cpu) {
> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>  		if (cpuid_topo->package_id != cpu_topo->package_id)
>  			continue;
>  
> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> +			cpumask_t *lsib;
> +			int cput_id = cpuid_topo->cache_id[idx];
> +
> +			if (cput_id == cpu_topo->cache_id[idx]) {
> +				lsib = &cpuid_topo->cache_siblings[idx];
> +				cpumask_set_cpu(cpu, lsib);
> +			}

Shouldn't the cache_id validity be checked here? I don't think it breaks
anything though.

Overall, I think this is more or less in line with the MC domain
shrinking I just mentioned in the v6 discussion. It is mostly the corner
cases and assumption about the system topology I'm not sure about.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-01 15:52     ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-01 15:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jeremy,

On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
> Now that we have an accurate view of the physical topology
> we need to represent it correctly to the scheduler. In the
> case of NUMA in socket, we need to assure that the sched domain
> we build for the MC layer isn't larger than the DIE above it.

MC shouldn't be larger than any of the NUMA domains either.

> To do this correctly, we should really base that on the cache
> topology immediately below the NUMA node (for NUMA in socket)
> or below the physical package for normal NUMA configurations.

That means we wouldn't support multi-die NUMA nodes?

> This patch creates a set of early cache_siblings masks, then
> when the scheduler requests the coregroup mask we pick the
> smaller of the physical package siblings, or the numa siblings
> and locate the largest cache which is an entire subset of
> those siblings. If we are unable to find a proper subset of
> cores then we retain the original behavior and return the
> core_sibling list.

IIUC, for numa-in-package it is a strict requirement that there is a
cache that span the entire NUMA node? For example, having a NUMA node
consisting of two clusters with per-cluster caches only wouldn't be
supported?

> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  arch/arm64/include/asm/topology.h |  5 +++
>  arch/arm64/kernel/topology.c      | 64 +++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
> index 6b10459e6905..08db3e4e44e1 100644
> --- a/arch/arm64/include/asm/topology.h
> +++ b/arch/arm64/include/asm/topology.h
> @@ -4,12 +4,17 @@
>  
>  #include <linux/cpumask.h>
>  
> +#define MAX_CACHE_CHECKS 4
> +
>  struct cpu_topology {
>  	int thread_id;
>  	int core_id;
>  	int package_id;
> +	int cache_id[MAX_CACHE_CHECKS];
>  	cpumask_t thread_sibling;
>  	cpumask_t core_sibling;
> +	cpumask_t cache_siblings[MAX_CACHE_CHECKS];
> +	int cache_level;
>  };
>  
>  extern struct cpu_topology cpu_topology[NR_CPUS];
> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
> index bd1aae438a31..1809dc9d347c 100644
> --- a/arch/arm64/kernel/topology.c
> +++ b/arch/arm64/kernel/topology.c
> @@ -212,8 +212,42 @@ static int __init parse_dt_topology(void)
>  struct cpu_topology cpu_topology[NR_CPUS];
>  EXPORT_SYMBOL_GPL(cpu_topology);
>  
> +static void find_llc_topology_for_cpu(int cpu)

Isn't it more find core/node siblings? Or is it a requirement that the
last level cache spans exactly one NUMA node? For example, a package
level cache isn't allowed for numa-in-package?

> +{
> +	/* first determine if we are a NUMA in package */
> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
> +	int indx;
> +
> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
> +		/* not numa in package, lets use the package siblings */
> +		node_mask = &cpu_topology[cpu].core_sibling;
> +	}
> +
> +	/*
> +	 * node_mask should represent the smallest package/numa grouping
> +	 * lets search for the largest cache smaller than the node_mask.
> +	 */
> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
> +
> +		if (cpu_topology[cpu].cache_id[indx] < 0)
> +			continue;
> +
> +		if (cpumask_subset(cache_sibs, node_mask))
> +			cpu_topology[cpu].cache_level = indx;

I don't this guarantees that the cache level we found matches exactly
the NUMA node. Taking the two cluster NUMA node example from above, we
would set cache_level to point at the per-cluster cache as it is a
subset of the NUMA node but it would only span half of the node. Or am I
missing something?

> +	}
> +}
> +
>  const struct cpumask *cpu_coregroup_mask(int cpu)
>  {
> +	int *llc = &cpu_topology[cpu].cache_level;
> +
> +	if (*llc == -1)
> +		find_llc_topology_for_cpu(cpu);
> +
> +	if (*llc != -1)
> +		return &cpu_topology[cpu].cache_siblings[*llc];
> +
>  	return &cpu_topology[cpu].core_sibling;
>  }
>  
> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>  {
>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>  	int cpu;
> +	int idx;
>  
>  	/* update core and thread sibling masks */
>  	for_each_possible_cpu(cpu) {
> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>  		if (cpuid_topo->package_id != cpu_topo->package_id)
>  			continue;
>  
> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> +			cpumask_t *lsib;
> +			int cput_id = cpuid_topo->cache_id[idx];
> +
> +			if (cput_id == cpu_topo->cache_id[idx]) {
> +				lsib = &cpuid_topo->cache_siblings[idx];
> +				cpumask_set_cpu(cpu, lsib);
> +			}

Shouldn't the cache_id validity be checked here? I don't think it breaks
anything though.

Overall, I think this is more or less in line with the MC domain
shrinking I just mentioned in the v6 discussion. It is mostly the corner
cases and assumption about the system topology I'm not sure about.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
  2018-02-28 22:06   ` Jeremy Linton
  (?)
  (?)
@ 2018-03-03 21:58     ` kbuild test robot
  -1 siblings, 0 replies; 136+ messages in thread
From: kbuild test robot @ 2018-03-03 21:58 UTC (permalink / raw)
  Cc: kbuild-all, linux-acpi, linux-arm-kernel, sudeep.holla,
	lorenzo.pieralisi, hanjun.guo, rjw, will.deacon, catalin.marinas,
	gregkh, mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2,
	vkilari, ahs3, dietmar.eggemann, morten.rasmussen, palmer, lenb,
	john.garry, austinwc, tnowicki, Jeremy Linton

[-- Attachment #1: Type: text/plain, Size: 1230 bytes --]

Hi Jeremy,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on pm/linux-next]
[also build test ERROR on v4.16-rc3 next-20180302]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jeremy-Linton/Support-PPTT-for-ARM64/20180304-005730
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: s390-default_defconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   drivers/base/cacheinfo.o: In function `acpi_find_last_cache_level':
>> (.text+0x900): multiple definition of `acpi_find_last_cache_level'
   arch/s390/kernel/cache.o:(.text+0xc8): first defined here

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 17664 bytes --]

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
@ 2018-03-03 21:58     ` kbuild test robot
  0 siblings, 0 replies; 136+ messages in thread
From: kbuild test robot @ 2018-03-03 21:58 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: kbuild-all, linux-acpi, linux-arm-kernel, sudeep.holla,
	lorenzo.pieralisi, hanjun.guo, rjw, will.deacon, catalin.marinas,
	gregkh, mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2,
	vkilari, ahs3, dietmar.eggemann, morten.rasmussen, palmer, lenb,
	john.garry, austinwc, tnowicki, Jeremy Linton

[-- Attachment #1: Type: text/plain, Size: 1230 bytes --]

Hi Jeremy,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on pm/linux-next]
[also build test ERROR on v4.16-rc3 next-20180302]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jeremy-Linton/Support-PPTT-for-ARM64/20180304-005730
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: s390-default_defconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   drivers/base/cacheinfo.o: In function `acpi_find_last_cache_level':
>> (.text+0x900): multiple definition of `acpi_find_last_cache_level'
   arch/s390/kernel/cache.o:(.text+0xc8): first defined here

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 17664 bytes --]

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
@ 2018-03-03 21:58     ` kbuild test robot
  0 siblings, 0 replies; 136+ messages in thread
From: kbuild test robot @ 2018-03-03 21:58 UTC (permalink / raw)
  To: linux-riscv

Hi Jeremy,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on pm/linux-next]
[also build test ERROR on v4.16-rc3 next-20180302]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jeremy-Linton/Support-PPTT-for-ARM64/20180304-005730
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: s390-default_defconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   drivers/base/cacheinfo.o: In function `acpi_find_last_cache_level':
>> (.text+0x900): multiple definition of `acpi_find_last_cache_level'
   arch/s390/kernel/cache.o:(.text+0xc8): first defined here

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 17664 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20180304/9c07f3bf/attachment.gz>

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
@ 2018-03-03 21:58     ` kbuild test robot
  0 siblings, 0 replies; 136+ messages in thread
From: kbuild test robot @ 2018-03-03 21:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jeremy,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on pm/linux-next]
[also build test ERROR on v4.16-rc3 next-20180302]
[cannot apply to arm64/for-next/core]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Jeremy-Linton/Support-PPTT-for-ARM64/20180304-005730
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: s390-default_defconfig (attached as .config)
compiler: s390x-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=s390 

All errors (new ones prefixed by >>):

   drivers/base/cacheinfo.o: In function `acpi_find_last_cache_level':
>> (.text+0x900): multiple definition of `acpi_find_last_cache_level'
   arch/s390/kernel/cache.o:(.text+0xc8): first defined here

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: .config.gz
Type: application/gzip
Size: 17664 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180304/9c07f3bf/attachment-0001.gz>

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 10/13] arm64: topology: rename cluster_id
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-05 12:24     ` Mark Brown
  -1 siblings, 0 replies; 136+ messages in thread
From: Mark Brown @ 2018-03-05 12:24 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, mark.rutland, austinwc, tnowicki, catalin.marinas,
	palmer, will.deacon, linux-riscv, morten.rasmussen, vkilari,
	lorenzo.pieralisi, ahs3, lenb, john.garry, wangxiongfeng2,
	dietmar.eggemann, linux-arm-kernel, gregkh, rjw, linux-kernel,
	hanjun.guo, sudeep.holla

[-- Attachment #1: Type: text/plain, Size: 341 bytes --]

On Wed, Feb 28, 2018 at 04:06:16PM -0600, Jeremy Linton wrote:

> Lets match the name of the arm64 topology field
> to the kernel macro that uses it.

I called it cluster ID in the code because that's what the documentation
for MPIDR called it IIRC.  Googling around suggests that this naming may
be being used by some of the DynamiQ stuff.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 10/13] arm64: topology: rename cluster_id
@ 2018-03-05 12:24     ` Mark Brown
  0 siblings, 0 replies; 136+ messages in thread
From: Mark Brown @ 2018-03-05 12:24 UTC (permalink / raw)
  To: linux-riscv

On Wed, Feb 28, 2018 at 04:06:16PM -0600, Jeremy Linton wrote:

> Lets match the name of the arm64 topology field
> to the kernel macro that uses it.

I called it cluster ID in the code because that's what the documentation
for MPIDR called it IIRC.  Googling around suggests that this naming may
be being used by some of the DynamiQ stuff.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-riscv/attachments/20180305/29609d32/attachment.sig>

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 10/13] arm64: topology: rename cluster_id
@ 2018-03-05 12:24     ` Mark Brown
  0 siblings, 0 replies; 136+ messages in thread
From: Mark Brown @ 2018-03-05 12:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 04:06:16PM -0600, Jeremy Linton wrote:

> Lets match the name of the arm64 topology field
> to the kernel macro that uses it.

I called it cluster ID in the code because that's what the documentation
for MPIDR called it IIRC.  Googling around suggests that this naming may
be being used by some of the DynamiQ stuff.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180305/29609d32/attachment.sig>

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-02-27 20:18       ` Jeremy Linton
  (?)
@ 2018-03-06 16:07         ` Morten Rasmussen
  -1 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-06 16:07 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, lorenzo.pieralisi,
	hanjun.guo, rjw, will.deacon, catalin.marinas, gregkh,
	mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2, vkilari,
	ahs3, dietmar.eggemann, palmer, lenb, john.garry, austinwc,
	tnowicki

On Tue, Feb 27, 2018 at 02:18:47PM -0600, Jeremy Linton wrote:
> Hi,
> 
> 
> First, thanks for taking a look at this.
> 
> On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
> >Hi Jeremy,
> >
> >On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
> >>Now that we have an accurate view of the physical topology
> >>we need to represent it correctly to the scheduler. In the
> >>case of NUMA in socket, we need to assure that the sched domain
> >>we build for the MC layer isn't larger than the DIE above it.
> >
> >MC shouldn't be larger than any of the NUMA domains either.
> 
> Right, that is one of the things this patch is assuring..
> 
> >
> >>To do this correctly, we should really base that on the cache
> >>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >
> >That means we wouldn't support multi-die NUMA nodes?
> 
> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> should work. This patch is picking the widest cache layer below the smallest
> of the package or numa grouping. What actually happens depends on the
> topology. Given a case where there are multiple dies in a socket, and the
> numa domain is at the socket level the MC is going to reflect the caching
> topology immediately below the socket. In the case of multiple dies, with a
> cache that crosses them in socket, then the MC is basically going to be the
> socket, otherwise if the widest cache is per die, or some narrower grouping
> (cluster?) then that is what ends up in the MC. (this is easier with some
> pictures)

That is more or less what I meant. I think I got confused with the role
of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
cpumask spans exactly the NUMA node, so IIUC we have three scenarios:

1. Multi-die/socket/physical package NUMA node
   Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
   spans multiple multi-package nodes. The MC mask reflects the last-level
   cache within the NUMA node which is most likely per-die or per-cluster
   (inside each die).

2. physical package == NUMA node
   The top non-NUMA (DIE) mask is the same as the core sibling mask.
   If there is cache spanning the entire node, the scheduler topology
   will eliminate a layer (DIE?), so bottom NUMA level would be right on
   top of MC spanning multiple physical packages. If there is no
   node-wide last level cache, DIE is preserved and MC matches the span
   of the last level cache.

3. numa-in-package
   Top non-NUMA (DIE) mask is not reflecting the actual die, but is
   reflecting the NUMA node. MC has a span equal to the largest share
   cache span smaller than or equal to the the NUMA node. If it is
   equal, DIE level is eliminated, otherwise DIE is preserved, but
   doesn't really represent die. Bottom non-NUMA level spans multiple
   in-package NUMA nodes.

As you said, multi-die nodes should work. However, I'm not sure if
shrinking MC to match a cache could cause us trouble, or if it should
just be shrunk to be the smaller of the node mask and core siblings.
Unless you have a node-wide last level cache DIE level won't be
eliminated in scenario 2 and 3, and could cause trouble. For
numa-in-package, you can end up with a DIE level inside the node where
the default flags don't favour aggressive spreading of tasks. The same
could be the case for per-package nodes (scenario 2).

Don't we end up redefining physical package to be last level cache
instead of using the PPTT flag for scenario 2 and 3?

I think DIE level should be eliminated for scenario 2 and 3 like it is
for x86.

[...]

> >>This patch creates a set of early cache_siblings masks, then
> >>when the scheduler requests the coregroup mask we pick the
> >>smaller of the physical package siblings, or the numa siblings
> >>and locate the largest cache which is an entire subset of
> >>those siblings. If we are unable to find a proper subset of
> >>cores then we retain the original behavior and return the
> >>core_sibling list.
> >
> >IIUC, for numa-in-package it is a strict requirement that there is a
> >cache that span the entire NUMA node? For example, having a NUMA node
> >consisting of two clusters with per-cluster caches only wouldn't be
> >supported?
> 
> Everything is supported, the MC is reflecting the cache topology. We just
> use the physical/numa topology to help us pick which layer of cache topology
> lands in the MC. (unless of course we fail to find a PPTT/cache topology, in
> which case we fallback to the old behavior of the core_siblings which can
> reflect the MPIDR/etc).

I see. For this example we would end up with a "DIE" level and two MC
domains inside each node whether we have the PPTT table and cache
topology or not. I'm just wondering if everyone would be happy with
basing MC on last level cache instead of the smaller of physical package
and NUMA node.

> >>+{
> >>+	/* first determine if we are a NUMA in package */
> >>+	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
> >>+	int indx;
> >>+
> >>+	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
> >>+		/* not numa in package, lets use the package siblings */
> >>+		node_mask = &cpu_topology[cpu].core_sibling;
> >>+	}
> >>+
> >>+	/*
> >>+	 * node_mask should represent the smallest package/numa grouping
> >>+	 * lets search for the largest cache smaller than the node_mask.
> >>+	 */
> >>+	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
> >>+		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
> >>+
> >>+		if (cpu_topology[cpu].cache_id[indx] < 0)
> >>+			continue;
> >>+
> >>+		if (cpumask_subset(cache_sibs, node_mask))
> >>+			cpu_topology[cpu].cache_level = indx;
> >
> >I don't this guarantees that the cache level we found matches exactly
> >the NUMA node. Taking the two cluster NUMA node example from above, we
> >would set cache_level to point at the per-cluster cache as it is a
> >subset of the NUMA node but it would only span half of the node. Or am I
> >missing something?
> 
> I think you got it. If the system is a traditional ARM system with shared
> L2's at the cluster level and it doesn't have any L3's/etc and the NUMA node
> crosses multiple clusters then you get the cluster L2 grouping in the MC.
> 
> I think this is what we want. Particularly, since the newer/larger machines
> do have L3+'s contained within their sockets or numa domains, so you end up
> with that as the MC.

Okay, thanks for confirming.

> 
> 
> >
> >>+	}
> >>+}
> >>+
> >>  const struct cpumask *cpu_coregroup_mask(int cpu)
> >>  {
> >>+	int *llc = &cpu_topology[cpu].cache_level;
> >>+
> >>+	if (*llc == -1)
> >>+		find_llc_topology_for_cpu(cpu);
> >>+
> >>+	if (*llc != -1)
> >>+		return &cpu_topology[cpu].cache_siblings[*llc];
> >>+
> >>  	return &cpu_topology[cpu].core_sibling;

If we don't have any of the cache_sibling masks set up, i.e. we don't
have the cache topology, we would keep looking for it every time
cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
but it could have a performance impact?


> >>  }
> >>@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
> >>  {
> >>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> >>  	int cpu;
> >>+	int idx;
> >>  	/* update core and thread sibling masks */
> >>  	for_each_possible_cpu(cpu) {
> >>@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
> >>  		if (cpuid_topo->package_id != cpu_topo->package_id)
> >>  			continue;
> >>+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> >>+			cpumask_t *lsib;
> >>+			int cput_id = cpuid_topo->cache_id[idx];
> >>+
> >>+			if (cput_id == cpu_topo->cache_id[idx]) {
> >>+				lsib = &cpuid_topo->cache_siblings[idx];
> >>+				cpumask_set_cpu(cpu, lsib);
> >>+			}
> >
> >Shouldn't the cache_id validity be checked here? I don't think it breaks
> >anything though.
> 
> It could be, but since its explicitly looking for unified caches its likely
> that some of the levels are invalid. Invalid levels get ignored later on so
> we don't really care if they are valid here.
> 
> >
> >Overall, I think this is more or less in line with the MC domain
> >shrinking I just mentioned in the v6 discussion. It is mostly the corner
> >cases and assumption about the system topology I'm not sure about.
> 
> I think its the corner cases i'm taking care of. The simple fix in v6 is to
> take the smaller of core_siblings or node_siblings, but that ignores cases
> with split L3s (or the L2 only example above). The idea here is to assure
> that MC is following a cache topology. In my mind, it is more a question of
> how that is picked. The other way I see to do this, is with a PX domain flag
> in the PPTT. We could then pick the core grouping one below that flag. Doing
> it that way affords the firmware vendors a lever they can pull to optimize a
> given machine for the linux scheduler behavior.

Okay. I think these assumptions/choices should be spelled out somewhere,
either as comments or in the commit message. As said above, I'm not sure
if the simple approach is better or not.

Using the cache span to define the MC level with a numa-in-cluster
switch like some Intel platform seems to have, you could two core being
MC siblings with numa-in-package disabled and them not being siblings
with numa-in-package enabled unless you reconfigure the span of the
caches too and remember to update the ACPI cache topology.

Regarding firmware levers, we don't want vendors to optimize for Linux
scheduler behaviour, but a mechanism to detect how closely related cores
are could make selecting the right mask for MC level easier. As I see
it, we basically have to choose between MC being cache boundary based or
physical package based. This patch implements the former, the simple
solution (core_siblings mask or node_siblings mask) implements the
latter.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-06 16:07         ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-06 16:07 UTC (permalink / raw)
  To: linux-riscv

On Tue, Feb 27, 2018 at 02:18:47PM -0600, Jeremy Linton wrote:
> Hi,
> 
> 
> First, thanks for taking a look at this.
> 
> On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
> >Hi Jeremy,
> >
> >On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
> >>Now that we have an accurate view of the physical topology
> >>we need to represent it correctly to the scheduler. In the
> >>case of NUMA in socket, we need to assure that the sched domain
> >>we build for the MC layer isn't larger than the DIE above it.
> >
> >MC shouldn't be larger than any of the NUMA domains either.
> 
> Right, that is one of the things this patch is assuring..
> 
> >
> >>To do this correctly, we should really base that on the cache
> >>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >
> >That means we wouldn't support multi-die NUMA nodes?
> 
> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> should work. This patch is picking the widest cache layer below the smallest
> of the package or numa grouping. What actually happens depends on the
> topology. Given a case where there are multiple dies in a socket, and the
> numa domain is at the socket level the MC is going to reflect the caching
> topology immediately below the socket. In the case of multiple dies, with a
> cache that crosses them in socket, then the MC is basically going to be the
> socket, otherwise if the widest cache is per die, or some narrower grouping
> (cluster?) then that is what ends up in the MC. (this is easier with some
> pictures)

That is more or less what I meant. I think I got confused with the role
of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
cpumask spans exactly the NUMA node, so IIUC we have three scenarios:

1. Multi-die/socket/physical package NUMA node
   Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
   spans multiple multi-package nodes. The MC mask reflects the last-level
   cache within the NUMA node which is most likely per-die or per-cluster
   (inside each die).

2. physical package == NUMA node
   The top non-NUMA (DIE) mask is the same as the core sibling mask.
   If there is cache spanning the entire node, the scheduler topology
   will eliminate a layer (DIE?), so bottom NUMA level would be right on
   top of MC spanning multiple physical packages. If there is no
   node-wide last level cache, DIE is preserved and MC matches the span
   of the last level cache.

3. numa-in-package
   Top non-NUMA (DIE) mask is not reflecting the actual die, but is
   reflecting the NUMA node. MC has a span equal to the largest share
   cache span smaller than or equal to the the NUMA node. If it is
   equal, DIE level is eliminated, otherwise DIE is preserved, but
   doesn't really represent die. Bottom non-NUMA level spans multiple
   in-package NUMA nodes.

As you said, multi-die nodes should work. However, I'm not sure if
shrinking MC to match a cache could cause us trouble, or if it should
just be shrunk to be the smaller of the node mask and core siblings.
Unless you have a node-wide last level cache DIE level won't be
eliminated in scenario 2 and 3, and could cause trouble. For
numa-in-package, you can end up with a DIE level inside the node where
the default flags don't favour aggressive spreading of tasks. The same
could be the case for per-package nodes (scenario 2).

Don't we end up redefining physical package to be last level cache
instead of using the PPTT flag for scenario 2 and 3?

I think DIE level should be eliminated for scenario 2 and 3 like it is
for x86.

[...]

> >>This patch creates a set of early cache_siblings masks, then
> >>when the scheduler requests the coregroup mask we pick the
> >>smaller of the physical package siblings, or the numa siblings
> >>and locate the largest cache which is an entire subset of
> >>those siblings. If we are unable to find a proper subset of
> >>cores then we retain the original behavior and return the
> >>core_sibling list.
> >
> >IIUC, for numa-in-package it is a strict requirement that there is a
> >cache that span the entire NUMA node? For example, having a NUMA node
> >consisting of two clusters with per-cluster caches only wouldn't be
> >supported?
> 
> Everything is supported, the MC is reflecting the cache topology. We just
> use the physical/numa topology to help us pick which layer of cache topology
> lands in the MC. (unless of course we fail to find a PPTT/cache topology, in
> which case we fallback to the old behavior of the core_siblings which can
> reflect the MPIDR/etc).

I see. For this example we would end up with a "DIE" level and two MC
domains inside each node whether we have the PPTT table and cache
topology or not. I'm just wondering if everyone would be happy with
basing MC on last level cache instead of the smaller of physical package
and NUMA node.

> >>+{
> >>+	/* first determine if we are a NUMA in package */
> >>+	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
> >>+	int indx;
> >>+
> >>+	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
> >>+		/* not numa in package, lets use the package siblings */
> >>+		node_mask = &cpu_topology[cpu].core_sibling;
> >>+	}
> >>+
> >>+	/*
> >>+	 * node_mask should represent the smallest package/numa grouping
> >>+	 * lets search for the largest cache smaller than the node_mask.
> >>+	 */
> >>+	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
> >>+		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
> >>+
> >>+		if (cpu_topology[cpu].cache_id[indx] < 0)
> >>+			continue;
> >>+
> >>+		if (cpumask_subset(cache_sibs, node_mask))
> >>+			cpu_topology[cpu].cache_level = indx;
> >
> >I don't this guarantees that the cache level we found matches exactly
> >the NUMA node. Taking the two cluster NUMA node example from above, we
> >would set cache_level to point at the per-cluster cache as it is a
> >subset of the NUMA node but it would only span half of the node. Or am I
> >missing something?
> 
> I think you got it. If the system is a traditional ARM system with shared
> L2's at the cluster level and it doesn't have any L3's/etc and the NUMA node
> crosses multiple clusters then you get the cluster L2 grouping in the MC.
> 
> I think this is what we want. Particularly, since the newer/larger machines
> do have L3+'s contained within their sockets or numa domains, so you end up
> with that as the MC.

Okay, thanks for confirming.

> 
> 
> >
> >>+	}
> >>+}
> >>+
> >>  const struct cpumask *cpu_coregroup_mask(int cpu)
> >>  {
> >>+	int *llc = &cpu_topology[cpu].cache_level;
> >>+
> >>+	if (*llc == -1)
> >>+		find_llc_topology_for_cpu(cpu);
> >>+
> >>+	if (*llc != -1)
> >>+		return &cpu_topology[cpu].cache_siblings[*llc];
> >>+
> >>  	return &cpu_topology[cpu].core_sibling;

If we don't have any of the cache_sibling masks set up, i.e. we don't
have the cache topology, we would keep looking for it every time
cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
but it could have a performance impact?


> >>  }
> >>@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
> >>  {
> >>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> >>  	int cpu;
> >>+	int idx;
> >>  	/* update core and thread sibling masks */
> >>  	for_each_possible_cpu(cpu) {
> >>@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
> >>  		if (cpuid_topo->package_id != cpu_topo->package_id)
> >>  			continue;
> >>+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> >>+			cpumask_t *lsib;
> >>+			int cput_id = cpuid_topo->cache_id[idx];
> >>+
> >>+			if (cput_id == cpu_topo->cache_id[idx]) {
> >>+				lsib = &cpuid_topo->cache_siblings[idx];
> >>+				cpumask_set_cpu(cpu, lsib);
> >>+			}
> >
> >Shouldn't the cache_id validity be checked here? I don't think it breaks
> >anything though.
> 
> It could be, but since its explicitly looking for unified caches its likely
> that some of the levels are invalid. Invalid levels get ignored later on so
> we don't really care if they are valid here.
> 
> >
> >Overall, I think this is more or less in line with the MC domain
> >shrinking I just mentioned in the v6 discussion. It is mostly the corner
> >cases and assumption about the system topology I'm not sure about.
> 
> I think its the corner cases i'm taking care of. The simple fix in v6 is to
> take the smaller of core_siblings or node_siblings, but that ignores cases
> with split L3s (or the L2 only example above). The idea here is to assure
> that MC is following a cache topology. In my mind, it is more a question of
> how that is picked. The other way I see to do this, is with a PX domain flag
> in the PPTT. We could then pick the core grouping one below that flag. Doing
> it that way affords the firmware vendors a lever they can pull to optimize a
> given machine for the linux scheduler behavior.

Okay. I think these assumptions/choices should be spelled out somewhere,
either as comments or in the commit message. As said above, I'm not sure
if the simple approach is better or not.

Using the cache span to define the MC level with a numa-in-cluster
switch like some Intel platform seems to have, you could two core being
MC siblings with numa-in-package disabled and them not being siblings
with numa-in-package enabled unless you reconfigure the span of the
caches too and remember to update the ACPI cache topology.

Regarding firmware levers, we don't want vendors to optimize for Linux
scheduler behaviour, but a mechanism to detect how closely related cores
are could make selecting the right mask for MC level easier. As I see
it, we basically have to choose between MC being cache boundary based or
physical package based. This patch implements the former, the simple
solution (core_siblings mask or node_siblings mask) implements the
latter.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-06 16:07         ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-06 16:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Feb 27, 2018 at 02:18:47PM -0600, Jeremy Linton wrote:
> Hi,
> 
> 
> First, thanks for taking a look at this.
> 
> On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
> >Hi Jeremy,
> >
> >On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
> >>Now that we have an accurate view of the physical topology
> >>we need to represent it correctly to the scheduler. In the
> >>case of NUMA in socket, we need to assure that the sched domain
> >>we build for the MC layer isn't larger than the DIE above it.
> >
> >MC shouldn't be larger than any of the NUMA domains either.
> 
> Right, that is one of the things this patch is assuring..
> 
> >
> >>To do this correctly, we should really base that on the cache
> >>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >
> >That means we wouldn't support multi-die NUMA nodes?
> 
> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> should work. This patch is picking the widest cache layer below the smallest
> of the package or numa grouping. What actually happens depends on the
> topology. Given a case where there are multiple dies in a socket, and the
> numa domain is at the socket level the MC is going to reflect the caching
> topology immediately below the socket. In the case of multiple dies, with a
> cache that crosses them in socket, then the MC is basically going to be the
> socket, otherwise if the widest cache is per die, or some narrower grouping
> (cluster?) then that is what ends up in the MC. (this is easier with some
> pictures)

That is more or less what I meant. I think I got confused with the role
of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
cpumask spans exactly the NUMA node, so IIUC we have three scenarios:

1. Multi-die/socket/physical package NUMA node
   Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
   spans multiple multi-package nodes. The MC mask reflects the last-level
   cache within the NUMA node which is most likely per-die or per-cluster
   (inside each die).

2. physical package == NUMA node
   The top non-NUMA (DIE) mask is the same as the core sibling mask.
   If there is cache spanning the entire node, the scheduler topology
   will eliminate a layer (DIE?), so bottom NUMA level would be right on
   top of MC spanning multiple physical packages. If there is no
   node-wide last level cache, DIE is preserved and MC matches the span
   of the last level cache.

3. numa-in-package
   Top non-NUMA (DIE) mask is not reflecting the actual die, but is
   reflecting the NUMA node. MC has a span equal to the largest share
   cache span smaller than or equal to the the NUMA node. If it is
   equal, DIE level is eliminated, otherwise DIE is preserved, but
   doesn't really represent die. Bottom non-NUMA level spans multiple
   in-package NUMA nodes.

As you said, multi-die nodes should work. However, I'm not sure if
shrinking MC to match a cache could cause us trouble, or if it should
just be shrunk to be the smaller of the node mask and core siblings.
Unless you have a node-wide last level cache DIE level won't be
eliminated in scenario 2 and 3, and could cause trouble. For
numa-in-package, you can end up with a DIE level inside the node where
the default flags don't favour aggressive spreading of tasks. The same
could be the case for per-package nodes (scenario 2).

Don't we end up redefining physical package to be last level cache
instead of using the PPTT flag for scenario 2 and 3?

I think DIE level should be eliminated for scenario 2 and 3 like it is
for x86.

[...]

> >>This patch creates a set of early cache_siblings masks, then
> >>when the scheduler requests the coregroup mask we pick the
> >>smaller of the physical package siblings, or the numa siblings
> >>and locate the largest cache which is an entire subset of
> >>those siblings. If we are unable to find a proper subset of
> >>cores then we retain the original behavior and return the
> >>core_sibling list.
> >
> >IIUC, for numa-in-package it is a strict requirement that there is a
> >cache that span the entire NUMA node? For example, having a NUMA node
> >consisting of two clusters with per-cluster caches only wouldn't be
> >supported?
> 
> Everything is supported, the MC is reflecting the cache topology. We just
> use the physical/numa topology to help us pick which layer of cache topology
> lands in the MC. (unless of course we fail to find a PPTT/cache topology, in
> which case we fallback to the old behavior of the core_siblings which can
> reflect the MPIDR/etc).

I see. For this example we would end up with a "DIE" level and two MC
domains inside each node whether we have the PPTT table and cache
topology or not. I'm just wondering if everyone would be happy with
basing MC on last level cache instead of the smaller of physical package
and NUMA node.

> >>+{
> >>+	/* first determine if we are a NUMA in package */
> >>+	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
> >>+	int indx;
> >>+
> >>+	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
> >>+		/* not numa in package, lets use the package siblings */
> >>+		node_mask = &cpu_topology[cpu].core_sibling;
> >>+	}
> >>+
> >>+	/*
> >>+	 * node_mask should represent the smallest package/numa grouping
> >>+	 * lets search for the largest cache smaller than the node_mask.
> >>+	 */
> >>+	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
> >>+		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
> >>+
> >>+		if (cpu_topology[cpu].cache_id[indx] < 0)
> >>+			continue;
> >>+
> >>+		if (cpumask_subset(cache_sibs, node_mask))
> >>+			cpu_topology[cpu].cache_level = indx;
> >
> >I don't this guarantees that the cache level we found matches exactly
> >the NUMA node. Taking the two cluster NUMA node example from above, we
> >would set cache_level to point at the per-cluster cache as it is a
> >subset of the NUMA node but it would only span half of the node. Or am I
> >missing something?
> 
> I think you got it. If the system is a traditional ARM system with shared
> L2's at the cluster level and it doesn't have any L3's/etc and the NUMA node
> crosses multiple clusters then you get the cluster L2 grouping in the MC.
> 
> I think this is what we want. Particularly, since the newer/larger machines
> do have L3+'s contained within their sockets or numa domains, so you end up
> with that as the MC.

Okay, thanks for confirming.

> 
> 
> >
> >>+	}
> >>+}
> >>+
> >>  const struct cpumask *cpu_coregroup_mask(int cpu)
> >>  {
> >>+	int *llc = &cpu_topology[cpu].cache_level;
> >>+
> >>+	if (*llc == -1)
> >>+		find_llc_topology_for_cpu(cpu);
> >>+
> >>+	if (*llc != -1)
> >>+		return &cpu_topology[cpu].cache_siblings[*llc];
> >>+
> >>  	return &cpu_topology[cpu].core_sibling;

If we don't have any of the cache_sibling masks set up, i.e. we don't
have the cache topology, we would keep looking for it every time
cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
but it could have a performance impact?


> >>  }
> >>@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
> >>  {
> >>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> >>  	int cpu;
> >>+	int idx;
> >>  	/* update core and thread sibling masks */
> >>  	for_each_possible_cpu(cpu) {
> >>@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
> >>  		if (cpuid_topo->package_id != cpu_topo->package_id)
> >>  			continue;
> >>+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> >>+			cpumask_t *lsib;
> >>+			int cput_id = cpuid_topo->cache_id[idx];
> >>+
> >>+			if (cput_id == cpu_topo->cache_id[idx]) {
> >>+				lsib = &cpuid_topo->cache_siblings[idx];
> >>+				cpumask_set_cpu(cpu, lsib);
> >>+			}
> >
> >Shouldn't the cache_id validity be checked here? I don't think it breaks
> >anything though.
> 
> It could be, but since its explicitly looking for unified caches its likely
> that some of the levels are invalid. Invalid levels get ignored later on so
> we don't really care if they are valid here.
> 
> >
> >Overall, I think this is more or less in line with the MC domain
> >shrinking I just mentioned in the v6 discussion. It is mostly the corner
> >cases and assumption about the system topology I'm not sure about.
> 
> I think its the corner cases i'm taking care of. The simple fix in v6 is to
> take the smaller of core_siblings or node_siblings, but that ignores cases
> with split L3s (or the L2 only example above). The idea here is to assure
> that MC is following a cache topology. In my mind, it is more a question of
> how that is picked. The other way I see to do this, is with a PX domain flag
> in the PPTT. We could then pick the core grouping one below that flag. Doing
> it that way affords the firmware vendors a lever they can pull to optimize a
> given machine for the linux scheduler behavior.

Okay. I think these assumptions/choices should be spelled out somewhere,
either as comments or in the commit message. As said above, I'm not sure
if the simple approach is better or not.

Using the cache span to define the MC level with a numa-in-cluster
switch like some Intel platform seems to have, you could two core being
MC siblings with numa-in-package disabled and them not being siblings
with numa-in-package enabled unless you reconfigure the span of the
caches too and remember to update the ACPI cache topology.

Regarding firmware levers, we don't want vendors to optimize for Linux
scheduler behaviour, but a mechanism to detect how closely related cores
are could make selecting the right mask for MC level easier. As I see
it, we basically have to choose between MC being cache boundary based or
physical package based. This patch implements the former, the simple
solution (core_siblings mask or node_siblings mask) implements the
latter.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 01/13] drivers: base: cacheinfo: move cache_setup_of_node()
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-06 16:16     ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:16 UTC (permalink / raw)
  To: Jeremy Linton, linux-acpi
  Cc: Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki



On 28/02/18 22:06, Jeremy Linton wrote:
> In preparation for the next patch, and to aid in
> review of that patch, lets move cache_setup_of_node
> further down in the module without any changes.
> 
I don't think this is change since v6 so my ack stands.

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 01/13] drivers: base: cacheinfo: move cache_setup_of_node()
@ 2018-03-06 16:16     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:16 UTC (permalink / raw)
  To: linux-riscv



On 28/02/18 22:06, Jeremy Linton wrote:
> In preparation for the next patch, and to aid in
> review of that patch, lets move cache_setup_of_node
> further down in the module without any changes.
> 
I don't think this is change since v6 so my ack stands.

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 01/13] drivers: base: cacheinfo: move cache_setup_of_node()
@ 2018-03-06 16:16     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:16 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/02/18 22:06, Jeremy Linton wrote:
> In preparation for the next patch, and to aid in
> review of that patch, lets move cache_setup_of_node
> further down in the module without any changes.
> 
I don't think this is change since v6 so my ack stands.

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-06 16:43     ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:43 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi,
	hanjun.guo, rjw, will.deacon, catalin.marinas, gregkh,
	mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2, vkilari,
	ahs3, dietmar.eggemann, morten.rasmussen, palmer, lenb,
	john.garry, austinwc, tnowicki



On 28/02/18 22:06, Jeremy Linton wrote:
> The original intent in cacheinfo was that an architecture
> specific populate_cache_leaves() would probe the hardware
> and then cache_shared_cpu_map_setup() and
> cache_override_properties() would provide firmware help to
> extend/expand upon what was probed. Arm64 was really
> the only architecture that was working this way, and
> with the removal of most of the hardware probing logic it
> became clear that it was possible to simplify the logic a bit.
> 
> This patch combines the walk of the DT nodes with the
> code updating the cache size/line_size and nr_sets.
> cache_override_properties() (which was DT specific) is
> then removed. The result is that cacheinfo.of_node is
> no longer used as a temporary place to hold DT references
> for future calls that update cache properties. That change
> helps to clarify its one remaining use (matching
> cacheinfo nodes that represent shared caches) which
> will be used by the ACPI/PPTT code in the following patches.
> 

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
@ 2018-03-06 16:43     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:43 UTC (permalink / raw)
  To: linux-riscv



On 28/02/18 22:06, Jeremy Linton wrote:
> The original intent in cacheinfo was that an architecture
> specific populate_cache_leaves() would probe the hardware
> and then cache_shared_cpu_map_setup() and
> cache_override_properties() would provide firmware help to
> extend/expand upon what was probed. Arm64 was really
> the only architecture that was working this way, and
> with the removal of most of the hardware probing logic it
> became clear that it was possible to simplify the logic a bit.
> 
> This patch combines the walk of the DT nodes with the
> code updating the cache size/line_size and nr_sets.
> cache_override_properties() (which was DT specific) is
> then removed. The result is that cacheinfo.of_node is
> no longer used as a temporary place to hold DT references
> for future calls that update cache properties. That change
> helps to clarify its one remaining use (matching
> cacheinfo nodes that represent shared caches) which
> will be used by the ACPI/PPTT code in the following patches.
> 

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early
@ 2018-03-06 16:43     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:43 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/02/18 22:06, Jeremy Linton wrote:
> The original intent in cacheinfo was that an architecture
> specific populate_cache_leaves() would probe the hardware
> and then cache_shared_cpu_map_setup() and
> cache_override_properties() would provide firmware help to
> extend/expand upon what was probed. Arm64 was really
> the only architecture that was working this way, and
> with the removal of most of the hardware probing logic it
> became clear that it was possible to simplify the logic a bit.
> 
> This patch combines the walk of the DT nodes with the
> code updating the cache size/line_size and nr_sets.
> cache_override_properties() (which was DT specific) is
> then removed. The result is that cacheinfo.of_node is
> no longer used as a temporary place to hold DT references
> for future calls that update cache properties. That change
> helps to clarify its one remaining use (matching
> cacheinfo nodes that represent shared caches) which
> will be used by the ACPI/PPTT code in the following patches.
> 

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 03/13] cacheinfo: rename of_node to fw_token
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-06 16:45     ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:45 UTC (permalink / raw)
  To: Jeremy Linton, linux-acpi
  Cc: Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki



On 28/02/18 22:06, Jeremy Linton wrote:
> Rename and change the type of of_node to indicate
> it is a generic pointer which is generally only used
> for comparison purposes. In a later patch we will put
> an ACPI/PPTT token pointer in fw_token so that
> the code which builds the shared cpu masks can be reused.
> 
Thanks for renaming :)

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 03/13] cacheinfo: rename of_node to fw_token
@ 2018-03-06 16:45     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:45 UTC (permalink / raw)
  To: linux-riscv



On 28/02/18 22:06, Jeremy Linton wrote:
> Rename and change the type of of_node to indicate
> it is a generic pointer which is generally only used
> for comparison purposes. In a later patch we will put
> an ACPI/PPTT token pointer in fw_token so that
> the code which builds the shared cpu masks can be reused.
> 
Thanks for renaming :)

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 03/13] cacheinfo: rename of_node to fw_token
@ 2018-03-06 16:45     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:45 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/02/18 22:06, Jeremy Linton wrote:
> Rename and change the type of of_node to indicate
> it is a generic pointer which is generally only used
> for comparison purposes. In a later patch we will put
> an ACPI/PPTT token pointer in fw_token so that
> the code which builds the shared cpu masks can be reused.
> 
Thanks for renaming :)

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 06/13] ACPI: Enable PPTT support on ARM64
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-06 16:55     ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:55 UTC (permalink / raw)
  To: Jeremy Linton, linux-acpi
  Cc: Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki



On 28/02/18 22:06, Jeremy Linton wrote:
> Now that we have a PPTT parser, in preparation for its use
> on arm64, lets build it.
> 

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 06/13] ACPI: Enable PPTT support on ARM64
@ 2018-03-06 16:55     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:55 UTC (permalink / raw)
  To: linux-riscv



On 28/02/18 22:06, Jeremy Linton wrote:
> Now that we have a PPTT parser, in preparation for its use
> on arm64, lets build it.
> 

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 06/13] ACPI: Enable PPTT support on ARM64
@ 2018-03-06 16:55     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 16:55 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/02/18 22:06, Jeremy Linton wrote:
> Now that we have a PPTT parser, in preparation for its use
> on arm64, lets build it.
> 

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 04/13] arm64/acpi: Create arch specific cpu to acpi id helper
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-06 17:13     ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:13 UTC (permalink / raw)
  To: Jeremy Linton, linux-acpi
  Cc: Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki



On 28/02/18 22:06, Jeremy Linton wrote:
> Its helpful to be able to lookup the acpi_processor_id associated
> with a logical cpu. Provide an arm64 helper to do this.
> 

This patch on it's own is good, but it's quite generic and made to look
at it again. Sorry for missing this earlier.

Can we use "per_cpu(processors, cpu)->acpi_id" at call sites instead ?
Or you can make that a generic helper using above expression ?

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 04/13] arm64/acpi: Create arch specific cpu to acpi id helper
@ 2018-03-06 17:13     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:13 UTC (permalink / raw)
  To: linux-riscv



On 28/02/18 22:06, Jeremy Linton wrote:
> Its helpful to be able to lookup the acpi_processor_id associated
> with a logical cpu. Provide an arm64 helper to do this.
> 

This patch on it's own is good, but it's quite generic and made to look
at it again. Sorry for missing this earlier.

Can we use "per_cpu(processors, cpu)->acpi_id" at call sites instead ?
Or you can make that a generic helper using above expression ?

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 04/13] arm64/acpi: Create arch specific cpu to acpi id helper
@ 2018-03-06 17:13     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:13 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/02/18 22:06, Jeremy Linton wrote:
> Its helpful to be able to lookup the acpi_processor_id associated
> with a logical cpu. Provide an arm64 helper to do this.
> 

This patch on it's own is good, but it's quite generic and made to look
at it again. Sorry for missing this earlier.

Can we use "per_cpu(processors, cpu)->acpi_id" at call sites instead ?
Or you can make that a generic helper using above expression ?

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-06 17:23     ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:23 UTC (permalink / raw)
  To: Jeremy Linton, linux-acpi
  Cc: Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki



On 28/02/18 22:06, Jeremy Linton wrote:
> The /sys cache entries should support ACPI/PPTT generated cache
> topology information. Lets detect ACPI systems and call
> an arch specific cache_setup_acpi() routine to update the hardware
> probed cache topology.
> 
> For arm64, if ACPI is enabled, determine the max number of cache
> levels and populate them using the PPTT table if one is available.
> 

I fail to understand the kbuild failure report association with this
patch(most likely the previous patch could be culprit, I am looking at
it in detail now), but for this patch

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
@ 2018-03-06 17:23     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:23 UTC (permalink / raw)
  To: linux-riscv



On 28/02/18 22:06, Jeremy Linton wrote:
> The /sys cache entries should support ACPI/PPTT generated cache
> topology information. Lets detect ACPI systems and call
> an arch specific cache_setup_acpi() routine to update the hardware
> probed cache topology.
> 
> For arm64, if ACPI is enabled, determine the max number of cache
> levels and populate them using the PPTT table if one is available.
> 

I fail to understand the kbuild failure report association with this
patch(most likely the previous patch could be culprit, I am looking at
it in detail now), but for this patch

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 08/13] arm64: Add support for ACPI based firmware tables
@ 2018-03-06 17:23     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:23 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/02/18 22:06, Jeremy Linton wrote:
> The /sys cache entries should support ACPI/PPTT generated cache
> topology information. Lets detect ACPI systems and call
> an arch specific cache_setup_acpi() routine to update the hardware
> probed cache topology.
> 
> For arm64, if ACPI is enabled, determine the max number of cache
> levels and populate them using the PPTT table if one is available.
> 

I fail to understand the kbuild failure report association with this
patch(most likely the previous patch could be culprit, I am looking at
it in detail now), but for this patch

Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-06 17:39     ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:39 UTC (permalink / raw)
  To: Jeremy Linton, linux-acpi
  Cc: Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki



On 28/02/18 22:06, Jeremy Linton wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
> 
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
> 
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
> 
> 
The more I look at the ACPI table parsing code, I get more questions
every time :). So I just skimmed through it this time. Not sure why
ACPI_PTR_* is not used elsewhere in drivers/acpi/tables.c

Anyways for cacheinfo part of this file:

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-06 17:39     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:39 UTC (permalink / raw)
  To: linux-riscv



On 28/02/18 22:06, Jeremy Linton wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
> 
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
> 
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
> 
> 
The more I look at the ACPI table parsing code, I get more questions
every time :). So I just skimmed through it this time. Not sure why
ACPI_PTR_* is not used elsewhere in drivers/acpi/tables.c

Anyways for cacheinfo part of this file:

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-06 17:39     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:39 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/02/18 22:06, Jeremy Linton wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
> 
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
> 
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
> 
> 
The more I look at the ACPI table parsing code, I get more questions
every time :). So I just skimmed through it this time. Not sure why
ACPI_PTR_* is not used elsewhere in drivers/acpi/tables.c

Anyways for cacheinfo part of this file:

Acked-by: Sudeep Holla <sudeep.holla@arm.com>

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-06 17:50     ` Sudeep Holla
  -1 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:50 UTC (permalink / raw)
  To: Jeremy Linton, linux-acpi
  Cc: Sudeep Holla, linux-arm-kernel, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki



On 28/02/18 22:06, Jeremy Linton wrote:
> Call ACPI cache parsing routines from base cacheinfo code if ACPI
> is enable. Also stub out cache_setup_acpi() so that individual
> architectures can enable ACPI topology parsing.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c       |  1 +
>  drivers/base/cacheinfo.c  | 14 ++++++++++----
>  include/linux/cacheinfo.h |  9 +++++++++
>  3 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 883e4318c6cd..c98f94ebd272 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -343,6 +343,7 @@ static void update_cache_properties(struct cacheinfo *this_leaf,
>  {
>  	int valid_flags = 0;
>  
> +	this_leaf->fw_token = cpu_node;


Any reason why this can't part of 05/13 ?

>  	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
>  		this_leaf->size = found_cache->size;
>  		valid_flags++;
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 597aacb233fc..2880e2ab01f5 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -206,7 +206,7 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>  					   struct cacheinfo *sib_leaf)
>  {
>  	/*
> -	 * For non-DT systems, assume unique level 1 cache, system-wide
> +	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
>  	 * shared caches for all other levels. This will be used only if
>  	 * arch specific code has not populated shared_cpu_map
>  	 */
> @@ -214,6 +214,11 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>  }
>  #endif
>  
> +int __weak cache_setup_acpi(unsigned int cpu)
> +{
> +	return -ENOTSUPP;
> +}
> +
>  static int cache_shared_cpu_map_setup(unsigned int cpu)
>  {
>  	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> @@ -227,8 +232,8 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>  	if (of_have_populated_dt())
>  		ret = cache_setup_of_node(cpu);
>  	else if (!acpi_disabled)
> -		/* No cache property/hierarchy support yet in ACPI */
> -		ret = -ENOTSUPP;
> +		ret = cache_setup_acpi(cpu);
> +
>  	if (ret)
>  		return ret;
>  
> @@ -279,7 +284,8 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
>  			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
>  			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
>  		}
> -		of_node_put(this_leaf->fw_token);
> +		if (of_have_populated_dt())
> +			of_node_put(this_leaf->fw_token);
>  	}
>  }
>  
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 0c6f658054d2..1446d3f053a2 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
>  struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>  int init_cache_level(unsigned int cpu);
>  int populate_cache_leaves(unsigned int cpu);
> +int cache_setup_acpi(unsigned int cpu);
> +int acpi_find_last_cache_level(unsigned int cpu);
> +#ifndef CONFIG_ACPI
> +int acpi_find_last_cache_level(unsigned int cpu)

The above 3 lines looks weird, can't it be:

#ifdef CONFIG_ACPI
int acpi_find_last_cache_level(unsigned int cpu);
#else
int acpi_find_last_cache_level(unsigned int cpu)
{
	/* ACPI kernels should be built with PPTT support */
	return 0;
}

Also I think it should be CONFIG_ACPI_PPTT, otherwise it might cause
issue on platforms which define CONFIG_ACPI but CONFIG_ACPI_PPTT is not.
I can only relate this to the s390 error reported by kbuild robot.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
@ 2018-03-06 17:50     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:50 UTC (permalink / raw)
  To: linux-riscv



On 28/02/18 22:06, Jeremy Linton wrote:
> Call ACPI cache parsing routines from base cacheinfo code if ACPI
> is enable. Also stub out cache_setup_acpi() so that individual
> architectures can enable ACPI topology parsing.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c       |  1 +
>  drivers/base/cacheinfo.c  | 14 ++++++++++----
>  include/linux/cacheinfo.h |  9 +++++++++
>  3 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 883e4318c6cd..c98f94ebd272 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -343,6 +343,7 @@ static void update_cache_properties(struct cacheinfo *this_leaf,
>  {
>  	int valid_flags = 0;
>  
> +	this_leaf->fw_token = cpu_node;


Any reason why this can't part of 05/13 ?

>  	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
>  		this_leaf->size = found_cache->size;
>  		valid_flags++;
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 597aacb233fc..2880e2ab01f5 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -206,7 +206,7 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>  					   struct cacheinfo *sib_leaf)
>  {
>  	/*
> -	 * For non-DT systems, assume unique level 1 cache, system-wide
> +	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
>  	 * shared caches for all other levels. This will be used only if
>  	 * arch specific code has not populated shared_cpu_map
>  	 */
> @@ -214,6 +214,11 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>  }
>  #endif
>  
> +int __weak cache_setup_acpi(unsigned int cpu)
> +{
> +	return -ENOTSUPP;
> +}
> +
>  static int cache_shared_cpu_map_setup(unsigned int cpu)
>  {
>  	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> @@ -227,8 +232,8 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>  	if (of_have_populated_dt())
>  		ret = cache_setup_of_node(cpu);
>  	else if (!acpi_disabled)
> -		/* No cache property/hierarchy support yet in ACPI */
> -		ret = -ENOTSUPP;
> +		ret = cache_setup_acpi(cpu);
> +
>  	if (ret)
>  		return ret;
>  
> @@ -279,7 +284,8 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
>  			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
>  			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
>  		}
> -		of_node_put(this_leaf->fw_token);
> +		if (of_have_populated_dt())
> +			of_node_put(this_leaf->fw_token);
>  	}
>  }
>  
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 0c6f658054d2..1446d3f053a2 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
>  struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>  int init_cache_level(unsigned int cpu);
>  int populate_cache_leaves(unsigned int cpu);
> +int cache_setup_acpi(unsigned int cpu);
> +int acpi_find_last_cache_level(unsigned int cpu);
> +#ifndef CONFIG_ACPI
> +int acpi_find_last_cache_level(unsigned int cpu)

The above 3 lines looks weird, can't it be:

#ifdef CONFIG_ACPI
int acpi_find_last_cache_level(unsigned int cpu);
#else
int acpi_find_last_cache_level(unsigned int cpu)
{
	/* ACPI kernels should be built with PPTT support */
	return 0;
}

Also I think it should be CONFIG_ACPI_PPTT, otherwise it might cause
issue on platforms which define CONFIG_ACPI but CONFIG_ACPI_PPTT is not.
I can only relate this to the s390 error reported by kbuild robot.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
@ 2018-03-06 17:50     ` Sudeep Holla
  0 siblings, 0 replies; 136+ messages in thread
From: Sudeep Holla @ 2018-03-06 17:50 UTC (permalink / raw)
  To: linux-arm-kernel



On 28/02/18 22:06, Jeremy Linton wrote:
> Call ACPI cache parsing routines from base cacheinfo code if ACPI
> is enable. Also stub out cache_setup_acpi() so that individual
> architectures can enable ACPI topology parsing.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c       |  1 +
>  drivers/base/cacheinfo.c  | 14 ++++++++++----
>  include/linux/cacheinfo.h |  9 +++++++++
>  3 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 883e4318c6cd..c98f94ebd272 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -343,6 +343,7 @@ static void update_cache_properties(struct cacheinfo *this_leaf,
>  {
>  	int valid_flags = 0;
>  
> +	this_leaf->fw_token = cpu_node;


Any reason why this can't part of 05/13 ?

>  	if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
>  		this_leaf->size = found_cache->size;
>  		valid_flags++;
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 597aacb233fc..2880e2ab01f5 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -206,7 +206,7 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>  					   struct cacheinfo *sib_leaf)
>  {
>  	/*
> -	 * For non-DT systems, assume unique level 1 cache, system-wide
> +	 * For non-DT/ACPI systems, assume unique level 1 caches, system-wide
>  	 * shared caches for all other levels. This will be used only if
>  	 * arch specific code has not populated shared_cpu_map
>  	 */
> @@ -214,6 +214,11 @@ static inline bool cache_leaves_are_shared(struct cacheinfo *this_leaf,
>  }
>  #endif
>  
> +int __weak cache_setup_acpi(unsigned int cpu)
> +{
> +	return -ENOTSUPP;
> +}
> +
>  static int cache_shared_cpu_map_setup(unsigned int cpu)
>  {
>  	struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> @@ -227,8 +232,8 @@ static int cache_shared_cpu_map_setup(unsigned int cpu)
>  	if (of_have_populated_dt())
>  		ret = cache_setup_of_node(cpu);
>  	else if (!acpi_disabled)
> -		/* No cache property/hierarchy support yet in ACPI */
> -		ret = -ENOTSUPP;
> +		ret = cache_setup_acpi(cpu);
> +
>  	if (ret)
>  		return ret;
>  
> @@ -279,7 +284,8 @@ static void cache_shared_cpu_map_remove(unsigned int cpu)
>  			cpumask_clear_cpu(cpu, &sib_leaf->shared_cpu_map);
>  			cpumask_clear_cpu(sibling, &this_leaf->shared_cpu_map);
>  		}
> -		of_node_put(this_leaf->fw_token);
> +		if (of_have_populated_dt())
> +			of_node_put(this_leaf->fw_token);
>  	}
>  }
>  
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 0c6f658054d2..1446d3f053a2 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
>  struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>  int init_cache_level(unsigned int cpu);
>  int populate_cache_leaves(unsigned int cpu);
> +int cache_setup_acpi(unsigned int cpu);
> +int acpi_find_last_cache_level(unsigned int cpu);
> +#ifndef CONFIG_ACPI
> +int acpi_find_last_cache_level(unsigned int cpu)

The above 3 lines looks weird, can't it be:

#ifdef CONFIG_ACPI
int acpi_find_last_cache_level(unsigned int cpu);
#else
int acpi_find_last_cache_level(unsigned int cpu)
{
	/* ACPI kernels should be built with PPTT support */
	return 0;
}

Also I think it should be CONFIG_ACPI_PPTT, otherwise it might cause
issue on platforms which define CONFIG_ACPI but CONFIG_ACPI_PPTT is not.
I can only relate this to the s390 error reported by kbuild robot.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-03-06 16:07         ` Morten Rasmussen
  (?)
@ 2018-03-06 22:22           ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-06 22:22 UTC (permalink / raw)
  To: Morten Rasmussen
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, lorenzo.pieralisi,
	hanjun.guo, rjw, will.deacon, catalin.marinas, gregkh,
	mark.rutland, linux-kernel, linux-riscv, wangxiongfeng2, vkilari,
	ahs3, dietmar.eggemann, palmer, lenb, john.garry, austinwc,
	tnowicki

Hi,

On 03/06/2018 10:07 AM, Morten Rasmussen wrote:
> On Tue, Feb 27, 2018 at 02:18:47PM -0600, Jeremy Linton wrote:
>> Hi,
>>
>>
>> First, thanks for taking a look at this.
>>
>> On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
>>> Hi Jeremy,
>>>
>>> On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
>>>> Now that we have an accurate view of the physical topology
>>>> we need to represent it correctly to the scheduler. In the
>>>> case of NUMA in socket, we need to assure that the sched domain
>>>> we build for the MC layer isn't larger than the DIE above it.
>>>
>>> MC shouldn't be larger than any of the NUMA domains either.
>>
>> Right, that is one of the things this patch is assuring..
>>
>>>
>>>> To do this correctly, we should really base that on the cache
>>>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
>>>
>>> That means we wouldn't support multi-die NUMA nodes?
>>
>> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
>> should work. This patch is picking the widest cache layer below the smallest
>> of the package or numa grouping. What actually happens depends on the
>> topology. Given a case where there are multiple dies in a socket, and the
>> numa domain is at the socket level the MC is going to reflect the caching
>> topology immediately below the socket. In the case of multiple dies, with a
>> cache that crosses them in socket, then the MC is basically going to be the
>> socket, otherwise if the widest cache is per die, or some narrower grouping
>> (cluster?) then that is what ends up in the MC. (this is easier with some
>> pictures)
> 
> That is more or less what I meant. I think I got confused with the role
> of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> 
> 1. Multi-die/socket/physical package NUMA node
>     Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
>     spans multiple multi-package nodes. The MC mask reflects the last-level
>     cache within the NUMA node which is most likely per-die or per-cluster
>     (inside each die).
> 
> 2. physical package == NUMA node
>     The top non-NUMA (DIE) mask is the same as the core sibling mask.
>     If there is cache spanning the entire node, the scheduler topology
>     will eliminate a layer (DIE?), so bottom NUMA level would be right on
>     top of MC spanning multiple physical packages. If there is no
>     node-wide last level cache, DIE is preserved and MC matches the span
>     of the last level cache.
> 
> 3. numa-in-package
>     Top non-NUMA (DIE) mask is not reflecting the actual die, but is
>     reflecting the NUMA node. MC has a span equal to the largest share
>     cache span smaller than or equal to the the NUMA node. If it is
>     equal, DIE level is eliminated, otherwise DIE is preserved, but
>     doesn't really represent die. Bottom non-NUMA level spans multiple
>     in-package NUMA nodes.
> 
> As you said, multi-die nodes should work. However, I'm not sure if
> shrinking MC to match a cache could cause us trouble, or if it should
> just be shrunk to be the smaller of the node mask and core siblings.

Shrinking to the smaller of the numa or package is fairly trivial 
change, I'm good with that change too.. I discounted it because there 
might be an advantage in case 2 if the internal hardware is actually a 
case 3 (or just multiple rings/whatever each with a L3). In those cases 
the firmware vendor could play around with whatever representation 
serves them the best.

> Unless you have a node-wide last level cache DIE level won't be
> eliminated in scenario 2 and 3, and could cause trouble. For
> numa-in-package, you can end up with a DIE level inside the node where
> the default flags don't favour aggressive spreading of tasks. The same
> could be the case for per-package nodes (scenario 2).
> 
> Don't we end up redefining physical package to be last level cache
> instead of using the PPTT flag for scenario 2 and 3?

I'm not sure I understand, core_siblings isn't changing (its still per 
package). Only the MC mapping which normally is just core_siblings. For 
all intents right now this patch is the same as v6, except for the 
numa-in-package where the MC domain is being shrunk to the node 
siblings. I'm just trying to setup the code for potential future cases 
where the LLC isn't equal to the node or package.

> 
> I think DIE level should be eliminated for scenario 2 and 3 like it is
> for x86.

Ok, that is based on the assumption that MC will always be equal to 
either the package or node? If that assumption isn't true, then would 
you keep it, or maybe it doesn't matter?

> 
> [...]
> 
>>>> This patch creates a set of early cache_siblings masks, then
>>>> when the scheduler requests the coregroup mask we pick the
>>>> smaller of the physical package siblings, or the numa siblings
>>>> and locate the largest cache which is an entire subset of
>>>> those siblings. If we are unable to find a proper subset of
>>>> cores then we retain the original behavior and return the
>>>> core_sibling list.
>>>
>>> IIUC, for numa-in-package it is a strict requirement that there is a
>>> cache that span the entire NUMA node? For example, having a NUMA node
>>> consisting of two clusters with per-cluster caches only wouldn't be
>>> supported?
>>
>> Everything is supported, the MC is reflecting the cache topology. We just
>> use the physical/numa topology to help us pick which layer of cache topology
>> lands in the MC. (unless of course we fail to find a PPTT/cache topology, in
>> which case we fallback to the old behavior of the core_siblings which can
>> reflect the MPIDR/etc).
> 
> I see. For this example we would end up with a "DIE" level and two MC
> domains inside each node whether we have the PPTT table and cache
> topology or not. I'm just wondering if everyone would be happy with
> basing MC on last level cache instead of the smaller of physical package
> and NUMA node.

I can't judge that, my idea was simply to provide some flexibility to 
the firmware. I guess in theory someone who still wanted that split 
could push a numa domain down to whatever level they wanted to group.


> 
>>>> +{
>>>> +	/* first determine if we are a NUMA in package */
>>>> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>>>> +	int indx;
>>>> +
>>>> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
>>>> +		/* not numa in package, lets use the package siblings */
>>>> +		node_mask = &cpu_topology[cpu].core_sibling;
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * node_mask should represent the smallest package/numa grouping
>>>> +	 * lets search for the largest cache smaller than the node_mask.
>>>> +	 */
>>>> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
>>>> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
>>>> +
>>>> +		if (cpu_topology[cpu].cache_id[indx] < 0)
>>>> +			continue;
>>>> +
>>>> +		if (cpumask_subset(cache_sibs, node_mask))
>>>> +			cpu_topology[cpu].cache_level = indx;
>>>
>>> I don't this guarantees that the cache level we found matches exactly
>>> the NUMA node. Taking the two cluster NUMA node example from above, we
>>> would set cache_level to point at the per-cluster cache as it is a
>>> subset of the NUMA node but it would only span half of the node. Or am I
>>> missing something?
>>
>> I think you got it. If the system is a traditional ARM system with shared
>> L2's at the cluster level and it doesn't have any L3's/etc and the NUMA node
>> crosses multiple clusters then you get the cluster L2 grouping in the MC.
>>
>> I think this is what we want. Particularly, since the newer/larger machines
>> do have L3+'s contained within their sockets or numa domains, so you end up
>> with that as the MC.
> 
> Okay, thanks for confirming.
> 
>>
>>
>>>
>>>> +	}
>>>> +}
>>>> +
>>>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>>>   {
>>>> +	int *llc = &cpu_topology[cpu].cache_level;
>>>> +
>>>> +	if (*llc == -1)
>>>> +		find_llc_topology_for_cpu(cpu);
>>>> +
>>>> +	if (*llc != -1)
>>>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>>>> +
>>>>   	return &cpu_topology[cpu].core_sibling;
> 
> If we don't have any of the cache_sibling masks set up, i.e. we don't
> have the cache topology, we would keep looking for it every time
> cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
> but it could have a performance impact?

Its only called when cores come online/offline (AFAIK).

> 
> 
>>>>   }
>>>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>   {
>>>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>>>   	int cpu;
>>>> +	int idx;
>>>>   	/* update core and thread sibling masks */
>>>>   	for_each_possible_cpu(cpu) {
>>>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>>>   			continue;
>>>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>>>> +			cpumask_t *lsib;
>>>> +			int cput_id = cpuid_topo->cache_id[idx];
>>>> +
>>>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>>>> +				lsib = &cpuid_topo->cache_siblings[idx];
>>>> +				cpumask_set_cpu(cpu, lsib);
>>>> +			}
>>>
>>> Shouldn't the cache_id validity be checked here? I don't think it breaks
>>> anything though.
>>
>> It could be, but since its explicitly looking for unified caches its likely
>> that some of the levels are invalid. Invalid levels get ignored later on so
>> we don't really care if they are valid here.
>>
>>>
>>> Overall, I think this is more or less in line with the MC domain
>>> shrinking I just mentioned in the v6 discussion. It is mostly the corner
>>> cases and assumption about the system topology I'm not sure about.
>>
>> I think its the corner cases i'm taking care of. The simple fix in v6 is to
>> take the smaller of core_siblings or node_siblings, but that ignores cases
>> with split L3s (or the L2 only example above). The idea here is to assure
>> that MC is following a cache topology. In my mind, it is more a question of
>> how that is picked. The other way I see to do this, is with a PX domain flag
>> in the PPTT. We could then pick the core grouping one below that flag. Doing
>> it that way affords the firmware vendors a lever they can pull to optimize a
>> given machine for the linux scheduler behavior.
> 
> Okay. I think these assumptions/choices should be spelled out somewhere,
> either as comments or in the commit message. As said above, I'm not sure
> if the simple approach is better or not.
> 
> Using the cache span to define the MC level with a numa-in-cluster
> switch like some Intel platform seems to have, you could two core being
> MC siblings with numa-in-package disabled and them not being siblings
> with numa-in-package enabled unless you reconfigure the span of the
> caches too and remember to update the ACPI cache topology.
> 
> Regarding firmware levers, we don't want vendors to optimize for Linux
> scheduler behaviour, but a mechanism to detect how closely related cores
> are could make selecting the right mask for MC level easier. As I see
> it, we basically have to choose between MC being cache boundary based or
> physical package based. This patch implements the former, the simple
> solution (core_siblings mask or node_siblings mask) implements the
> latter.

Basically, right now (AFAIK) the result is the same because the few 
machines I have access to have cache layers immediately below those 
boundaries which are the same size as the package/die.

I'm ok with tossing this patch in favor of something like:

const struct cpumask *cpu_coregroup_mask(int cpu)
{
    const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
    if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
    {
       /* not numa in package, lets use the package siblings */
       return &cpu_topology[cpu].core_sibling;
    }
    return node_mask;
}


Mostly, because I want to merge the PPTT parts, and I only added this to 
clear the NUMA in package borken....

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-06 22:22           ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-06 22:22 UTC (permalink / raw)
  To: linux-riscv

Hi,

On 03/06/2018 10:07 AM, Morten Rasmussen wrote:
> On Tue, Feb 27, 2018 at 02:18:47PM -0600, Jeremy Linton wrote:
>> Hi,
>>
>>
>> First, thanks for taking a look at this.
>>
>> On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
>>> Hi Jeremy,
>>>
>>> On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
>>>> Now that we have an accurate view of the physical topology
>>>> we need to represent it correctly to the scheduler. In the
>>>> case of NUMA in socket, we need to assure that the sched domain
>>>> we build for the MC layer isn't larger than the DIE above it.
>>>
>>> MC shouldn't be larger than any of the NUMA domains either.
>>
>> Right, that is one of the things this patch is assuring..
>>
>>>
>>>> To do this correctly, we should really base that on the cache
>>>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
>>>
>>> That means we wouldn't support multi-die NUMA nodes?
>>
>> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
>> should work. This patch is picking the widest cache layer below the smallest
>> of the package or numa grouping. What actually happens depends on the
>> topology. Given a case where there are multiple dies in a socket, and the
>> numa domain is at the socket level the MC is going to reflect the caching
>> topology immediately below the socket. In the case of multiple dies, with a
>> cache that crosses them in socket, then the MC is basically going to be the
>> socket, otherwise if the widest cache is per die, or some narrower grouping
>> (cluster?) then that is what ends up in the MC. (this is easier with some
>> pictures)
> 
> That is more or less what I meant. I think I got confused with the role
> of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> 
> 1. Multi-die/socket/physical package NUMA node
>     Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
>     spans multiple multi-package nodes. The MC mask reflects the last-level
>     cache within the NUMA node which is most likely per-die or per-cluster
>     (inside each die).
> 
> 2. physical package == NUMA node
>     The top non-NUMA (DIE) mask is the same as the core sibling mask.
>     If there is cache spanning the entire node, the scheduler topology
>     will eliminate a layer (DIE?), so bottom NUMA level would be right on
>     top of MC spanning multiple physical packages. If there is no
>     node-wide last level cache, DIE is preserved and MC matches the span
>     of the last level cache.
> 
> 3. numa-in-package
>     Top non-NUMA (DIE) mask is not reflecting the actual die, but is
>     reflecting the NUMA node. MC has a span equal to the largest share
>     cache span smaller than or equal to the the NUMA node. If it is
>     equal, DIE level is eliminated, otherwise DIE is preserved, but
>     doesn't really represent die. Bottom non-NUMA level spans multiple
>     in-package NUMA nodes.
> 
> As you said, multi-die nodes should work. However, I'm not sure if
> shrinking MC to match a cache could cause us trouble, or if it should
> just be shrunk to be the smaller of the node mask and core siblings.

Shrinking to the smaller of the numa or package is fairly trivial 
change, I'm good with that change too.. I discounted it because there 
might be an advantage in case 2 if the internal hardware is actually a 
case 3 (or just multiple rings/whatever each with a L3). In those cases 
the firmware vendor could play around with whatever representation 
serves them the best.

> Unless you have a node-wide last level cache DIE level won't be
> eliminated in scenario 2 and 3, and could cause trouble. For
> numa-in-package, you can end up with a DIE level inside the node where
> the default flags don't favour aggressive spreading of tasks. The same
> could be the case for per-package nodes (scenario 2).
> 
> Don't we end up redefining physical package to be last level cache
> instead of using the PPTT flag for scenario 2 and 3?

I'm not sure I understand, core_siblings isn't changing (its still per 
package). Only the MC mapping which normally is just core_siblings. For 
all intents right now this patch is the same as v6, except for the 
numa-in-package where the MC domain is being shrunk to the node 
siblings. I'm just trying to setup the code for potential future cases 
where the LLC isn't equal to the node or package.

> 
> I think DIE level should be eliminated for scenario 2 and 3 like it is
> for x86.

Ok, that is based on the assumption that MC will always be equal to 
either the package or node? If that assumption isn't true, then would 
you keep it, or maybe it doesn't matter?

> 
> [...]
> 
>>>> This patch creates a set of early cache_siblings masks, then
>>>> when the scheduler requests the coregroup mask we pick the
>>>> smaller of the physical package siblings, or the numa siblings
>>>> and locate the largest cache which is an entire subset of
>>>> those siblings. If we are unable to find a proper subset of
>>>> cores then we retain the original behavior and return the
>>>> core_sibling list.
>>>
>>> IIUC, for numa-in-package it is a strict requirement that there is a
>>> cache that span the entire NUMA node? For example, having a NUMA node
>>> consisting of two clusters with per-cluster caches only wouldn't be
>>> supported?
>>
>> Everything is supported, the MC is reflecting the cache topology. We just
>> use the physical/numa topology to help us pick which layer of cache topology
>> lands in the MC. (unless of course we fail to find a PPTT/cache topology, in
>> which case we fallback to the old behavior of the core_siblings which can
>> reflect the MPIDR/etc).
> 
> I see. For this example we would end up with a "DIE" level and two MC
> domains inside each node whether we have the PPTT table and cache
> topology or not. I'm just wondering if everyone would be happy with
> basing MC on last level cache instead of the smaller of physical package
> and NUMA node.

I can't judge that, my idea was simply to provide some flexibility to 
the firmware. I guess in theory someone who still wanted that split 
could push a numa domain down to whatever level they wanted to group.


> 
>>>> +{
>>>> +	/* first determine if we are a NUMA in package */
>>>> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>>>> +	int indx;
>>>> +
>>>> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
>>>> +		/* not numa in package, lets use the package siblings */
>>>> +		node_mask = &cpu_topology[cpu].core_sibling;
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * node_mask should represent the smallest package/numa grouping
>>>> +	 * lets search for the largest cache smaller than the node_mask.
>>>> +	 */
>>>> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
>>>> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
>>>> +
>>>> +		if (cpu_topology[cpu].cache_id[indx] < 0)
>>>> +			continue;
>>>> +
>>>> +		if (cpumask_subset(cache_sibs, node_mask))
>>>> +			cpu_topology[cpu].cache_level = indx;
>>>
>>> I don't this guarantees that the cache level we found matches exactly
>>> the NUMA node. Taking the two cluster NUMA node example from above, we
>>> would set cache_level to point at the per-cluster cache as it is a
>>> subset of the NUMA node but it would only span half of the node. Or am I
>>> missing something?
>>
>> I think you got it. If the system is a traditional ARM system with shared
>> L2's at the cluster level and it doesn't have any L3's/etc and the NUMA node
>> crosses multiple clusters then you get the cluster L2 grouping in the MC.
>>
>> I think this is what we want. Particularly, since the newer/larger machines
>> do have L3+'s contained within their sockets or numa domains, so you end up
>> with that as the MC.
> 
> Okay, thanks for confirming.
> 
>>
>>
>>>
>>>> +	}
>>>> +}
>>>> +
>>>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>>>   {
>>>> +	int *llc = &cpu_topology[cpu].cache_level;
>>>> +
>>>> +	if (*llc == -1)
>>>> +		find_llc_topology_for_cpu(cpu);
>>>> +
>>>> +	if (*llc != -1)
>>>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>>>> +
>>>>   	return &cpu_topology[cpu].core_sibling;
> 
> If we don't have any of the cache_sibling masks set up, i.e. we don't
> have the cache topology, we would keep looking for it every time
> cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
> but it could have a performance impact?

Its only called when cores come online/offline (AFAIK).

> 
> 
>>>>   }
>>>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>   {
>>>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>>>   	int cpu;
>>>> +	int idx;
>>>>   	/* update core and thread sibling masks */
>>>>   	for_each_possible_cpu(cpu) {
>>>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>>>   			continue;
>>>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>>>> +			cpumask_t *lsib;
>>>> +			int cput_id = cpuid_topo->cache_id[idx];
>>>> +
>>>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>>>> +				lsib = &cpuid_topo->cache_siblings[idx];
>>>> +				cpumask_set_cpu(cpu, lsib);
>>>> +			}
>>>
>>> Shouldn't the cache_id validity be checked here? I don't think it breaks
>>> anything though.
>>
>> It could be, but since its explicitly looking for unified caches its likely
>> that some of the levels are invalid. Invalid levels get ignored later on so
>> we don't really care if they are valid here.
>>
>>>
>>> Overall, I think this is more or less in line with the MC domain
>>> shrinking I just mentioned in the v6 discussion. It is mostly the corner
>>> cases and assumption about the system topology I'm not sure about.
>>
>> I think its the corner cases i'm taking care of. The simple fix in v6 is to
>> take the smaller of core_siblings or node_siblings, but that ignores cases
>> with split L3s (or the L2 only example above). The idea here is to assure
>> that MC is following a cache topology. In my mind, it is more a question of
>> how that is picked. The other way I see to do this, is with a PX domain flag
>> in the PPTT. We could then pick the core grouping one below that flag. Doing
>> it that way affords the firmware vendors a lever they can pull to optimize a
>> given machine for the linux scheduler behavior.
> 
> Okay. I think these assumptions/choices should be spelled out somewhere,
> either as comments or in the commit message. As said above, I'm not sure
> if the simple approach is better or not.
> 
> Using the cache span to define the MC level with a numa-in-cluster
> switch like some Intel platform seems to have, you could two core being
> MC siblings with numa-in-package disabled and them not being siblings
> with numa-in-package enabled unless you reconfigure the span of the
> caches too and remember to update the ACPI cache topology.
> 
> Regarding firmware levers, we don't want vendors to optimize for Linux
> scheduler behaviour, but a mechanism to detect how closely related cores
> are could make selecting the right mask for MC level easier. As I see
> it, we basically have to choose between MC being cache boundary based or
> physical package based. This patch implements the former, the simple
> solution (core_siblings mask or node_siblings mask) implements the
> latter.

Basically, right now (AFAIK) the result is the same because the few 
machines I have access to have cache layers immediately below those 
boundaries which are the same size as the package/die.

I'm ok with tossing this patch in favor of something like:

const struct cpumask *cpu_coregroup_mask(int cpu)
{
    const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
    if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
    {
       /* not numa in package, lets use the package siblings */
       return &cpu_topology[cpu].core_sibling;
    }
    return node_mask;
}


Mostly, because I want to merge the PPTT parts, and I only added this to 
clear the NUMA in package borken....

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-06 22:22           ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-06 22:22 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 03/06/2018 10:07 AM, Morten Rasmussen wrote:
> On Tue, Feb 27, 2018 at 02:18:47PM -0600, Jeremy Linton wrote:
>> Hi,
>>
>>
>> First, thanks for taking a look at this.
>>
>> On 03/01/2018 09:52 AM, Morten Rasmussen wrote:
>>> Hi Jeremy,
>>>
>>> On Wed, Feb 28, 2018 at 04:06:19PM -0600, Jeremy Linton wrote:
>>>> Now that we have an accurate view of the physical topology
>>>> we need to represent it correctly to the scheduler. In the
>>>> case of NUMA in socket, we need to assure that the sched domain
>>>> we build for the MC layer isn't larger than the DIE above it.
>>>
>>> MC shouldn't be larger than any of the NUMA domains either.
>>
>> Right, that is one of the things this patch is assuring..
>>
>>>
>>>> To do this correctly, we should really base that on the cache
>>>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
>>>
>>> That means we wouldn't support multi-die NUMA nodes?
>>
>> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
>> should work. This patch is picking the widest cache layer below the smallest
>> of the package or numa grouping. What actually happens depends on the
>> topology. Given a case where there are multiple dies in a socket, and the
>> numa domain is at the socket level the MC is going to reflect the caching
>> topology immediately below the socket. In the case of multiple dies, with a
>> cache that crosses them in socket, then the MC is basically going to be the
>> socket, otherwise if the widest cache is per die, or some narrower grouping
>> (cluster?) then that is what ends up in the MC. (this is easier with some
>> pictures)
> 
> That is more or less what I meant. I think I got confused with the role
> of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> 
> 1. Multi-die/socket/physical package NUMA node
>     Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
>     spans multiple multi-package nodes. The MC mask reflects the last-level
>     cache within the NUMA node which is most likely per-die or per-cluster
>     (inside each die).
> 
> 2. physical package == NUMA node
>     The top non-NUMA (DIE) mask is the same as the core sibling mask.
>     If there is cache spanning the entire node, the scheduler topology
>     will eliminate a layer (DIE?), so bottom NUMA level would be right on
>     top of MC spanning multiple physical packages. If there is no
>     node-wide last level cache, DIE is preserved and MC matches the span
>     of the last level cache.
> 
> 3. numa-in-package
>     Top non-NUMA (DIE) mask is not reflecting the actual die, but is
>     reflecting the NUMA node. MC has a span equal to the largest share
>     cache span smaller than or equal to the the NUMA node. If it is
>     equal, DIE level is eliminated, otherwise DIE is preserved, but
>     doesn't really represent die. Bottom non-NUMA level spans multiple
>     in-package NUMA nodes.
> 
> As you said, multi-die nodes should work. However, I'm not sure if
> shrinking MC to match a cache could cause us trouble, or if it should
> just be shrunk to be the smaller of the node mask and core siblings.

Shrinking to the smaller of the numa or package is fairly trivial 
change, I'm good with that change too.. I discounted it because there 
might be an advantage in case 2 if the internal hardware is actually a 
case 3 (or just multiple rings/whatever each with a L3). In those cases 
the firmware vendor could play around with whatever representation 
serves them the best.

> Unless you have a node-wide last level cache DIE level won't be
> eliminated in scenario 2 and 3, and could cause trouble. For
> numa-in-package, you can end up with a DIE level inside the node where
> the default flags don't favour aggressive spreading of tasks. The same
> could be the case for per-package nodes (scenario 2).
> 
> Don't we end up redefining physical package to be last level cache
> instead of using the PPTT flag for scenario 2 and 3?

I'm not sure I understand, core_siblings isn't changing (its still per 
package). Only the MC mapping which normally is just core_siblings. For 
all intents right now this patch is the same as v6, except for the 
numa-in-package where the MC domain is being shrunk to the node 
siblings. I'm just trying to setup the code for potential future cases 
where the LLC isn't equal to the node or package.

> 
> I think DIE level should be eliminated for scenario 2 and 3 like it is
> for x86.

Ok, that is based on the assumption that MC will always be equal to 
either the package or node? If that assumption isn't true, then would 
you keep it, or maybe it doesn't matter?

> 
> [...]
> 
>>>> This patch creates a set of early cache_siblings masks, then
>>>> when the scheduler requests the coregroup mask we pick the
>>>> smaller of the physical package siblings, or the numa siblings
>>>> and locate the largest cache which is an entire subset of
>>>> those siblings. If we are unable to find a proper subset of
>>>> cores then we retain the original behavior and return the
>>>> core_sibling list.
>>>
>>> IIUC, for numa-in-package it is a strict requirement that there is a
>>> cache that span the entire NUMA node? For example, having a NUMA node
>>> consisting of two clusters with per-cluster caches only wouldn't be
>>> supported?
>>
>> Everything is supported, the MC is reflecting the cache topology. We just
>> use the physical/numa topology to help us pick which layer of cache topology
>> lands in the MC. (unless of course we fail to find a PPTT/cache topology, in
>> which case we fallback to the old behavior of the core_siblings which can
>> reflect the MPIDR/etc).
> 
> I see. For this example we would end up with a "DIE" level and two MC
> domains inside each node whether we have the PPTT table and cache
> topology or not. I'm just wondering if everyone would be happy with
> basing MC on last level cache instead of the smaller of physical package
> and NUMA node.

I can't judge that, my idea was simply to provide some flexibility to 
the firmware. I guess in theory someone who still wanted that split 
could push a numa domain down to whatever level they wanted to group.


> 
>>>> +{
>>>> +	/* first determine if we are a NUMA in package */
>>>> +	const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>>>> +	int indx;
>>>> +
>>>> +	if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) {
>>>> +		/* not numa in package, lets use the package siblings */
>>>> +		node_mask = &cpu_topology[cpu].core_sibling;
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * node_mask should represent the smallest package/numa grouping
>>>> +	 * lets search for the largest cache smaller than the node_mask.
>>>> +	 */
>>>> +	for (indx = 0; indx < MAX_CACHE_CHECKS; indx++) {
>>>> +		cpumask_t *cache_sibs = &cpu_topology[cpu].cache_siblings[indx];
>>>> +
>>>> +		if (cpu_topology[cpu].cache_id[indx] < 0)
>>>> +			continue;
>>>> +
>>>> +		if (cpumask_subset(cache_sibs, node_mask))
>>>> +			cpu_topology[cpu].cache_level = indx;
>>>
>>> I don't this guarantees that the cache level we found matches exactly
>>> the NUMA node. Taking the two cluster NUMA node example from above, we
>>> would set cache_level to point at the per-cluster cache as it is a
>>> subset of the NUMA node but it would only span half of the node. Or am I
>>> missing something?
>>
>> I think you got it. If the system is a traditional ARM system with shared
>> L2's at the cluster level and it doesn't have any L3's/etc and the NUMA node
>> crosses multiple clusters then you get the cluster L2 grouping in the MC.
>>
>> I think this is what we want. Particularly, since the newer/larger machines
>> do have L3+'s contained within their sockets or numa domains, so you end up
>> with that as the MC.
> 
> Okay, thanks for confirming.
> 
>>
>>
>>>
>>>> +	}
>>>> +}
>>>> +
>>>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>>>   {
>>>> +	int *llc = &cpu_topology[cpu].cache_level;
>>>> +
>>>> +	if (*llc == -1)
>>>> +		find_llc_topology_for_cpu(cpu);
>>>> +
>>>> +	if (*llc != -1)
>>>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>>>> +
>>>>   	return &cpu_topology[cpu].core_sibling;
> 
> If we don't have any of the cache_sibling masks set up, i.e. we don't
> have the cache topology, we would keep looking for it every time
> cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
> but it could have a performance impact?

Its only called when cores come online/offline (AFAIK).

> 
> 
>>>>   }
>>>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>   {
>>>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>>>   	int cpu;
>>>> +	int idx;
>>>>   	/* update core and thread sibling masks */
>>>>   	for_each_possible_cpu(cpu) {
>>>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>>>   			continue;
>>>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>>>> +			cpumask_t *lsib;
>>>> +			int cput_id = cpuid_topo->cache_id[idx];
>>>> +
>>>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>>>> +				lsib = &cpuid_topo->cache_siblings[idx];
>>>> +				cpumask_set_cpu(cpu, lsib);
>>>> +			}
>>>
>>> Shouldn't the cache_id validity be checked here? I don't think it breaks
>>> anything though.
>>
>> It could be, but since its explicitly looking for unified caches its likely
>> that some of the levels are invalid. Invalid levels get ignored later on so
>> we don't really care if they are valid here.
>>
>>>
>>> Overall, I think this is more or less in line with the MC domain
>>> shrinking I just mentioned in the v6 discussion. It is mostly the corner
>>> cases and assumption about the system topology I'm not sure about.
>>
>> I think its the corner cases i'm taking care of. The simple fix in v6 is to
>> take the smaller of core_siblings or node_siblings, but that ignores cases
>> with split L3s (or the L2 only example above). The idea here is to assure
>> that MC is following a cache topology. In my mind, it is more a question of
>> how that is picked. The other way I see to do this, is with a PX domain flag
>> in the PPTT. We could then pick the core grouping one below that flag. Doing
>> it that way affords the firmware vendors a lever they can pull to optimize a
>> given machine for the linux scheduler behavior.
> 
> Okay. I think these assumptions/choices should be spelled out somewhere,
> either as comments or in the commit message. As said above, I'm not sure
> if the simple approach is better or not.
> 
> Using the cache span to define the MC level with a numa-in-cluster
> switch like some Intel platform seems to have, you could two core being
> MC siblings with numa-in-package disabled and them not being siblings
> with numa-in-package enabled unless you reconfigure the span of the
> caches too and remember to update the ACPI cache topology.
> 
> Regarding firmware levers, we don't want vendors to optimize for Linux
> scheduler behaviour, but a mechanism to detect how closely related cores
> are could make selecting the right mask for MC level easier. As I see
> it, we basically have to choose between MC being cache boundary based or
> physical package based. This patch implements the former, the simple
> solution (core_siblings mask or node_siblings mask) implements the
> latter.

Basically, right now (AFAIK) the result is the same because the few 
machines I have access to have cache layers immediately below those 
boundaries which are the same size as the package/die.

I'm ok with tossing this patch in favor of something like:

const struct cpumask *cpu_coregroup_mask(int cpu)
{
    const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
    if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
    {
       /* not numa in package, lets use the package siblings */
       return &cpu_topology[cpu].core_sibling;
    }
    return node_mask;
}


Mostly, because I want to merge the PPTT parts, and I only added this to 
clear the NUMA in package borken....

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-03-06 22:22           ` Jeremy Linton
  (?)
@ 2018-03-07 13:06             ` Morten Rasmussen
  -1 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-07 13:06 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: mark.rutland, vkilari, lorenzo.pieralisi, catalin.marinas,
	tnowicki, gregkh, will.deacon, dietmar.eggemann, rjw,
	linux-kernel, ahs3, linux-acpi, palmer, hanjun.guo, sudeep.holla,
	austinwc, linux-riscv, john.garry, wangxiongfeng2,
	linux-arm-kernel, lenb

On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
> >>>>To do this correctly, we should really base that on the cache
> >>>>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >>>
> >>>That means we wouldn't support multi-die NUMA nodes?
> >>
> >>You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> >>should work. This patch is picking the widest cache layer below the smallest
> >>of the package or numa grouping. What actually happens depends on the
> >>topology. Given a case where there are multiple dies in a socket, and the
> >>numa domain is at the socket level the MC is going to reflect the caching
> >>topology immediately below the socket. In the case of multiple dies, with a
> >>cache that crosses them in socket, then the MC is basically going to be the
> >>socket, otherwise if the widest cache is per die, or some narrower grouping
> >>(cluster?) then that is what ends up in the MC. (this is easier with some
> >>pictures)
> >
> >That is more or less what I meant. I think I got confused with the role
> >of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> >cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> >
> >1. Multi-die/socket/physical package NUMA node
> >    Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
> >    spans multiple multi-package nodes. The MC mask reflects the last-level
> >    cache within the NUMA node which is most likely per-die or per-cluster
> >    (inside each die).
> >
> >2. physical package == NUMA node
> >    The top non-NUMA (DIE) mask is the same as the core sibling mask.
> >    If there is cache spanning the entire node, the scheduler topology
> >    will eliminate a layer (DIE?), so bottom NUMA level would be right on
> >    top of MC spanning multiple physical packages. If there is no
> >    node-wide last level cache, DIE is preserved and MC matches the span
> >    of the last level cache.
> >
> >3. numa-in-package
> >    Top non-NUMA (DIE) mask is not reflecting the actual die, but is
> >    reflecting the NUMA node. MC has a span equal to the largest share
> >    cache span smaller than or equal to the the NUMA node. If it is
> >    equal, DIE level is eliminated, otherwise DIE is preserved, but
> >    doesn't really represent die. Bottom non-NUMA level spans multiple
> >    in-package NUMA nodes.
> >
> >As you said, multi-die nodes should work. However, I'm not sure if
> >shrinking MC to match a cache could cause us trouble, or if it should
> >just be shrunk to be the smaller of the node mask and core siblings.
> 
> Shrinking to the smaller of the numa or package is fairly trivial change,
> I'm good with that change too.. I discounted it because there might be an
> advantage in case 2 if the internal hardware is actually a case 3 (or just
> multiple rings/whatever each with a L3). In those cases the firmware vendor
> could play around with whatever representation serves them the best.

Agreed. Distributed last level caches and interconnect speeds makes it
virtually impossible to define MC in a way that works well for everyone
based on the topology information we have at hand.

> 
> >Unless you have a node-wide last level cache DIE level won't be
> >eliminated in scenario 2 and 3, and could cause trouble. For
> >numa-in-package, you can end up with a DIE level inside the node where
> >the default flags don't favour aggressive spreading of tasks. The same
> >could be the case for per-package nodes (scenario 2).
> >
> >Don't we end up redefining physical package to be last level cache
> >instead of using the PPTT flag for scenario 2 and 3?
> 
> I'm not sure I understand, core_siblings isn't changing (its still per
> package). Only the MC mapping which normally is just core_siblings. For all
> intents right now this patch is the same as v6, except for the
> numa-in-package where the MC domain is being shrunk to the node siblings.
> I'm just trying to setup the code for potential future cases where the LLC
> isn't equal to the node or package.

Right, core_siblings remains the same. The scheduler topology just looks
a bit odd as we can have core_siblings spanning the full true physical
package and have DIE level as a subset of that with an MC level where
the MC siblings is a much smaller subset of cpus than core_siblings.

IOW, it would lead to having one topology used by the scheduler and
another used by the users of topology_core_cpumask() (which is not
many I think).

Is there a good reason for diverging instead of adjusting the
core_sibling mask? On x86 the core_siblings mask is defined by the last
level cache span so they don't have this issue. 

> >I think DIE level should be eliminated for scenario 2 and 3 like it is
> >for x86.
> 
> Ok, that is based on the assumption that MC will always be equal to either
> the package or node? If that assumption isn't true, then would you keep it,
> or maybe it doesn't matter?

Yes. IIUC, MC is always equal to package or node on x86. They don't have
DIE in their numa-in-package topology as MC is equal to the node.

> >>>>+	}
> >>>>+}
> >>>>+
> >>>>  const struct cpumask *cpu_coregroup_mask(int cpu)
> >>>>  {
> >>>>+	int *llc = &cpu_topology[cpu].cache_level;
> >>>>+
> >>>>+	if (*llc == -1)
> >>>>+		find_llc_topology_for_cpu(cpu);
> >>>>+
> >>>>+	if (*llc != -1)
> >>>>+		return &cpu_topology[cpu].cache_siblings[*llc];
> >>>>+
> >>>>  	return &cpu_topology[cpu].core_sibling;
> >
> >If we don't have any of the cache_sibling masks set up, i.e. we don't
> >have the cache topology, we would keep looking for it every time
> >cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
> >but it could have a performance impact?
> 
> Its only called when cores come online/offline (AFAIK).

Yes, it seems to only be used for sched_domain building. That can happen
as part of creating/modifying cpusets as well, but I guess the overhead
is less critical for all these case.

> 
> >
> >
> >>>>  }
> >>>>@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
> >>>>  {
> >>>>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> >>>>  	int cpu;
> >>>>+	int idx;
> >>>>  	/* update core and thread sibling masks */
> >>>>  	for_each_possible_cpu(cpu) {
> >>>>@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
> >>>>  		if (cpuid_topo->package_id != cpu_topo->package_id)
> >>>>  			continue;
> >>>>+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> >>>>+			cpumask_t *lsib;
> >>>>+			int cput_id = cpuid_topo->cache_id[idx];
> >>>>+
> >>>>+			if (cput_id == cpu_topo->cache_id[idx]) {
> >>>>+				lsib = &cpuid_topo->cache_siblings[idx];
> >>>>+				cpumask_set_cpu(cpu, lsib);
> >>>>+			}
> >>>
> >>>Shouldn't the cache_id validity be checked here? I don't think it breaks
> >>>anything though.
> >>
> >>It could be, but since its explicitly looking for unified caches its likely
> >>that some of the levels are invalid. Invalid levels get ignored later on so
> >>we don't really care if they are valid here.
> >>
> >>>
> >>>Overall, I think this is more or less in line with the MC domain
> >>>shrinking I just mentioned in the v6 discussion. It is mostly the corner
> >>>cases and assumption about the system topology I'm not sure about.
> >>
> >>I think its the corner cases i'm taking care of. The simple fix in v6 is to
> >>take the smaller of core_siblings or node_siblings, but that ignores cases
> >>with split L3s (or the L2 only example above). The idea here is to assure
> >>that MC is following a cache topology. In my mind, it is more a question of
> >>how that is picked. The other way I see to do this, is with a PX domain flag
> >>in the PPTT. We could then pick the core grouping one below that flag. Doing
> >>it that way affords the firmware vendors a lever they can pull to optimize a
> >>given machine for the linux scheduler behavior.
> >
> >Okay. I think these assumptions/choices should be spelled out somewhere,
> >either as comments or in the commit message. As said above, I'm not sure
> >if the simple approach is better or not.
> >
> >Using the cache span to define the MC level with a numa-in-cluster
> >switch like some Intel platform seems to have, you could two core being
> >MC siblings with numa-in-package disabled and them not being siblings
> >with numa-in-package enabled unless you reconfigure the span of the
> >caches too and remember to update the ACPI cache topology.
> >
> >Regarding firmware levers, we don't want vendors to optimize for Linux
> >scheduler behaviour, but a mechanism to detect how closely related cores
> >are could make selecting the right mask for MC level easier. As I see
> >it, we basically have to choose between MC being cache boundary based or
> >physical package based. This patch implements the former, the simple
> >solution (core_siblings mask or node_siblings mask) implements the
> >latter.
> 
> Basically, right now (AFAIK) the result is the same because the few machines
> I have access to have cache layers immediately below those boundaries which
> are the same size as the package/die.

Agreed, I'm more worried about what vendors will built in the future.

> I'm ok with tossing this patch in favor of something like:
> 
> const struct cpumask *cpu_coregroup_mask(int cpu)
> {
>    const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>    if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
>    {
>       /* not numa in package, lets use the package siblings */
>       return &cpu_topology[cpu].core_sibling;
>    }
>    return node_mask;
> }

I would prefer this simpler solution as it should eliminate DIE level
for all numa-in-package configurations. Although, I think we should consider
just shrinking the core_sibling mask instead of having a difference MC
mask (cpu_coregroup_mask). Do you see any problems in doing that?

> Mostly, because I want to merge the PPTT parts, and I only added this to
> clear the NUMA in package borken....

Understood. Whatever choice we make now to fix it will be with us
potentially forever. So unless we have really good reason to do things
differently, I would prefer to follow what other architectures do. I
think the simple solution is closest to what x86 does.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-07 13:06             ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-07 13:06 UTC (permalink / raw)
  To: linux-riscv

On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
> >>>>To do this correctly, we should really base that on the cache
> >>>>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >>>
> >>>That means we wouldn't support multi-die NUMA nodes?
> >>
> >>You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> >>should work. This patch is picking the widest cache layer below the smallest
> >>of the package or numa grouping. What actually happens depends on the
> >>topology. Given a case where there are multiple dies in a socket, and the
> >>numa domain is at the socket level the MC is going to reflect the caching
> >>topology immediately below the socket. In the case of multiple dies, with a
> >>cache that crosses them in socket, then the MC is basically going to be the
> >>socket, otherwise if the widest cache is per die, or some narrower grouping
> >>(cluster?) then that is what ends up in the MC. (this is easier with some
> >>pictures)
> >
> >That is more or less what I meant. I think I got confused with the role
> >of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> >cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> >
> >1. Multi-die/socket/physical package NUMA node
> >    Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
> >    spans multiple multi-package nodes. The MC mask reflects the last-level
> >    cache within the NUMA node which is most likely per-die or per-cluster
> >    (inside each die).
> >
> >2. physical package == NUMA node
> >    The top non-NUMA (DIE) mask is the same as the core sibling mask.
> >    If there is cache spanning the entire node, the scheduler topology
> >    will eliminate a layer (DIE?), so bottom NUMA level would be right on
> >    top of MC spanning multiple physical packages. If there is no
> >    node-wide last level cache, DIE is preserved and MC matches the span
> >    of the last level cache.
> >
> >3. numa-in-package
> >    Top non-NUMA (DIE) mask is not reflecting the actual die, but is
> >    reflecting the NUMA node. MC has a span equal to the largest share
> >    cache span smaller than or equal to the the NUMA node. If it is
> >    equal, DIE level is eliminated, otherwise DIE is preserved, but
> >    doesn't really represent die. Bottom non-NUMA level spans multiple
> >    in-package NUMA nodes.
> >
> >As you said, multi-die nodes should work. However, I'm not sure if
> >shrinking MC to match a cache could cause us trouble, or if it should
> >just be shrunk to be the smaller of the node mask and core siblings.
> 
> Shrinking to the smaller of the numa or package is fairly trivial change,
> I'm good with that change too.. I discounted it because there might be an
> advantage in case 2 if the internal hardware is actually a case 3 (or just
> multiple rings/whatever each with a L3). In those cases the firmware vendor
> could play around with whatever representation serves them the best.

Agreed. Distributed last level caches and interconnect speeds makes it
virtually impossible to define MC in a way that works well for everyone
based on the topology information we have at hand.

> 
> >Unless you have a node-wide last level cache DIE level won't be
> >eliminated in scenario 2 and 3, and could cause trouble. For
> >numa-in-package, you can end up with a DIE level inside the node where
> >the default flags don't favour aggressive spreading of tasks. The same
> >could be the case for per-package nodes (scenario 2).
> >
> >Don't we end up redefining physical package to be last level cache
> >instead of using the PPTT flag for scenario 2 and 3?
> 
> I'm not sure I understand, core_siblings isn't changing (its still per
> package). Only the MC mapping which normally is just core_siblings. For all
> intents right now this patch is the same as v6, except for the
> numa-in-package where the MC domain is being shrunk to the node siblings.
> I'm just trying to setup the code for potential future cases where the LLC
> isn't equal to the node or package.

Right, core_siblings remains the same. The scheduler topology just looks
a bit odd as we can have core_siblings spanning the full true physical
package and have DIE level as a subset of that with an MC level where
the MC siblings is a much smaller subset of cpus than core_siblings.

IOW, it would lead to having one topology used by the scheduler and
another used by the users of topology_core_cpumask() (which is not
many I think).

Is there a good reason for diverging instead of adjusting the
core_sibling mask? On x86 the core_siblings mask is defined by the last
level cache span so they don't have this issue. 

> >I think DIE level should be eliminated for scenario 2 and 3 like it is
> >for x86.
> 
> Ok, that is based on the assumption that MC will always be equal to either
> the package or node? If that assumption isn't true, then would you keep it,
> or maybe it doesn't matter?

Yes. IIUC, MC is always equal to package or node on x86. They don't have
DIE in their numa-in-package topology as MC is equal to the node.

> >>>>+	}
> >>>>+}
> >>>>+
> >>>>  const struct cpumask *cpu_coregroup_mask(int cpu)
> >>>>  {
> >>>>+	int *llc = &cpu_topology[cpu].cache_level;
> >>>>+
> >>>>+	if (*llc == -1)
> >>>>+		find_llc_topology_for_cpu(cpu);
> >>>>+
> >>>>+	if (*llc != -1)
> >>>>+		return &cpu_topology[cpu].cache_siblings[*llc];
> >>>>+
> >>>>  	return &cpu_topology[cpu].core_sibling;
> >
> >If we don't have any of the cache_sibling masks set up, i.e. we don't
> >have the cache topology, we would keep looking for it every time
> >cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
> >but it could have a performance impact?
> 
> Its only called when cores come online/offline (AFAIK).

Yes, it seems to only be used for sched_domain building. That can happen
as part of creating/modifying cpusets as well, but I guess the overhead
is less critical for all these case.

> 
> >
> >
> >>>>  }
> >>>>@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
> >>>>  {
> >>>>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> >>>>  	int cpu;
> >>>>+	int idx;
> >>>>  	/* update core and thread sibling masks */
> >>>>  	for_each_possible_cpu(cpu) {
> >>>>@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
> >>>>  		if (cpuid_topo->package_id != cpu_topo->package_id)
> >>>>  			continue;
> >>>>+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> >>>>+			cpumask_t *lsib;
> >>>>+			int cput_id = cpuid_topo->cache_id[idx];
> >>>>+
> >>>>+			if (cput_id == cpu_topo->cache_id[idx]) {
> >>>>+				lsib = &cpuid_topo->cache_siblings[idx];
> >>>>+				cpumask_set_cpu(cpu, lsib);
> >>>>+			}
> >>>
> >>>Shouldn't the cache_id validity be checked here? I don't think it breaks
> >>>anything though.
> >>
> >>It could be, but since its explicitly looking for unified caches its likely
> >>that some of the levels are invalid. Invalid levels get ignored later on so
> >>we don't really care if they are valid here.
> >>
> >>>
> >>>Overall, I think this is more or less in line with the MC domain
> >>>shrinking I just mentioned in the v6 discussion. It is mostly the corner
> >>>cases and assumption about the system topology I'm not sure about.
> >>
> >>I think its the corner cases i'm taking care of. The simple fix in v6 is to
> >>take the smaller of core_siblings or node_siblings, but that ignores cases
> >>with split L3s (or the L2 only example above). The idea here is to assure
> >>that MC is following a cache topology. In my mind, it is more a question of
> >>how that is picked. The other way I see to do this, is with a PX domain flag
> >>in the PPTT. We could then pick the core grouping one below that flag. Doing
> >>it that way affords the firmware vendors a lever they can pull to optimize a
> >>given machine for the linux scheduler behavior.
> >
> >Okay. I think these assumptions/choices should be spelled out somewhere,
> >either as comments or in the commit message. As said above, I'm not sure
> >if the simple approach is better or not.
> >
> >Using the cache span to define the MC level with a numa-in-cluster
> >switch like some Intel platform seems to have, you could two core being
> >MC siblings with numa-in-package disabled and them not being siblings
> >with numa-in-package enabled unless you reconfigure the span of the
> >caches too and remember to update the ACPI cache topology.
> >
> >Regarding firmware levers, we don't want vendors to optimize for Linux
> >scheduler behaviour, but a mechanism to detect how closely related cores
> >are could make selecting the right mask for MC level easier. As I see
> >it, we basically have to choose between MC being cache boundary based or
> >physical package based. This patch implements the former, the simple
> >solution (core_siblings mask or node_siblings mask) implements the
> >latter.
> 
> Basically, right now (AFAIK) the result is the same because the few machines
> I have access to have cache layers immediately below those boundaries which
> are the same size as the package/die.

Agreed, I'm more worried about what vendors will built in the future.

> I'm ok with tossing this patch in favor of something like:
> 
> const struct cpumask *cpu_coregroup_mask(int cpu)
> {
>    const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>    if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
>    {
>       /* not numa in package, lets use the package siblings */
>       return &cpu_topology[cpu].core_sibling;
>    }
>    return node_mask;
> }

I would prefer this simpler solution as it should eliminate DIE level
for all numa-in-package configurations. Although, I think we should consider
just shrinking the core_sibling mask instead of having a difference MC
mask (cpu_coregroup_mask). Do you see any problems in doing that?

> Mostly, because I want to merge the PPTT parts, and I only added this to
> clear the NUMA in package borken....

Understood. Whatever choice we make now to fix it will be with us
potentially forever. So unless we have really good reason to do things
differently, I would prefer to follow what other architectures do. I
think the simple solution is closest to what x86 does.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-07 13:06             ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-07 13:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
> >>>>To do this correctly, we should really base that on the cache
> >>>>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >>>
> >>>That means we wouldn't support multi-die NUMA nodes?
> >>
> >>You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> >>should work. This patch is picking the widest cache layer below the smallest
> >>of the package or numa grouping. What actually happens depends on the
> >>topology. Given a case where there are multiple dies in a socket, and the
> >>numa domain is at the socket level the MC is going to reflect the caching
> >>topology immediately below the socket. In the case of multiple dies, with a
> >>cache that crosses them in socket, then the MC is basically going to be the
> >>socket, otherwise if the widest cache is per die, or some narrower grouping
> >>(cluster?) then that is what ends up in the MC. (this is easier with some
> >>pictures)
> >
> >That is more or less what I meant. I think I got confused with the role
> >of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> >cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> >
> >1. Multi-die/socket/physical package NUMA node
> >    Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
> >    spans multiple multi-package nodes. The MC mask reflects the last-level
> >    cache within the NUMA node which is most likely per-die or per-cluster
> >    (inside each die).
> >
> >2. physical package == NUMA node
> >    The top non-NUMA (DIE) mask is the same as the core sibling mask.
> >    If there is cache spanning the entire node, the scheduler topology
> >    will eliminate a layer (DIE?), so bottom NUMA level would be right on
> >    top of MC spanning multiple physical packages. If there is no
> >    node-wide last level cache, DIE is preserved and MC matches the span
> >    of the last level cache.
> >
> >3. numa-in-package
> >    Top non-NUMA (DIE) mask is not reflecting the actual die, but is
> >    reflecting the NUMA node. MC has a span equal to the largest share
> >    cache span smaller than or equal to the the NUMA node. If it is
> >    equal, DIE level is eliminated, otherwise DIE is preserved, but
> >    doesn't really represent die. Bottom non-NUMA level spans multiple
> >    in-package NUMA nodes.
> >
> >As you said, multi-die nodes should work. However, I'm not sure if
> >shrinking MC to match a cache could cause us trouble, or if it should
> >just be shrunk to be the smaller of the node mask and core siblings.
> 
> Shrinking to the smaller of the numa or package is fairly trivial change,
> I'm good with that change too.. I discounted it because there might be an
> advantage in case 2 if the internal hardware is actually a case 3 (or just
> multiple rings/whatever each with a L3). In those cases the firmware vendor
> could play around with whatever representation serves them the best.

Agreed. Distributed last level caches and interconnect speeds makes it
virtually impossible to define MC in a way that works well for everyone
based on the topology information we have at hand.

> 
> >Unless you have a node-wide last level cache DIE level won't be
> >eliminated in scenario 2 and 3, and could cause trouble. For
> >numa-in-package, you can end up with a DIE level inside the node where
> >the default flags don't favour aggressive spreading of tasks. The same
> >could be the case for per-package nodes (scenario 2).
> >
> >Don't we end up redefining physical package to be last level cache
> >instead of using the PPTT flag for scenario 2 and 3?
> 
> I'm not sure I understand, core_siblings isn't changing (its still per
> package). Only the MC mapping which normally is just core_siblings. For all
> intents right now this patch is the same as v6, except for the
> numa-in-package where the MC domain is being shrunk to the node siblings.
> I'm just trying to setup the code for potential future cases where the LLC
> isn't equal to the node or package.

Right, core_siblings remains the same. The scheduler topology just looks
a bit odd as we can have core_siblings spanning the full true physical
package and have DIE level as a subset of that with an MC level where
the MC siblings is a much smaller subset of cpus than core_siblings.

IOW, it would lead to having one topology used by the scheduler and
another used by the users of topology_core_cpumask() (which is not
many I think).

Is there a good reason for diverging instead of adjusting the
core_sibling mask? On x86 the core_siblings mask is defined by the last
level cache span so they don't have this issue. 

> >I think DIE level should be eliminated for scenario 2 and 3 like it is
> >for x86.
> 
> Ok, that is based on the assumption that MC will always be equal to either
> the package or node? If that assumption isn't true, then would you keep it,
> or maybe it doesn't matter?

Yes. IIUC, MC is always equal to package or node on x86. They don't have
DIE in their numa-in-package topology as MC is equal to the node.

> >>>>+	}
> >>>>+}
> >>>>+
> >>>>  const struct cpumask *cpu_coregroup_mask(int cpu)
> >>>>  {
> >>>>+	int *llc = &cpu_topology[cpu].cache_level;
> >>>>+
> >>>>+	if (*llc == -1)
> >>>>+		find_llc_topology_for_cpu(cpu);
> >>>>+
> >>>>+	if (*llc != -1)
> >>>>+		return &cpu_topology[cpu].cache_siblings[*llc];
> >>>>+
> >>>>  	return &cpu_topology[cpu].core_sibling;
> >
> >If we don't have any of the cache_sibling masks set up, i.e. we don't
> >have the cache topology, we would keep looking for it every time
> >cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
> >but it could have a performance impact?
> 
> Its only called when cores come online/offline (AFAIK).

Yes, it seems to only be used for sched_domain building. That can happen
as part of creating/modifying cpusets as well, but I guess the overhead
is less critical for all these case.

> 
> >
> >
> >>>>  }
> >>>>@@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
> >>>>  {
> >>>>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> >>>>  	int cpu;
> >>>>+	int idx;
> >>>>  	/* update core and thread sibling masks */
> >>>>  	for_each_possible_cpu(cpu) {
> >>>>@@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
> >>>>  		if (cpuid_topo->package_id != cpu_topo->package_id)
> >>>>  			continue;
> >>>>+		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
> >>>>+			cpumask_t *lsib;
> >>>>+			int cput_id = cpuid_topo->cache_id[idx];
> >>>>+
> >>>>+			if (cput_id == cpu_topo->cache_id[idx]) {
> >>>>+				lsib = &cpuid_topo->cache_siblings[idx];
> >>>>+				cpumask_set_cpu(cpu, lsib);
> >>>>+			}
> >>>
> >>>Shouldn't the cache_id validity be checked here? I don't think it breaks
> >>>anything though.
> >>
> >>It could be, but since its explicitly looking for unified caches its likely
> >>that some of the levels are invalid. Invalid levels get ignored later on so
> >>we don't really care if they are valid here.
> >>
> >>>
> >>>Overall, I think this is more or less in line with the MC domain
> >>>shrinking I just mentioned in the v6 discussion. It is mostly the corner
> >>>cases and assumption about the system topology I'm not sure about.
> >>
> >>I think its the corner cases i'm taking care of. The simple fix in v6 is to
> >>take the smaller of core_siblings or node_siblings, but that ignores cases
> >>with split L3s (or the L2 only example above). The idea here is to assure
> >>that MC is following a cache topology. In my mind, it is more a question of
> >>how that is picked. The other way I see to do this, is with a PX domain flag
> >>in the PPTT. We could then pick the core grouping one below that flag. Doing
> >>it that way affords the firmware vendors a lever they can pull to optimize a
> >>given machine for the linux scheduler behavior.
> >
> >Okay. I think these assumptions/choices should be spelled out somewhere,
> >either as comments or in the commit message. As said above, I'm not sure
> >if the simple approach is better or not.
> >
> >Using the cache span to define the MC level with a numa-in-cluster
> >switch like some Intel platform seems to have, you could two core being
> >MC siblings with numa-in-package disabled and them not being siblings
> >with numa-in-package enabled unless you reconfigure the span of the
> >caches too and remember to update the ACPI cache topology.
> >
> >Regarding firmware levers, we don't want vendors to optimize for Linux
> >scheduler behaviour, but a mechanism to detect how closely related cores
> >are could make selecting the right mask for MC level easier. As I see
> >it, we basically have to choose between MC being cache boundary based or
> >physical package based. This patch implements the former, the simple
> >solution (core_siblings mask or node_siblings mask) implements the
> >latter.
> 
> Basically, right now (AFAIK) the result is the same because the few machines
> I have access to have cache layers immediately below those boundaries which
> are the same size as the package/die.

Agreed, I'm more worried about what vendors will built in the future.

> I'm ok with tossing this patch in favor of something like:
> 
> const struct cpumask *cpu_coregroup_mask(int cpu)
> {
>    const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>    if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
>    {
>       /* not numa in package, lets use the package siblings */
>       return &cpu_topology[cpu].core_sibling;
>    }
>    return node_mask;
> }

I would prefer this simpler solution as it should eliminate DIE level
for all numa-in-package configurations. Although, I think we should consider
just shrinking the core_sibling mask instead of having a difference MC
mask (cpu_coregroup_mask). Do you see any problems in doing that?

> Mostly, because I want to merge the PPTT parts, and I only added this to
> clear the NUMA in package borken....

Understood. Whatever choice we make now to fix it will be with us
potentially forever. So unless we have really good reason to do things
differently, I would prefer to follow what other architectures do. I
think the simple solution is closest to what x86 does.

Morten

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-03-07 13:06             ` Morten Rasmussen
  (?)
@ 2018-03-07 16:19               ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-07 16:19 UTC (permalink / raw)
  To: Morten Rasmussen
  Cc: mark.rutland, vkilari, lorenzo.pieralisi, catalin.marinas,
	tnowicki, gregkh, will.deacon, dietmar.eggemann, rjw,
	linux-kernel, ahs3, linux-acpi, palmer, hanjun.guo, sudeep.holla,
	austinwc, linux-riscv, john.garry, wangxiongfeng2,
	linux-arm-kernel, lenb

Hi,

On 03/07/2018 07:06 AM, Morten Rasmussen wrote:
> On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
>>>>>> To do this correctly, we should really base that on the cache
>>>>>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
>>>>>
>>>>> That means we wouldn't support multi-die NUMA nodes?
>>>>
>>>> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
>>>> should work. This patch is picking the widest cache layer below the smallest
>>>> of the package or numa grouping. What actually happens depends on the
>>>> topology. Given a case where there are multiple dies in a socket, and the
>>>> numa domain is at the socket level the MC is going to reflect the caching
>>>> topology immediately below the socket. In the case of multiple dies, with a
>>>> cache that crosses them in socket, then the MC is basically going to be the
>>>> socket, otherwise if the widest cache is per die, or some narrower grouping
>>>> (cluster?) then that is what ends up in the MC. (this is easier with some
>>>> pictures)
>>>
>>> That is more or less what I meant. I think I got confused with the role
>>> of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
>>> cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
>>>
>>> 1. Multi-die/socket/physical package NUMA node
>>>     Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
>>>     spans multiple multi-package nodes. The MC mask reflects the last-level
>>>     cache within the NUMA node which is most likely per-die or per-cluster
>>>     (inside each die).
>>>
>>> 2. physical package == NUMA node
>>>     The top non-NUMA (DIE) mask is the same as the core sibling mask.
>>>     If there is cache spanning the entire node, the scheduler topology
>>>     will eliminate a layer (DIE?), so bottom NUMA level would be right on
>>>     top of MC spanning multiple physical packages. If there is no
>>>     node-wide last level cache, DIE is preserved and MC matches the span
>>>     of the last level cache.
>>>
>>> 3. numa-in-package
>>>     Top non-NUMA (DIE) mask is not reflecting the actual die, but is
>>>     reflecting the NUMA node. MC has a span equal to the largest share
>>>     cache span smaller than or equal to the the NUMA node. If it is
>>>     equal, DIE level is eliminated, otherwise DIE is preserved, but
>>>     doesn't really represent die. Bottom non-NUMA level spans multiple
>>>     in-package NUMA nodes.
>>>
>>> As you said, multi-die nodes should work. However, I'm not sure if
>>> shrinking MC to match a cache could cause us trouble, or if it should
>>> just be shrunk to be the smaller of the node mask and core siblings.
>>
>> Shrinking to the smaller of the numa or package is fairly trivial change,
>> I'm good with that change too.. I discounted it because there might be an
>> advantage in case 2 if the internal hardware is actually a case 3 (or just
>> multiple rings/whatever each with a L3). In those cases the firmware vendor
>> could play around with whatever representation serves them the best.
> 
> Agreed. Distributed last level caches and interconnect speeds makes it
> virtually impossible to define MC in a way that works well for everyone
> based on the topology information we have at hand.
> 
>>
>>> Unless you have a node-wide last level cache DIE level won't be
>>> eliminated in scenario 2 and 3, and could cause trouble. For
>>> numa-in-package, you can end up with a DIE level inside the node where
>>> the default flags don't favour aggressive spreading of tasks. The same
>>> could be the case for per-package nodes (scenario 2).
>>>
>>> Don't we end up redefining physical package to be last level cache
>>> instead of using the PPTT flag for scenario 2 and 3?
>>
>> I'm not sure I understand, core_siblings isn't changing (its still per
>> package). Only the MC mapping which normally is just core_siblings. For all
>> intents right now this patch is the same as v6, except for the
>> numa-in-package where the MC domain is being shrunk to the node siblings.
>> I'm just trying to setup the code for potential future cases where the LLC
>> isn't equal to the node or package.
> 
> Right, core_siblings remains the same. The scheduler topology just looks
> a bit odd as we can have core_siblings spanning the full true physical
> package and have DIE level as a subset of that with an MC level where
> the MC siblings is a much smaller subset of cpus than core_siblings.
> 
> IOW, it would lead to having one topology used by the scheduler and
> another used by the users of topology_core_cpumask() (which is not
> many I think).
> 
> Is there a good reason for diverging instead of adjusting the
> core_sibling mask? On x86 the core_siblings mask is defined by the last
> level cache span so they don't have this issue.

I'm overwhelmingly convinced we are doing the right thing WRT the core 
siblings at the moment. Its exported to user space, and the general 
understanding is that its a socket. Even with numa in package/on die if 
you run lscpu, lstopo, etc... They all understand the system topology 
correctly doing it this way (AFAIK).
> 
> Yes. IIUC, MC is always equal to package or node on x86. They don't have
> DIE in their numa-in-package topology as MC is equal to the node.
> 
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>>>>>   {
>>>>>> +	int *llc = &cpu_topology[cpu].cache_level;
>>>>>> +
>>>>>> +	if (*llc == -1)
>>>>>> +		find_llc_topology_for_cpu(cpu);
>>>>>> +
>>>>>> +	if (*llc != -1)
>>>>>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>>>>>> +
>>>>>>   	return &cpu_topology[cpu].core_sibling;
>>>
>>> If we don't have any of the cache_sibling masks set up, i.e. we don't
>>> have the cache topology, we would keep looking for it every time
>>> cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
>>> but it could have a performance impact?
>>
>> Its only called when cores come online/offline (AFAIK).
> 
> Yes, it seems to only be used for sched_domain building. That can happen
> as part of creating/modifying cpusets as well, but I guess the overhead
> is less critical for all these case.
> 
>>
>>>
>>>
>>>>>>   }
>>>>>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>>>   {
>>>>>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>>>>>   	int cpu;
>>>>>> +	int idx;
>>>>>>   	/* update core and thread sibling masks */
>>>>>>   	for_each_possible_cpu(cpu) {
>>>>>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>>>>>   			continue;
>>>>>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>>>>>> +			cpumask_t *lsib;
>>>>>> +			int cput_id = cpuid_topo->cache_id[idx];
>>>>>> +
>>>>>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>>>>>> +				lsib = &cpuid_topo->cache_siblings[idx];
>>>>>> +				cpumask_set_cpu(cpu, lsib);
>>>>>> +			}
>>>>>
>>>>> Shouldn't the cache_id validity be checked here? I don't think it breaks
>>>>> anything though.
>>>>
>>>> It could be, but since its explicitly looking for unified caches its likely
>>>> that some of the levels are invalid. Invalid levels get ignored later on so
>>>> we don't really care if they are valid here.
>>>>
>>>>>
>>>>> Overall, I think this is more or less in line with the MC domain
>>>>> shrinking I just mentioned in the v6 discussion. It is mostly the corner
>>>>> cases and assumption about the system topology I'm not sure about.
>>>>
>>>> I think its the corner cases i'm taking care of. The simple fix in v6 is to
>>>> take the smaller of core_siblings or node_siblings, but that ignores cases
>>>> with split L3s (or the L2 only example above). The idea here is to assure
>>>> that MC is following a cache topology. In my mind, it is more a question of
>>>> how that is picked. The other way I see to do this, is with a PX domain flag
>>>> in the PPTT. We could then pick the core grouping one below that flag. Doing
>>>> it that way affords the firmware vendors a lever they can pull to optimize a
>>>> given machine for the linux scheduler behavior.
>>>
>>> Okay. I think these assumptions/choices should be spelled out somewhere,
>>> either as comments or in the commit message. As said above, I'm not sure
>>> if the simple approach is better or not.
>>>
>>> Using the cache span to define the MC level with a numa-in-cluster
>>> switch like some Intel platform seems to have, you could two core being
>>> MC siblings with numa-in-package disabled and them not being siblings
>>> with numa-in-package enabled unless you reconfigure the span of the
>>> caches too and remember to update the ACPI cache topology.
>>>
>>> Regarding firmware levers, we don't want vendors to optimize for Linux
>>> scheduler behaviour, but a mechanism to detect how closely related cores
>>> are could make selecting the right mask for MC level easier. As I see
>>> it, we basically have to choose between MC being cache boundary based or
>>> physical package based. This patch implements the former, the simple
>>> solution (core_siblings mask or node_siblings mask) implements the
>>> latter.
>>
>> Basically, right now (AFAIK) the result is the same because the few machines
>> I have access to have cache layers immediately below those boundaries which
>> are the same size as the package/die.
> 
> Agreed, I'm more worried about what vendors will built in the future.

Yes, and that is why I posted this rather than the code below, because I 
was leaving vendors a way to compensate for less regular machines.

> 
>> I'm ok with tossing this patch in favor of something like:
>>
>> const struct cpumask *cpu_coregroup_mask(int cpu)
>> {
>>     const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>>     if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
>>     {
>>        /* not numa in package, lets use the package siblings */
>>        return &cpu_topology[cpu].core_sibling;
>>     }
>>     return node_mask;
>> }
> 
> I would prefer this simpler solution as it should eliminate DIE level
> for all numa-in-package configurations. Although, I think we should consider
> just shrinking the core_sibling mask instead of having a difference MC
> mask (cpu_coregroup_mask). Do you see any problems in doing that?I'm 
My strongest opinion is leaning toward core_siblings being correct as it 
stands. How the scheduler deals with that is less clear. I will toss the 
above as a separate patch, and we can forget this one. I see dropping 
DIE as a larger patch set defining an arch specific scheduler topology 
and tweaking the individual scheduler level/flags/tuning. OTOH, unless 
there is something particularly creative there, I don't see how to avoid 
NUMA domains pushing deeper into the cache/system topology. Which means 
filling the MC layer (and possible others) similarly to the above snippit.


> 
>> Mostly, because I want to merge the PPTT parts, and I only added this to
>> clear the NUMA in package borken....
> 
> Understood. Whatever choice we make now to fix it will be with us
> potentially forever. So unless we have really good reason to do things
> differently, I would prefer to follow what other architectures do. I
> think the simple solution is closest to what x86 does.

Sure, that sounds familiar... ;)

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-07 16:19               ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-07 16:19 UTC (permalink / raw)
  To: linux-riscv

Hi,

On 03/07/2018 07:06 AM, Morten Rasmussen wrote:
> On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
>>>>>> To do this correctly, we should really base that on the cache
>>>>>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
>>>>>
>>>>> That means we wouldn't support multi-die NUMA nodes?
>>>>
>>>> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
>>>> should work. This patch is picking the widest cache layer below the smallest
>>>> of the package or numa grouping. What actually happens depends on the
>>>> topology. Given a case where there are multiple dies in a socket, and the
>>>> numa domain is at the socket level the MC is going to reflect the caching
>>>> topology immediately below the socket. In the case of multiple dies, with a
>>>> cache that crosses them in socket, then the MC is basically going to be the
>>>> socket, otherwise if the widest cache is per die, or some narrower grouping
>>>> (cluster?) then that is what ends up in the MC. (this is easier with some
>>>> pictures)
>>>
>>> That is more or less what I meant. I think I got confused with the role
>>> of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
>>> cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
>>>
>>> 1. Multi-die/socket/physical package NUMA node
>>>     Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
>>>     spans multiple multi-package nodes. The MC mask reflects the last-level
>>>     cache within the NUMA node which is most likely per-die or per-cluster
>>>     (inside each die).
>>>
>>> 2. physical package == NUMA node
>>>     The top non-NUMA (DIE) mask is the same as the core sibling mask.
>>>     If there is cache spanning the entire node, the scheduler topology
>>>     will eliminate a layer (DIE?), so bottom NUMA level would be right on
>>>     top of MC spanning multiple physical packages. If there is no
>>>     node-wide last level cache, DIE is preserved and MC matches the span
>>>     of the last level cache.
>>>
>>> 3. numa-in-package
>>>     Top non-NUMA (DIE) mask is not reflecting the actual die, but is
>>>     reflecting the NUMA node. MC has a span equal to the largest share
>>>     cache span smaller than or equal to the the NUMA node. If it is
>>>     equal, DIE level is eliminated, otherwise DIE is preserved, but
>>>     doesn't really represent die. Bottom non-NUMA level spans multiple
>>>     in-package NUMA nodes.
>>>
>>> As you said, multi-die nodes should work. However, I'm not sure if
>>> shrinking MC to match a cache could cause us trouble, or if it should
>>> just be shrunk to be the smaller of the node mask and core siblings.
>>
>> Shrinking to the smaller of the numa or package is fairly trivial change,
>> I'm good with that change too.. I discounted it because there might be an
>> advantage in case 2 if the internal hardware is actually a case 3 (or just
>> multiple rings/whatever each with a L3). In those cases the firmware vendor
>> could play around with whatever representation serves them the best.
> 
> Agreed. Distributed last level caches and interconnect speeds makes it
> virtually impossible to define MC in a way that works well for everyone
> based on the topology information we have at hand.
> 
>>
>>> Unless you have a node-wide last level cache DIE level won't be
>>> eliminated in scenario 2 and 3, and could cause trouble. For
>>> numa-in-package, you can end up with a DIE level inside the node where
>>> the default flags don't favour aggressive spreading of tasks. The same
>>> could be the case for per-package nodes (scenario 2).
>>>
>>> Don't we end up redefining physical package to be last level cache
>>> instead of using the PPTT flag for scenario 2 and 3?
>>
>> I'm not sure I understand, core_siblings isn't changing (its still per
>> package). Only the MC mapping which normally is just core_siblings. For all
>> intents right now this patch is the same as v6, except for the
>> numa-in-package where the MC domain is being shrunk to the node siblings.
>> I'm just trying to setup the code for potential future cases where the LLC
>> isn't equal to the node or package.
> 
> Right, core_siblings remains the same. The scheduler topology just looks
> a bit odd as we can have core_siblings spanning the full true physical
> package and have DIE level as a subset of that with an MC level where
> the MC siblings is a much smaller subset of cpus than core_siblings.
> 
> IOW, it would lead to having one topology used by the scheduler and
> another used by the users of topology_core_cpumask() (which is not
> many I think).
> 
> Is there a good reason for diverging instead of adjusting the
> core_sibling mask? On x86 the core_siblings mask is defined by the last
> level cache span so they don't have this issue.

I'm overwhelmingly convinced we are doing the right thing WRT the core 
siblings at the moment. Its exported to user space, and the general 
understanding is that its a socket. Even with numa in package/on die if 
you run lscpu, lstopo, etc... They all understand the system topology 
correctly doing it this way (AFAIK).
> 
> Yes. IIUC, MC is always equal to package or node on x86. They don't have
> DIE in their numa-in-package topology as MC is equal to the node.
> 
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>>>>>   {
>>>>>> +	int *llc = &cpu_topology[cpu].cache_level;
>>>>>> +
>>>>>> +	if (*llc == -1)
>>>>>> +		find_llc_topology_for_cpu(cpu);
>>>>>> +
>>>>>> +	if (*llc != -1)
>>>>>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>>>>>> +
>>>>>>   	return &cpu_topology[cpu].core_sibling;
>>>
>>> If we don't have any of the cache_sibling masks set up, i.e. we don't
>>> have the cache topology, we would keep looking for it every time
>>> cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
>>> but it could have a performance impact?
>>
>> Its only called when cores come online/offline (AFAIK).
> 
> Yes, it seems to only be used for sched_domain building. That can happen
> as part of creating/modifying cpusets as well, but I guess the overhead
> is less critical for all these case.
> 
>>
>>>
>>>
>>>>>>   }
>>>>>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>>>   {
>>>>>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>>>>>   	int cpu;
>>>>>> +	int idx;
>>>>>>   	/* update core and thread sibling masks */
>>>>>>   	for_each_possible_cpu(cpu) {
>>>>>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>>>>>   			continue;
>>>>>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>>>>>> +			cpumask_t *lsib;
>>>>>> +			int cput_id = cpuid_topo->cache_id[idx];
>>>>>> +
>>>>>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>>>>>> +				lsib = &cpuid_topo->cache_siblings[idx];
>>>>>> +				cpumask_set_cpu(cpu, lsib);
>>>>>> +			}
>>>>>
>>>>> Shouldn't the cache_id validity be checked here? I don't think it breaks
>>>>> anything though.
>>>>
>>>> It could be, but since its explicitly looking for unified caches its likely
>>>> that some of the levels are invalid. Invalid levels get ignored later on so
>>>> we don't really care if they are valid here.
>>>>
>>>>>
>>>>> Overall, I think this is more or less in line with the MC domain
>>>>> shrinking I just mentioned in the v6 discussion. It is mostly the corner
>>>>> cases and assumption about the system topology I'm not sure about.
>>>>
>>>> I think its the corner cases i'm taking care of. The simple fix in v6 is to
>>>> take the smaller of core_siblings or node_siblings, but that ignores cases
>>>> with split L3s (or the L2 only example above). The idea here is to assure
>>>> that MC is following a cache topology. In my mind, it is more a question of
>>>> how that is picked. The other way I see to do this, is with a PX domain flag
>>>> in the PPTT. We could then pick the core grouping one below that flag. Doing
>>>> it that way affords the firmware vendors a lever they can pull to optimize a
>>>> given machine for the linux scheduler behavior.
>>>
>>> Okay. I think these assumptions/choices should be spelled out somewhere,
>>> either as comments or in the commit message. As said above, I'm not sure
>>> if the simple approach is better or not.
>>>
>>> Using the cache span to define the MC level with a numa-in-cluster
>>> switch like some Intel platform seems to have, you could two core being
>>> MC siblings with numa-in-package disabled and them not being siblings
>>> with numa-in-package enabled unless you reconfigure the span of the
>>> caches too and remember to update the ACPI cache topology.
>>>
>>> Regarding firmware levers, we don't want vendors to optimize for Linux
>>> scheduler behaviour, but a mechanism to detect how closely related cores
>>> are could make selecting the right mask for MC level easier. As I see
>>> it, we basically have to choose between MC being cache boundary based or
>>> physical package based. This patch implements the former, the simple
>>> solution (core_siblings mask or node_siblings mask) implements the
>>> latter.
>>
>> Basically, right now (AFAIK) the result is the same because the few machines
>> I have access to have cache layers immediately below those boundaries which
>> are the same size as the package/die.
> 
> Agreed, I'm more worried about what vendors will built in the future.

Yes, and that is why I posted this rather than the code below, because I 
was leaving vendors a way to compensate for less regular machines.

> 
>> I'm ok with tossing this patch in favor of something like:
>>
>> const struct cpumask *cpu_coregroup_mask(int cpu)
>> {
>>     const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>>     if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
>>     {
>>        /* not numa in package, lets use the package siblings */
>>        return &cpu_topology[cpu].core_sibling;
>>     }
>>     return node_mask;
>> }
> 
> I would prefer this simpler solution as it should eliminate DIE level
> for all numa-in-package configurations. Although, I think we should consider
> just shrinking the core_sibling mask instead of having a difference MC
> mask (cpu_coregroup_mask). Do you see any problems in doing that?I'm 
My strongest opinion is leaning toward core_siblings being correct as it 
stands. How the scheduler deals with that is less clear. I will toss the 
above as a separate patch, and we can forget this one. I see dropping 
DIE as a larger patch set defining an arch specific scheduler topology 
and tweaking the individual scheduler level/flags/tuning. OTOH, unless 
there is something particularly creative there, I don't see how to avoid 
NUMA domains pushing deeper into the cache/system topology. Which means 
filling the MC layer (and possible others) similarly to the above snippit.


> 
>> Mostly, because I want to merge the PPTT parts, and I only added this to
>> clear the NUMA in package borken....
> 
> Understood. Whatever choice we make now to fix it will be with us
> potentially forever. So unless we have really good reason to do things
> differently, I would prefer to follow what other architectures do. I
> think the simple solution is closest to what x86 does.

Sure, that sounds familiar... ;)

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-07 16:19               ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-07 16:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 03/07/2018 07:06 AM, Morten Rasmussen wrote:
> On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
>>>>>> To do this correctly, we should really base that on the cache
>>>>>> topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
>>>>>
>>>>> That means we wouldn't support multi-die NUMA nodes?
>>>>
>>>> You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
>>>> should work. This patch is picking the widest cache layer below the smallest
>>>> of the package or numa grouping. What actually happens depends on the
>>>> topology. Given a case where there are multiple dies in a socket, and the
>>>> numa domain is at the socket level the MC is going to reflect the caching
>>>> topology immediately below the socket. In the case of multiple dies, with a
>>>> cache that crosses them in socket, then the MC is basically going to be the
>>>> socket, otherwise if the widest cache is per die, or some narrower grouping
>>>> (cluster?) then that is what ends up in the MC. (this is easier with some
>>>> pictures)
>>>
>>> That is more or less what I meant. I think I got confused with the role
>>> of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
>>> cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
>>>
>>> 1. Multi-die/socket/physical package NUMA node
>>>     Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
>>>     spans multiple multi-package nodes. The MC mask reflects the last-level
>>>     cache within the NUMA node which is most likely per-die or per-cluster
>>>     (inside each die).
>>>
>>> 2. physical package == NUMA node
>>>     The top non-NUMA (DIE) mask is the same as the core sibling mask.
>>>     If there is cache spanning the entire node, the scheduler topology
>>>     will eliminate a layer (DIE?), so bottom NUMA level would be right on
>>>     top of MC spanning multiple physical packages. If there is no
>>>     node-wide last level cache, DIE is preserved and MC matches the span
>>>     of the last level cache.
>>>
>>> 3. numa-in-package
>>>     Top non-NUMA (DIE) mask is not reflecting the actual die, but is
>>>     reflecting the NUMA node. MC has a span equal to the largest share
>>>     cache span smaller than or equal to the the NUMA node. If it is
>>>     equal, DIE level is eliminated, otherwise DIE is preserved, but
>>>     doesn't really represent die. Bottom non-NUMA level spans multiple
>>>     in-package NUMA nodes.
>>>
>>> As you said, multi-die nodes should work. However, I'm not sure if
>>> shrinking MC to match a cache could cause us trouble, or if it should
>>> just be shrunk to be the smaller of the node mask and core siblings.
>>
>> Shrinking to the smaller of the numa or package is fairly trivial change,
>> I'm good with that change too.. I discounted it because there might be an
>> advantage in case 2 if the internal hardware is actually a case 3 (or just
>> multiple rings/whatever each with a L3). In those cases the firmware vendor
>> could play around with whatever representation serves them the best.
> 
> Agreed. Distributed last level caches and interconnect speeds makes it
> virtually impossible to define MC in a way that works well for everyone
> based on the topology information we have at hand.
> 
>>
>>> Unless you have a node-wide last level cache DIE level won't be
>>> eliminated in scenario 2 and 3, and could cause trouble. For
>>> numa-in-package, you can end up with a DIE level inside the node where
>>> the default flags don't favour aggressive spreading of tasks. The same
>>> could be the case for per-package nodes (scenario 2).
>>>
>>> Don't we end up redefining physical package to be last level cache
>>> instead of using the PPTT flag for scenario 2 and 3?
>>
>> I'm not sure I understand, core_siblings isn't changing (its still per
>> package). Only the MC mapping which normally is just core_siblings. For all
>> intents right now this patch is the same as v6, except for the
>> numa-in-package where the MC domain is being shrunk to the node siblings.
>> I'm just trying to setup the code for potential future cases where the LLC
>> isn't equal to the node or package.
> 
> Right, core_siblings remains the same. The scheduler topology just looks
> a bit odd as we can have core_siblings spanning the full true physical
> package and have DIE level as a subset of that with an MC level where
> the MC siblings is a much smaller subset of cpus than core_siblings.
> 
> IOW, it would lead to having one topology used by the scheduler and
> another used by the users of topology_core_cpumask() (which is not
> many I think).
> 
> Is there a good reason for diverging instead of adjusting the
> core_sibling mask? On x86 the core_siblings mask is defined by the last
> level cache span so they don't have this issue.

I'm overwhelmingly convinced we are doing the right thing WRT the core 
siblings at the moment. Its exported to user space, and the general 
understanding is that its a socket. Even with numa in package/on die if 
you run lscpu, lstopo, etc... They all understand the system topology 
correctly doing it this way (AFAIK).
> 
> Yes. IIUC, MC is always equal to package or node on x86. They don't have
> DIE in their numa-in-package topology as MC is equal to the node.
> 
>>>>>> +	}
>>>>>> +}
>>>>>> +
>>>>>>   const struct cpumask *cpu_coregroup_mask(int cpu)
>>>>>>   {
>>>>>> +	int *llc = &cpu_topology[cpu].cache_level;
>>>>>> +
>>>>>> +	if (*llc == -1)
>>>>>> +		find_llc_topology_for_cpu(cpu);
>>>>>> +
>>>>>> +	if (*llc != -1)
>>>>>> +		return &cpu_topology[cpu].cache_siblings[*llc];
>>>>>> +
>>>>>>   	return &cpu_topology[cpu].core_sibling;
>>>
>>> If we don't have any of the cache_sibling masks set up, i.e. we don't
>>> have the cache topology, we would keep looking for it every time
>>> cpu_coregroup_mask() is called. I'm not sure how extensively it is used,
>>> but it could have a performance impact?
>>
>> Its only called when cores come online/offline (AFAIK).
> 
> Yes, it seems to only be used for sched_domain building. That can happen
> as part of creating/modifying cpusets as well, but I guess the overhead
> is less critical for all these case.
> 
>>
>>>
>>>
>>>>>>   }
>>>>>> @@ -221,6 +255,7 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>>>   {
>>>>>>   	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>>>>>>   	int cpu;
>>>>>> +	int idx;
>>>>>>   	/* update core and thread sibling masks */
>>>>>>   	for_each_possible_cpu(cpu) {
>>>>>> @@ -229,6 +264,16 @@ static void update_siblings_masks(unsigned int cpuid)
>>>>>>   		if (cpuid_topo->package_id != cpu_topo->package_id)
>>>>>>   			continue;
>>>>>> +		for (idx = 0; idx < MAX_CACHE_CHECKS; idx++) {
>>>>>> +			cpumask_t *lsib;
>>>>>> +			int cput_id = cpuid_topo->cache_id[idx];
>>>>>> +
>>>>>> +			if (cput_id == cpu_topo->cache_id[idx]) {
>>>>>> +				lsib = &cpuid_topo->cache_siblings[idx];
>>>>>> +				cpumask_set_cpu(cpu, lsib);
>>>>>> +			}
>>>>>
>>>>> Shouldn't the cache_id validity be checked here? I don't think it breaks
>>>>> anything though.
>>>>
>>>> It could be, but since its explicitly looking for unified caches its likely
>>>> that some of the levels are invalid. Invalid levels get ignored later on so
>>>> we don't really care if they are valid here.
>>>>
>>>>>
>>>>> Overall, I think this is more or less in line with the MC domain
>>>>> shrinking I just mentioned in the v6 discussion. It is mostly the corner
>>>>> cases and assumption about the system topology I'm not sure about.
>>>>
>>>> I think its the corner cases i'm taking care of. The simple fix in v6 is to
>>>> take the smaller of core_siblings or node_siblings, but that ignores cases
>>>> with split L3s (or the L2 only example above). The idea here is to assure
>>>> that MC is following a cache topology. In my mind, it is more a question of
>>>> how that is picked. The other way I see to do this, is with a PX domain flag
>>>> in the PPTT. We could then pick the core grouping one below that flag. Doing
>>>> it that way affords the firmware vendors a lever they can pull to optimize a
>>>> given machine for the linux scheduler behavior.
>>>
>>> Okay. I think these assumptions/choices should be spelled out somewhere,
>>> either as comments or in the commit message. As said above, I'm not sure
>>> if the simple approach is better or not.
>>>
>>> Using the cache span to define the MC level with a numa-in-cluster
>>> switch like some Intel platform seems to have, you could two core being
>>> MC siblings with numa-in-package disabled and them not being siblings
>>> with numa-in-package enabled unless you reconfigure the span of the
>>> caches too and remember to update the ACPI cache topology.
>>>
>>> Regarding firmware levers, we don't want vendors to optimize for Linux
>>> scheduler behaviour, but a mechanism to detect how closely related cores
>>> are could make selecting the right mask for MC level easier. As I see
>>> it, we basically have to choose between MC being cache boundary based or
>>> physical package based. This patch implements the former, the simple
>>> solution (core_siblings mask or node_siblings mask) implements the
>>> latter.
>>
>> Basically, right now (AFAIK) the result is the same because the few machines
>> I have access to have cache layers immediately below those boundaries which
>> are the same size as the package/die.
> 
> Agreed, I'm more worried about what vendors will built in the future.

Yes, and that is why I posted this rather than the code below, because I 
was leaving vendors a way to compensate for less regular machines.

> 
>> I'm ok with tossing this patch in favor of something like:
>>
>> const struct cpumask *cpu_coregroup_mask(int cpu)
>> {
>>     const cpumask_t *node_mask = cpumask_of_node(cpu_to_node(cpu));
>>     if (!cpumask_subset(node_mask, &cpu_topology[cpu].core_sibling)) 	
>>     {
>>        /* not numa in package, lets use the package siblings */
>>        return &cpu_topology[cpu].core_sibling;
>>     }
>>     return node_mask;
>> }
> 
> I would prefer this simpler solution as it should eliminate DIE level
> for all numa-in-package configurations. Although, I think we should consider
> just shrinking the core_sibling mask instead of having a difference MC
> mask (cpu_coregroup_mask). Do you see any problems in doing that?I'm 
My strongest opinion is leaning toward core_siblings being correct as it 
stands. How the scheduler deals with that is less clear. I will toss the 
above as a separate patch, and we can forget this one. I see dropping 
DIE as a larger patch set defining an arch specific scheduler topology 
and tweaking the individual scheduler level/flags/tuning. OTOH, unless 
there is something particularly creative there, I don't see how to avoid 
NUMA domains pushing deeper into the cache/system topology. Which means 
filling the MC layer (and possible others) similarly to the above snippit.


> 
>> Mostly, because I want to merge the PPTT parts, and I only added this to
>> clear the NUMA in package borken....
> 
> Understood. Whatever choice we make now to fix it will be with us
> potentially forever. So unless we have really good reason to do things
> differently, I would prefer to follow what other architectures do. I
> think the simple solution is closest to what x86 does.

Sure, that sounds familiar... ;)

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 00/13] Support PPTT for ARM64
  2018-02-27 18:49     ` Jeremy Linton
  (?)
@ 2018-03-08 15:59       ` Ard Biesheuvel
  -1 siblings, 0 replies; 136+ messages in thread
From: Ard Biesheuvel @ 2018-03-08 15:59 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Sudeep Holla, linux-acpi, Mark Rutland, vkilari,
	Lorenzo Pieralisi, austinwc, tnowicki, Greg Kroah-Hartman,
	Rafael J. Wysocki, dietmar.eggemann, Will Deacon,
	Linux Kernel Mailing List, morten.rasmussen, Al Stone, palmer,
	Hanjun Guo, Catalin Marinas, linux-riscv, John Garry,
	wangxiongfeng2, linux-arm-kernel, Len Brown

On 27 February 2018 at 18:49, Jeremy Linton <jeremy.linton@arm.com> wrote:
> On 03/01/2018 06:06 AM, Sudeep Holla wrote:
>>
>> Hi Jeremy,
>>
>> On 28/02/18 22:06, Jeremy Linton wrote:
>>>
>>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>>> used to describe the processor and cache topology. Ideally it is
>>> used to extend/override information provided by the hardware, but
>>> right now ARM64 is entirely dependent on firmware provided tables.
>>>
>>> This patch parses the table for the cache topology and CPU topology.
>>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>>> PPTT node flagged as the physical package by the firmware.
>>> This results in topologies that match what the remainder of the
>>> system expects. To avoid inverted scheduler domains we then
>>> set the MC domain equal to the largest cache within the socket
>>> below the NUMA domain.
>>>
>> I remember reviewing and acknowledging most of the cacheinfo stuff with
>> couple of minor suggestions for v6. I don't see any Acked-by tags in
>> this series and don't know if I need to review/ack any more cacheinfo
>> related patches.
>
>
> Hi,
>
> Yes, I didn't put them in because I changed the functionality in 2/13 and
> there is a bug fix in 5/13. I thought you might want to do a quick diff of
> the git v6->v7 tree.
>
> Although given that most of the changes were in response to your comments in
> v6 I probably should have just put the tags in.
>

I get sane output from lstopo when applying these patches and booting
my Socionext SynQuacer in ACPI mode:

$ lstopo-no-graphics
Machine (31GB)
  Package L#0 + L3 L#0 (4096KB)
    L2 L#0 (256KB)
      L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
    L2 L#1 (256KB)
      L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L2 L#2 (256KB)
      L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
    L2 L#3 (256KB)
      L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
      L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
    L2 L#4 (256KB)
      L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
      L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
    L2 L#5 (256KB)
      L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
      L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
    L2 L#6 (256KB)
      L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 (P#12)
      L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 (P#13)
    L2 L#7 (256KB)
      L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 (P#14)
      L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 (P#15)
    L2 L#8 (256KB)
      L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 (P#16)
      L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 (P#17)
    L2 L#9 (256KB)
      L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 (P#18)
      L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 (P#19)
    L2 L#10 (256KB)
      L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 (P#20)
      L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 (P#21)
    L2 L#11 (256KB)
      L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 (P#22)
      L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 (P#23)
  HostBridge L#0
    PCIBridge
      PCIBridge
        PCI 1b21:0612
          Block(Disk) L#0 "sda"
  HostBridge L#3
    PCI 10de:128b
      GPU L#1 "renderD128"
      GPU L#2 "card0"
      GPU L#3 "controlD64"

So

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

*However*, while hacking on the firmware that exposes the table, I
noticed that a malformed structure (incorrect size) can get the parser
in an infinite loop, hanging the boot after

[    8.244281] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    8.251780] Serial: AMBA driver
[    8.255042] msm_serial: driver initialized
[    8.259752] ACPI PPTT: Cache Setup ACPI cpu 0
[    8.264121] ACPI PPTT: Looking for data cache
[    8.268484] ACPI PPTT: Looking for CPU 0's level 1 cache type 0

so I guess the parsing code could be made a bit more robust?

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-08 15:59       ` Ard Biesheuvel
  0 siblings, 0 replies; 136+ messages in thread
From: Ard Biesheuvel @ 2018-03-08 15:59 UTC (permalink / raw)
  To: linux-riscv

On 27 February 2018 at 18:49, Jeremy Linton <jeremy.linton@arm.com> wrote:
> On 03/01/2018 06:06 AM, Sudeep Holla wrote:
>>
>> Hi Jeremy,
>>
>> On 28/02/18 22:06, Jeremy Linton wrote:
>>>
>>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>>> used to describe the processor and cache topology. Ideally it is
>>> used to extend/override information provided by the hardware, but
>>> right now ARM64 is entirely dependent on firmware provided tables.
>>>
>>> This patch parses the table for the cache topology and CPU topology.
>>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>>> PPTT node flagged as the physical package by the firmware.
>>> This results in topologies that match what the remainder of the
>>> system expects. To avoid inverted scheduler domains we then
>>> set the MC domain equal to the largest cache within the socket
>>> below the NUMA domain.
>>>
>> I remember reviewing and acknowledging most of the cacheinfo stuff with
>> couple of minor suggestions for v6. I don't see any Acked-by tags in
>> this series and don't know if I need to review/ack any more cacheinfo
>> related patches.
>
>
> Hi,
>
> Yes, I didn't put them in because I changed the functionality in 2/13 and
> there is a bug fix in 5/13. I thought you might want to do a quick diff of
> the git v6->v7 tree.
>
> Although given that most of the changes were in response to your comments in
> v6 I probably should have just put the tags in.
>

I get sane output from lstopo when applying these patches and booting
my Socionext SynQuacer in ACPI mode:

$ lstopo-no-graphics
Machine (31GB)
  Package L#0 + L3 L#0 (4096KB)
    L2 L#0 (256KB)
      L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
    L2 L#1 (256KB)
      L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L2 L#2 (256KB)
      L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
    L2 L#3 (256KB)
      L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
      L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
    L2 L#4 (256KB)
      L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
      L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
    L2 L#5 (256KB)
      L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
      L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
    L2 L#6 (256KB)
      L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 (P#12)
      L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 (P#13)
    L2 L#7 (256KB)
      L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 (P#14)
      L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 (P#15)
    L2 L#8 (256KB)
      L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 (P#16)
      L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 (P#17)
    L2 L#9 (256KB)
      L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 (P#18)
      L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 (P#19)
    L2 L#10 (256KB)
      L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 (P#20)
      L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 (P#21)
    L2 L#11 (256KB)
      L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 (P#22)
      L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 (P#23)
  HostBridge L#0
    PCIBridge
      PCIBridge
        PCI 1b21:0612
          Block(Disk) L#0 "sda"
  HostBridge L#3
    PCI 10de:128b
      GPU L#1 "renderD128"
      GPU L#2 "card0"
      GPU L#3 "controlD64"

So

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

*However*, while hacking on the firmware that exposes the table, I
noticed that a malformed structure (incorrect size) can get the parser
in an infinite loop, hanging the boot after

[    8.244281] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    8.251780] Serial: AMBA driver
[    8.255042] msm_serial: driver initialized
[    8.259752] ACPI PPTT: Cache Setup ACPI cpu 0
[    8.264121] ACPI PPTT: Looking for data cache
[    8.268484] ACPI PPTT: Looking for CPU 0's level 1 cache type 0

so I guess the parsing code could be made a bit more robust?

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-08 15:59       ` Ard Biesheuvel
  0 siblings, 0 replies; 136+ messages in thread
From: Ard Biesheuvel @ 2018-03-08 15:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 27 February 2018 at 18:49, Jeremy Linton <jeremy.linton@arm.com> wrote:
> On 03/01/2018 06:06 AM, Sudeep Holla wrote:
>>
>> Hi Jeremy,
>>
>> On 28/02/18 22:06, Jeremy Linton wrote:
>>>
>>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>>> used to describe the processor and cache topology. Ideally it is
>>> used to extend/override information provided by the hardware, but
>>> right now ARM64 is entirely dependent on firmware provided tables.
>>>
>>> This patch parses the table for the cache topology and CPU topology.
>>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>>> PPTT node flagged as the physical package by the firmware.
>>> This results in topologies that match what the remainder of the
>>> system expects. To avoid inverted scheduler domains we then
>>> set the MC domain equal to the largest cache within the socket
>>> below the NUMA domain.
>>>
>> I remember reviewing and acknowledging most of the cacheinfo stuff with
>> couple of minor suggestions for v6. I don't see any Acked-by tags in
>> this series and don't know if I need to review/ack any more cacheinfo
>> related patches.
>
>
> Hi,
>
> Yes, I didn't put them in because I changed the functionality in 2/13 and
> there is a bug fix in 5/13. I thought you might want to do a quick diff of
> the git v6->v7 tree.
>
> Although given that most of the changes were in response to your comments in
> v6 I probably should have just put the tags in.
>

I get sane output from lstopo when applying these patches and booting
my Socionext SynQuacer in ACPI mode:

$ lstopo-no-graphics
Machine (31GB)
  Package L#0 + L3 L#0 (4096KB)
    L2 L#0 (256KB)
      L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
      L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
    L2 L#1 (256KB)
      L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
      L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
    L2 L#2 (256KB)
      L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
      L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
    L2 L#3 (256KB)
      L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
      L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
    L2 L#4 (256KB)
      L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
      L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
    L2 L#5 (256KB)
      L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
      L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
    L2 L#6 (256KB)
      L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 (P#12)
      L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 (P#13)
    L2 L#7 (256KB)
      L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 (P#14)
      L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 (P#15)
    L2 L#8 (256KB)
      L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 (P#16)
      L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 (P#17)
    L2 L#9 (256KB)
      L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 (P#18)
      L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 (P#19)
    L2 L#10 (256KB)
      L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 (P#20)
      L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 (P#21)
    L2 L#11 (256KB)
      L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 (P#22)
      L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 (P#23)
  HostBridge L#0
    PCIBridge
      PCIBridge
        PCI 1b21:0612
          Block(Disk) L#0 "sda"
  HostBridge L#3
    PCI 10de:128b
      GPU L#1 "renderD128"
      GPU L#2 "card0"
      GPU L#3 "controlD64"

So

Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

*However*, while hacking on the firmware that exposes the table, I
noticed that a malformed structure (incorrect size) can get the parser
in an infinite loop, hanging the boot after

[    8.244281] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[    8.251780] Serial: AMBA driver
[    8.255042] msm_serial: driver initialized
[    8.259752] ACPI PPTT: Cache Setup ACPI cpu 0
[    8.264121] ACPI PPTT: Looking for data cache
[    8.268484] ACPI PPTT: Looking for CPU 0's level 1 cache type 0

so I guess the parsing code could be made a bit more robust?

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-08 16:39     ` Ard Biesheuvel
  -1 siblings, 0 replies; 136+ messages in thread
From: Ard Biesheuvel @ 2018-03-08 16:39 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, Mark Rutland, austinwc, tnowicki, Catalin Marinas,
	palmer, Will Deacon, linux-riscv, morten.rasmussen, vkilari,
	Lorenzo Pieralisi, Al Stone, Len Brown, John Garry,
	wangxiongfeng2, dietmar.eggemann, linux-arm-kernel,
	Greg Kroah-Hartman, Rafael J. Wysocki, Linux Kernel Mailing List,
	Hanjun Guo, Sudeep Holla

On 28 February 2018 at 22:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
>
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
>
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 488 insertions(+)
>  create mode 100644 drivers/acpi/pptt.c
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> new file mode 100644
> index 000000000000..883e4318c6cd
> --- /dev/null
> +++ b/drivers/acpi/pptt.c
...
> +/* total number of attributes checked by the properties code */
> +#define PPTT_CHECKED_ATTRIBUTES 4
> +
> +/*
> + * The ACPI spec implies that the fields in the cache structures are used to
> + * extend and correct the information probed from the hardware. Lets only
> + * set fields that we determine are VALID.
> + */
> +static void update_cache_properties(struct cacheinfo *this_leaf,
> +                                   struct acpi_pptt_cache *found_cache,
> +                                   struct acpi_pptt_processor *cpu_node)
> +{
> +       int valid_flags = 0;
> +
> +       if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
> +               this_leaf->size = found_cache->size;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
> +               this_leaf->coherency_line_size = found_cache->line_size;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
> +               this_leaf->number_of_sets = found_cache->number_of_sets;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
> +               this_leaf->ways_of_associativity = found_cache->associativity;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
> +               switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
> +               case ACPI_PPTT_CACHE_POLICY_WT:
> +                       this_leaf->attributes = CACHE_WRITE_THROUGH;
> +                       break;
> +               case ACPI_PPTT_CACHE_POLICY_WB:
> +                       this_leaf->attributes = CACHE_WRITE_BACK;
> +                       break;
> +               }
> +       }
> +       if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
> +               switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
> +               case ACPI_PPTT_CACHE_READ_ALLOCATE:
> +                       this_leaf->attributes |= CACHE_READ_ALLOCATE;
> +                       break;
> +               case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
> +                       this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
> +                       break;
> +               case ACPI_PPTT_CACHE_RW_ALLOCATE:
> +               case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
> +                       this_leaf->attributes |=
> +                               CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
> +                       break;
> +               }
> +       }
> +       /*
> +        * If the above flags are valid, and the cache type is NOCACHE
> +        * update the cache type as well.
> +        */
> +       if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES))
> +               this_leaf->type = CACHE_TYPE_UNIFIED;

Why do we need the associativity and #sets attributes to be valid in
order to set the cache type?

I see how size and line size are rather fundamental properties, but
for a system cache, the geometry doesn't really matter.

> +}
> +
> +/*
> + * Update the kernel cache information for each level of cache
> + * associated with the given acpi cpu.
> + */
> +static void cache_setup_acpi_cpu(struct acpi_table_header *table,
> +                                unsigned int cpu)
> +{
> +       struct acpi_pptt_cache *found_cache;
> +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> +       u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> +       struct cacheinfo *this_leaf;
> +       unsigned int index = 0;
> +       struct acpi_pptt_processor *cpu_node = NULL;
> +
> +       while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
> +               this_leaf = this_cpu_ci->info_list + index;
> +               found_cache = acpi_find_cache_node(table, acpi_cpu_id,
> +                                                  this_leaf->type,
> +                                                  this_leaf->level,
> +                                                  &cpu_node);
> +               pr_debug("found = %p %p\n", found_cache, cpu_node);
> +               if (found_cache)
> +                       update_cache_properties(this_leaf,
> +                                               found_cache,
> +                                               cpu_node);
> +
> +               index++;
> +       }
> +}
> +
> +/**
> + * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
> + * @cpu: Kernel logical cpu number
> + *
> + * Given a logical cpu number, returns the number of levels of cache represented
> + * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
> + * indicating we didn't find any cache levels.
> + *
> + * Return: Cache levels visible to this core.
> + */
> +int acpi_find_last_cache_level(unsigned int cpu)
> +{
> +       u32 acpi_cpu_id;
> +       struct acpi_table_header *table;
> +       int number_of_levels = 0;
> +       acpi_status status;
> +
> +       pr_debug("Cache Setup find last level cpu=%d\n", cpu);
> +
> +       acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> +       if (ACPI_FAILURE(status)) {
> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
> +       } else {
> +               number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
> +               acpi_put_table(table);
> +       }
> +       pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
> +
> +       return number_of_levels;
> +}
> +
> +/**
> + * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
> + * @cpu: Kernel logical cpu number
> + *
> + * Updates the global cache info provided by cpu_get_cacheinfo()
> + * when there are valid properties in the acpi_pptt_cache nodes. A
> + * successful parse may not result in any updates if none of the
> + * cache levels have any valid flags set.  Futher, a unique value is
> + * associated with each known CPU cache entry. This unique value
> + * can be used to determine whether caches are shared between cpus.
> + *
> + * Return: -ENOENT on failure to find table, or 0 on success
> + */
> +int cache_setup_acpi(unsigned int cpu)
> +{
> +       struct acpi_table_header *table;
> +       acpi_status status;
> +
> +       pr_debug("Cache Setup ACPI cpu %d\n", cpu);
> +
> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> +       if (ACPI_FAILURE(status)) {
> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
> +               return -ENOENT;
> +       }
> +
> +       cache_setup_acpi_cpu(table, cpu);
> +       acpi_put_table(table);
> +
> +       return status;
> +}
> --
> 2.13.6
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-08 16:39     ` Ard Biesheuvel
  0 siblings, 0 replies; 136+ messages in thread
From: Ard Biesheuvel @ 2018-03-08 16:39 UTC (permalink / raw)
  To: linux-riscv

On 28 February 2018 at 22:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
>
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
>
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 488 insertions(+)
>  create mode 100644 drivers/acpi/pptt.c
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> new file mode 100644
> index 000000000000..883e4318c6cd
> --- /dev/null
> +++ b/drivers/acpi/pptt.c
...
> +/* total number of attributes checked by the properties code */
> +#define PPTT_CHECKED_ATTRIBUTES 4
> +
> +/*
> + * The ACPI spec implies that the fields in the cache structures are used to
> + * extend and correct the information probed from the hardware. Lets only
> + * set fields that we determine are VALID.
> + */
> +static void update_cache_properties(struct cacheinfo *this_leaf,
> +                                   struct acpi_pptt_cache *found_cache,
> +                                   struct acpi_pptt_processor *cpu_node)
> +{
> +       int valid_flags = 0;
> +
> +       if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
> +               this_leaf->size = found_cache->size;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
> +               this_leaf->coherency_line_size = found_cache->line_size;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
> +               this_leaf->number_of_sets = found_cache->number_of_sets;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
> +               this_leaf->ways_of_associativity = found_cache->associativity;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
> +               switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
> +               case ACPI_PPTT_CACHE_POLICY_WT:
> +                       this_leaf->attributes = CACHE_WRITE_THROUGH;
> +                       break;
> +               case ACPI_PPTT_CACHE_POLICY_WB:
> +                       this_leaf->attributes = CACHE_WRITE_BACK;
> +                       break;
> +               }
> +       }
> +       if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
> +               switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
> +               case ACPI_PPTT_CACHE_READ_ALLOCATE:
> +                       this_leaf->attributes |= CACHE_READ_ALLOCATE;
> +                       break;
> +               case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
> +                       this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
> +                       break;
> +               case ACPI_PPTT_CACHE_RW_ALLOCATE:
> +               case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
> +                       this_leaf->attributes |=
> +                               CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
> +                       break;
> +               }
> +       }
> +       /*
> +        * If the above flags are valid, and the cache type is NOCACHE
> +        * update the cache type as well.
> +        */
> +       if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES))
> +               this_leaf->type = CACHE_TYPE_UNIFIED;

Why do we need the associativity and #sets attributes to be valid in
order to set the cache type?

I see how size and line size are rather fundamental properties, but
for a system cache, the geometry doesn't really matter.

> +}
> +
> +/*
> + * Update the kernel cache information for each level of cache
> + * associated with the given acpi cpu.
> + */
> +static void cache_setup_acpi_cpu(struct acpi_table_header *table,
> +                                unsigned int cpu)
> +{
> +       struct acpi_pptt_cache *found_cache;
> +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> +       u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> +       struct cacheinfo *this_leaf;
> +       unsigned int index = 0;
> +       struct acpi_pptt_processor *cpu_node = NULL;
> +
> +       while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
> +               this_leaf = this_cpu_ci->info_list + index;
> +               found_cache = acpi_find_cache_node(table, acpi_cpu_id,
> +                                                  this_leaf->type,
> +                                                  this_leaf->level,
> +                                                  &cpu_node);
> +               pr_debug("found = %p %p\n", found_cache, cpu_node);
> +               if (found_cache)
> +                       update_cache_properties(this_leaf,
> +                                               found_cache,
> +                                               cpu_node);
> +
> +               index++;
> +       }
> +}
> +
> +/**
> + * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
> + * @cpu: Kernel logical cpu number
> + *
> + * Given a logical cpu number, returns the number of levels of cache represented
> + * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
> + * indicating we didn't find any cache levels.
> + *
> + * Return: Cache levels visible to this core.
> + */
> +int acpi_find_last_cache_level(unsigned int cpu)
> +{
> +       u32 acpi_cpu_id;
> +       struct acpi_table_header *table;
> +       int number_of_levels = 0;
> +       acpi_status status;
> +
> +       pr_debug("Cache Setup find last level cpu=%d\n", cpu);
> +
> +       acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> +       if (ACPI_FAILURE(status)) {
> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
> +       } else {
> +               number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
> +               acpi_put_table(table);
> +       }
> +       pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
> +
> +       return number_of_levels;
> +}
> +
> +/**
> + * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
> + * @cpu: Kernel logical cpu number
> + *
> + * Updates the global cache info provided by cpu_get_cacheinfo()
> + * when there are valid properties in the acpi_pptt_cache nodes. A
> + * successful parse may not result in any updates if none of the
> + * cache levels have any valid flags set.  Futher, a unique value is
> + * associated with each known CPU cache entry. This unique value
> + * can be used to determine whether caches are shared between cpus.
> + *
> + * Return: -ENOENT on failure to find table, or 0 on success
> + */
> +int cache_setup_acpi(unsigned int cpu)
> +{
> +       struct acpi_table_header *table;
> +       acpi_status status;
> +
> +       pr_debug("Cache Setup ACPI cpu %d\n", cpu);
> +
> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> +       if (ACPI_FAILURE(status)) {
> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
> +               return -ENOENT;
> +       }
> +
> +       cache_setup_acpi_cpu(table, cpu);
> +       acpi_put_table(table);
> +
> +       return status;
> +}
> --
> 2.13.6
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-08 16:39     ` Ard Biesheuvel
  0 siblings, 0 replies; 136+ messages in thread
From: Ard Biesheuvel @ 2018-03-08 16:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 28 February 2018 at 22:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
>
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
>
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
>
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> ---
>  drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 488 insertions(+)
>  create mode 100644 drivers/acpi/pptt.c
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> new file mode 100644
> index 000000000000..883e4318c6cd
> --- /dev/null
> +++ b/drivers/acpi/pptt.c
...
> +/* total number of attributes checked by the properties code */
> +#define PPTT_CHECKED_ATTRIBUTES 4
> +
> +/*
> + * The ACPI spec implies that the fields in the cache structures are used to
> + * extend and correct the information probed from the hardware. Lets only
> + * set fields that we determine are VALID.
> + */
> +static void update_cache_properties(struct cacheinfo *this_leaf,
> +                                   struct acpi_pptt_cache *found_cache,
> +                                   struct acpi_pptt_processor *cpu_node)
> +{
> +       int valid_flags = 0;
> +
> +       if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
> +               this_leaf->size = found_cache->size;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
> +               this_leaf->coherency_line_size = found_cache->line_size;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
> +               this_leaf->number_of_sets = found_cache->number_of_sets;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
> +               this_leaf->ways_of_associativity = found_cache->associativity;
> +               valid_flags++;
> +       }
> +       if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
> +               switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
> +               case ACPI_PPTT_CACHE_POLICY_WT:
> +                       this_leaf->attributes = CACHE_WRITE_THROUGH;
> +                       break;
> +               case ACPI_PPTT_CACHE_POLICY_WB:
> +                       this_leaf->attributes = CACHE_WRITE_BACK;
> +                       break;
> +               }
> +       }
> +       if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
> +               switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
> +               case ACPI_PPTT_CACHE_READ_ALLOCATE:
> +                       this_leaf->attributes |= CACHE_READ_ALLOCATE;
> +                       break;
> +               case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
> +                       this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
> +                       break;
> +               case ACPI_PPTT_CACHE_RW_ALLOCATE:
> +               case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
> +                       this_leaf->attributes |=
> +                               CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
> +                       break;
> +               }
> +       }
> +       /*
> +        * If the above flags are valid, and the cache type is NOCACHE
> +        * update the cache type as well.
> +        */
> +       if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES))
> +               this_leaf->type = CACHE_TYPE_UNIFIED;

Why do we need the associativity and #sets attributes to be valid in
order to set the cache type?

I see how size and line size are rather fundamental properties, but
for a system cache, the geometry doesn't really matter.

> +}
> +
> +/*
> + * Update the kernel cache information for each level of cache
> + * associated with the given acpi cpu.
> + */
> +static void cache_setup_acpi_cpu(struct acpi_table_header *table,
> +                                unsigned int cpu)
> +{
> +       struct acpi_pptt_cache *found_cache;
> +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
> +       u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> +       struct cacheinfo *this_leaf;
> +       unsigned int index = 0;
> +       struct acpi_pptt_processor *cpu_node = NULL;
> +
> +       while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
> +               this_leaf = this_cpu_ci->info_list + index;
> +               found_cache = acpi_find_cache_node(table, acpi_cpu_id,
> +                                                  this_leaf->type,
> +                                                  this_leaf->level,
> +                                                  &cpu_node);
> +               pr_debug("found = %p %p\n", found_cache, cpu_node);
> +               if (found_cache)
> +                       update_cache_properties(this_leaf,
> +                                               found_cache,
> +                                               cpu_node);
> +
> +               index++;
> +       }
> +}
> +
> +/**
> + * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
> + * @cpu: Kernel logical cpu number
> + *
> + * Given a logical cpu number, returns the number of levels of cache represented
> + * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
> + * indicating we didn't find any cache levels.
> + *
> + * Return: Cache levels visible to this core.
> + */
> +int acpi_find_last_cache_level(unsigned int cpu)
> +{
> +       u32 acpi_cpu_id;
> +       struct acpi_table_header *table;
> +       int number_of_levels = 0;
> +       acpi_status status;
> +
> +       pr_debug("Cache Setup find last level cpu=%d\n", cpu);
> +
> +       acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> +       if (ACPI_FAILURE(status)) {
> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
> +       } else {
> +               number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
> +               acpi_put_table(table);
> +       }
> +       pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
> +
> +       return number_of_levels;
> +}
> +
> +/**
> + * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
> + * @cpu: Kernel logical cpu number
> + *
> + * Updates the global cache info provided by cpu_get_cacheinfo()
> + * when there are valid properties in the acpi_pptt_cache nodes. A
> + * successful parse may not result in any updates if none of the
> + * cache levels have any valid flags set.  Futher, a unique value is
> + * associated with each known CPU cache entry. This unique value
> + * can be used to determine whether caches are shared between cpus.
> + *
> + * Return: -ENOENT on failure to find table, or 0 on success
> + */
> +int cache_setup_acpi(unsigned int cpu)
> +{
> +       struct acpi_table_header *table;
> +       acpi_status status;
> +
> +       pr_debug("Cache Setup ACPI cpu %d\n", cpu);
> +
> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> +       if (ACPI_FAILURE(status)) {
> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
> +               return -ENOENT;
> +       }
> +
> +       cache_setup_acpi_cpu(table, cpu);
> +       acpi_put_table(table);
> +
> +       return status;
> +}
> --
> 2.13.6
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-08 17:20     ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 136+ messages in thread
From: Lorenzo Pieralisi @ 2018-03-08 17:20 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, hanjun.guo, rjw,
	will.deacon, catalin.marinas, gregkh, mark.rutland, linux-kernel,
	linux-riscv, wangxiongfeng2, vkilari, ahs3, dietmar.eggemann,
	morten.rasmussen, palmer, lenb, john.garry, austinwc, tnowicki

On Wed, Feb 28, 2018 at 04:06:13PM -0600, Jeremy Linton wrote:

[...]

> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 0c6f658054d2..1446d3f053a2 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
>  struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>  int init_cache_level(unsigned int cpu);
>  int populate_cache_leaves(unsigned int cpu);
> +int cache_setup_acpi(unsigned int cpu);
> +int acpi_find_last_cache_level(unsigned int cpu);
> +#ifndef CONFIG_ACPI
> +int acpi_find_last_cache_level(unsigned int cpu)

This has got to be a static inline function declaration (see kbot report).

Lorenzo

> +{
> +	/* ACPI kernels should be built with PPTT support */
> +	return 0;
> +}
> +#endif
>  
>  const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
>  
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
@ 2018-03-08 17:20     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 136+ messages in thread
From: Lorenzo Pieralisi @ 2018-03-08 17:20 UTC (permalink / raw)
  To: linux-riscv

On Wed, Feb 28, 2018 at 04:06:13PM -0600, Jeremy Linton wrote:

[...]

> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 0c6f658054d2..1446d3f053a2 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
>  struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>  int init_cache_level(unsigned int cpu);
>  int populate_cache_leaves(unsigned int cpu);
> +int cache_setup_acpi(unsigned int cpu);
> +int acpi_find_last_cache_level(unsigned int cpu);
> +#ifndef CONFIG_ACPI
> +int acpi_find_last_cache_level(unsigned int cpu)

This has got to be a static inline function declaration (see kbot report).

Lorenzo

> +{
> +	/* ACPI kernels should be built with PPTT support */
> +	return 0;
> +}
> +#endif
>  
>  const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
>  
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables
@ 2018-03-08 17:20     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 136+ messages in thread
From: Lorenzo Pieralisi @ 2018-03-08 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Feb 28, 2018 at 04:06:13PM -0600, Jeremy Linton wrote:

[...]

> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index 0c6f658054d2..1446d3f053a2 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -97,6 +97,15 @@ int func(unsigned int cpu)					\
>  struct cpu_cacheinfo *get_cpu_cacheinfo(unsigned int cpu);
>  int init_cache_level(unsigned int cpu);
>  int populate_cache_leaves(unsigned int cpu);
> +int cache_setup_acpi(unsigned int cpu);
> +int acpi_find_last_cache_level(unsigned int cpu);
> +#ifndef CONFIG_ACPI
> +int acpi_find_last_cache_level(unsigned int cpu)

This has got to be a static inline function declaration (see kbot report).

Lorenzo

> +{
> +	/* ACPI kernels should be built with PPTT support */
> +	return 0;
> +}
> +#endif
>  
>  const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
>  
> -- 
> 2.13.6
> 

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 00/13] Support PPTT for ARM64
  2018-03-08 15:59       ` Ard Biesheuvel
  (?)
@ 2018-03-08 17:41         ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-08 17:41 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Sudeep Holla, linux-acpi, Mark Rutland, vkilari,
	Lorenzo Pieralisi, austinwc, tnowicki, Greg Kroah-Hartman,
	Rafael J. Wysocki, dietmar.eggemann, Will Deacon,
	Linux Kernel Mailing List, morten.rasmussen, Al Stone, palmer,
	Hanjun Guo, Catalin Marinas, linux-riscv, John Garry,
	wangxiongfeng2, linux-arm-kernel, Len Brown

Hi,

First thanks for testing this!!

On 03/08/2018 09:59 AM, Ard Biesheuvel wrote:
> On 27 February 2018 at 18:49, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> On 03/01/2018 06:06 AM, Sudeep Holla wrote:
>>>
>>> Hi Jeremy,
>>>
>>> On 28/02/18 22:06, Jeremy Linton wrote:
>>>>
>>>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>>>> used to describe the processor and cache topology. Ideally it is
>>>> used to extend/override information provided by the hardware, but
>>>> right now ARM64 is entirely dependent on firmware provided tables.
>>>>
>>>> This patch parses the table for the cache topology and CPU topology.
>>>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>>>> PPTT node flagged as the physical package by the firmware.
>>>> This results in topologies that match what the remainder of the
>>>> system expects. To avoid inverted scheduler domains we then
>>>> set the MC domain equal to the largest cache within the socket
>>>> below the NUMA domain.
>>>>
>>> I remember reviewing and acknowledging most of the cacheinfo stuff with
>>> couple of minor suggestions for v6. I don't see any Acked-by tags in
>>> this series and don't know if I need to review/ack any more cacheinfo
>>> related patches.
>>
>>
>> Hi,
>>
>> Yes, I didn't put them in because I changed the functionality in 2/13 and
>> there is a bug fix in 5/13. I thought you might want to do a quick diff of
>> the git v6->v7 tree.
>>
>> Although given that most of the changes were in response to your comments in
>> v6 I probably should have just put the tags in.
>>
> 
> I get sane output from lstopo when applying these patches and booting
> my Socionext SynQuacer in ACPI mode:
> 
> $ lstopo-no-graphics
> Machine (31GB)
>    Package L#0 + L3 L#0 (4096KB)
>      L2 L#0 (256KB)
>        L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>        L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>      L2 L#1 (256KB)
>        L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>        L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>      L2 L#2 (256KB)
>        L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
>        L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
>      L2 L#3 (256KB)
>        L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
>        L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
>      L2 L#4 (256KB)
>        L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
>        L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
>      L2 L#5 (256KB)
>        L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
>        L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
>      L2 L#6 (256KB)
>        L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 (P#12)
>        L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 (P#13)
>      L2 L#7 (256KB)
>        L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 (P#14)
>        L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 (P#15)
>      L2 L#8 (256KB)
>        L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 (P#16)
>        L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 (P#17)
>      L2 L#9 (256KB)
>        L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 (P#18)
>        L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 (P#19)
>      L2 L#10 (256KB)
>        L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 (P#20)
>        L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 (P#21)
>      L2 L#11 (256KB)
>        L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 (P#22)
>        L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 (P#23)
>    HostBridge L#0
>      PCIBridge
>        PCIBridge
>          PCI 1b21:0612
>            Block(Disk) L#0 "sda"
>    HostBridge L#3
>      PCI 10de:128b
>        GPU L#1 "renderD128"
>        GPU L#2 "card0"
>        GPU L#3 "controlD64"
> 
> So
> 
> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> 
> *However*, while hacking on the firmware that exposes the table, I
> noticed that a malformed structure (incorrect size) can get the parser
> in an infinite loop, hanging the boot after
> 
> [    8.244281] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [    8.251780] Serial: AMBA driver
> [    8.255042] msm_serial: driver initialized
> [    8.259752] ACPI PPTT: Cache Setup ACPI cpu 0
> [    8.264121] ACPI PPTT: Looking for data cache
> [    8.268484] ACPI PPTT: Looking for CPU 0's level 1 cache type 0
> 
> so I guess the parsing code could be made a bit more robust?
> 

I've been wondering how long it would take for someone to complain about 
one of these cases, I added a check in find_processor_node back a few 
revisions ago to deal with zero length's causing infinite loops, but the 
leaf node check doesn't have one, and that is likely what your hitting.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-08 17:41         ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-08 17:41 UTC (permalink / raw)
  To: linux-riscv

Hi,

First thanks for testing this!!

On 03/08/2018 09:59 AM, Ard Biesheuvel wrote:
> On 27 February 2018 at 18:49, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> On 03/01/2018 06:06 AM, Sudeep Holla wrote:
>>>
>>> Hi Jeremy,
>>>
>>> On 28/02/18 22:06, Jeremy Linton wrote:
>>>>
>>>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>>>> used to describe the processor and cache topology. Ideally it is
>>>> used to extend/override information provided by the hardware, but
>>>> right now ARM64 is entirely dependent on firmware provided tables.
>>>>
>>>> This patch parses the table for the cache topology and CPU topology.
>>>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>>>> PPTT node flagged as the physical package by the firmware.
>>>> This results in topologies that match what the remainder of the
>>>> system expects. To avoid inverted scheduler domains we then
>>>> set the MC domain equal to the largest cache within the socket
>>>> below the NUMA domain.
>>>>
>>> I remember reviewing and acknowledging most of the cacheinfo stuff with
>>> couple of minor suggestions for v6. I don't see any Acked-by tags in
>>> this series and don't know if I need to review/ack any more cacheinfo
>>> related patches.
>>
>>
>> Hi,
>>
>> Yes, I didn't put them in because I changed the functionality in 2/13 and
>> there is a bug fix in 5/13. I thought you might want to do a quick diff of
>> the git v6->v7 tree.
>>
>> Although given that most of the changes were in response to your comments in
>> v6 I probably should have just put the tags in.
>>
> 
> I get sane output from lstopo when applying these patches and booting
> my Socionext SynQuacer in ACPI mode:
> 
> $ lstopo-no-graphics
> Machine (31GB)
>    Package L#0 + L3 L#0 (4096KB)
>      L2 L#0 (256KB)
>        L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>        L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>      L2 L#1 (256KB)
>        L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>        L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>      L2 L#2 (256KB)
>        L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
>        L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
>      L2 L#3 (256KB)
>        L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
>        L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
>      L2 L#4 (256KB)
>        L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
>        L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
>      L2 L#5 (256KB)
>        L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
>        L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
>      L2 L#6 (256KB)
>        L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 (P#12)
>        L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 (P#13)
>      L2 L#7 (256KB)
>        L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 (P#14)
>        L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 (P#15)
>      L2 L#8 (256KB)
>        L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 (P#16)
>        L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 (P#17)
>      L2 L#9 (256KB)
>        L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 (P#18)
>        L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 (P#19)
>      L2 L#10 (256KB)
>        L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 (P#20)
>        L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 (P#21)
>      L2 L#11 (256KB)
>        L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 (P#22)
>        L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 (P#23)
>    HostBridge L#0
>      PCIBridge
>        PCIBridge
>          PCI 1b21:0612
>            Block(Disk) L#0 "sda"
>    HostBridge L#3
>      PCI 10de:128b
>        GPU L#1 "renderD128"
>        GPU L#2 "card0"
>        GPU L#3 "controlD64"
> 
> So
> 
> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> 
> *However*, while hacking on the firmware that exposes the table, I
> noticed that a malformed structure (incorrect size) can get the parser
> in an infinite loop, hanging the boot after
> 
> [    8.244281] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [    8.251780] Serial: AMBA driver
> [    8.255042] msm_serial: driver initialized
> [    8.259752] ACPI PPTT: Cache Setup ACPI cpu 0
> [    8.264121] ACPI PPTT: Looking for data cache
> [    8.268484] ACPI PPTT: Looking for CPU 0's level 1 cache type 0
> 
> so I guess the parsing code could be made a bit more robust?
> 

I've been wondering how long it would take for someone to complain about 
one of these cases, I added a check in find_processor_node back a few 
revisions ago to deal with zero length's causing infinite loops, but the 
leaf node check doesn't have one, and that is likely what your hitting.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-08 17:41         ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-08 17:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

First thanks for testing this!!

On 03/08/2018 09:59 AM, Ard Biesheuvel wrote:
> On 27 February 2018 at 18:49, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> On 03/01/2018 06:06 AM, Sudeep Holla wrote:
>>>
>>> Hi Jeremy,
>>>
>>> On 28/02/18 22:06, Jeremy Linton wrote:
>>>>
>>>> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
>>>> used to describe the processor and cache topology. Ideally it is
>>>> used to extend/override information provided by the hardware, but
>>>> right now ARM64 is entirely dependent on firmware provided tables.
>>>>
>>>> This patch parses the table for the cache topology and CPU topology.
>>>> When we enable ACPI/PPTT for arm64 we map the physical_id to the
>>>> PPTT node flagged as the physical package by the firmware.
>>>> This results in topologies that match what the remainder of the
>>>> system expects. To avoid inverted scheduler domains we then
>>>> set the MC domain equal to the largest cache within the socket
>>>> below the NUMA domain.
>>>>
>>> I remember reviewing and acknowledging most of the cacheinfo stuff with
>>> couple of minor suggestions for v6. I don't see any Acked-by tags in
>>> this series and don't know if I need to review/ack any more cacheinfo
>>> related patches.
>>
>>
>> Hi,
>>
>> Yes, I didn't put them in because I changed the functionality in 2/13 and
>> there is a bug fix in 5/13. I thought you might want to do a quick diff of
>> the git v6->v7 tree.
>>
>> Although given that most of the changes were in response to your comments in
>> v6 I probably should have just put the tags in.
>>
> 
> I get sane output from lstopo when applying these patches and booting
> my Socionext SynQuacer in ACPI mode:
> 
> $ lstopo-no-graphics
> Machine (31GB)
>    Package L#0 + L3 L#0 (4096KB)
>      L2 L#0 (256KB)
>        L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>        L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>      L2 L#1 (256KB)
>        L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>        L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>      L2 L#2 (256KB)
>        L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#4)
>        L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#5)
>      L2 L#3 (256KB)
>        L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#6)
>        L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#7)
>      L2 L#4 (256KB)
>        L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#8)
>        L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#9)
>      L2 L#5 (256KB)
>        L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#10)
>        L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#11)
>      L2 L#6 (256KB)
>        L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 (P#12)
>        L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 (P#13)
>      L2 L#7 (256KB)
>        L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 (P#14)
>        L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 (P#15)
>      L2 L#8 (256KB)
>        L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 (P#16)
>        L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 (P#17)
>      L2 L#9 (256KB)
>        L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 (P#18)
>        L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 (P#19)
>      L2 L#10 (256KB)
>        L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 (P#20)
>        L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 (P#21)
>      L2 L#11 (256KB)
>        L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 (P#22)
>        L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 (P#23)
>    HostBridge L#0
>      PCIBridge
>        PCIBridge
>          PCI 1b21:0612
>            Block(Disk) L#0 "sda"
>    HostBridge L#3
>      PCI 10de:128b
>        GPU L#1 "renderD128"
>        GPU L#2 "card0"
>        GPU L#3 "controlD64"
> 
> So
> 
> Tested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> 
> *However*, while hacking on the firmware that exposes the table, I
> noticed that a malformed structure (incorrect size) can get the parser
> in an infinite loop, hanging the boot after
> 
> [    8.244281] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
> [    8.251780] Serial: AMBA driver
> [    8.255042] msm_serial: driver initialized
> [    8.259752] ACPI PPTT: Cache Setup ACPI cpu 0
> [    8.264121] ACPI PPTT: Looking for data cache
> [    8.268484] ACPI PPTT: Looking for CPU 0's level 1 cache type 0
> 
> so I guess the parsing code could be made a bit more robust?
> 

I've been wondering how long it would take for someone to complain about 
one of these cases, I added a check in find_processor_node back a few 
revisions ago to deal with zero length's causing infinite loops, but the 
leaf node check doesn't have one, and that is likely what your hitting.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
  2018-03-08 16:39     ` Ard Biesheuvel
  (?)
  (?)
@ 2018-03-08 19:52       ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-08 19:52 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Mark Rutland, austinwc, tnowicki, Catalin Marinas, palmer,
	Will Deacon, linux-riscv, wangxiongfeng2, vkilari,
	Lorenzo Pieralisi, morten.rasmussen, linux-acpi, Len Brown,
	John Garry, Al Stone, linux-arm-kernel, Greg Kroah-Hartman,
	Rafael J. Wysocki, Linux Kernel Mailing List, Hanjun Guo,
	Sudeep Holla, dietmar.eggemann

Hi,

On 03/08/2018 10:39 AM, Ard Biesheuvel wrote:
> On 28 February 2018 at 22:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> ACPI 6.2 adds a new table, which describes how processing units
>> are related to each other in tree like fashion. Caches are
>> also sprinkled throughout the tree and describe the properties
>> of the caches in relation to other caches and processing units.
>>
>> Add the code to parse the cache hierarchy and report the total
>> number of levels of cache for a given core using
>> acpi_find_last_cache_level() as well as fill out the individual
>> cores cache information with cache_setup_acpi() once the
>> cpu_cacheinfo structure has been populated by the arch specific
>> code.
>>
>> An additional patch later in the set adds the ability to report
>> peers in the topology using find_acpi_cpu_topology()
>> to report a unique ID for each processing unit at a given level
>> in the tree. These unique id's can then be used to match related
>> processing units which exist as threads, COD (clusters
>> on die), within a given package, etc.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 488 insertions(+)
>>   create mode 100644 drivers/acpi/pptt.c
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> new file mode 100644
>> index 000000000000..883e4318c6cd
>> --- /dev/null
>> +++ b/drivers/acpi/pptt.c
> ...
>> +/* total number of attributes checked by the properties code */
>> +#define PPTT_CHECKED_ATTRIBUTES 4
>> +
>> +/*
>> + * The ACPI spec implies that the fields in the cache structures are used to
>> + * extend and correct the information probed from the hardware. Lets only
>> + * set fields that we determine are VALID.
>> + */
>> +static void update_cache_properties(struct cacheinfo *this_leaf,
>> +                                   struct acpi_pptt_cache *found_cache,
>> +                                   struct acpi_pptt_processor *cpu_node)
>> +{
>> +       int valid_flags = 0;
>> +
>> +       if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
>> +               this_leaf->size = found_cache->size;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
>> +               this_leaf->coherency_line_size = found_cache->line_size;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
>> +               this_leaf->number_of_sets = found_cache->number_of_sets;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
>> +               this_leaf->ways_of_associativity = found_cache->associativity;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
>> +               switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
>> +               case ACPI_PPTT_CACHE_POLICY_WT:
>> +                       this_leaf->attributes = CACHE_WRITE_THROUGH;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_POLICY_WB:
>> +                       this_leaf->attributes = CACHE_WRITE_BACK;
>> +                       break;
>> +               }
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
>> +               switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
>> +               case ACPI_PPTT_CACHE_READ_ALLOCATE:
>> +                       this_leaf->attributes |= CACHE_READ_ALLOCATE;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
>> +                       this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_RW_ALLOCATE:
>> +               case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
>> +                       this_leaf->attributes |=
>> +                               CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
>> +                       break;
>> +               }
>> +       }
>> +       /*
>> +        * If the above flags are valid, and the cache type is NOCACHE
>> +        * update the cache type as well.
>> +        */
>> +       if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
>> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES))
>> +               this_leaf->type = CACHE_TYPE_UNIFIED;
> 
> Why do we need the associativity and #sets attributes to be valid in
> order to set the cache type?

This happened a couple revisions ago because its better to force people 
to completely populate the attributes in the tables (particularly for 
nodes we are generating, which is what the _NOCACHE check above is 
detecting) and leave half of them undefined and therefor exported to 
user-space in a way which makes the properties unreliable across machines.


> 
> I see how size and line size are rather fundamental properties, but
> for a system cache, the geometry doesn't really matter.

Originally all of them were required (to avoid having the discussion 
about which ones were "important"), but Sudeep was of the opinion that 
these were the "important" ones, I can see his point, and no one else 
spoke up... So, those are the ones that are required.


> 
>> +}
>> +
>> +/*
>> + * Update the kernel cache information for each level of cache
>> + * associated with the given acpi cpu.
>> + */
>> +static void cache_setup_acpi_cpu(struct acpi_table_header *table,
>> +                                unsigned int cpu)
>> +{
>> +       struct acpi_pptt_cache *found_cache;
>> +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +       u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +       struct cacheinfo *this_leaf;
>> +       unsigned int index = 0;
>> +       struct acpi_pptt_processor *cpu_node = NULL;
>> +
>> +       while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
>> +               this_leaf = this_cpu_ci->info_list + index;
>> +               found_cache = acpi_find_cache_node(table, acpi_cpu_id,
>> +                                                  this_leaf->type,
>> +                                                  this_leaf->level,
>> +                                                  &cpu_node);
>> +               pr_debug("found = %p %p\n", found_cache, cpu_node);
>> +               if (found_cache)
>> +                       update_cache_properties(this_leaf,
>> +                                               found_cache,
>> +                                               cpu_node);
>> +
>> +               index++;
>> +       }
>> +}
>> +
>> +/**
>> + * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
>> + * @cpu: Kernel logical cpu number
>> + *
>> + * Given a logical cpu number, returns the number of levels of cache represented
>> + * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
>> + * indicating we didn't find any cache levels.
>> + *
>> + * Return: Cache levels visible to this core.
>> + */
>> +int acpi_find_last_cache_level(unsigned int cpu)
>> +{
>> +       u32 acpi_cpu_id;
>> +       struct acpi_table_header *table;
>> +       int number_of_levels = 0;
>> +       acpi_status status;
>> +
>> +       pr_debug("Cache Setup find last level cpu=%d\n", cpu);
>> +
>> +       acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> +       if (ACPI_FAILURE(status)) {
>> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
>> +       } else {
>> +               number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
>> +               acpi_put_table(table);
>> +       }
>> +       pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
>> +
>> +       return number_of_levels;
>> +}
>> +
>> +/**
>> + * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
>> + * @cpu: Kernel logical cpu number
>> + *
>> + * Updates the global cache info provided by cpu_get_cacheinfo()
>> + * when there are valid properties in the acpi_pptt_cache nodes. A
>> + * successful parse may not result in any updates if none of the
>> + * cache levels have any valid flags set.  Futher, a unique value is
>> + * associated with each known CPU cache entry. This unique value
>> + * can be used to determine whether caches are shared between cpus.
>> + *
>> + * Return: -ENOENT on failure to find table, or 0 on success
>> + */
>> +int cache_setup_acpi(unsigned int cpu)
>> +{
>> +       struct acpi_table_header *table;
>> +       acpi_status status;
>> +
>> +       pr_debug("Cache Setup ACPI cpu %d\n", cpu);
>> +
>> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> +       if (ACPI_FAILURE(status)) {
>> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
>> +               return -ENOENT;
>> +       }
>> +
>> +       cache_setup_acpi_cpu(table, cpu);
>> +       acpi_put_table(table);
>> +
>> +       return status;
>> +}
>> --
>> 2.13.6
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-08 19:52       ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-08 19:52 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-acpi, Mark Rutland, austinwc, tnowicki, Catalin Marinas,
	palmer, Will Deacon, linux-riscv, morten.rasmussen, vkilari,
	Lorenzo Pieralisi, Al Stone, Len Brown, John Garry,
	wangxiongfeng2, dietmar.eggemann, linux-arm-kernel,
	Greg Kroah-Hartman, Rafael J. Wysocki, Linux Kernel Mailing List,
	Hanjun Guo, Sudeep Holla

Hi,

On 03/08/2018 10:39 AM, Ard Biesheuvel wrote:
> On 28 February 2018 at 22:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> ACPI 6.2 adds a new table, which describes how processing units
>> are related to each other in tree like fashion. Caches are
>> also sprinkled throughout the tree and describe the properties
>> of the caches in relation to other caches and processing units.
>>
>> Add the code to parse the cache hierarchy and report the total
>> number of levels of cache for a given core using
>> acpi_find_last_cache_level() as well as fill out the individual
>> cores cache information with cache_setup_acpi() once the
>> cpu_cacheinfo structure has been populated by the arch specific
>> code.
>>
>> An additional patch later in the set adds the ability to report
>> peers in the topology using find_acpi_cpu_topology()
>> to report a unique ID for each processing unit at a given level
>> in the tree. These unique id's can then be used to match related
>> processing units which exist as threads, COD (clusters
>> on die), within a given package, etc.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 488 insertions(+)
>>   create mode 100644 drivers/acpi/pptt.c
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> new file mode 100644
>> index 000000000000..883e4318c6cd
>> --- /dev/null
>> +++ b/drivers/acpi/pptt.c
> ...
>> +/* total number of attributes checked by the properties code */
>> +#define PPTT_CHECKED_ATTRIBUTES 4
>> +
>> +/*
>> + * The ACPI spec implies that the fields in the cache structures are used to
>> + * extend and correct the information probed from the hardware. Lets only
>> + * set fields that we determine are VALID.
>> + */
>> +static void update_cache_properties(struct cacheinfo *this_leaf,
>> +                                   struct acpi_pptt_cache *found_cache,
>> +                                   struct acpi_pptt_processor *cpu_node)
>> +{
>> +       int valid_flags = 0;
>> +
>> +       if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
>> +               this_leaf->size = found_cache->size;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
>> +               this_leaf->coherency_line_size = found_cache->line_size;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
>> +               this_leaf->number_of_sets = found_cache->number_of_sets;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
>> +               this_leaf->ways_of_associativity = found_cache->associativity;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
>> +               switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
>> +               case ACPI_PPTT_CACHE_POLICY_WT:
>> +                       this_leaf->attributes = CACHE_WRITE_THROUGH;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_POLICY_WB:
>> +                       this_leaf->attributes = CACHE_WRITE_BACK;
>> +                       break;
>> +               }
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
>> +               switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
>> +               case ACPI_PPTT_CACHE_READ_ALLOCATE:
>> +                       this_leaf->attributes |= CACHE_READ_ALLOCATE;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
>> +                       this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_RW_ALLOCATE:
>> +               case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
>> +                       this_leaf->attributes |=
>> +                               CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
>> +                       break;
>> +               }
>> +       }
>> +       /*
>> +        * If the above flags are valid, and the cache type is NOCACHE
>> +        * update the cache type as well.
>> +        */
>> +       if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
>> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES))
>> +               this_leaf->type = CACHE_TYPE_UNIFIED;
> 
> Why do we need the associativity and #sets attributes to be valid in
> order to set the cache type?

This happened a couple revisions ago because its better to force people 
to completely populate the attributes in the tables (particularly for 
nodes we are generating, which is what the _NOCACHE check above is 
detecting) and leave half of them undefined and therefor exported to 
user-space in a way which makes the properties unreliable across machines.


> 
> I see how size and line size are rather fundamental properties, but
> for a system cache, the geometry doesn't really matter.

Originally all of them were required (to avoid having the discussion 
about which ones were "important"), but Sudeep was of the opinion that 
these were the "important" ones, I can see his point, and no one else 
spoke up... So, those are the ones that are required.


> 
>> +}
>> +
>> +/*
>> + * Update the kernel cache information for each level of cache
>> + * associated with the given acpi cpu.
>> + */
>> +static void cache_setup_acpi_cpu(struct acpi_table_header *table,
>> +                                unsigned int cpu)
>> +{
>> +       struct acpi_pptt_cache *found_cache;
>> +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +       u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +       struct cacheinfo *this_leaf;
>> +       unsigned int index = 0;
>> +       struct acpi_pptt_processor *cpu_node = NULL;
>> +
>> +       while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
>> +               this_leaf = this_cpu_ci->info_list + index;
>> +               found_cache = acpi_find_cache_node(table, acpi_cpu_id,
>> +                                                  this_leaf->type,
>> +                                                  this_leaf->level,
>> +                                                  &cpu_node);
>> +               pr_debug("found = %p %p\n", found_cache, cpu_node);
>> +               if (found_cache)
>> +                       update_cache_properties(this_leaf,
>> +                                               found_cache,
>> +                                               cpu_node);
>> +
>> +               index++;
>> +       }
>> +}
>> +
>> +/**
>> + * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
>> + * @cpu: Kernel logical cpu number
>> + *
>> + * Given a logical cpu number, returns the number of levels of cache represented
>> + * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
>> + * indicating we didn't find any cache levels.
>> + *
>> + * Return: Cache levels visible to this core.
>> + */
>> +int acpi_find_last_cache_level(unsigned int cpu)
>> +{
>> +       u32 acpi_cpu_id;
>> +       struct acpi_table_header *table;
>> +       int number_of_levels = 0;
>> +       acpi_status status;
>> +
>> +       pr_debug("Cache Setup find last level cpu=%d\n", cpu);
>> +
>> +       acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> +       if (ACPI_FAILURE(status)) {
>> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
>> +       } else {
>> +               number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
>> +               acpi_put_table(table);
>> +       }
>> +       pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
>> +
>> +       return number_of_levels;
>> +}
>> +
>> +/**
>> + * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
>> + * @cpu: Kernel logical cpu number
>> + *
>> + * Updates the global cache info provided by cpu_get_cacheinfo()
>> + * when there are valid properties in the acpi_pptt_cache nodes. A
>> + * successful parse may not result in any updates if none of the
>> + * cache levels have any valid flags set.  Futher, a unique value is
>> + * associated with each known CPU cache entry. This unique value
>> + * can be used to determine whether caches are shared between cpus.
>> + *
>> + * Return: -ENOENT on failure to find table, or 0 on success
>> + */
>> +int cache_setup_acpi(unsigned int cpu)
>> +{
>> +       struct acpi_table_header *table;
>> +       acpi_status status;
>> +
>> +       pr_debug("Cache Setup ACPI cpu %d\n", cpu);
>> +
>> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> +       if (ACPI_FAILURE(status)) {
>> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
>> +               return -ENOENT;
>> +       }
>> +
>> +       cache_setup_acpi_cpu(table, cpu);
>> +       acpi_put_table(table);
>> +
>> +       return status;
>> +}
>> --
>> 2.13.6
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-08 19:52       ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-08 19:52 UTC (permalink / raw)
  To: linux-riscv

Hi,

On 03/08/2018 10:39 AM, Ard Biesheuvel wrote:
> On 28 February 2018 at 22:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> ACPI 6.2 adds a new table, which describes how processing units
>> are related to each other in tree like fashion. Caches are
>> also sprinkled throughout the tree and describe the properties
>> of the caches in relation to other caches and processing units.
>>
>> Add the code to parse the cache hierarchy and report the total
>> number of levels of cache for a given core using
>> acpi_find_last_cache_level() as well as fill out the individual
>> cores cache information with cache_setup_acpi() once the
>> cpu_cacheinfo structure has been populated by the arch specific
>> code.
>>
>> An additional patch later in the set adds the ability to report
>> peers in the topology using find_acpi_cpu_topology()
>> to report a unique ID for each processing unit at a given level
>> in the tree. These unique id's can then be used to match related
>> processing units which exist as threads, COD (clusters
>> on die), within a given package, etc.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 488 insertions(+)
>>   create mode 100644 drivers/acpi/pptt.c
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> new file mode 100644
>> index 000000000000..883e4318c6cd
>> --- /dev/null
>> +++ b/drivers/acpi/pptt.c
> ...
>> +/* total number of attributes checked by the properties code */
>> +#define PPTT_CHECKED_ATTRIBUTES 4
>> +
>> +/*
>> + * The ACPI spec implies that the fields in the cache structures are used to
>> + * extend and correct the information probed from the hardware. Lets only
>> + * set fields that we determine are VALID.
>> + */
>> +static void update_cache_properties(struct cacheinfo *this_leaf,
>> +                                   struct acpi_pptt_cache *found_cache,
>> +                                   struct acpi_pptt_processor *cpu_node)
>> +{
>> +       int valid_flags = 0;
>> +
>> +       if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
>> +               this_leaf->size = found_cache->size;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
>> +               this_leaf->coherency_line_size = found_cache->line_size;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
>> +               this_leaf->number_of_sets = found_cache->number_of_sets;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
>> +               this_leaf->ways_of_associativity = found_cache->associativity;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
>> +               switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
>> +               case ACPI_PPTT_CACHE_POLICY_WT:
>> +                       this_leaf->attributes = CACHE_WRITE_THROUGH;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_POLICY_WB:
>> +                       this_leaf->attributes = CACHE_WRITE_BACK;
>> +                       break;
>> +               }
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
>> +               switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
>> +               case ACPI_PPTT_CACHE_READ_ALLOCATE:
>> +                       this_leaf->attributes |= CACHE_READ_ALLOCATE;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
>> +                       this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_RW_ALLOCATE:
>> +               case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
>> +                       this_leaf->attributes |=
>> +                               CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
>> +                       break;
>> +               }
>> +       }
>> +       /*
>> +        * If the above flags are valid, and the cache type is NOCACHE
>> +        * update the cache type as well.
>> +        */
>> +       if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
>> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES))
>> +               this_leaf->type = CACHE_TYPE_UNIFIED;
> 
> Why do we need the associativity and #sets attributes to be valid in
> order to set the cache type?

This happened a couple revisions ago because its better to force people 
to completely populate the attributes in the tables (particularly for 
nodes we are generating, which is what the _NOCACHE check above is 
detecting) and leave half of them undefined and therefor exported to 
user-space in a way which makes the properties unreliable across machines.


> 
> I see how size and line size are rather fundamental properties, but
> for a system cache, the geometry doesn't really matter.

Originally all of them were required (to avoid having the discussion 
about which ones were "important"), but Sudeep was of the opinion that 
these were the "important" ones, I can see his point, and no one else 
spoke up... So, those are the ones that are required.


> 
>> +}
>> +
>> +/*
>> + * Update the kernel cache information for each level of cache
>> + * associated with the given acpi cpu.
>> + */
>> +static void cache_setup_acpi_cpu(struct acpi_table_header *table,
>> +                                unsigned int cpu)
>> +{
>> +       struct acpi_pptt_cache *found_cache;
>> +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +       u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +       struct cacheinfo *this_leaf;
>> +       unsigned int index = 0;
>> +       struct acpi_pptt_processor *cpu_node = NULL;
>> +
>> +       while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
>> +               this_leaf = this_cpu_ci->info_list + index;
>> +               found_cache = acpi_find_cache_node(table, acpi_cpu_id,
>> +                                                  this_leaf->type,
>> +                                                  this_leaf->level,
>> +                                                  &cpu_node);
>> +               pr_debug("found = %p %p\n", found_cache, cpu_node);
>> +               if (found_cache)
>> +                       update_cache_properties(this_leaf,
>> +                                               found_cache,
>> +                                               cpu_node);
>> +
>> +               index++;
>> +       }
>> +}
>> +
>> +/**
>> + * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
>> + * @cpu: Kernel logical cpu number
>> + *
>> + * Given a logical cpu number, returns the number of levels of cache represented
>> + * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
>> + * indicating we didn't find any cache levels.
>> + *
>> + * Return: Cache levels visible to this core.
>> + */
>> +int acpi_find_last_cache_level(unsigned int cpu)
>> +{
>> +       u32 acpi_cpu_id;
>> +       struct acpi_table_header *table;
>> +       int number_of_levels = 0;
>> +       acpi_status status;
>> +
>> +       pr_debug("Cache Setup find last level cpu=%d\n", cpu);
>> +
>> +       acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> +       if (ACPI_FAILURE(status)) {
>> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
>> +       } else {
>> +               number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
>> +               acpi_put_table(table);
>> +       }
>> +       pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
>> +
>> +       return number_of_levels;
>> +}
>> +
>> +/**
>> + * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
>> + * @cpu: Kernel logical cpu number
>> + *
>> + * Updates the global cache info provided by cpu_get_cacheinfo()
>> + * when there are valid properties in the acpi_pptt_cache nodes. A
>> + * successful parse may not result in any updates if none of the
>> + * cache levels have any valid flags set.  Futher, a unique value is
>> + * associated with each known CPU cache entry. This unique value
>> + * can be used to determine whether caches are shared between cpus.
>> + *
>> + * Return: -ENOENT on failure to find table, or 0 on success
>> + */
>> +int cache_setup_acpi(unsigned int cpu)
>> +{
>> +       struct acpi_table_header *table;
>> +       acpi_status status;
>> +
>> +       pr_debug("Cache Setup ACPI cpu %d\n", cpu);
>> +
>> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> +       if (ACPI_FAILURE(status)) {
>> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
>> +               return -ENOENT;
>> +       }
>> +
>> +       cache_setup_acpi_cpu(table, cpu);
>> +       acpi_put_table(table);
>> +
>> +       return status;
>> +}
>> --
>> 2.13.6
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-08 19:52       ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-08 19:52 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 03/08/2018 10:39 AM, Ard Biesheuvel wrote:
> On 28 February 2018 at 22:06, Jeremy Linton <jeremy.linton@arm.com> wrote:
>> ACPI 6.2 adds a new table, which describes how processing units
>> are related to each other in tree like fashion. Caches are
>> also sprinkled throughout the tree and describe the properties
>> of the caches in relation to other caches and processing units.
>>
>> Add the code to parse the cache hierarchy and report the total
>> number of levels of cache for a given core using
>> acpi_find_last_cache_level() as well as fill out the individual
>> cores cache information with cache_setup_acpi() once the
>> cpu_cacheinfo structure has been populated by the arch specific
>> code.
>>
>> An additional patch later in the set adds the ability to report
>> peers in the topology using find_acpi_cpu_topology()
>> to report a unique ID for each processing unit at a given level
>> in the tree. These unique id's can then be used to match related
>> processing units which exist as threads, COD (clusters
>> on die), within a given package, etc.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
>> ---
>>   drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 488 insertions(+)
>>   create mode 100644 drivers/acpi/pptt.c
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> new file mode 100644
>> index 000000000000..883e4318c6cd
>> --- /dev/null
>> +++ b/drivers/acpi/pptt.c
> ...
>> +/* total number of attributes checked by the properties code */
>> +#define PPTT_CHECKED_ATTRIBUTES 4
>> +
>> +/*
>> + * The ACPI spec implies that the fields in the cache structures are used to
>> + * extend and correct the information probed from the hardware. Lets only
>> + * set fields that we determine are VALID.
>> + */
>> +static void update_cache_properties(struct cacheinfo *this_leaf,
>> +                                   struct acpi_pptt_cache *found_cache,
>> +                                   struct acpi_pptt_processor *cpu_node)
>> +{
>> +       int valid_flags = 0;
>> +
>> +       if (found_cache->flags & ACPI_PPTT_SIZE_PROPERTY_VALID) {
>> +               this_leaf->size = found_cache->size;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_LINE_SIZE_VALID) {
>> +               this_leaf->coherency_line_size = found_cache->line_size;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_NUMBER_OF_SETS_VALID) {
>> +               this_leaf->number_of_sets = found_cache->number_of_sets;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_ASSOCIATIVITY_VALID) {
>> +               this_leaf->ways_of_associativity = found_cache->associativity;
>> +               valid_flags++;
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_WRITE_POLICY_VALID) {
>> +               switch (found_cache->attributes & ACPI_PPTT_MASK_WRITE_POLICY) {
>> +               case ACPI_PPTT_CACHE_POLICY_WT:
>> +                       this_leaf->attributes = CACHE_WRITE_THROUGH;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_POLICY_WB:
>> +                       this_leaf->attributes = CACHE_WRITE_BACK;
>> +                       break;
>> +               }
>> +       }
>> +       if (found_cache->flags & ACPI_PPTT_ALLOCATION_TYPE_VALID) {
>> +               switch (found_cache->attributes & ACPI_PPTT_MASK_ALLOCATION_TYPE) {
>> +               case ACPI_PPTT_CACHE_READ_ALLOCATE:
>> +                       this_leaf->attributes |= CACHE_READ_ALLOCATE;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_WRITE_ALLOCATE:
>> +                       this_leaf->attributes |= CACHE_WRITE_ALLOCATE;
>> +                       break;
>> +               case ACPI_PPTT_CACHE_RW_ALLOCATE:
>> +               case ACPI_PPTT_CACHE_RW_ALLOCATE_ALT:
>> +                       this_leaf->attributes |=
>> +                               CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE;
>> +                       break;
>> +               }
>> +       }
>> +       /*
>> +        * If the above flags are valid, and the cache type is NOCACHE
>> +        * update the cache type as well.
>> +        */
>> +       if ((this_leaf->type == CACHE_TYPE_NOCACHE) &&
>> +           (valid_flags == PPTT_CHECKED_ATTRIBUTES))
>> +               this_leaf->type = CACHE_TYPE_UNIFIED;
> 
> Why do we need the associativity and #sets attributes to be valid in
> order to set the cache type?

This happened a couple revisions ago because its better to force people 
to completely populate the attributes in the tables (particularly for 
nodes we are generating, which is what the _NOCACHE check above is 
detecting) and leave half of them undefined and therefor exported to 
user-space in a way which makes the properties unreliable across machines.


> 
> I see how size and line size are rather fundamental properties, but
> for a system cache, the geometry doesn't really matter.

Originally all of them were required (to avoid having the discussion 
about which ones were "important"), but Sudeep was of the opinion that 
these were the "important" ones, I can see his point, and no one else 
spoke up... So, those are the ones that are required.


> 
>> +}
>> +
>> +/*
>> + * Update the kernel cache information for each level of cache
>> + * associated with the given acpi cpu.
>> + */
>> +static void cache_setup_acpi_cpu(struct acpi_table_header *table,
>> +                                unsigned int cpu)
>> +{
>> +       struct acpi_pptt_cache *found_cache;
>> +       struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
>> +       u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +       struct cacheinfo *this_leaf;
>> +       unsigned int index = 0;
>> +       struct acpi_pptt_processor *cpu_node = NULL;
>> +
>> +       while (index < get_cpu_cacheinfo(cpu)->num_leaves) {
>> +               this_leaf = this_cpu_ci->info_list + index;
>> +               found_cache = acpi_find_cache_node(table, acpi_cpu_id,
>> +                                                  this_leaf->type,
>> +                                                  this_leaf->level,
>> +                                                  &cpu_node);
>> +               pr_debug("found = %p %p\n", found_cache, cpu_node);
>> +               if (found_cache)
>> +                       update_cache_properties(this_leaf,
>> +                                               found_cache,
>> +                                               cpu_node);
>> +
>> +               index++;
>> +       }
>> +}
>> +
>> +/**
>> + * acpi_find_last_cache_level() - Determines the number of cache levels for a PE
>> + * @cpu: Kernel logical cpu number
>> + *
>> + * Given a logical cpu number, returns the number of levels of cache represented
>> + * in the PPTT. Errors caused by lack of a PPTT table, or otherwise, return 0
>> + * indicating we didn't find any cache levels.
>> + *
>> + * Return: Cache levels visible to this core.
>> + */
>> +int acpi_find_last_cache_level(unsigned int cpu)
>> +{
>> +       u32 acpi_cpu_id;
>> +       struct acpi_table_header *table;
>> +       int number_of_levels = 0;
>> +       acpi_status status;
>> +
>> +       pr_debug("Cache Setup find last level cpu=%d\n", cpu);
>> +
>> +       acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> +       if (ACPI_FAILURE(status)) {
>> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
>> +       } else {
>> +               number_of_levels = acpi_find_cache_levels(table, acpi_cpu_id);
>> +               acpi_put_table(table);
>> +       }
>> +       pr_debug("Cache Setup find last level level=%d\n", number_of_levels);
>> +
>> +       return number_of_levels;
>> +}
>> +
>> +/**
>> + * cache_setup_acpi() - Override CPU cache topology with data from the PPTT
>> + * @cpu: Kernel logical cpu number
>> + *
>> + * Updates the global cache info provided by cpu_get_cacheinfo()
>> + * when there are valid properties in the acpi_pptt_cache nodes. A
>> + * successful parse may not result in any updates if none of the
>> + * cache levels have any valid flags set.  Futher, a unique value is
>> + * associated with each known CPU cache entry. This unique value
>> + * can be used to determine whether caches are shared between cpus.
>> + *
>> + * Return: -ENOENT on failure to find table, or 0 on success
>> + */
>> +int cache_setup_acpi(unsigned int cpu)
>> +{
>> +       struct acpi_table_header *table;
>> +       acpi_status status;
>> +
>> +       pr_debug("Cache Setup ACPI cpu %d\n", cpu);
>> +
>> +       status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> +       if (ACPI_FAILURE(status)) {
>> +               pr_err_once("No PPTT table found, cache topology may be inaccurate\n");
>> +               return -ENOENT;
>> +       }
>> +
>> +       cache_setup_acpi_cpu(table, cpu);
>> +       acpi_put_table(table);
>> +
>> +       return status;
>> +}
>> --
>> 2.13.6
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-03-07 13:06             ` Morten Rasmussen
  (?)
@ 2018-03-08 20:41               ` Brice Goglin
  -1 siblings, 0 replies; 136+ messages in thread
From: Brice Goglin @ 2018-03-08 20:41 UTC (permalink / raw)
  To: Morten Rasmussen, Jeremy Linton
  Cc: mark.rutland, vkilari, lorenzo.pieralisi, catalin.marinas,
	tnowicki, gregkh, will.deacon, dietmar.eggemann, rjw,
	linux-kernel, ahs3, linux-acpi, palmer, hanjun.guo, sudeep.holla,
	austinwc, linux-riscv, john.garry, wangxiongfeng2,
	linux-arm-kernel, lenb


> Is there a good reason for diverging instead of adjusting the
> core_sibling mask? On x86 the core_siblings mask is defined by the last
> level cache span so they don't have this issue. 

No. core_siblings is defined as the list of cores that have the same
physical_package_id (see the doc of sysfs topology files), and LLC can
be smaller than that.
Example with E5v3 with cluster-on-die (two L3 per package, core_siblings
is twice larger than L3 cpumap):
https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonE5v3.v1.11.png
On AMD EPYC, you even have up to 8 LLC per package.

Brice

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-08 20:41               ` Brice Goglin
  0 siblings, 0 replies; 136+ messages in thread
From: Brice Goglin @ 2018-03-08 20:41 UTC (permalink / raw)
  To: linux-riscv


> Is there a good reason for diverging instead of adjusting the
> core_sibling mask? On x86 the core_siblings mask is defined by the last
> level cache span so they don't have this issue. 

No. core_siblings is defined as the list of cores that have the same
physical_package_id (see the doc of sysfs topology files), and LLC can
be smaller than that.
Example with E5v3 with cluster-on-die (two L3 per package, core_siblings
is twice larger than L3 cpumap):
https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonE5v3.v1.11.png
On AMD EPYC, you even have up to 8 LLC per package.

Brice

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-08 20:41               ` Brice Goglin
  0 siblings, 0 replies; 136+ messages in thread
From: Brice Goglin @ 2018-03-08 20:41 UTC (permalink / raw)
  To: linux-arm-kernel


> Is there a good reason for diverging instead of adjusting the
> core_sibling mask? On x86 the core_siblings mask is defined by the last
> level cache span so they don't have this issue. 

No. core_siblings is defined as the list of cores that have the same
physical_package_id (see the doc of sysfs topology files), and LLC can
be smaller than that.
Example with E5v3 with cluster-on-die (two L3 per package, core_siblings
is twice larger than L3 cpumap):
https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonE5v3.v1.11.png
On AMD EPYC, you even have up to 8 LLC per package.

Brice

^ permalink raw reply	[flat|nested] 136+ messages in thread

* RE: [PATCH v7 00/13] Support PPTT for ARM64
  2018-02-28 22:06 ` Jeremy Linton
  (?)
  (?)
@ 2018-03-14  9:57   ` vkilari
  -1 siblings, 0 replies; 136+ messages in thread
From: vkilari @ 2018-03-14  9:57 UTC (permalink / raw)
  To: 'Jeremy Linton', linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki

Hi Jeremy,

> -----Original Message-----
> From: Jeremy Linton <jeremy.linton@arm.com>
> Sent: Thursday, March 1, 2018 3:36 AM
> To: linux-acpi@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org; sudeep.holla@arm.com;
> lorenzo.pieralisi@arm.com; hanjun.guo@linaro.org; rjw@rjwysocki.net;
> will.deacon@arm.com; catalin.marinas@arm.com;
> gregkh@linuxfoundation.org; mark.rutland@arm.com; linux-
> kernel@vger.kernel.org; linux-riscv@lists.infradead.org;
> wangxiongfeng2@huawei.com; vkilari@codeaurora.org; ahs3@redhat.com;
> dietmar.eggemann@arm.com; morten.rasmussen@arm.com;
> palmer@sifive.com; lenb@kernel.org; john.garry@huawei.com;
> austinwc@codeaurora.org; tnowicki@caviumnetworks.com; Jeremy Linton
> <jeremy.linton@arm.com>
> Subject: [PATCH v7 00/13] Support PPTT for ARM64
> 
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to
> describe the processor and cache topology. Ideally it is used to
extend/override
> information provided by the hardware, but right now ARM64 is entirely
> dependent on firmware provided tables.
> 
> This patch parses the table for the cache topology and CPU topology.
> When we enable ACPI/PPTT for arm64 we map the physical_id to the PPTT
> node flagged as the physical package by the firmware.
> This results in topologies that match what the remainder of the system
expects.
> To avoid inverted scheduler domains we then set the MC domain equal to the
> largest cache within the socket below the NUMA domain.
> 
> For example on juno:
> [root@mammon-juno-rh topology]# lstopo-no-graphics
>   Package L#0
>     L2 L#0 (1024KB)
>       L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>       L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>       L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>       L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>     L2 L#1 (2048KB)
>       L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
>       L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
>   HostBridge L#0
>     PCIBridge
>       PCIBridge
>         PCIBridge
>           PCI 1095:3132
>             Block(Disk) L#0 "sda"
>         PCIBridge
>           PCI 1002:68f9
>             GPU L#1 "renderD128"
>             GPU L#2 "card0"
>             GPU L#3 "controlD64"
>         PCIBridge
>           PCI 11ab:4380
>             Net L#4 "enp8s0"
> 
> Git tree at:
> http://linux-arm.org/git?p=linux-jlinton.git
> branch: pptt_v7

Tested this series and looks good

Tested by: Vijaya Kumar K <vkilari@codeaurora.org>
> 
> v6->v7:
> Add additional patch to use the last cache level within the NUMA
>   or socket as the MC domain. This assures the MC domain is
>   equal or smaller than the DIE.
> 
> Various formatting/etc review comments.
> 
> Rebase to 4.16rc2
> 
> v5->v6:
> Add additional patches which re-factor how the initial DT code sets
>   up the cacheinfo structure so that its not as dependent on the
>   of_node stored in that tree. Once that is done we rename it
>   for use with the ACPI code.
> 
> Additionally there were a fair number of minor name/location/etc
>   tweaks scattered about made in response to review comments.
> 
> v4->v5:
> Update the cache type from NOCACHE to UNIFIED when all the cache
>   attributes we update are valid. This fixes a problem where caches
>   which are entirely created by the PPTT don't show up in lstopo.
> 
> Give the PPTT its own firmware_node in the cache structure instead of
>   sharing it with the of_node.
> 
> Move some pieces around between patches.
> 
> (see previous cover letters for futher changes)
> 
> Jeremy Linton (13):
>   drivers: base: cacheinfo: move cache_setup_of_node()
>   drivers: base: cacheinfo: setup DT cache properties early
>   cacheinfo: rename of_node to fw_token
>   arm64/acpi: Create arch specific cpu to acpi id helper
>   ACPI/PPTT: Add Processor Properties Topology Table parsing
>   ACPI: Enable PPTT support on ARM64
>   drivers: base cacheinfo: Add support for ACPI based firmware tables
>   arm64: Add support for ACPI based firmware tables
>   ACPI/PPTT: Add topology parsing code
>   arm64: topology: rename cluster_id
>   arm64: topology: enable ACPI/PPTT based CPU topology
>   ACPI: Add PPTT to injectable table list
>   arm64: topology: divorce MC scheduling domain from core_siblings
> 
>  arch/arm64/Kconfig                |   1 +
>  arch/arm64/include/asm/acpi.h     |   4 +
>  arch/arm64/include/asm/topology.h |   9 +-
>  arch/arm64/kernel/cacheinfo.c     |  15 +-
>  arch/arm64/kernel/topology.c      | 132 +++++++-
>  arch/riscv/kernel/cacheinfo.c     |   1 -
>  drivers/acpi/Kconfig              |   3 +
>  drivers/acpi/Makefile             |   1 +
>  drivers/acpi/pptt.c               | 642
> ++++++++++++++++++++++++++++++++++++++
>  drivers/acpi/tables.c             |   2 +-
>  drivers/base/cacheinfo.c          | 157 +++++-----
>  include/linux/acpi.h              |   4 +
>  include/linux/cacheinfo.h         |  17 +-
>  13 files changed, 882 insertions(+), 106 deletions(-)  create mode 100644
> drivers/acpi/pptt.c
> 
> --
> 2.13.6

^ permalink raw reply	[flat|nested] 136+ messages in thread

* RE: [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-14  9:57   ` vkilari
  0 siblings, 0 replies; 136+ messages in thread
From: vkilari @ 2018-03-14  9:57 UTC (permalink / raw)
  To: 'Jeremy Linton', linux-acpi
  Cc: linux-arm-kernel, sudeep.holla, lorenzo.pieralisi, hanjun.guo,
	rjw, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki

Hi Jeremy,

> -----Original Message-----
> From: Jeremy Linton <jeremy.linton@arm.com>
> Sent: Thursday, March 1, 2018 3:36 AM
> To: linux-acpi@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org; sudeep.holla@arm.com;
> lorenzo.pieralisi@arm.com; hanjun.guo@linaro.org; rjw@rjwysocki.net;
> will.deacon@arm.com; catalin.marinas@arm.com;
> gregkh@linuxfoundation.org; mark.rutland@arm.com; linux-
> kernel@vger.kernel.org; linux-riscv@lists.infradead.org;
> wangxiongfeng2@huawei.com; vkilari@codeaurora.org; ahs3@redhat.com;
> dietmar.eggemann@arm.com; morten.rasmussen@arm.com;
> palmer@sifive.com; lenb@kernel.org; john.garry@huawei.com;
> austinwc@codeaurora.org; tnowicki@caviumnetworks.com; Jeremy Linton
> <jeremy.linton@arm.com>
> Subject: [PATCH v7 00/13] Support PPTT for ARM64
> 
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to
> describe the processor and cache topology. Ideally it is used to
extend/override
> information provided by the hardware, but right now ARM64 is entirely
> dependent on firmware provided tables.
> 
> This patch parses the table for the cache topology and CPU topology.
> When we enable ACPI/PPTT for arm64 we map the physical_id to the PPTT
> node flagged as the physical package by the firmware.
> This results in topologies that match what the remainder of the system
expects.
> To avoid inverted scheduler domains we then set the MC domain equal to the
> largest cache within the socket below the NUMA domain.
> 
> For example on juno:
> [root@mammon-juno-rh topology]# lstopo-no-graphics
>   Package L#0
>     L2 L#0 (1024KB)
>       L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>       L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>       L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>       L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>     L2 L#1 (2048KB)
>       L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
>       L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
>   HostBridge L#0
>     PCIBridge
>       PCIBridge
>         PCIBridge
>           PCI 1095:3132
>             Block(Disk) L#0 "sda"
>         PCIBridge
>           PCI 1002:68f9
>             GPU L#1 "renderD128"
>             GPU L#2 "card0"
>             GPU L#3 "controlD64"
>         PCIBridge
>           PCI 11ab:4380
>             Net L#4 "enp8s0"
> 
> Git tree at:
> http://linux-arm.org/git?p=linux-jlinton.git
> branch: pptt_v7

Tested this series and looks good

Tested by: Vijaya Kumar K <vkilari@codeaurora.org>
> 
> v6->v7:
> Add additional patch to use the last cache level within the NUMA
>   or socket as the MC domain. This assures the MC domain is
>   equal or smaller than the DIE.
> 
> Various formatting/etc review comments.
> 
> Rebase to 4.16rc2
> 
> v5->v6:
> Add additional patches which re-factor how the initial DT code sets
>   up the cacheinfo structure so that its not as dependent on the
>   of_node stored in that tree. Once that is done we rename it
>   for use with the ACPI code.
> 
> Additionally there were a fair number of minor name/location/etc
>   tweaks scattered about made in response to review comments.
> 
> v4->v5:
> Update the cache type from NOCACHE to UNIFIED when all the cache
>   attributes we update are valid. This fixes a problem where caches
>   which are entirely created by the PPTT don't show up in lstopo.
> 
> Give the PPTT its own firmware_node in the cache structure instead of
>   sharing it with the of_node.
> 
> Move some pieces around between patches.
> 
> (see previous cover letters for futher changes)
> 
> Jeremy Linton (13):
>   drivers: base: cacheinfo: move cache_setup_of_node()
>   drivers: base: cacheinfo: setup DT cache properties early
>   cacheinfo: rename of_node to fw_token
>   arm64/acpi: Create arch specific cpu to acpi id helper
>   ACPI/PPTT: Add Processor Properties Topology Table parsing
>   ACPI: Enable PPTT support on ARM64
>   drivers: base cacheinfo: Add support for ACPI based firmware tables
>   arm64: Add support for ACPI based firmware tables
>   ACPI/PPTT: Add topology parsing code
>   arm64: topology: rename cluster_id
>   arm64: topology: enable ACPI/PPTT based CPU topology
>   ACPI: Add PPTT to injectable table list
>   arm64: topology: divorce MC scheduling domain from core_siblings
> 
>  arch/arm64/Kconfig                |   1 +
>  arch/arm64/include/asm/acpi.h     |   4 +
>  arch/arm64/include/asm/topology.h |   9 +-
>  arch/arm64/kernel/cacheinfo.c     |  15 +-
>  arch/arm64/kernel/topology.c      | 132 +++++++-
>  arch/riscv/kernel/cacheinfo.c     |   1 -
>  drivers/acpi/Kconfig              |   3 +
>  drivers/acpi/Makefile             |   1 +
>  drivers/acpi/pptt.c               | 642
> ++++++++++++++++++++++++++++++++++++++
>  drivers/acpi/tables.c             |   2 +-
>  drivers/base/cacheinfo.c          | 157 +++++-----
>  include/linux/acpi.h              |   4 +
>  include/linux/cacheinfo.h         |  17 +-
>  13 files changed, 882 insertions(+), 106 deletions(-)  create mode 100644
> drivers/acpi/pptt.c
> 
> --
> 2.13.6

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-14  9:57   ` vkilari
  0 siblings, 0 replies; 136+ messages in thread
From: vkilari at codeaurora.org @ 2018-03-14  9:57 UTC (permalink / raw)
  To: linux-riscv

Hi Jeremy,

> -----Original Message-----
> From: Jeremy Linton <jeremy.linton@arm.com>
> Sent: Thursday, March 1, 2018 3:36 AM
> To: linux-acpi at vger.kernel.org
> Cc: linux-arm-kernel at lists.infradead.org; sudeep.holla at arm.com;
> lorenzo.pieralisi at arm.com; hanjun.guo at linaro.org; rjw at rjwysocki.net;
> will.deacon at arm.com; catalin.marinas at arm.com;
> gregkh at linuxfoundation.org; mark.rutland at arm.com; linux-
> kernel at vger.kernel.org; linux-riscv at lists.infradead.org;
> wangxiongfeng2 at huawei.com; vkilari at codeaurora.org; ahs3 at redhat.com;
> dietmar.eggemann at arm.com; morten.rasmussen at arm.com;
> palmer at sifive.com; lenb at kernel.org; john.garry at huawei.com;
> austinwc at codeaurora.org; tnowicki at caviumnetworks.com; Jeremy Linton
> <jeremy.linton@arm.com>
> Subject: [PATCH v7 00/13] Support PPTT for ARM64
> 
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to
> describe the processor and cache topology. Ideally it is used to
extend/override
> information provided by the hardware, but right now ARM64 is entirely
> dependent on firmware provided tables.
> 
> This patch parses the table for the cache topology and CPU topology.
> When we enable ACPI/PPTT for arm64 we map the physical_id to the PPTT
> node flagged as the physical package by the firmware.
> This results in topologies that match what the remainder of the system
expects.
> To avoid inverted scheduler domains we then set the MC domain equal to the
> largest cache within the socket below the NUMA domain.
> 
> For example on juno:
> [root at mammon-juno-rh topology]# lstopo-no-graphics
>   Package L#0
>     L2 L#0 (1024KB)
>       L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>       L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>       L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>       L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>     L2 L#1 (2048KB)
>       L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
>       L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
>   HostBridge L#0
>     PCIBridge
>       PCIBridge
>         PCIBridge
>           PCI 1095:3132
>             Block(Disk) L#0 "sda"
>         PCIBridge
>           PCI 1002:68f9
>             GPU L#1 "renderD128"
>             GPU L#2 "card0"
>             GPU L#3 "controlD64"
>         PCIBridge
>           PCI 11ab:4380
>             Net L#4 "enp8s0"
> 
> Git tree at:
> http://linux-arm.org/git?p=linux-jlinton.git
> branch: pptt_v7

Tested this series and looks good

Tested by: Vijaya Kumar K <vkilari@codeaurora.org>
> 
> v6->v7:
> Add additional patch to use the last cache level within the NUMA
>   or socket as the MC domain. This assures the MC domain is
>   equal or smaller than the DIE.
> 
> Various formatting/etc review comments.
> 
> Rebase to 4.16rc2
> 
> v5->v6:
> Add additional patches which re-factor how the initial DT code sets
>   up the cacheinfo structure so that its not as dependent on the
>   of_node stored in that tree. Once that is done we rename it
>   for use with the ACPI code.
> 
> Additionally there were a fair number of minor name/location/etc
>   tweaks scattered about made in response to review comments.
> 
> v4->v5:
> Update the cache type from NOCACHE to UNIFIED when all the cache
>   attributes we update are valid. This fixes a problem where caches
>   which are entirely created by the PPTT don't show up in lstopo.
> 
> Give the PPTT its own firmware_node in the cache structure instead of
>   sharing it with the of_node.
> 
> Move some pieces around between patches.
> 
> (see previous cover letters for futher changes)
> 
> Jeremy Linton (13):
>   drivers: base: cacheinfo: move cache_setup_of_node()
>   drivers: base: cacheinfo: setup DT cache properties early
>   cacheinfo: rename of_node to fw_token
>   arm64/acpi: Create arch specific cpu to acpi id helper
>   ACPI/PPTT: Add Processor Properties Topology Table parsing
>   ACPI: Enable PPTT support on ARM64
>   drivers: base cacheinfo: Add support for ACPI based firmware tables
>   arm64: Add support for ACPI based firmware tables
>   ACPI/PPTT: Add topology parsing code
>   arm64: topology: rename cluster_id
>   arm64: topology: enable ACPI/PPTT based CPU topology
>   ACPI: Add PPTT to injectable table list
>   arm64: topology: divorce MC scheduling domain from core_siblings
> 
>  arch/arm64/Kconfig                |   1 +
>  arch/arm64/include/asm/acpi.h     |   4 +
>  arch/arm64/include/asm/topology.h |   9 +-
>  arch/arm64/kernel/cacheinfo.c     |  15 +-
>  arch/arm64/kernel/topology.c      | 132 +++++++-
>  arch/riscv/kernel/cacheinfo.c     |   1 -
>  drivers/acpi/Kconfig              |   3 +
>  drivers/acpi/Makefile             |   1 +
>  drivers/acpi/pptt.c               | 642
> ++++++++++++++++++++++++++++++++++++++
>  drivers/acpi/tables.c             |   2 +-
>  drivers/base/cacheinfo.c          | 157 +++++-----
>  include/linux/acpi.h              |   4 +
>  include/linux/cacheinfo.h         |  17 +-
>  13 files changed, 882 insertions(+), 106 deletions(-)  create mode 100644
> drivers/acpi/pptt.c
> 
> --
> 2.13.6

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 00/13] Support PPTT for ARM64
@ 2018-03-14  9:57   ` vkilari
  0 siblings, 0 replies; 136+ messages in thread
From: vkilari at codeaurora.org @ 2018-03-14  9:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Jeremy,

> -----Original Message-----
> From: Jeremy Linton <jeremy.linton@arm.com>
> Sent: Thursday, March 1, 2018 3:36 AM
> To: linux-acpi at vger.kernel.org
> Cc: linux-arm-kernel at lists.infradead.org; sudeep.holla at arm.com;
> lorenzo.pieralisi at arm.com; hanjun.guo at linaro.org; rjw at rjwysocki.net;
> will.deacon at arm.com; catalin.marinas at arm.com;
> gregkh at linuxfoundation.org; mark.rutland at arm.com; linux-
> kernel at vger.kernel.org; linux-riscv at lists.infradead.org;
> wangxiongfeng2 at huawei.com; vkilari at codeaurora.org; ahs3 at redhat.com;
> dietmar.eggemann at arm.com; morten.rasmussen at arm.com;
> palmer at sifive.com; lenb at kernel.org; john.garry at huawei.com;
> austinwc at codeaurora.org; tnowicki at caviumnetworks.com; Jeremy Linton
> <jeremy.linton@arm.com>
> Subject: [PATCH v7 00/13] Support PPTT for ARM64
> 
> ACPI 6.2 adds the Processor Properties Topology Table (PPTT), which is
used to
> describe the processor and cache topology. Ideally it is used to
extend/override
> information provided by the hardware, but right now ARM64 is entirely
> dependent on firmware provided tables.
> 
> This patch parses the table for the cache topology and CPU topology.
> When we enable ACPI/PPTT for arm64 we map the physical_id to the PPTT
> node flagged as the physical package by the firmware.
> This results in topologies that match what the remainder of the system
expects.
> To avoid inverted scheduler domains we then set the MC domain equal to the
> largest cache within the socket below the NUMA domain.
> 
> For example on juno:
> [root at mammon-juno-rh topology]# lstopo-no-graphics
>   Package L#0
>     L2 L#0 (1024KB)
>       L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>       L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>       L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>       L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>     L2 L#1 (2048KB)
>       L1d L#4 (32KB) + L1i L#4 (48KB) + Core L#4 + PU L#4 (P#4)
>       L1d L#5 (32KB) + L1i L#5 (48KB) + Core L#5 + PU L#5 (P#5)
>   HostBridge L#0
>     PCIBridge
>       PCIBridge
>         PCIBridge
>           PCI 1095:3132
>             Block(Disk) L#0 "sda"
>         PCIBridge
>           PCI 1002:68f9
>             GPU L#1 "renderD128"
>             GPU L#2 "card0"
>             GPU L#3 "controlD64"
>         PCIBridge
>           PCI 11ab:4380
>             Net L#4 "enp8s0"
> 
> Git tree at:
> http://linux-arm.org/git?p=linux-jlinton.git
> branch: pptt_v7

Tested this series and looks good

Tested by: Vijaya Kumar K <vkilari@codeaurora.org>
> 
> v6->v7:
> Add additional patch to use the last cache level within the NUMA
>   or socket as the MC domain. This assures the MC domain is
>   equal or smaller than the DIE.
> 
> Various formatting/etc review comments.
> 
> Rebase to 4.16rc2
> 
> v5->v6:
> Add additional patches which re-factor how the initial DT code sets
>   up the cacheinfo structure so that its not as dependent on the
>   of_node stored in that tree. Once that is done we rename it
>   for use with the ACPI code.
> 
> Additionally there were a fair number of minor name/location/etc
>   tweaks scattered about made in response to review comments.
> 
> v4->v5:
> Update the cache type from NOCACHE to UNIFIED when all the cache
>   attributes we update are valid. This fixes a problem where caches
>   which are entirely created by the PPTT don't show up in lstopo.
> 
> Give the PPTT its own firmware_node in the cache structure instead of
>   sharing it with the of_node.
> 
> Move some pieces around between patches.
> 
> (see previous cover letters for futher changes)
> 
> Jeremy Linton (13):
>   drivers: base: cacheinfo: move cache_setup_of_node()
>   drivers: base: cacheinfo: setup DT cache properties early
>   cacheinfo: rename of_node to fw_token
>   arm64/acpi: Create arch specific cpu to acpi id helper
>   ACPI/PPTT: Add Processor Properties Topology Table parsing
>   ACPI: Enable PPTT support on ARM64
>   drivers: base cacheinfo: Add support for ACPI based firmware tables
>   arm64: Add support for ACPI based firmware tables
>   ACPI/PPTT: Add topology parsing code
>   arm64: topology: rename cluster_id
>   arm64: topology: enable ACPI/PPTT based CPU topology
>   ACPI: Add PPTT to injectable table list
>   arm64: topology: divorce MC scheduling domain from core_siblings
> 
>  arch/arm64/Kconfig                |   1 +
>  arch/arm64/include/asm/acpi.h     |   4 +
>  arch/arm64/include/asm/topology.h |   9 +-
>  arch/arm64/kernel/cacheinfo.c     |  15 +-
>  arch/arm64/kernel/topology.c      | 132 +++++++-
>  arch/riscv/kernel/cacheinfo.c     |   1 -
>  drivers/acpi/Kconfig              |   3 +
>  drivers/acpi/Makefile             |   1 +
>  drivers/acpi/pptt.c               | 642
> ++++++++++++++++++++++++++++++++++++++
>  drivers/acpi/tables.c             |   2 +-
>  drivers/base/cacheinfo.c          | 157 +++++-----
>  include/linux/acpi.h              |   4 +
>  include/linux/cacheinfo.h         |  17 +-
>  13 files changed, 882 insertions(+), 106 deletions(-)  create mode 100644
> drivers/acpi/pptt.c
> 
> --
> 2.13.6

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-03-08 20:41               ` Brice Goglin
  (?)
@ 2018-03-14 12:43                 ` Morten Rasmussen
  -1 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-14 12:43 UTC (permalink / raw)
  To: Brice Goglin
  Cc: Jeremy Linton, mark.rutland, vkilari, lorenzo.pieralisi,
	catalin.marinas, tnowicki, gregkh, will.deacon, dietmar.eggemann,
	rjw, linux-kernel, ahs3, linux-acpi, palmer, hanjun.guo,
	sudeep.holla, austinwc, linux-riscv, john.garry, wangxiongfeng2,
	linux-arm-kernel, lenb

On Thu, Mar 08, 2018 at 09:41:17PM +0100, Brice Goglin wrote:
> 
> > Is there a good reason for diverging instead of adjusting the
> > core_sibling mask? On x86 the core_siblings mask is defined by the last
> > level cache span so they don't have this issue. 
> 
> No. core_siblings is defined as the list of cores that have the same
> physical_package_id (see the doc of sysfs topology files), and LLC can
> be smaller than that.
> Example with E5v3 with cluster-on-die (two L3 per package, core_siblings
> is twice larger than L3 cpumap):
> https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonE5v3.v1.11.png
> On AMD EPYC, you even have up to 8 LLC per package.

Right, I missed the fact that x86 reports a different cpumask for
topology_core_cpumask() which defines the core_siblings exported through
sysfs than the mask used to define MC level in the scheduler topology.
The sysfs core_siblings is defined by the package_id, while the MC level
is defined by the LLC.

Thanks for pointing this out.

On arm64 MC level and sysfs core_siblings are currently defined using
the same mask, but we can't break sysfs, so using different masks is the
only option.

Morten 

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-14 12:43                 ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-14 12:43 UTC (permalink / raw)
  To: linux-riscv

On Thu, Mar 08, 2018 at 09:41:17PM +0100, Brice Goglin wrote:
> 
> > Is there a good reason for diverging instead of adjusting the
> > core_sibling mask? On x86 the core_siblings mask is defined by the last
> > level cache span so they don't have this issue. 
> 
> No. core_siblings is defined as the list of cores that have the same
> physical_package_id (see the doc of sysfs topology files), and LLC can
> be smaller than that.
> Example with E5v3 with cluster-on-die (two L3 per package, core_siblings
> is twice larger than L3 cpumap):
> https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonE5v3.v1.11.png
> On AMD EPYC, you even have up to 8 LLC per package.

Right, I missed the fact that x86 reports a different cpumask for
topology_core_cpumask() which defines the core_siblings exported through
sysfs than the mask used to define MC level in the scheduler topology.
The sysfs core_siblings is defined by the package_id, while the MC level
is defined by the LLC.

Thanks for pointing this out.

On arm64 MC level and sysfs core_siblings are currently defined using
the same mask, but we can't break sysfs, so using different masks is the
only option.

Morten 

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-14 12:43                 ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-14 12:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 08, 2018 at 09:41:17PM +0100, Brice Goglin wrote:
> 
> > Is there a good reason for diverging instead of adjusting the
> > core_sibling mask? On x86 the core_siblings mask is defined by the last
> > level cache span so they don't have this issue. 
> 
> No. core_siblings is defined as the list of cores that have the same
> physical_package_id (see the doc of sysfs topology files), and LLC can
> be smaller than that.
> Example with E5v3 with cluster-on-die (two L3 per package, core_siblings
> is twice larger than L3 cpumap):
> https://www.open-mpi.org/projects/hwloc/lstopo/images/2XeonE5v3.v1.11.png
> On AMD EPYC, you even have up to 8 LLC per package.

Right, I missed the fact that x86 reports a different cpumask for
topology_core_cpumask() which defines the core_siblings exported through
sysfs than the mask used to define MC level in the scheduler topology.
The sysfs core_siblings is defined by the package_id, while the MC level
is defined by the LLC.

Thanks for pointing this out.

On arm64 MC level and sysfs core_siblings are currently defined using
the same mask, but we can't break sysfs, so using different masks is the
only option.

Morten 

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
  2018-03-07 16:19               ` Jeremy Linton
  (?)
@ 2018-03-14 13:05                 ` Morten Rasmussen
  -1 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-14 13:05 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: mark.rutland, vkilari, lorenzo.pieralisi, catalin.marinas,
	tnowicki, gregkh, will.deacon, dietmar.eggemann, rjw,
	linux-kernel, ahs3, linux-acpi, palmer, hanjun.guo, sudeep.holla,
	austinwc, linux-riscv, john.garry, wangxiongfeng2,
	linux-arm-kernel, lenb

On Wed, Mar 07, 2018 at 10:19:50AM -0600, Jeremy Linton wrote:
> Hi,
> 
> On 03/07/2018 07:06 AM, Morten Rasmussen wrote:
> >On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
> >>>>>>To do this correctly, we should really base that on the cache
> >>>>>>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >>>>>
> >>>>>That means we wouldn't support multi-die NUMA nodes?
> >>>>
> >>>>You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> >>>>should work. This patch is picking the widest cache layer below the smallest
> >>>>of the package or numa grouping. What actually happens depends on the
> >>>>topology. Given a case where there are multiple dies in a socket, and the
> >>>>numa domain is at the socket level the MC is going to reflect the caching
> >>>>topology immediately below the socket. In the case of multiple dies, with a
> >>>>cache that crosses them in socket, then the MC is basically going to be the
> >>>>socket, otherwise if the widest cache is per die, or some narrower grouping
> >>>>(cluster?) then that is what ends up in the MC. (this is easier with some
> >>>>pictures)
> >>>
> >>>That is more or less what I meant. I think I got confused with the role
> >>>of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> >>>cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> >>>
> >>>1. Multi-die/socket/physical package NUMA node
> >>>    Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
> >>>    spans multiple multi-package nodes. The MC mask reflects the last-level
> >>>    cache within the NUMA node which is most likely per-die or per-cluster
> >>>    (inside each die).
> >>>
> >>>2. physical package == NUMA node
> >>>    The top non-NUMA (DIE) mask is the same as the core sibling mask.
> >>>    If there is cache spanning the entire node, the scheduler topology
> >>>    will eliminate a layer (DIE?), so bottom NUMA level would be right on
> >>>    top of MC spanning multiple physical packages. If there is no
> >>>    node-wide last level cache, DIE is preserved and MC matches the span
> >>>    of the last level cache.
> >>>
> >>>3. numa-in-package
> >>>    Top non-NUMA (DIE) mask is not reflecting the actual die, but is
> >>>    reflecting the NUMA node. MC has a span equal to the largest share
> >>>    cache span smaller than or equal to the the NUMA node. If it is
> >>>    equal, DIE level is eliminated, otherwise DIE is preserved, but
> >>>    doesn't really represent die. Bottom non-NUMA level spans multiple
> >>>    in-package NUMA nodes.
> >>>
> >>>As you said, multi-die nodes should work. However, I'm not sure if
> >>>shrinking MC to match a cache could cause us trouble, or if it should
> >>>just be shrunk to be the smaller of the node mask and core siblings.
> >>
> >>Shrinking to the smaller of the numa or package is fairly trivial change,
> >>I'm good with that change too.. I discounted it because there might be an
> >>advantage in case 2 if the internal hardware is actually a case 3 (or just
> >>multiple rings/whatever each with a L3). In those cases the firmware vendor
> >>could play around with whatever representation serves them the best.
> >
> >Agreed. Distributed last level caches and interconnect speeds makes it
> >virtually impossible to define MC in a way that works well for everyone
> >based on the topology information we have at hand.
> >
> >>
> >>>Unless you have a node-wide last level cache DIE level won't be
> >>>eliminated in scenario 2 and 3, and could cause trouble. For
> >>>numa-in-package, you can end up with a DIE level inside the node where
> >>>the default flags don't favour aggressive spreading of tasks. The same
> >>>could be the case for per-package nodes (scenario 2).
> >>>
> >>>Don't we end up redefining physical package to be last level cache
> >>>instead of using the PPTT flag for scenario 2 and 3?
> >>
> >>I'm not sure I understand, core_siblings isn't changing (its still per
> >>package). Only the MC mapping which normally is just core_siblings. For all
> >>intents right now this patch is the same as v6, except for the
> >>numa-in-package where the MC domain is being shrunk to the node siblings.
> >>I'm just trying to setup the code for potential future cases where the LLC
> >>isn't equal to the node or package.
> >
> >Right, core_siblings remains the same. The scheduler topology just looks
> >a bit odd as we can have core_siblings spanning the full true physical
> >package and have DIE level as a subset of that with an MC level where
> >the MC siblings is a much smaller subset of cpus than core_siblings.
> >
> >IOW, it would lead to having one topology used by the scheduler and
> >another used by the users of topology_core_cpumask() (which is not
> >many I think).
> >
> >Is there a good reason for diverging instead of adjusting the
> >core_sibling mask? On x86 the core_siblings mask is defined by the last
> >level cache span so they don't have this issue.
> 
> I'm overwhelmingly convinced we are doing the right thing WRT the core
> siblings at the moment. Its exported to user space, and the general
> understanding is that its a socket. Even with numa in package/on die if you
> run lscpu, lstopo, etc... They all understand the system topology correctly
> doing it this way (AFAIK).

Right. As said in my reply to Brice, I thought MC level and sysfs were
aligned, but they clearly aren't. I agree that treating them different
is the right thing to do.

> >I would prefer this simpler solution as it should eliminate DIE level
> >for all numa-in-package configurations. Although, I think we should consider
> >just shrinking the core_sibling mask instead of having a difference MC
> >mask (cpu_coregroup_mask). Do you see any problems in doing that?I'm
> My strongest opinion is leaning toward core_siblings being correct as it
> stands. How the scheduler deals with that is less clear. I will toss the
> above as a separate patch, and we can forget this one. I see dropping DIE as
> a larger patch set defining an arch specific scheduler topology and tweaking
> the individual scheduler level/flags/tuning. OTOH, unless there is something
> particularly creative there, I don't see how to avoid NUMA domains pushing
> deeper into the cache/system topology. Which means filling the MC layer (and
> possible others) similarly to the above snippit.

Agreed that core_siblings is correct. With the simple solution DIE
shouldn't show up for any numa_in_package configurations allowing NUMA
to sit directly on top of MC, which should mean that flags should be
roughly okay.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-14 13:05                 ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-14 13:05 UTC (permalink / raw)
  To: linux-riscv

On Wed, Mar 07, 2018 at 10:19:50AM -0600, Jeremy Linton wrote:
> Hi,
> 
> On 03/07/2018 07:06 AM, Morten Rasmussen wrote:
> >On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
> >>>>>>To do this correctly, we should really base that on the cache
> >>>>>>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >>>>>
> >>>>>That means we wouldn't support multi-die NUMA nodes?
> >>>>
> >>>>You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> >>>>should work. This patch is picking the widest cache layer below the smallest
> >>>>of the package or numa grouping. What actually happens depends on the
> >>>>topology. Given a case where there are multiple dies in a socket, and the
> >>>>numa domain is at the socket level the MC is going to reflect the caching
> >>>>topology immediately below the socket. In the case of multiple dies, with a
> >>>>cache that crosses them in socket, then the MC is basically going to be the
> >>>>socket, otherwise if the widest cache is per die, or some narrower grouping
> >>>>(cluster?) then that is what ends up in the MC. (this is easier with some
> >>>>pictures)
> >>>
> >>>That is more or less what I meant. I think I got confused with the role
> >>>of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> >>>cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> >>>
> >>>1. Multi-die/socket/physical package NUMA node
> >>>    Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
> >>>    spans multiple multi-package nodes. The MC mask reflects the last-level
> >>>    cache within the NUMA node which is most likely per-die or per-cluster
> >>>    (inside each die).
> >>>
> >>>2. physical package == NUMA node
> >>>    The top non-NUMA (DIE) mask is the same as the core sibling mask.
> >>>    If there is cache spanning the entire node, the scheduler topology
> >>>    will eliminate a layer (DIE?), so bottom NUMA level would be right on
> >>>    top of MC spanning multiple physical packages. If there is no
> >>>    node-wide last level cache, DIE is preserved and MC matches the span
> >>>    of the last level cache.
> >>>
> >>>3. numa-in-package
> >>>    Top non-NUMA (DIE) mask is not reflecting the actual die, but is
> >>>    reflecting the NUMA node. MC has a span equal to the largest share
> >>>    cache span smaller than or equal to the the NUMA node. If it is
> >>>    equal, DIE level is eliminated, otherwise DIE is preserved, but
> >>>    doesn't really represent die. Bottom non-NUMA level spans multiple
> >>>    in-package NUMA nodes.
> >>>
> >>>As you said, multi-die nodes should work. However, I'm not sure if
> >>>shrinking MC to match a cache could cause us trouble, or if it should
> >>>just be shrunk to be the smaller of the node mask and core siblings.
> >>
> >>Shrinking to the smaller of the numa or package is fairly trivial change,
> >>I'm good with that change too.. I discounted it because there might be an
> >>advantage in case 2 if the internal hardware is actually a case 3 (or just
> >>multiple rings/whatever each with a L3). In those cases the firmware vendor
> >>could play around with whatever representation serves them the best.
> >
> >Agreed. Distributed last level caches and interconnect speeds makes it
> >virtually impossible to define MC in a way that works well for everyone
> >based on the topology information we have at hand.
> >
> >>
> >>>Unless you have a node-wide last level cache DIE level won't be
> >>>eliminated in scenario 2 and 3, and could cause trouble. For
> >>>numa-in-package, you can end up with a DIE level inside the node where
> >>>the default flags don't favour aggressive spreading of tasks. The same
> >>>could be the case for per-package nodes (scenario 2).
> >>>
> >>>Don't we end up redefining physical package to be last level cache
> >>>instead of using the PPTT flag for scenario 2 and 3?
> >>
> >>I'm not sure I understand, core_siblings isn't changing (its still per
> >>package). Only the MC mapping which normally is just core_siblings. For all
> >>intents right now this patch is the same as v6, except for the
> >>numa-in-package where the MC domain is being shrunk to the node siblings.
> >>I'm just trying to setup the code for potential future cases where the LLC
> >>isn't equal to the node or package.
> >
> >Right, core_siblings remains the same. The scheduler topology just looks
> >a bit odd as we can have core_siblings spanning the full true physical
> >package and have DIE level as a subset of that with an MC level where
> >the MC siblings is a much smaller subset of cpus than core_siblings.
> >
> >IOW, it would lead to having one topology used by the scheduler and
> >another used by the users of topology_core_cpumask() (which is not
> >many I think).
> >
> >Is there a good reason for diverging instead of adjusting the
> >core_sibling mask? On x86 the core_siblings mask is defined by the last
> >level cache span so they don't have this issue.
> 
> I'm overwhelmingly convinced we are doing the right thing WRT the core
> siblings at the moment. Its exported to user space, and the general
> understanding is that its a socket. Even with numa in package/on die if you
> run lscpu, lstopo, etc... They all understand the system topology correctly
> doing it this way (AFAIK).

Right. As said in my reply to Brice, I thought MC level and sysfs were
aligned, but they clearly aren't. I agree that treating them different
is the right thing to do.

> >I would prefer this simpler solution as it should eliminate DIE level
> >for all numa-in-package configurations. Although, I think we should consider
> >just shrinking the core_sibling mask instead of having a difference MC
> >mask (cpu_coregroup_mask). Do you see any problems in doing that?I'm
> My strongest opinion is leaning toward core_siblings being correct as it
> stands. How the scheduler deals with that is less clear. I will toss the
> above as a separate patch, and we can forget this one. I see dropping DIE as
> a larger patch set defining an arch specific scheduler topology and tweaking
> the individual scheduler level/flags/tuning. OTOH, unless there is something
> particularly creative there, I don't see how to avoid NUMA domains pushing
> deeper into the cache/system topology. Which means filling the MC layer (and
> possible others) similarly to the above snippit.

Agreed that core_siblings is correct. With the simple solution DIE
shouldn't show up for any numa_in_package configurations allowing NUMA
to sit directly on top of MC, which should mean that flags should be
roughly okay.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings
@ 2018-03-14 13:05                 ` Morten Rasmussen
  0 siblings, 0 replies; 136+ messages in thread
From: Morten Rasmussen @ 2018-03-14 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Mar 07, 2018 at 10:19:50AM -0600, Jeremy Linton wrote:
> Hi,
> 
> On 03/07/2018 07:06 AM, Morten Rasmussen wrote:
> >On Tue, Mar 06, 2018 at 04:22:18PM -0600, Jeremy Linton wrote:
> >>>>>>To do this correctly, we should really base that on the cache
> >>>>>>topology immediately below the NUMA node (for NUMA in socket) >> or below the physical package for normal NUMA configurations.
> >>>>>
> >>>>>That means we wouldn't support multi-die NUMA nodes?
> >>>>
> >>>>You mean a bottom level NUMA domain that crosses multiple sockets/dies? That
> >>>>should work. This patch is picking the widest cache layer below the smallest
> >>>>of the package or numa grouping. What actually happens depends on the
> >>>>topology. Given a case where there are multiple dies in a socket, and the
> >>>>numa domain is at the socket level the MC is going to reflect the caching
> >>>>topology immediately below the socket. In the case of multiple dies, with a
> >>>>cache that crosses them in socket, then the MC is basically going to be the
> >>>>socket, otherwise if the widest cache is per die, or some narrower grouping
> >>>>(cluster?) then that is what ends up in the MC. (this is easier with some
> >>>>pictures)
> >>>
> >>>That is more or less what I meant. I think I got confused with the role
> >>>of "DIE" level, i.e. that top non-NUMA level, in this. The DIE level
> >>>cpumask spans exactly the NUMA node, so IIUC we have three scenarios:
> >>>
> >>>1. Multi-die/socket/physical package NUMA node
> >>>    Top non-NUMA level (DIE) spans multiple packages. Bottom NUMA level
> >>>    spans multiple multi-package nodes. The MC mask reflects the last-level
> >>>    cache within the NUMA node which is most likely per-die or per-cluster
> >>>    (inside each die).
> >>>
> >>>2. physical package == NUMA node
> >>>    The top non-NUMA (DIE) mask is the same as the core sibling mask.
> >>>    If there is cache spanning the entire node, the scheduler topology
> >>>    will eliminate a layer (DIE?), so bottom NUMA level would be right on
> >>>    top of MC spanning multiple physical packages. If there is no
> >>>    node-wide last level cache, DIE is preserved and MC matches the span
> >>>    of the last level cache.
> >>>
> >>>3. numa-in-package
> >>>    Top non-NUMA (DIE) mask is not reflecting the actual die, but is
> >>>    reflecting the NUMA node. MC has a span equal to the largest share
> >>>    cache span smaller than or equal to the the NUMA node. If it is
> >>>    equal, DIE level is eliminated, otherwise DIE is preserved, but
> >>>    doesn't really represent die. Bottom non-NUMA level spans multiple
> >>>    in-package NUMA nodes.
> >>>
> >>>As you said, multi-die nodes should work. However, I'm not sure if
> >>>shrinking MC to match a cache could cause us trouble, or if it should
> >>>just be shrunk to be the smaller of the node mask and core siblings.
> >>
> >>Shrinking to the smaller of the numa or package is fairly trivial change,
> >>I'm good with that change too.. I discounted it because there might be an
> >>advantage in case 2 if the internal hardware is actually a case 3 (or just
> >>multiple rings/whatever each with a L3). In those cases the firmware vendor
> >>could play around with whatever representation serves them the best.
> >
> >Agreed. Distributed last level caches and interconnect speeds makes it
> >virtually impossible to define MC in a way that works well for everyone
> >based on the topology information we have at hand.
> >
> >>
> >>>Unless you have a node-wide last level cache DIE level won't be
> >>>eliminated in scenario 2 and 3, and could cause trouble. For
> >>>numa-in-package, you can end up with a DIE level inside the node where
> >>>the default flags don't favour aggressive spreading of tasks. The same
> >>>could be the case for per-package nodes (scenario 2).
> >>>
> >>>Don't we end up redefining physical package to be last level cache
> >>>instead of using the PPTT flag for scenario 2 and 3?
> >>
> >>I'm not sure I understand, core_siblings isn't changing (its still per
> >>package). Only the MC mapping which normally is just core_siblings. For all
> >>intents right now this patch is the same as v6, except for the
> >>numa-in-package where the MC domain is being shrunk to the node siblings.
> >>I'm just trying to setup the code for potential future cases where the LLC
> >>isn't equal to the node or package.
> >
> >Right, core_siblings remains the same. The scheduler topology just looks
> >a bit odd as we can have core_siblings spanning the full true physical
> >package and have DIE level as a subset of that with an MC level where
> >the MC siblings is a much smaller subset of cpus than core_siblings.
> >
> >IOW, it would lead to having one topology used by the scheduler and
> >another used by the users of topology_core_cpumask() (which is not
> >many I think).
> >
> >Is there a good reason for diverging instead of adjusting the
> >core_sibling mask? On x86 the core_siblings mask is defined by the last
> >level cache span so they don't have this issue.
> 
> I'm overwhelmingly convinced we are doing the right thing WRT the core
> siblings at the moment. Its exported to user space, and the general
> understanding is that its a socket. Even with numa in package/on die if you
> run lscpu, lstopo, etc... They all understand the system topology correctly
> doing it this way (AFAIK).

Right. As said in my reply to Brice, I thought MC level and sysfs were
aligned, but they clearly aren't. I agree that treating them different
is the right thing to do.

> >I would prefer this simpler solution as it should eliminate DIE level
> >for all numa-in-package configurations. Although, I think we should consider
> >just shrinking the core_sibling mask instead of having a difference MC
> >mask (cpu_coregroup_mask). Do you see any problems in doing that?I'm
> My strongest opinion is leaning toward core_siblings being correct as it
> stands. How the scheduler deals with that is less clear. I will toss the
> above as a separate patch, and we can forget this one. I see dropping DIE as
> a larger patch set defining an arch specific scheduler topology and tweaking
> the individual scheduler level/flags/tuning. OTOH, unless there is something
> particularly creative there, I don't see how to avoid NUMA domains pushing
> deeper into the cache/system topology. Which means filling the MC layer (and
> possible others) similarly to the above snippit.

Agreed that core_siblings is correct. With the simple solution DIE
shouldn't show up for any numa_in_package configurations allowing NUMA
to sit directly on top of MC, which should mean that flags should be
roughly okay.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
  2018-02-28 22:06   ` Jeremy Linton
  (?)
@ 2018-03-19 10:46     ` Rafael J. Wysocki
  -1 siblings, 0 replies; 136+ messages in thread
From: Rafael J. Wysocki @ 2018-03-19 10:46 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, lorenzo.pieralisi,
	hanjun.guo, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki

On Wednesday, February 28, 2018 11:06:11 PM CET Jeremy Linton wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
> 
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
> 
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

A couple of cosmetic comments.

> ---
>  drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 488 insertions(+)
>  create mode 100644 drivers/acpi/pptt.c
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> new file mode 100644
> index 000000000000..883e4318c6cd
> --- /dev/null
> +++ b/drivers/acpi/pptt.c
> @@ -0,0 +1,488 @@

Use an SPDX license ID here and then you don't need the license boilerplate
below.

> +/*
> + * Copyright (C) 2018, ARM
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * This file implements parsing of Processor Properties Topology Table (PPTT)
> + * which is optionally used to describe the processor and cache topology.
> + * Due to the relative pointers used throughout the table, this doesn't
> + * leverage the existing subtable parsing in the kernel.
> + *
> + * The PPTT structure is an inverted tree, with each node potentially
> + * holding one or two inverted tree data structures describing
> + * the caches available at that level. Each cache structure optionally
> + * contains properties describing the cache at a given level which can be
> + * used to override hardware probed values.
> + */
> +#define pr_fmt(fmt) "ACPI PPTT: " fmt
> +
> +#include <linux/acpi.h>
> +#include <linux/cacheinfo.h>
> +#include <acpi/processor.h>
> +
> +/*
> + * Given the PPTT table, find and verify that the subtable entry
> + * is located within the table
> + */

If you add a comment above a function, make it a kerneldoc.  That
never hurts, but really helps sometimes.

> +static struct acpi_subtable_header *fetch_pptt_subtable(

Don't break the line here, it can be broken after the first arg.

That's OK even if it is more than 80 chars long.

> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	struct acpi_subtable_header *entry;
> +
> +	/* there isn't a subtable at reference 0 */
> +	if (pptt_ref < sizeof(struct acpi_subtable_header))
> +		return NULL;
> +
> +	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
> +		return NULL;
> +
> +	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
> +
> +	if (pptt_ref + entry->length > table_hdr->length)
> +		return NULL;
> +
> +	return entry;
> +}
> +
> +static struct acpi_pptt_processor *fetch_pptt_node(
> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
> +								 pptt_ref);

You don't need to break this line too.

> +}
> +
> +static struct acpi_pptt_cache *fetch_pptt_cache(
> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
> +							     pptt_ref);

And here.

> +}
> +
> +static struct acpi_subtable_header *acpi_get_pptt_resource(
> +	struct acpi_table_header *table_hdr,
> +	struct acpi_pptt_processor *node, int resource)
> +{
> +	u32 *ref;
> +
> +	if (resource >= node->number_of_priv_resources)
> +		return NULL;
> +
> +	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
> +	ref += resource;
> +
> +	return fetch_pptt_subtable(table_hdr, *ref);
> +}
> +
> +/*
> + * Match the type passed and special case the TYPE_UNIFIED so that
> + * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
> + */
> +static inline bool acpi_pptt_match_type(int table_type, int type)
> +{
> +	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
> +		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
> +}
> +
> +/*
> + * Attempt to find a given cache level, while counting the max number
> + * of cache levels for the cache node.
> + *
> + * Given a pptt resource, verify that it is a cache node, then walk
> + * down each level of caches, counting how many levels are found
> + * as well as checking the cache type (icache, dcache, unified). If a
> + * level & type match, then we set found, and continue the search.
> + * Once the entire cache branch has been walked return its max
> + * depth.
> + */
> +static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
> +				int local_level,
> +				struct acpi_subtable_header *res,
> +				struct acpi_pptt_cache **found,
> +				int level, int type)
> +{
> +	struct acpi_pptt_cache *cache;
> +
> +	if (res->type != ACPI_PPTT_TYPE_CACHE)
> +		return 0;
> +
> +	cache = (struct acpi_pptt_cache *) res;
> +	while (cache) {
> +		local_level++;
> +
> +		if ((local_level == level) &&
> +		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
> +		    acpi_pptt_match_type(cache->attributes, type)) {
> +			if ((*found != NULL) && (cache != *found))

Inner parens are not necessary (and above and in some other places too).

> +				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");

This is not an error, rather a statement that something is inconsistent in the
ACPI table, so please consider using a different log level here.

[cut]

I guess I could post similar comments for the other general ACPI patches in
this series, so please address them in there too if applicable.

Thanks!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-19 10:46     ` Rafael J. Wysocki
  0 siblings, 0 replies; 136+ messages in thread
From: Rafael J. Wysocki @ 2018-03-19 10:46 UTC (permalink / raw)
  To: linux-riscv

On Wednesday, February 28, 2018 11:06:11 PM CET Jeremy Linton wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
> 
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
> 
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

A couple of cosmetic comments.

> ---
>  drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 488 insertions(+)
>  create mode 100644 drivers/acpi/pptt.c
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> new file mode 100644
> index 000000000000..883e4318c6cd
> --- /dev/null
> +++ b/drivers/acpi/pptt.c
> @@ -0,0 +1,488 @@

Use an SPDX license ID here and then you don't need the license boilerplate
below.

> +/*
> + * Copyright (C) 2018, ARM
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * This file implements parsing of Processor Properties Topology Table (PPTT)
> + * which is optionally used to describe the processor and cache topology.
> + * Due to the relative pointers used throughout the table, this doesn't
> + * leverage the existing subtable parsing in the kernel.
> + *
> + * The PPTT structure is an inverted tree, with each node potentially
> + * holding one or two inverted tree data structures describing
> + * the caches available at that level. Each cache structure optionally
> + * contains properties describing the cache at a given level which can be
> + * used to override hardware probed values.
> + */
> +#define pr_fmt(fmt) "ACPI PPTT: " fmt
> +
> +#include <linux/acpi.h>
> +#include <linux/cacheinfo.h>
> +#include <acpi/processor.h>
> +
> +/*
> + * Given the PPTT table, find and verify that the subtable entry
> + * is located within the table
> + */

If you add a comment above a function, make it a kerneldoc.  That
never hurts, but really helps sometimes.

> +static struct acpi_subtable_header *fetch_pptt_subtable(

Don't break the line here, it can be broken after the first arg.

That's OK even if it is more than 80 chars long.

> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	struct acpi_subtable_header *entry;
> +
> +	/* there isn't a subtable at reference 0 */
> +	if (pptt_ref < sizeof(struct acpi_subtable_header))
> +		return NULL;
> +
> +	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
> +		return NULL;
> +
> +	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
> +
> +	if (pptt_ref + entry->length > table_hdr->length)
> +		return NULL;
> +
> +	return entry;
> +}
> +
> +static struct acpi_pptt_processor *fetch_pptt_node(
> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
> +								 pptt_ref);

You don't need to break this line too.

> +}
> +
> +static struct acpi_pptt_cache *fetch_pptt_cache(
> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
> +							     pptt_ref);

And here.

> +}
> +
> +static struct acpi_subtable_header *acpi_get_pptt_resource(
> +	struct acpi_table_header *table_hdr,
> +	struct acpi_pptt_processor *node, int resource)
> +{
> +	u32 *ref;
> +
> +	if (resource >= node->number_of_priv_resources)
> +		return NULL;
> +
> +	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
> +	ref += resource;
> +
> +	return fetch_pptt_subtable(table_hdr, *ref);
> +}
> +
> +/*
> + * Match the type passed and special case the TYPE_UNIFIED so that
> + * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
> + */
> +static inline bool acpi_pptt_match_type(int table_type, int type)
> +{
> +	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
> +		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
> +}
> +
> +/*
> + * Attempt to find a given cache level, while counting the max number
> + * of cache levels for the cache node.
> + *
> + * Given a pptt resource, verify that it is a cache node, then walk
> + * down each level of caches, counting how many levels are found
> + * as well as checking the cache type (icache, dcache, unified). If a
> + * level & type match, then we set found, and continue the search.
> + * Once the entire cache branch has been walked return its max
> + * depth.
> + */
> +static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
> +				int local_level,
> +				struct acpi_subtable_header *res,
> +				struct acpi_pptt_cache **found,
> +				int level, int type)
> +{
> +	struct acpi_pptt_cache *cache;
> +
> +	if (res->type != ACPI_PPTT_TYPE_CACHE)
> +		return 0;
> +
> +	cache = (struct acpi_pptt_cache *) res;
> +	while (cache) {
> +		local_level++;
> +
> +		if ((local_level == level) &&
> +		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
> +		    acpi_pptt_match_type(cache->attributes, type)) {
> +			if ((*found != NULL) && (cache != *found))

Inner parens are not necessary (and above and in some other places too).

> +				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");

This is not an error, rather a statement that something is inconsistent in the
ACPI table, so please consider using a different log level here.

[cut]

I guess I could post similar comments for the other general ACPI patches in
this series, so please address them in there too if applicable.

Thanks!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-19 10:46     ` Rafael J. Wysocki
  0 siblings, 0 replies; 136+ messages in thread
From: Rafael J. Wysocki @ 2018-03-19 10:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday, February 28, 2018 11:06:11 PM CET Jeremy Linton wrote:
> ACPI 6.2 adds a new table, which describes how processing units
> are related to each other in tree like fashion. Caches are
> also sprinkled throughout the tree and describe the properties
> of the caches in relation to other caches and processing units.
> 
> Add the code to parse the cache hierarchy and report the total
> number of levels of cache for a given core using
> acpi_find_last_cache_level() as well as fill out the individual
> cores cache information with cache_setup_acpi() once the
> cpu_cacheinfo structure has been populated by the arch specific
> code.
> 
> An additional patch later in the set adds the ability to report
> peers in the topology using find_acpi_cpu_topology()
> to report a unique ID for each processing unit at a given level
> in the tree. These unique id's can then be used to match related
> processing units which exist as threads, COD (clusters
> on die), within a given package, etc.
> 
> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>

A couple of cosmetic comments.

> ---
>  drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 488 insertions(+)
>  create mode 100644 drivers/acpi/pptt.c
> 
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> new file mode 100644
> index 000000000000..883e4318c6cd
> --- /dev/null
> +++ b/drivers/acpi/pptt.c
> @@ -0,0 +1,488 @@

Use an SPDX license ID here and then you don't need the license boilerplate
below.

> +/*
> + * Copyright (C) 2018, ARM
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * This file implements parsing of Processor Properties Topology Table (PPTT)
> + * which is optionally used to describe the processor and cache topology.
> + * Due to the relative pointers used throughout the table, this doesn't
> + * leverage the existing subtable parsing in the kernel.
> + *
> + * The PPTT structure is an inverted tree, with each node potentially
> + * holding one or two inverted tree data structures describing
> + * the caches available at that level. Each cache structure optionally
> + * contains properties describing the cache at a given level which can be
> + * used to override hardware probed values.
> + */
> +#define pr_fmt(fmt) "ACPI PPTT: " fmt
> +
> +#include <linux/acpi.h>
> +#include <linux/cacheinfo.h>
> +#include <acpi/processor.h>
> +
> +/*
> + * Given the PPTT table, find and verify that the subtable entry
> + * is located within the table
> + */

If you add a comment above a function, make it a kerneldoc.  That
never hurts, but really helps sometimes.

> +static struct acpi_subtable_header *fetch_pptt_subtable(

Don't break the line here, it can be broken after the first arg.

That's OK even if it is more than 80 chars long.

> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	struct acpi_subtable_header *entry;
> +
> +	/* there isn't a subtable at reference 0 */
> +	if (pptt_ref < sizeof(struct acpi_subtable_header))
> +		return NULL;
> +
> +	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
> +		return NULL;
> +
> +	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
> +
> +	if (pptt_ref + entry->length > table_hdr->length)
> +		return NULL;
> +
> +	return entry;
> +}
> +
> +static struct acpi_pptt_processor *fetch_pptt_node(
> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
> +								 pptt_ref);

You don't need to break this line too.

> +}
> +
> +static struct acpi_pptt_cache *fetch_pptt_cache(
> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
> +{
> +	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
> +							     pptt_ref);

And here.

> +}
> +
> +static struct acpi_subtable_header *acpi_get_pptt_resource(
> +	struct acpi_table_header *table_hdr,
> +	struct acpi_pptt_processor *node, int resource)
> +{
> +	u32 *ref;
> +
> +	if (resource >= node->number_of_priv_resources)
> +		return NULL;
> +
> +	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
> +	ref += resource;
> +
> +	return fetch_pptt_subtable(table_hdr, *ref);
> +}
> +
> +/*
> + * Match the type passed and special case the TYPE_UNIFIED so that
> + * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
> + */
> +static inline bool acpi_pptt_match_type(int table_type, int type)
> +{
> +	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
> +		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
> +}
> +
> +/*
> + * Attempt to find a given cache level, while counting the max number
> + * of cache levels for the cache node.
> + *
> + * Given a pptt resource, verify that it is a cache node, then walk
> + * down each level of caches, counting how many levels are found
> + * as well as checking the cache type (icache, dcache, unified). If a
> + * level & type match, then we set found, and continue the search.
> + * Once the entire cache branch has been walked return its max
> + * depth.
> + */
> +static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
> +				int local_level,
> +				struct acpi_subtable_header *res,
> +				struct acpi_pptt_cache **found,
> +				int level, int type)
> +{
> +	struct acpi_pptt_cache *cache;
> +
> +	if (res->type != ACPI_PPTT_TYPE_CACHE)
> +		return 0;
> +
> +	cache = (struct acpi_pptt_cache *) res;
> +	while (cache) {
> +		local_level++;
> +
> +		if ((local_level == level) &&
> +		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
> +		    acpi_pptt_match_type(cache->attributes, type)) {
> +			if ((*found != NULL) && (cache != *found))

Inner parens are not necessary (and above and in some other places too).

> +				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");

This is not an error, rather a statement that something is inconsistent in the
ACPI table, so please consider using a different log level here.

[cut]

I guess I could post similar comments for the other general ACPI patches in
this series, so please address them in there too if applicable.

Thanks!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
  2018-03-19 10:46     ` Rafael J. Wysocki
  (?)
@ 2018-03-20 13:25       ` Jeremy Linton
  -1 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-20 13:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-acpi, linux-arm-kernel, sudeep.holla, lorenzo.pieralisi,
	hanjun.guo, will.deacon, catalin.marinas, gregkh, mark.rutland,
	linux-kernel, linux-riscv, wangxiongfeng2, vkilari, ahs3,
	dietmar.eggemann, morten.rasmussen, palmer, lenb, john.garry,
	austinwc, tnowicki

Hi,

Thanks for taking a look at this.

On 03/19/2018 05:46 AM, Rafael J. Wysocki wrote:
> On Wednesday, February 28, 2018 11:06:11 PM CET Jeremy Linton wrote:
>> ACPI 6.2 adds a new table, which describes how processing units
>> are related to each other in tree like fashion. Caches are
>> also sprinkled throughout the tree and describe the properties
>> of the caches in relation to other caches and processing units.
>>
>> Add the code to parse the cache hierarchy and report the total
>> number of levels of cache for a given core using
>> acpi_find_last_cache_level() as well as fill out the individual
>> cores cache information with cache_setup_acpi() once the
>> cpu_cacheinfo structure has been populated by the arch specific
>> code.
>>
>> An additional patch later in the set adds the ability to report
>> peers in the topology using find_acpi_cpu_topology()
>> to report a unique ID for each processing unit at a given level
>> in the tree. These unique id's can then be used to match related
>> processing units which exist as threads, COD (clusters
>> on die), within a given package, etc.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> 
> A couple of cosmetic comments.
> 
>> ---
>>   drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 488 insertions(+)
>>   create mode 100644 drivers/acpi/pptt.c
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> new file mode 100644
>> index 000000000000..883e4318c6cd
>> --- /dev/null
>> +++ b/drivers/acpi/pptt.c
>> @@ -0,0 +1,488 @@
> 
> Use an SPDX license ID here and then you don't need the license boilerplate
> below.

Sure, good point!

> 
>> +/*
>> + * Copyright (C) 2018, ARM
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * This file implements parsing of Processor Properties Topology Table (PPTT)
>> + * which is optionally used to describe the processor and cache topology.
>> + * Due to the relative pointers used throughout the table, this doesn't
>> + * leverage the existing subtable parsing in the kernel.
>> + *
>> + * The PPTT structure is an inverted tree, with each node potentially
>> + * holding one or two inverted tree data structures describing
>> + * the caches available at that level. Each cache structure optionally
>> + * contains properties describing the cache at a given level which can be
>> + * used to override hardware probed values.
>> + */
>> +#define pr_fmt(fmt) "ACPI PPTT: " fmt
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/cacheinfo.h>
>> +#include <acpi/processor.h>
>> +
>> +/*
>> + * Given the PPTT table, find and verify that the subtable entry
>> + * is located within the table
>> + */
> 
> If you add a comment above a function, make it a kerneldoc.  That
> never hurts, but really helps sometimes.

Sure, I did that for the visible symbols. I will convert the internal 
ones as well.

> 
>> +static struct acpi_subtable_header *fetch_pptt_subtable(
> 
> Don't break the line here, it can be broken after the first arg.
> 
> That's OK even if it is more than 80 chars long.
> 
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	struct acpi_subtable_header *entry;
>> +
>> +	/* there isn't a subtable at reference 0 */
>> +	if (pptt_ref < sizeof(struct acpi_subtable_header))
>> +		return NULL;
>> +
>> +	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
>> +		return NULL;
>> +
>> +	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
>> +
>> +	if (pptt_ref + entry->length > table_hdr->length)
>> +		return NULL;
>> +
>> +	return entry;
>> +}
>> +
>> +static struct acpi_pptt_processor *fetch_pptt_node(
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
>> +								 pptt_ref);
> 
> You don't need to break this line too.

Sure.

> 
>> +}
>> +
>> +static struct acpi_pptt_cache *fetch_pptt_cache(
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
>> +							     pptt_ref);
> 
> And here.
> 
>> +}
>> +
>> +static struct acpi_subtable_header *acpi_get_pptt_resource(
>> +	struct acpi_table_header *table_hdr,
>> +	struct acpi_pptt_processor *node, int resource)
>> +{
>> +	u32 *ref;
>> +
>> +	if (resource >= node->number_of_priv_resources)
>> +		return NULL;
>> +
>> +	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
>> +	ref += resource;
>> +
>> +	return fetch_pptt_subtable(table_hdr, *ref);
>> +}
>> +
>> +/*
>> + * Match the type passed and special case the TYPE_UNIFIED so that
>> + * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
>> + */
>> +static inline bool acpi_pptt_match_type(int table_type, int type)
>> +{
>> +	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
>> +		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
>> +}
>> +
>> +/*
>> + * Attempt to find a given cache level, while counting the max number
>> + * of cache levels for the cache node.
>> + *
>> + * Given a pptt resource, verify that it is a cache node, then walk
>> + * down each level of caches, counting how many levels are found
>> + * as well as checking the cache type (icache, dcache, unified). If a
>> + * level & type match, then we set found, and continue the search.
>> + * Once the entire cache branch has been walked return its max
>> + * depth.
>> + */
>> +static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
>> +				int local_level,
>> +				struct acpi_subtable_header *res,
>> +				struct acpi_pptt_cache **found,
>> +				int level, int type)
>> +{
>> +	struct acpi_pptt_cache *cache;
>> +
>> +	if (res->type != ACPI_PPTT_TYPE_CACHE)
>> +		return 0;
>> +
>> +	cache = (struct acpi_pptt_cache *) res;
>> +	while (cache) {
>> +		local_level++;
>> +
>> +		if ((local_level == level) &&
>> +		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
>> +		    acpi_pptt_match_type(cache->attributes, type)) {
>> +			if ((*found != NULL) && (cache != *found))
> 
> Inner parens are not necessary (and above and in some other places too).
Sure.

> 
>> +				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");
> 
> This is not an error, rather a statement that something is inconsistent in the
> ACPI table, so please consider using a different log level here.

pr_warn seems to be the other common choice in the acpi directory for 
table errors, I will will use that.


> 
> [cut]
> 
> I guess I could post similar comments for the other general ACPI patches in
> this series, so please address them in there too if applicable.
> 

Ok, I will look over them again.

Thanks,


> Thanks!
> 

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-20 13:25       ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-20 13:25 UTC (permalink / raw)
  To: linux-riscv

Hi,

Thanks for taking a look at this.

On 03/19/2018 05:46 AM, Rafael J. Wysocki wrote:
> On Wednesday, February 28, 2018 11:06:11 PM CET Jeremy Linton wrote:
>> ACPI 6.2 adds a new table, which describes how processing units
>> are related to each other in tree like fashion. Caches are
>> also sprinkled throughout the tree and describe the properties
>> of the caches in relation to other caches and processing units.
>>
>> Add the code to parse the cache hierarchy and report the total
>> number of levels of cache for a given core using
>> acpi_find_last_cache_level() as well as fill out the individual
>> cores cache information with cache_setup_acpi() once the
>> cpu_cacheinfo structure has been populated by the arch specific
>> code.
>>
>> An additional patch later in the set adds the ability to report
>> peers in the topology using find_acpi_cpu_topology()
>> to report a unique ID for each processing unit at a given level
>> in the tree. These unique id's can then be used to match related
>> processing units which exist as threads, COD (clusters
>> on die), within a given package, etc.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> 
> A couple of cosmetic comments.
> 
>> ---
>>   drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 488 insertions(+)
>>   create mode 100644 drivers/acpi/pptt.c
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> new file mode 100644
>> index 000000000000..883e4318c6cd
>> --- /dev/null
>> +++ b/drivers/acpi/pptt.c
>> @@ -0,0 +1,488 @@
> 
> Use an SPDX license ID here and then you don't need the license boilerplate
> below.

Sure, good point!

> 
>> +/*
>> + * Copyright (C) 2018, ARM
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * This file implements parsing of Processor Properties Topology Table (PPTT)
>> + * which is optionally used to describe the processor and cache topology.
>> + * Due to the relative pointers used throughout the table, this doesn't
>> + * leverage the existing subtable parsing in the kernel.
>> + *
>> + * The PPTT structure is an inverted tree, with each node potentially
>> + * holding one or two inverted tree data structures describing
>> + * the caches available at that level. Each cache structure optionally
>> + * contains properties describing the cache at a given level which can be
>> + * used to override hardware probed values.
>> + */
>> +#define pr_fmt(fmt) "ACPI PPTT: " fmt
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/cacheinfo.h>
>> +#include <acpi/processor.h>
>> +
>> +/*
>> + * Given the PPTT table, find and verify that the subtable entry
>> + * is located within the table
>> + */
> 
> If you add a comment above a function, make it a kerneldoc.  That
> never hurts, but really helps sometimes.

Sure, I did that for the visible symbols. I will convert the internal 
ones as well.

> 
>> +static struct acpi_subtable_header *fetch_pptt_subtable(
> 
> Don't break the line here, it can be broken after the first arg.
> 
> That's OK even if it is more than 80 chars long.
> 
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	struct acpi_subtable_header *entry;
>> +
>> +	/* there isn't a subtable at reference 0 */
>> +	if (pptt_ref < sizeof(struct acpi_subtable_header))
>> +		return NULL;
>> +
>> +	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
>> +		return NULL;
>> +
>> +	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
>> +
>> +	if (pptt_ref + entry->length > table_hdr->length)
>> +		return NULL;
>> +
>> +	return entry;
>> +}
>> +
>> +static struct acpi_pptt_processor *fetch_pptt_node(
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
>> +								 pptt_ref);
> 
> You don't need to break this line too.

Sure.

> 
>> +}
>> +
>> +static struct acpi_pptt_cache *fetch_pptt_cache(
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
>> +							     pptt_ref);
> 
> And here.
> 
>> +}
>> +
>> +static struct acpi_subtable_header *acpi_get_pptt_resource(
>> +	struct acpi_table_header *table_hdr,
>> +	struct acpi_pptt_processor *node, int resource)
>> +{
>> +	u32 *ref;
>> +
>> +	if (resource >= node->number_of_priv_resources)
>> +		return NULL;
>> +
>> +	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
>> +	ref += resource;
>> +
>> +	return fetch_pptt_subtable(table_hdr, *ref);
>> +}
>> +
>> +/*
>> + * Match the type passed and special case the TYPE_UNIFIED so that
>> + * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
>> + */
>> +static inline bool acpi_pptt_match_type(int table_type, int type)
>> +{
>> +	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
>> +		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
>> +}
>> +
>> +/*
>> + * Attempt to find a given cache level, while counting the max number
>> + * of cache levels for the cache node.
>> + *
>> + * Given a pptt resource, verify that it is a cache node, then walk
>> + * down each level of caches, counting how many levels are found
>> + * as well as checking the cache type (icache, dcache, unified). If a
>> + * level & type match, then we set found, and continue the search.
>> + * Once the entire cache branch has been walked return its max
>> + * depth.
>> + */
>> +static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
>> +				int local_level,
>> +				struct acpi_subtable_header *res,
>> +				struct acpi_pptt_cache **found,
>> +				int level, int type)
>> +{
>> +	struct acpi_pptt_cache *cache;
>> +
>> +	if (res->type != ACPI_PPTT_TYPE_CACHE)
>> +		return 0;
>> +
>> +	cache = (struct acpi_pptt_cache *) res;
>> +	while (cache) {
>> +		local_level++;
>> +
>> +		if ((local_level == level) &&
>> +		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
>> +		    acpi_pptt_match_type(cache->attributes, type)) {
>> +			if ((*found != NULL) && (cache != *found))
> 
> Inner parens are not necessary (and above and in some other places too).
Sure.

> 
>> +				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");
> 
> This is not an error, rather a statement that something is inconsistent in the
> ACPI table, so please consider using a different log level here.

pr_warn seems to be the other common choice in the acpi directory for 
table errors, I will will use that.


> 
> [cut]
> 
> I guess I could post similar comments for the other general ACPI patches in
> this series, so please address them in there too if applicable.
> 

Ok, I will look over them again.

Thanks,


> Thanks!
> 

^ permalink raw reply	[flat|nested] 136+ messages in thread

* [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing
@ 2018-03-20 13:25       ` Jeremy Linton
  0 siblings, 0 replies; 136+ messages in thread
From: Jeremy Linton @ 2018-03-20 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

Thanks for taking a look at this.

On 03/19/2018 05:46 AM, Rafael J. Wysocki wrote:
> On Wednesday, February 28, 2018 11:06:11 PM CET Jeremy Linton wrote:
>> ACPI 6.2 adds a new table, which describes how processing units
>> are related to each other in tree like fashion. Caches are
>> also sprinkled throughout the tree and describe the properties
>> of the caches in relation to other caches and processing units.
>>
>> Add the code to parse the cache hierarchy and report the total
>> number of levels of cache for a given core using
>> acpi_find_last_cache_level() as well as fill out the individual
>> cores cache information with cache_setup_acpi() once the
>> cpu_cacheinfo structure has been populated by the arch specific
>> code.
>>
>> An additional patch later in the set adds the ability to report
>> peers in the topology using find_acpi_cpu_topology()
>> to report a unique ID for each processing unit at a given level
>> in the tree. These unique id's can then be used to match related
>> processing units which exist as threads, COD (clusters
>> on die), within a given package, etc.
>>
>> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
> 
> A couple of cosmetic comments.
> 
>> ---
>>   drivers/acpi/pptt.c | 488 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 488 insertions(+)
>>   create mode 100644 drivers/acpi/pptt.c
>>
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> new file mode 100644
>> index 000000000000..883e4318c6cd
>> --- /dev/null
>> +++ b/drivers/acpi/pptt.c
>> @@ -0,0 +1,488 @@
> 
> Use an SPDX license ID here and then you don't need the license boilerplate
> below.

Sure, good point!

> 
>> +/*
>> + * Copyright (C) 2018, ARM
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * This file implements parsing of Processor Properties Topology Table (PPTT)
>> + * which is optionally used to describe the processor and cache topology.
>> + * Due to the relative pointers used throughout the table, this doesn't
>> + * leverage the existing subtable parsing in the kernel.
>> + *
>> + * The PPTT structure is an inverted tree, with each node potentially
>> + * holding one or two inverted tree data structures describing
>> + * the caches available at that level. Each cache structure optionally
>> + * contains properties describing the cache at a given level which can be
>> + * used to override hardware probed values.
>> + */
>> +#define pr_fmt(fmt) "ACPI PPTT: " fmt
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/cacheinfo.h>
>> +#include <acpi/processor.h>
>> +
>> +/*
>> + * Given the PPTT table, find and verify that the subtable entry
>> + * is located within the table
>> + */
> 
> If you add a comment above a function, make it a kerneldoc.  That
> never hurts, but really helps sometimes.

Sure, I did that for the visible symbols. I will convert the internal 
ones as well.

> 
>> +static struct acpi_subtable_header *fetch_pptt_subtable(
> 
> Don't break the line here, it can be broken after the first arg.
> 
> That's OK even if it is more than 80 chars long.
> 
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	struct acpi_subtable_header *entry;
>> +
>> +	/* there isn't a subtable at reference 0 */
>> +	if (pptt_ref < sizeof(struct acpi_subtable_header))
>> +		return NULL;
>> +
>> +	if (pptt_ref + sizeof(struct acpi_subtable_header) > table_hdr->length)
>> +		return NULL;
>> +
>> +	entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr, pptt_ref);
>> +
>> +	if (pptt_ref + entry->length > table_hdr->length)
>> +		return NULL;
>> +
>> +	return entry;
>> +}
>> +
>> +static struct acpi_pptt_processor *fetch_pptt_node(
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	return (struct acpi_pptt_processor *)fetch_pptt_subtable(table_hdr,
>> +								 pptt_ref);
> 
> You don't need to break this line too.

Sure.

> 
>> +}
>> +
>> +static struct acpi_pptt_cache *fetch_pptt_cache(
>> +	struct acpi_table_header *table_hdr, u32 pptt_ref)
>> +{
>> +	return (struct acpi_pptt_cache *)fetch_pptt_subtable(table_hdr,
>> +							     pptt_ref);
> 
> And here.
> 
>> +}
>> +
>> +static struct acpi_subtable_header *acpi_get_pptt_resource(
>> +	struct acpi_table_header *table_hdr,
>> +	struct acpi_pptt_processor *node, int resource)
>> +{
>> +	u32 *ref;
>> +
>> +	if (resource >= node->number_of_priv_resources)
>> +		return NULL;
>> +
>> +	ref = ACPI_ADD_PTR(u32, node, sizeof(struct acpi_pptt_processor));
>> +	ref += resource;
>> +
>> +	return fetch_pptt_subtable(table_hdr, *ref);
>> +}
>> +
>> +/*
>> + * Match the type passed and special case the TYPE_UNIFIED so that
>> + * it match both ACPI_PPTT_CACHE_TYPE_UNIFIED(_ALT) types.
>> + */
>> +static inline bool acpi_pptt_match_type(int table_type, int type)
>> +{
>> +	return (((table_type & ACPI_PPTT_MASK_CACHE_TYPE) == type) ||
>> +		(table_type & ACPI_PPTT_CACHE_TYPE_UNIFIED & type));
>> +}
>> +
>> +/*
>> + * Attempt to find a given cache level, while counting the max number
>> + * of cache levels for the cache node.
>> + *
>> + * Given a pptt resource, verify that it is a cache node, then walk
>> + * down each level of caches, counting how many levels are found
>> + * as well as checking the cache type (icache, dcache, unified). If a
>> + * level & type match, then we set found, and continue the search.
>> + * Once the entire cache branch has been walked return its max
>> + * depth.
>> + */
>> +static int acpi_pptt_walk_cache(struct acpi_table_header *table_hdr,
>> +				int local_level,
>> +				struct acpi_subtable_header *res,
>> +				struct acpi_pptt_cache **found,
>> +				int level, int type)
>> +{
>> +	struct acpi_pptt_cache *cache;
>> +
>> +	if (res->type != ACPI_PPTT_TYPE_CACHE)
>> +		return 0;
>> +
>> +	cache = (struct acpi_pptt_cache *) res;
>> +	while (cache) {
>> +		local_level++;
>> +
>> +		if ((local_level == level) &&
>> +		    (cache->flags & ACPI_PPTT_CACHE_TYPE_VALID) &&
>> +		    acpi_pptt_match_type(cache->attributes, type)) {
>> +			if ((*found != NULL) && (cache != *found))
> 
> Inner parens are not necessary (and above and in some other places too).
Sure.

> 
>> +				pr_err("Found duplicate cache level/type unable to determine uniqueness\n");
> 
> This is not an error, rather a statement that something is inconsistent in the
> ACPI table, so please consider using a different log level here.

pr_warn seems to be the other common choice in the acpi directory for 
table errors, I will will use that.


> 
> [cut]
> 
> I guess I could post similar comments for the other general ACPI patches in
> this series, so please address them in there too if applicable.
> 

Ok, I will look over them again.

Thanks,


> Thanks!
> 

^ permalink raw reply	[flat|nested] 136+ messages in thread

end of thread, other threads:[~2018-03-20 13:25 UTC | newest]

Thread overview: 136+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-28 22:06 [PATCH v7 00/13] Support PPTT for ARM64 Jeremy Linton
2018-02-28 22:06 ` Jeremy Linton
2018-02-28 22:06 ` Jeremy Linton
2018-02-28 22:06 ` [PATCH v7 01/13] drivers: base: cacheinfo: move cache_setup_of_node() Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-06 16:16   ` Sudeep Holla
2018-03-06 16:16     ` Sudeep Holla
2018-03-06 16:16     ` Sudeep Holla
2018-02-28 22:06 ` [PATCH v7 02/13] drivers: base: cacheinfo: setup DT cache properties early Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:34   ` Palmer Dabbelt
2018-02-28 22:34     ` Palmer Dabbelt
2018-02-28 22:34     ` Palmer Dabbelt
2018-02-28 22:34     ` Palmer Dabbelt
2018-03-06 16:43   ` Sudeep Holla
2018-03-06 16:43     ` Sudeep Holla
2018-03-06 16:43     ` Sudeep Holla
2018-02-28 22:06 ` [PATCH v7 03/13] cacheinfo: rename of_node to fw_token Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-06 16:45   ` Sudeep Holla
2018-03-06 16:45     ` Sudeep Holla
2018-03-06 16:45     ` Sudeep Holla
2018-02-28 22:06 ` [PATCH v7 04/13] arm64/acpi: Create arch specific cpu to acpi id helper Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-06 17:13   ` Sudeep Holla
2018-03-06 17:13     ` Sudeep Holla
2018-03-06 17:13     ` Sudeep Holla
2018-02-28 22:06 ` [PATCH v7 05/13] ACPI/PPTT: Add Processor Properties Topology Table parsing Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-06 17:39   ` Sudeep Holla
2018-03-06 17:39     ` Sudeep Holla
2018-03-06 17:39     ` Sudeep Holla
2018-03-08 16:39   ` Ard Biesheuvel
2018-03-08 16:39     ` Ard Biesheuvel
2018-03-08 16:39     ` Ard Biesheuvel
2018-03-08 19:52     ` Jeremy Linton
2018-03-08 19:52       ` Jeremy Linton
2018-03-08 19:52       ` Jeremy Linton
2018-03-08 19:52       ` Jeremy Linton
2018-03-19 10:46   ` Rafael J. Wysocki
2018-03-19 10:46     ` Rafael J. Wysocki
2018-03-19 10:46     ` Rafael J. Wysocki
2018-03-20 13:25     ` Jeremy Linton
2018-03-20 13:25       ` Jeremy Linton
2018-03-20 13:25       ` Jeremy Linton
2018-02-28 22:06 ` [PATCH v7 06/13] ACPI: Enable PPTT support on ARM64 Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-06 16:55   ` Sudeep Holla
2018-03-06 16:55     ` Sudeep Holla
2018-03-06 16:55     ` Sudeep Holla
2018-02-28 22:06 ` [PATCH v7 07/13] drivers: base cacheinfo: Add support for ACPI based firmware tables Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-06 17:50   ` Sudeep Holla
2018-03-06 17:50     ` Sudeep Holla
2018-03-06 17:50     ` Sudeep Holla
2018-03-08 17:20   ` Lorenzo Pieralisi
2018-03-08 17:20     ` Lorenzo Pieralisi
2018-03-08 17:20     ` Lorenzo Pieralisi
2018-02-28 22:06 ` [PATCH v7 08/13] arm64: " Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-03 21:58   ` kbuild test robot
2018-03-03 21:58     ` kbuild test robot
2018-03-03 21:58     ` kbuild test robot
2018-03-03 21:58     ` kbuild test robot
2018-03-06 17:23   ` Sudeep Holla
2018-03-06 17:23     ` Sudeep Holla
2018-03-06 17:23     ` Sudeep Holla
2018-02-28 22:06 ` [PATCH v7 09/13] ACPI/PPTT: Add topology parsing code Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06 ` [PATCH v7 10/13] arm64: topology: rename cluster_id Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-05 12:24   ` Mark Brown
2018-03-05 12:24     ` Mark Brown
2018-03-05 12:24     ` Mark Brown
2018-02-28 22:06 ` [PATCH v7 11/13] arm64: topology: enable ACPI/PPTT based CPU topology Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06 ` [PATCH v7 12/13] ACPI: Add PPTT to injectable table list Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06 ` [PATCH v7 13/13] arm64: topology: divorce MC scheduling domain from core_siblings Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-02-28 22:06   ` Jeremy Linton
2018-03-01 15:52   ` Morten Rasmussen
2018-03-01 15:52     ` Morten Rasmussen
2018-03-01 15:52     ` Morten Rasmussen
2018-02-27 20:18     ` Jeremy Linton
2018-02-27 20:18       ` Jeremy Linton
2018-02-27 20:18       ` Jeremy Linton
2018-03-06 16:07       ` Morten Rasmussen
2018-03-06 16:07         ` Morten Rasmussen
2018-03-06 16:07         ` Morten Rasmussen
2018-03-06 22:22         ` Jeremy Linton
2018-03-06 22:22           ` Jeremy Linton
2018-03-06 22:22           ` Jeremy Linton
2018-03-07 13:06           ` Morten Rasmussen
2018-03-07 13:06             ` Morten Rasmussen
2018-03-07 13:06             ` Morten Rasmussen
2018-03-07 16:19             ` Jeremy Linton
2018-03-07 16:19               ` Jeremy Linton
2018-03-07 16:19               ` Jeremy Linton
2018-03-14 13:05               ` Morten Rasmussen
2018-03-14 13:05                 ` Morten Rasmussen
2018-03-14 13:05                 ` Morten Rasmussen
2018-03-08 20:41             ` Brice Goglin
2018-03-08 20:41               ` Brice Goglin
2018-03-08 20:41               ` Brice Goglin
2018-03-14 12:43               ` Morten Rasmussen
2018-03-14 12:43                 ` Morten Rasmussen
2018-03-14 12:43                 ` Morten Rasmussen
2018-03-01 12:06 ` [PATCH v7 00/13] Support PPTT for ARM64 Sudeep Holla
2018-03-01 12:06   ` Sudeep Holla
2018-03-01 12:06   ` Sudeep Holla
2018-02-27 18:49   ` Jeremy Linton
2018-02-27 18:49     ` Jeremy Linton
2018-02-27 18:49     ` Jeremy Linton
2018-03-08 15:59     ` Ard Biesheuvel
2018-03-08 15:59       ` Ard Biesheuvel
2018-03-08 15:59       ` Ard Biesheuvel
2018-03-08 17:41       ` Jeremy Linton
2018-03-08 17:41         ` Jeremy Linton
2018-03-08 17:41         ` Jeremy Linton
2018-03-14  9:57 ` vkilari
2018-03-14  9:57   ` vkilari at codeaurora.org
2018-03-14  9:57   ` vkilari at codeaurora.org
2018-03-14  9:57   ` vkilari

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.