All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-25  2:55 ` Qing Wang
  0 siblings, 0 replies; 24+ messages in thread
From: Qing Wang @ 2022-04-25  2:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel
  Cc: vincent.guittot, peterz, dietmar.eggemann, Wang Qing

From: Wang Qing <wangqing@vivo.com>

cluster sched_domain configured by cpu_topology[].cluster_sibling, 
which is set by cluster_id, cluster_id can only get from ACPI.

If the system does not enable ACPI, cluster_id is always -1, even enable
SCHED_CLUSTER is invalid, this is misleading. 

So we add SCHED_CLUSTER's dependency on ACPI here.

Signed-off-by: Wang Qing <wangqing@vivo.com>
---
v2:
- just modify commit log
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 25b33feed800..edbe035cb0e3 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1199,6 +1199,7 @@ config SCHED_MC
 
 config SCHED_CLUSTER
 	bool "Cluster scheduler support"
+	depends on ACPI
 	help
 	  Cluster scheduler support improves the CPU scheduler's decision
 	  making when dealing with machines that have clusters of CPUs.
-- 
2.27.0.windows.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-25  2:55 ` Qing Wang
  0 siblings, 0 replies; 24+ messages in thread
From: Qing Wang @ 2022-04-25  2:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel
  Cc: vincent.guittot, peterz, dietmar.eggemann, Wang Qing

From: Wang Qing <wangqing@vivo.com>

cluster sched_domain configured by cpu_topology[].cluster_sibling, 
which is set by cluster_id, cluster_id can only get from ACPI.

If the system does not enable ACPI, cluster_id is always -1, even enable
SCHED_CLUSTER is invalid, this is misleading. 

So we add SCHED_CLUSTER's dependency on ACPI here.

Signed-off-by: Wang Qing <wangqing@vivo.com>
---
v2:
- just modify commit log
---
 arch/arm64/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 25b33feed800..edbe035cb0e3 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1199,6 +1199,7 @@ config SCHED_MC
 
 config SCHED_CLUSTER
 	bool "Cluster scheduler support"
+	depends on ACPI
 	help
 	  Cluster scheduler support improves the CPU scheduler's decision
 	  making when dealing with machines that have clusters of CPUs.
-- 
2.27.0.windows.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-25  2:55 ` Qing Wang
@ 2022-04-25 10:06   ` Sudeep Holla
  -1 siblings, 0 replies; 24+ messages in thread
From: Sudeep Holla @ 2022-04-25 10:06 UTC (permalink / raw)
  To: Qing Wang
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, Sudeep Holla,
	linux-kernel, vincent.guittot, peterz, dietmar.eggemann

On Sun, Apr 24, 2022 at 07:55:01PM -0700, Qing Wang wrote:
> From: Wang Qing <wangqing@vivo.com>
> 
> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
> which is set by cluster_id, cluster_id can only get from ACPI.
> 
> If the system does not enable ACPI, cluster_id is always -1, even enable
> SCHED_CLUSTER is invalid, this is misleading. 
> 
> So we add SCHED_CLUSTER's dependency on ACPI here.
>

Any reason why this can't be extended to support DT based systems via
cpu-map in the device tree. IMO we almost have everything w.r.t topology
in DT and no reason to deviate this feature between ACPI and DT.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-25 10:06   ` Sudeep Holla
  0 siblings, 0 replies; 24+ messages in thread
From: Sudeep Holla @ 2022-04-25 10:06 UTC (permalink / raw)
  To: Qing Wang
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, Sudeep Holla,
	linux-kernel, vincent.guittot, peterz, dietmar.eggemann

On Sun, Apr 24, 2022 at 07:55:01PM -0700, Qing Wang wrote:
> From: Wang Qing <wangqing@vivo.com>
> 
> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
> which is set by cluster_id, cluster_id can only get from ACPI.
> 
> If the system does not enable ACPI, cluster_id is always -1, even enable
> SCHED_CLUSTER is invalid, this is misleading. 
> 
> So we add SCHED_CLUSTER's dependency on ACPI here.
>

Any reason why this can't be extended to support DT based systems via
cpu-map in the device tree. IMO we almost have everything w.r.t topology
in DT and no reason to deviate this feature between ACPI and DT.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-25 10:06   ` Sudeep Holla
@ 2022-04-25 11:18     ` 王擎
  -1 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-25 11:18 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann


>> From: Wang Qing <wangqing@vivo.com>
>> 
>> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>> which is set by cluster_id, cluster_id can only get from ACPI.
>> 
>> If the system does not enable ACPI, cluster_id is always -1, even enable
>> SCHED_CLUSTER is invalid, this is misleading. 
>> 
>> So we add SCHED_CLUSTER's dependency on ACPI here.
>>
>
>Any reason why this can't be extended to support DT based systems via
>cpu-map in the device tree. IMO we almost have everything w.r.t topology
>in DT and no reason to deviate this feature between ACPI and DT.
>
That's the problem, we parse out "cluster" info according to the
description in cpu-map, but do assign it to package_id, which used to
configure the MC sched domain, not cluster sched domain.

That is to say, "cluster" in cpu-map is used to describe the package_id.
We can't get cluster_id from DT.

Thanks,
Qing

>-- 
>Regards,
>Sudeep

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-25 11:18     ` 王擎
  0 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-25 11:18 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann


>> From: Wang Qing <wangqing@vivo.com>
>> 
>> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>> which is set by cluster_id, cluster_id can only get from ACPI.
>> 
>> If the system does not enable ACPI, cluster_id is always -1, even enable
>> SCHED_CLUSTER is invalid, this is misleading. 
>> 
>> So we add SCHED_CLUSTER's dependency on ACPI here.
>>
>
>Any reason why this can't be extended to support DT based systems via
>cpu-map in the device tree. IMO we almost have everything w.r.t topology
>in DT and no reason to deviate this feature between ACPI and DT.
>
That's the problem, we parse out "cluster" info according to the
description in cpu-map, but do assign it to package_id, which used to
configure the MC sched domain, not cluster sched domain.

That is to say, "cluster" in cpu-map is used to describe the package_id.
We can't get cluster_id from DT.

Thanks,
Qing

>-- 
>Regards,
>Sudeep
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-25 11:18     ` 王擎
@ 2022-04-25 16:59       ` Sudeep Holla
  -1 siblings, 0 replies; 24+ messages in thread
From: Sudeep Holla @ 2022-04-25 16:59 UTC (permalink / raw)
  To: 王擎
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann

On Mon, Apr 25, 2022 at 11:18:21AM +0000, 王擎 wrote:
> 
> >> From: Wang Qing <wangqing@vivo.com>
> >> 
> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
> >> which is set by cluster_id, cluster_id can only get from ACPI.
> >> 
> >> If the system does not enable ACPI, cluster_id is always -1, even enable
> >> SCHED_CLUSTER is invalid, this is misleading. 
> >> 
> >> So we add SCHED_CLUSTER's dependency on ACPI here.
> >>
> >
> >Any reason why this can't be extended to support DT based systems via
> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
> >in DT and no reason to deviate this feature between ACPI and DT.
> >
> That's the problem, we parse out "cluster" info according to the
> description in cpu-map, but do assign it to package_id, which used to
> configure the MC sched domain, not cluster sched domain.
>

Right, we haven't updated the code after updating the bindings to match
ACPI sockets which are the physical package boundaries. Clusters are not
the physical boundaries and the current topology code is not 100% aligned
with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
support for sockets defining package boundaries")

> That is to say, "cluster" in cpu-map is used to describe the package_id.
> We can't get cluster_id from DT.
>

That is wrong, we have "socket" to describe the package_id now.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-25 16:59       ` Sudeep Holla
  0 siblings, 0 replies; 24+ messages in thread
From: Sudeep Holla @ 2022-04-25 16:59 UTC (permalink / raw)
  To: 王擎
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann

On Mon, Apr 25, 2022 at 11:18:21AM +0000, 王擎 wrote:
> 
> >> From: Wang Qing <wangqing@vivo.com>
> >> 
> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
> >> which is set by cluster_id, cluster_id can only get from ACPI.
> >> 
> >> If the system does not enable ACPI, cluster_id is always -1, even enable
> >> SCHED_CLUSTER is invalid, this is misleading. 
> >> 
> >> So we add SCHED_CLUSTER's dependency on ACPI here.
> >>
> >
> >Any reason why this can't be extended to support DT based systems via
> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
> >in DT and no reason to deviate this feature between ACPI and DT.
> >
> That's the problem, we parse out "cluster" info according to the
> description in cpu-map, but do assign it to package_id, which used to
> configure the MC sched domain, not cluster sched domain.
>

Right, we haven't updated the code after updating the bindings to match
ACPI sockets which are the physical package boundaries. Clusters are not
the physical boundaries and the current topology code is not 100% aligned
with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
support for sockets defining package boundaries")

> That is to say, "cluster" in cpu-map is used to describe the package_id.
> We can't get cluster_id from DT.
>

That is wrong, we have "socket" to describe the package_id now.

--
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-25 16:59       ` Sudeep Holla
@ 2022-04-26  2:23         ` 王擎
  -1 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-26  2:23 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann


>> 
>> >> From: Wang Qing <wangqing@vivo.com>
>> >> 
>> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>> >> which is set by cluster_id, cluster_id can only get from ACPI.
>> >> 
>> >> If the system does not enable ACPI, cluster_id is always -1, even enable
>> >> SCHED_CLUSTER is invalid, this is misleading. 
>> >> 
>> >> So we add SCHED_CLUSTER's dependency on ACPI here.
>> >>
>> >
>> >Any reason why this can't be extended to support DT based systems via
>> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
>> >in DT and no reason to deviate this feature between ACPI and DT.
>> >
>> That's the problem, we parse out "cluster" info according to the
>> description in cpu-map, but do assign it to package_id, which used to
>> configure the MC sched domain, not cluster sched domain.
>>
>
>Right, we haven't updated the code after updating the bindings to match
>ACPI sockets which are the physical package boundaries. Clusters are not
>the physical boundaries and the current topology code is not 100% aligned
>with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>support for sockets defining package boundaries")

I see, but this commit is a long time ago, why hasn't it been used widely.
Maybe I can help about it if you need.

Thanks,
Qing

>
>> That is to say, "cluster" in cpu-map is used to describe the package_id.
>> We can't get cluster_id from DT.
>>
>
>That is wrong, we have "socket" to describe the package_id now.
>
>--
>Regards,
>Sudeep

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-26  2:23         ` 王擎
  0 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-26  2:23 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann


>> 
>> >> From: Wang Qing <wangqing@vivo.com>
>> >> 
>> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>> >> which is set by cluster_id, cluster_id can only get from ACPI.
>> >> 
>> >> If the system does not enable ACPI, cluster_id is always -1, even enable
>> >> SCHED_CLUSTER is invalid, this is misleading. 
>> >> 
>> >> So we add SCHED_CLUSTER's dependency on ACPI here.
>> >>
>> >
>> >Any reason why this can't be extended to support DT based systems via
>> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
>> >in DT and no reason to deviate this feature between ACPI and DT.
>> >
>> That's the problem, we parse out "cluster" info according to the
>> description in cpu-map, but do assign it to package_id, which used to
>> configure the MC sched domain, not cluster sched domain.
>>
>
>Right, we haven't updated the code after updating the bindings to match
>ACPI sockets which are the physical package boundaries. Clusters are not
>the physical boundaries and the current topology code is not 100% aligned
>with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>support for sockets defining package boundaries")

I see, but this commit is a long time ago, why hasn't it been used widely.
Maybe I can help about it if you need.

Thanks,
Qing

>
>> That is to say, "cluster" in cpu-map is used to describe the package_id.
>> We can't get cluster_id from DT.
>>
>
>That is wrong, we have "socket" to describe the package_id now.
>
>--
>Regards,
>Sudeep
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-26  2:23         ` 王擎
@ 2022-04-26  6:40           ` Sudeep Holla
  -1 siblings, 0 replies; 24+ messages in thread
From: Sudeep Holla @ 2022-04-26  6:40 UTC (permalink / raw)
  To: 王擎
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann

On Tue, Apr 26, 2022 at 02:23:25AM +0000, 王擎 wrote:
> 
> >> 
> >> >> From: Wang Qing <wangqing@vivo.com>
> >> >> 
> >> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
> >> >> which is set by cluster_id, cluster_id can only get from ACPI.
> >> >> 
> >> >> If the system does not enable ACPI, cluster_id is always -1, even enable
> >> >> SCHED_CLUSTER is invalid, this is misleading. 
> >> >> 
> >> >> So we add SCHED_CLUSTER's dependency on ACPI here.
> >> >>
> >> >
> >> >Any reason why this can't be extended to support DT based systems via
> >> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
> >> >in DT and no reason to deviate this feature between ACPI and DT.
> >> >
> >> That's the problem, we parse out "cluster" info according to the
> >> description in cpu-map, but do assign it to package_id, which used to
> >> configure the MC sched domain, not cluster sched domain.
> >>
> >
> >Right, we haven't updated the code after updating the bindings to match
> >ACPI sockets which are the physical package boundaries. Clusters are not
> >the physical boundaries and the current topology code is not 100% aligned
> >with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
> >support for sockets defining package boundaries")
>
> I see, but this commit is a long time ago, why hasn't it been used widely.
> Maybe I can help about it if you need.
>

I assume no one cared or had a requirement for the same.

--
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-26  6:40           ` Sudeep Holla
  0 siblings, 0 replies; 24+ messages in thread
From: Sudeep Holla @ 2022-04-26  6:40 UTC (permalink / raw)
  To: 王擎
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann

On Tue, Apr 26, 2022 at 02:23:25AM +0000, 王擎 wrote:
> 
> >> 
> >> >> From: Wang Qing <wangqing@vivo.com>
> >> >> 
> >> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
> >> >> which is set by cluster_id, cluster_id can only get from ACPI.
> >> >> 
> >> >> If the system does not enable ACPI, cluster_id is always -1, even enable
> >> >> SCHED_CLUSTER is invalid, this is misleading. 
> >> >> 
> >> >> So we add SCHED_CLUSTER's dependency on ACPI here.
> >> >>
> >> >
> >> >Any reason why this can't be extended to support DT based systems via
> >> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
> >> >in DT and no reason to deviate this feature between ACPI and DT.
> >> >
> >> That's the problem, we parse out "cluster" info according to the
> >> description in cpu-map, but do assign it to package_id, which used to
> >> configure the MC sched domain, not cluster sched domain.
> >>
> >
> >Right, we haven't updated the code after updating the bindings to match
> >ACPI sockets which are the physical package boundaries. Clusters are not
> >the physical boundaries and the current topology code is not 100% aligned
> >with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
> >support for sockets defining package boundaries")
>
> I see, but this commit is a long time ago, why hasn't it been used widely.
> Maybe I can help about it if you need.
>

I assume no one cared or had a requirement for the same.

--
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-26  6:40           ` Sudeep Holla
@ 2022-04-26  6:52             ` 王擎
  -1 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-26  6:52 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann


>> 
>> >> 
>> >> >> From: Wang Qing <wangqing@vivo.com>
>> >> >> 
>> >> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>> >> >> which is set by cluster_id, cluster_id can only get from ACPI.
>> >> >> 
>> >> >> If the system does not enable ACPI, cluster_id is always -1, even enable
>> >> >> SCHED_CLUSTER is invalid, this is misleading. 
>> >> >> 
>> >> >> So we add SCHED_CLUSTER's dependency on ACPI here.
>> >> >>
>> >> >
>> >> >Any reason why this can't be extended to support DT based systems via
>> >> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
>> >> >in DT and no reason to deviate this feature between ACPI and DT.
>> >> >
>> >> That's the problem, we parse out "cluster" info according to the
>> >> description in cpu-map, but do assign it to package_id, which used to
>> >> configure the MC sched domain, not cluster sched domain.
>> >>
>> >
>> >Right, we haven't updated the code after updating the bindings to match
>> >ACPI sockets which are the physical package boundaries. Clusters are not
>> >the physical boundaries and the current topology code is not 100% aligned
>> >with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>> >support for sockets defining package boundaries")
>>
>> I see, but this commit is a long time ago, why hasn't it been used widely.
>> Maybe I can help about it if you need.
>>
>
>I assume no one cared or had a requirement for the same.

It took me a while to find the root cause why enabling SCHED_CLUSTER
didn't work.

We should add SCHED_CLUSTER's dependency before implementation.
Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
will have this problem.

Thanks,
Qing

>
>--
>Regards,
>Sudeep

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-26  6:52             ` 王擎
  0 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-26  6:52 UTC (permalink / raw)
  To: Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann


>> 
>> >> 
>> >> >> From: Wang Qing <wangqing@vivo.com>
>> >> >> 
>> >> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>> >> >> which is set by cluster_id, cluster_id can only get from ACPI.
>> >> >> 
>> >> >> If the system does not enable ACPI, cluster_id is always -1, even enable
>> >> >> SCHED_CLUSTER is invalid, this is misleading. 
>> >> >> 
>> >> >> So we add SCHED_CLUSTER's dependency on ACPI here.
>> >> >>
>> >> >
>> >> >Any reason why this can't be extended to support DT based systems via
>> >> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
>> >> >in DT and no reason to deviate this feature between ACPI and DT.
>> >> >
>> >> That's the problem, we parse out "cluster" info according to the
>> >> description in cpu-map, but do assign it to package_id, which used to
>> >> configure the MC sched domain, not cluster sched domain.
>> >>
>> >
>> >Right, we haven't updated the code after updating the bindings to match
>> >ACPI sockets which are the physical package boundaries. Clusters are not
>> >the physical boundaries and the current topology code is not 100% aligned
>> >with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>> >support for sockets defining package boundaries")
>>
>> I see, but this commit is a long time ago, why hasn't it been used widely.
>> Maybe I can help about it if you need.
>>
>
>I assume no one cared or had a requirement for the same.

It took me a while to find the root cause why enabling SCHED_CLUSTER
didn't work.

We should add SCHED_CLUSTER's dependency before implementation.
Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
will have this problem.

Thanks,
Qing

>
>--
>Regards,
>Sudeep
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-26  6:52             ` 王擎
@ 2022-04-26 13:25               ` Sudeep Holla
  -1 siblings, 0 replies; 24+ messages in thread
From: Sudeep Holla @ 2022-04-26 13:25 UTC (permalink / raw)
  To: 王擎
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann

On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
> 
> >> 
> >> >> 
> >> >> >> From: Wang Qing <wangqing@vivo.com>
> >> >> >> 
> >> >> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
> >> >> >> which is set by cluster_id, cluster_id can only get from ACPI.
> >> >> >> 
> >> >> >> If the system does not enable ACPI, cluster_id is always -1, even enable
> >> >> >> SCHED_CLUSTER is invalid, this is misleading. 
> >> >> >> 
> >> >> >> So we add SCHED_CLUSTER's dependency on ACPI here.
> >> >> >>
> >> >> >
> >> >> >Any reason why this can't be extended to support DT based systems via
> >> >> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
> >> >> >in DT and no reason to deviate this feature between ACPI and DT.
> >> >> >
> >> >> That's the problem, we parse out "cluster" info according to the
> >> >> description in cpu-map, but do assign it to package_id, which used to
> >> >> configure the MC sched domain, not cluster sched domain.
> >> >>
> >> >
> >> >Right, we haven't updated the code after updating the bindings to match
> >> >ACPI sockets which are the physical package boundaries. Clusters are not
> >> >the physical boundaries and the current topology code is not 100% aligned
> >> >with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
> >> >support for sockets defining package boundaries")
> >>
> >> I see, but this commit is a long time ago, why hasn't it been used widely.
> >> Maybe I can help about it if you need.
> >>
> >
> >I assume no one cared or had a requirement for the same.
> 
> It took me a while to find the root cause why enabling SCHED_CLUSTER
> didn't work.
> 
> We should add SCHED_CLUSTER's dependency before implementation.
> Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
> will have this problem.
> 

I am fine with that or mark it broken for DT, but ideally I wouldn't
want to create unnecessary dependency on ACPI or DT when both supports
the feature.

-- 
Regards,
Sudeep

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-26 13:25               ` Sudeep Holla
  0 siblings, 0 replies; 24+ messages in thread
From: Sudeep Holla @ 2022-04-26 13:25 UTC (permalink / raw)
  To: 王擎
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz, dietmar.eggemann

On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
> 
> >> 
> >> >> 
> >> >> >> From: Wang Qing <wangqing@vivo.com>
> >> >> >> 
> >> >> >> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
> >> >> >> which is set by cluster_id, cluster_id can only get from ACPI.
> >> >> >> 
> >> >> >> If the system does not enable ACPI, cluster_id is always -1, even enable
> >> >> >> SCHED_CLUSTER is invalid, this is misleading. 
> >> >> >> 
> >> >> >> So we add SCHED_CLUSTER's dependency on ACPI here.
> >> >> >>
> >> >> >
> >> >> >Any reason why this can't be extended to support DT based systems via
> >> >> >cpu-map in the device tree. IMO we almost have everything w.r.t topology
> >> >> >in DT and no reason to deviate this feature between ACPI and DT.
> >> >> >
> >> >> That's the problem, we parse out "cluster" info according to the
> >> >> description in cpu-map, but do assign it to package_id, which used to
> >> >> configure the MC sched domain, not cluster sched domain.
> >> >>
> >> >
> >> >Right, we haven't updated the code after updating the bindings to match
> >> >ACPI sockets which are the physical package boundaries. Clusters are not
> >> >the physical boundaries and the current topology code is not 100% aligned
> >> >with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
> >> >support for sockets defining package boundaries")
> >>
> >> I see, but this commit is a long time ago, why hasn't it been used widely.
> >> Maybe I can help about it if you need.
> >>
> >
> >I assume no one cared or had a requirement for the same.
> 
> It took me a while to find the root cause why enabling SCHED_CLUSTER
> didn't work.
> 
> We should add SCHED_CLUSTER's dependency before implementation.
> Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
> will have this problem.
> 

I am fine with that or mark it broken for DT, but ideally I wouldn't
want to create unnecessary dependency on ACPI or DT when both supports
the feature.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-26 13:25               ` Sudeep Holla
@ 2022-04-26 19:14                 ` Dietmar Eggemann
  -1 siblings, 0 replies; 24+ messages in thread
From: Dietmar Eggemann @ 2022-04-26 19:14 UTC (permalink / raw)
  To: Sudeep Holla, 王擎
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz

On 26/04/2022 15:25, Sudeep Holla wrote:
> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>
>>>>>>>> From: Wang Qing <wangqing@vivo.com>
>>>>>>>>
>>>>>>>> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>>>>>>>> which is set by cluster_id, cluster_id can only get from ACPI.
>>>>>>>>
>>>>>>>> If the system does not enable ACPI, cluster_id is always -1, even enable
>>>>>>>> SCHED_CLUSTER is invalid, this is misleading. 
>>>>>>>>
>>>>>>>> So we add SCHED_CLUSTER's dependency on ACPI here.
>>>>>>>>
>>>>>>>
>>>>>>> Any reason why this can't be extended to support DT based systems via
>>>>>>> cpu-map in the device tree. IMO we almost have everything w.r.t topology
>>>>>>> in DT and no reason to deviate this feature between ACPI and DT.
>>>>>>>
>>>>>> That's the problem, we parse out "cluster" info according to the
>>>>>> description in cpu-map, but do assign it to package_id, which used to
>>>>>> configure the MC sched domain, not cluster sched domain.
>>>>>>
>>>>>
>>>>> Right, we haven't updated the code after updating the bindings to match
>>>>> ACPI sockets which are the physical package boundaries. Clusters are not
>>>>> the physical boundaries and the current topology code is not 100% aligned
>>>>> with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>>>>> support for sockets defining package boundaries")
>>>>
>>>> I see, but this commit is a long time ago, why hasn't it been used widely.
>>>> Maybe I can help about it if you need.
>>>>
>>>
>>> I assume no one cared or had a requirement for the same.
>>
>> It took me a while to find the root cause why enabling SCHED_CLUSTER
>> didn't work.
>>
>> We should add SCHED_CLUSTER's dependency before implementation.
>> Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
>> will have this problem.
>>
> 
> I am fine with that or mark it broken for DT, but ideally I wouldn't
> want to create unnecessary dependency on ACPI or DT when both supports
> the feature.

IMHO trying to introduce SCHED_COMPLEX for DT next to the linux-wide
available SCHED_CLUSTER (used only for ACPI right now) is the wrong way.

_If_ asymmetric sub-clustering of CPUs underneath LLC (L3) makes any
sense on ARMv9 single DSU systems like:

      .---------------.
CPU   |0 1 2 3 4 5 6 7|
      +---------------+
uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
      +---------------+
  L2  |   |   | | | | |
      +---------------+
  L3  |<--         -->|
      +---------------+
      |<-- cluster -->|
      +---------------+
      |<--   DSU   -->|
      '---------------'

(and I'm not saying here it does!) then the existing SCHED_CLUSTER
should be used in DT as well. Since it provides exactly the same
functionality for the task scheduler no matter whether it's setup from
ACPI or DT.

parse_cluster() -> parse_core() should be changed to be able to decode
both id's (package_id and cluster_id) in this case.

DT's cpu_map would have to be changed to code 2 level setups.

For a system like the one above it should look like:

                cpu-map {
                        cluster0 {
                                foo0 {
                                        core0 {
                                                cpu = <&cpu0>;
                                        };
                                        core1 {
                                                cpu = <&cpu1>;
                                        };
                                };
                                foo2 {
                                        core2 {
                                                cpu = <&cpu2>;
                                        };
                                        core3 {
                                                cpu = <&cpu3>;
                                        };
                                };
                        };
                        cluster1 {
                                core0 {
                                        cpu = <&cpu4>;
                                };
                                core1 {
                                        cpu = <&cpu5>;
                                };
                                core2 {
                                        cpu = <&cpu6>;
                                };
                        };
                        cluster2 {
                                core0 {
                                        cpu = <&cpu7>;
                                };
                        };
                };

I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
L2-complexes on the LITTLES and I get:

# cat /sys/kernel/debug/sched/domains/cpu*/domain*/name
CLS
MC
DIE

CLS
MC
DIE

CLS
MC
DIE

CLS
MC
DIE

MC
DIE

MC
DIE

MC
DIE

DIE

cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd]

cpu0 0
domain0 03
domain1 0f
domain2 ff
cpu1 0
domain0 03
domain1 0f
domain2 ff
cpu2 0
domain0 0c
domain1 0f
domain2 ff
cpu3 0
domain0 0c
domain1 0f
domain2 ff
cpu4 0
domain0 70
domain1 ff
cpu5 0
domain0 70
domain1 ff
cpu6 0
domain0 70
domain1 ff
cpu7 0
domain0 ff

Like I mentioned earlier, I'm not sure if this additional complexity
makes sense on mobile systems running EAS (since only CFS load-balancing
on little CPUs would be affected).

But my hunch is that this setup is what you want for your system. If we
could agree on this one, that would already be some progress to see the
entire story here.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-26 19:14                 ` Dietmar Eggemann
  0 siblings, 0 replies; 24+ messages in thread
From: Dietmar Eggemann @ 2022-04-26 19:14 UTC (permalink / raw)
  To: Sudeep Holla, 王擎
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz

On 26/04/2022 15:25, Sudeep Holla wrote:
> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>
>>>>>>>> From: Wang Qing <wangqing@vivo.com>
>>>>>>>>
>>>>>>>> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>>>>>>>> which is set by cluster_id, cluster_id can only get from ACPI.
>>>>>>>>
>>>>>>>> If the system does not enable ACPI, cluster_id is always -1, even enable
>>>>>>>> SCHED_CLUSTER is invalid, this is misleading. 
>>>>>>>>
>>>>>>>> So we add SCHED_CLUSTER's dependency on ACPI here.
>>>>>>>>
>>>>>>>
>>>>>>> Any reason why this can't be extended to support DT based systems via
>>>>>>> cpu-map in the device tree. IMO we almost have everything w.r.t topology
>>>>>>> in DT and no reason to deviate this feature between ACPI and DT.
>>>>>>>
>>>>>> That's the problem, we parse out "cluster" info according to the
>>>>>> description in cpu-map, but do assign it to package_id, which used to
>>>>>> configure the MC sched domain, not cluster sched domain.
>>>>>>
>>>>>
>>>>> Right, we haven't updated the code after updating the bindings to match
>>>>> ACPI sockets which are the physical package boundaries. Clusters are not
>>>>> the physical boundaries and the current topology code is not 100% aligned
>>>>> with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>>>>> support for sockets defining package boundaries")
>>>>
>>>> I see, but this commit is a long time ago, why hasn't it been used widely.
>>>> Maybe I can help about it if you need.
>>>>
>>>
>>> I assume no one cared or had a requirement for the same.
>>
>> It took me a while to find the root cause why enabling SCHED_CLUSTER
>> didn't work.
>>
>> We should add SCHED_CLUSTER's dependency before implementation.
>> Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
>> will have this problem.
>>
> 
> I am fine with that or mark it broken for DT, but ideally I wouldn't
> want to create unnecessary dependency on ACPI or DT when both supports
> the feature.

IMHO trying to introduce SCHED_COMPLEX for DT next to the linux-wide
available SCHED_CLUSTER (used only for ACPI right now) is the wrong way.

_If_ asymmetric sub-clustering of CPUs underneath LLC (L3) makes any
sense on ARMv9 single DSU systems like:

      .---------------.
CPU   |0 1 2 3 4 5 6 7|
      +---------------+
uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
      +---------------+
  L2  |   |   | | | | |
      +---------------+
  L3  |<--         -->|
      +---------------+
      |<-- cluster -->|
      +---------------+
      |<--   DSU   -->|
      '---------------'

(and I'm not saying here it does!) then the existing SCHED_CLUSTER
should be used in DT as well. Since it provides exactly the same
functionality for the task scheduler no matter whether it's setup from
ACPI or DT.

parse_cluster() -> parse_core() should be changed to be able to decode
both id's (package_id and cluster_id) in this case.

DT's cpu_map would have to be changed to code 2 level setups.

For a system like the one above it should look like:

                cpu-map {
                        cluster0 {
                                foo0 {
                                        core0 {
                                                cpu = <&cpu0>;
                                        };
                                        core1 {
                                                cpu = <&cpu1>;
                                        };
                                };
                                foo2 {
                                        core2 {
                                                cpu = <&cpu2>;
                                        };
                                        core3 {
                                                cpu = <&cpu3>;
                                        };
                                };
                        };
                        cluster1 {
                                core0 {
                                        cpu = <&cpu4>;
                                };
                                core1 {
                                        cpu = <&cpu5>;
                                };
                                core2 {
                                        cpu = <&cpu6>;
                                };
                        };
                        cluster2 {
                                core0 {
                                        cpu = <&cpu7>;
                                };
                        };
                };

I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
L2-complexes on the LITTLES and I get:

# cat /sys/kernel/debug/sched/domains/cpu*/domain*/name
CLS
MC
DIE

CLS
MC
DIE

CLS
MC
DIE

CLS
MC
DIE

MC
DIE

MC
DIE

MC
DIE

DIE

cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd]

cpu0 0
domain0 03
domain1 0f
domain2 ff
cpu1 0
domain0 03
domain1 0f
domain2 ff
cpu2 0
domain0 0c
domain1 0f
domain2 ff
cpu3 0
domain0 0c
domain1 0f
domain2 ff
cpu4 0
domain0 70
domain1 ff
cpu5 0
domain0 70
domain1 ff
cpu6 0
domain0 70
domain1 ff
cpu7 0
domain0 ff

Like I mentioned earlier, I'm not sure if this additional complexity
makes sense on mobile systems running EAS (since only CFS load-balancing
on little CPUs would be affected).

But my hunch is that this setup is what you want for your system. If we
could agree on this one, that would already be some progress to see the
entire story here.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-26 19:14                 ` Dietmar Eggemann
@ 2022-04-27  2:18                   ` 王擎
  -1 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-27  2:18 UTC (permalink / raw)
  To: Dietmar Eggemann, Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz


>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>>
>>>>>>>>> From: Wang Qing <wangqing@vivo.com>
>>>>>>>>>
>>>>>>>>> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>>>>>>>>> which is set by cluster_id, cluster_id can only get from ACPI.
>>>>>>>>>
>>>>>>>>> If the system does not enable ACPI, cluster_id is always -1, even enable
>>>>>>>>> SCHED_CLUSTER is invalid, this is misleading. 
>>>>>>>>>
>>>>>>>>> So we add SCHED_CLUSTER's dependency on ACPI here.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Any reason why this can't be extended to support DT based systems via
>>>>>>>> cpu-map in the device tree. IMO we almost have everything w.r.t topology
>>>>>>>> in DT and no reason to deviate this feature between ACPI and DT.
>>>>>>>>
>>>>>>> That's the problem, we parse out "cluster" info according to the
>>>>>>> description in cpu-map, but do assign it to package_id, which used to
>>>>>>> configure the MC sched domain, not cluster sched domain.
>>>>>>>
>>>>>>
>>>>>> Right, we haven't updated the code after updating the bindings to match
>>>>>> ACPI sockets which are the physical package boundaries. Clusters are not
>>>>>> the physical boundaries and the current topology code is not 100% aligned
>>>>>> with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>>>>>> support for sockets defining package boundaries")
>>>>>
>>>>> I see, but this commit is a long time ago, why hasn't it been used widely.
>>>>> Maybe I can help about it if you need.
>>>>>
>>>>
>>>> I assume no one cared or had a requirement for the same.
>>>
>>> It took me a while to find the root cause why enabling SCHED_CLUSTER
>>> didn't work.
>>>
>>> We should add SCHED_CLUSTER's dependency before implementation.
>>> Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
>>> will have this problem.
>>>
>> 
>> I am fine with that or mark it broken for DT, but ideally I wouldn't
>> want to create unnecessary dependency on ACPI or DT when both supports
>> the feature.
>
>IMHO trying to introduce SCHED_COMPLEX for DT next to the linux-wide
>available SCHED_CLUSTER (used only for ACPI right now) is the wrong way.
>
>_If_ asymmetric sub-clustering of CPUs underneath LLC (L3) makes any
>sense on ARMv9 single DSU systems like:
>
>      .---------------.
>CPU   |0 1 2 3 4 5 6 7|
>      +---------------+
>uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>      +---------------+
>  L2  |   |   | | | | |
>      +---------------+
>  L3  |<--         -->|
>      +---------------+
>      |<-- cluster -->|
>      +---------------+
>      |<--   DSU   -->|
>      '---------------'
>
>(and I'm not saying here it does!) then the existing SCHED_CLUSTER
>should be used in DT as well. Since it provides exactly the same
>functionality for the task scheduler no matter whether it's setup from
>ACPI or DT.
>
>parse_cluster() -> parse_core() should be changed to be able to decode
>both id's (package_id and cluster_id) in this case.

Totally agree, but not implemented yet. Because now cluster_id is used
to describe the package/socket, the modification will involve all DTS.

>
>DT's cpu_map would have to be changed to code 2 level setups.
>
>For a system like the one above it should look like:
>
>                cpu-map {
>                        cluster0 {
>                                foo0 {
>                                        core0 {
>                                                cpu = <&cpu0>;
>                                        };
>                                        core1 {
>                                                cpu = <&cpu1>;
>                                        };
>                                };
>                                foo2 {
>                                        core2 {
>                                                cpu = <&cpu2>;
>                                        };
>                                        core3 {
>                                                cpu = <&cpu3>;
>                                        };
>                                };
>                        };
>                        cluster1 {
>                                core0 {
>                                        cpu = <&cpu4>;
>                                };
>                                core1 {
>                                        cpu = <&cpu5>;
>                                };
>                                core2 {
>                                        cpu = <&cpu6>;
>                                };
>                        };
>                        cluster2 {
>                                core0 {
>                                        cpu = <&cpu7>;
>                                };
>                        };
>                };
>
>I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
>L2-complexes on the LITTLES and I get:

This system is exactly what I mentioned, but I have a question,
How did you parse out the cluster_id based on foo0/foo2?
Because if ACPI is not used, cluster_id is always -1.

What I want to do is to change the foo0/foo2 to complex0/complex2 here,
then parse it like parse_cluster() -> parse_complex() -> parse_core().

>
># cat /sys/kernel/debug/sched/domains/cpu*/domain*/name
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>MC
>DIE
>
>MC
>DIE
>
>MC
>DIE
>
>DIE
>
>cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd]
>
>cpu0 0
>domain0 03
>domain1 0f
>domain2 ff
>cpu1 0
>domain0 03
>domain1 0f
>domain2 ff
>cpu2 0
>domain0 0c
>domain1 0f
>domain2 ff
>cpu3 0
>domain0 0c
>domain1 0f
>domain2 ff
>cpu4 0
>domain0 70
>domain1 ff
>cpu5 0
>domain0 70
>domain1 ff
>cpu6 0
>domain0 70
>domain1 ff
>cpu7 0
>domain0 ff
>
>Like I mentioned earlier, I'm not sure if this additional complexity
>makes sense on mobile systems running EAS (since only CFS load-balancing
>on little CPUs would be affected).
>
>But my hunch is that this setup is what you want for your system. If we
>could agree on this one, that would already be some progress to see the
>entire story here.

Yes, that's what I want, but still a little confused, why we use MC to
describe "cluster" and use CLS describe "complex", can you show some details?

Thanks,
Qing

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-27  2:18                   ` 王擎
  0 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-27  2:18 UTC (permalink / raw)
  To: Dietmar Eggemann, Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz


>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>>
>>>>>>>>> From: Wang Qing <wangqing@vivo.com>
>>>>>>>>>
>>>>>>>>> cluster sched_domain configured by cpu_topology[].cluster_sibling, 
>>>>>>>>> which is set by cluster_id, cluster_id can only get from ACPI.
>>>>>>>>>
>>>>>>>>> If the system does not enable ACPI, cluster_id is always -1, even enable
>>>>>>>>> SCHED_CLUSTER is invalid, this is misleading. 
>>>>>>>>>
>>>>>>>>> So we add SCHED_CLUSTER's dependency on ACPI here.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Any reason why this can't be extended to support DT based systems via
>>>>>>>> cpu-map in the device tree. IMO we almost have everything w.r.t topology
>>>>>>>> in DT and no reason to deviate this feature between ACPI and DT.
>>>>>>>>
>>>>>>> That's the problem, we parse out "cluster" info according to the
>>>>>>> description in cpu-map, but do assign it to package_id, which used to
>>>>>>> configure the MC sched domain, not cluster sched domain.
>>>>>>>
>>>>>>
>>>>>> Right, we haven't updated the code after updating the bindings to match
>>>>>> ACPI sockets which are the physical package boundaries. Clusters are not
>>>>>> the physical boundaries and the current topology code is not 100% aligned
>>>>>> with the bindings after Commit 849b384f92bc ("Documentation: DT: arm: add
>>>>>> support for sockets defining package boundaries")
>>>>>
>>>>> I see, but this commit is a long time ago, why hasn't it been used widely.
>>>>> Maybe I can help about it if you need.
>>>>>
>>>>
>>>> I assume no one cared or had a requirement for the same.
>>>
>>> It took me a while to find the root cause why enabling SCHED_CLUSTER
>>> didn't work.
>>>
>>> We should add SCHED_CLUSTER's dependency before implementation.
>>> Otherwise, everyone who doesn't have ACPI but use SCHED_CLUSTER 
>>> will have this problem.
>>>
>> 
>> I am fine with that or mark it broken for DT, but ideally I wouldn't
>> want to create unnecessary dependency on ACPI or DT when both supports
>> the feature.
>
>IMHO trying to introduce SCHED_COMPLEX for DT next to the linux-wide
>available SCHED_CLUSTER (used only for ACPI right now) is the wrong way.
>
>_If_ asymmetric sub-clustering of CPUs underneath LLC (L3) makes any
>sense on ARMv9 single DSU systems like:
>
>      .---------------.
>CPU   |0 1 2 3 4 5 6 7|
>      +---------------+
>uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>      +---------------+
>  L2  |   |   | | | | |
>      +---------------+
>  L3  |<--         -->|
>      +---------------+
>      |<-- cluster -->|
>      +---------------+
>      |<--   DSU   -->|
>      '---------------'
>
>(and I'm not saying here it does!) then the existing SCHED_CLUSTER
>should be used in DT as well. Since it provides exactly the same
>functionality for the task scheduler no matter whether it's setup from
>ACPI or DT.
>
>parse_cluster() -> parse_core() should be changed to be able to decode
>both id's (package_id and cluster_id) in this case.

Totally agree, but not implemented yet. Because now cluster_id is used
to describe the package/socket, the modification will involve all DTS.

>
>DT's cpu_map would have to be changed to code 2 level setups.
>
>For a system like the one above it should look like:
>
>                cpu-map {
>                        cluster0 {
>                                foo0 {
>                                        core0 {
>                                                cpu = <&cpu0>;
>                                        };
>                                        core1 {
>                                                cpu = <&cpu1>;
>                                        };
>                                };
>                                foo2 {
>                                        core2 {
>                                                cpu = <&cpu2>;
>                                        };
>                                        core3 {
>                                                cpu = <&cpu3>;
>                                        };
>                                };
>                        };
>                        cluster1 {
>                                core0 {
>                                        cpu = <&cpu4>;
>                                };
>                                core1 {
>                                        cpu = <&cpu5>;
>                                };
>                                core2 {
>                                        cpu = <&cpu6>;
>                                };
>                        };
>                        cluster2 {
>                                core0 {
>                                        cpu = <&cpu7>;
>                                };
>                        };
>                };
>
>I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
>L2-complexes on the LITTLES and I get:

This system is exactly what I mentioned, but I have a question,
How did you parse out the cluster_id based on foo0/foo2?
Because if ACPI is not used, cluster_id is always -1.

What I want to do is to change the foo0/foo2 to complex0/complex2 here,
then parse it like parse_cluster() -> parse_complex() -> parse_core().

>
># cat /sys/kernel/debug/sched/domains/cpu*/domain*/name
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>CLS
>MC
>DIE
>
>MC
>DIE
>
>MC
>DIE
>
>MC
>DIE
>
>DIE
>
>cat /proc/schedstat | awk '{print $1 " " $2 }' | grep ^[cd]
>
>cpu0 0
>domain0 03
>domain1 0f
>domain2 ff
>cpu1 0
>domain0 03
>domain1 0f
>domain2 ff
>cpu2 0
>domain0 0c
>domain1 0f
>domain2 ff
>cpu3 0
>domain0 0c
>domain1 0f
>domain2 ff
>cpu4 0
>domain0 70
>domain1 ff
>cpu5 0
>domain0 70
>domain1 ff
>cpu6 0
>domain0 70
>domain1 ff
>cpu7 0
>domain0 ff
>
>Like I mentioned earlier, I'm not sure if this additional complexity
>makes sense on mobile systems running EAS (since only CFS load-balancing
>on little CPUs would be affected).
>
>But my hunch is that this setup is what you want for your system. If we
>could agree on this one, that would already be some progress to see the
>entire story here.

Yes, that's what I want, but still a little confused, why we use MC to
describe "cluster" and use CLS describe "complex", can you show some details?

Thanks,
Qing
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-27  2:18                   ` 王擎
@ 2022-04-27 15:47                     ` Dietmar Eggemann
  -1 siblings, 0 replies; 24+ messages in thread
From: Dietmar Eggemann @ 2022-04-27 15:47 UTC (permalink / raw)
  To: 王擎, Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz

On 27/04/2022 04:18, 王擎 wrote:
> 
>>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>>>
>>>>>>>>>> From: Wang Qing <wangqing@vivo.com>

[...]

>>      .---------------.
>> CPU   |0 1 2 3 4 5 6 7|
>>      +---------------+
>> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>>      +---------------+
>>  L2  |   |   | | | | |
>>      +---------------+
>>  L3  |<--         -->|
>>      +---------------+
>>      |<-- cluster -->|
>>      +---------------+
>>      |<--   DSU   -->|
>>      '---------------'
>>
>> (and I'm not saying here it does!) then the existing SCHED_CLUSTER
>> should be used in DT as well. Since it provides exactly the same
>> functionality for the task scheduler no matter whether it's setup from
>> ACPI or DT.
>>
>> parse_cluster() -> parse_core() should be changed to be able to decode
>> both id's (package_id and cluster_id) in this case.
> 
> Totally agree, but not implemented yet. Because now cluster_id is used
> to describe the package/socket, the modification will involve all DTS.

You're talking about the fact that 1. level clusterX (1) in cpu_map is
used to set `cpu_topology[cpu].package_id`, right? That's the
information for the DIE layer.
The first level cluster[0,1,2] spawn all 8 CPUs and form 3 groups of
CPUS (0-3, 4-6, 7), the later sched groups of the DIE sched domain.
We don't have any socketN since it is a single socket system. Think
about a system with 2 DSUs where you would have socket[0,1].

Sub-chapter 4 in `Documentation/devicetree/bindings/cpu/cpu-topology.txt`:

cpu-map {
        socket0 {
                cluster0 { <--- (1)

Sub-chapter 3 says:

  - cluster node

    ... A system can contain several layers of clustering within a   
    single physical socket and cluster nodes can be contained in parent 
    cluster nodes.

    A cluster node's child nodes must be:
      one or more cluster nodes

This multi-level cluster thing hasn't been coded yet.

parse_cluster()

    ...
    /*
     * First check for child clusters; we currently ignore any
     * information about the nesting of clusters and present the
     * scheduler with a flat list of them.
     */
    ...

[...]

>> I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
>> L2-complexes on the LITTLES and I get:
> 
> This system is exactly what I mentioned, but I have a question,
> How did you parse out the cluster_id based on foo0/foo2?
> Because if ACPI is not used, cluster_id is always -1.

I haven't put in the extension to decode a 2-level clusterX cpu_map in
parse_cluster() -> parse_core(). I just did a mock-up for illustration
purpose inside parse_core() for my H960 with a 4-3-1 cpu-map:

cpu-map
    cluster0
        core0
        core1
        core2
        core3
    cluster1
        core0
        core1
        core2
    cluster2
        core0

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1d6636ebaac5..8af40f13fdb5 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -529,6 +529,11 @@ static int __init parse_core(struct device_node *core, int package_id,
 
                cpu_topology[cpu].package_id = package_id;
                cpu_topology[cpu].core_id = core_id;
+
+               /* mock-up CLS SD on 4-3-1 Armv9 DSU cluster w/ L2-complexes */
+               if (cpu <= 3)
+                       cpu_topology[cpu].cluster_id = cpu / 2;
+
        } else if (leaf && cpu != -ENODEV) {
                pr_err("%pOF: Can't get CPU for leaf core\n", core);
                return -EINVAL;

And on the scheduler-side I only had to enable CONFIG_SCHED_CLUSTER and
everything worked just fine, no need for any arm64-specific topo table
and alike.

IMHO, this is what you have to do. Make a 2 level cluster cpumap:

cpu-map
    cluster0
        cluster0
            core0
            core1
        cluster1
            core2
            core3
    cluster1
        core0
        core1
        core2
    cluster2
        core0

parse-able and set `cpu_topology[cpu].cluster_id` in parse_core().
 
> What I want to do is to change the foo0/foo2 to complex0/complex2 here,
> then parse it like parse_cluster() -> parse_complex() -> parse_core().

You should read `Documentation/devicetree/bindings/cpu/cpu-topology.txt`
and implement the multi-level cluster approach instead. Big advantage
would be that there won't be any DT related changes/extensions needed.

[...]

> Yes, that's what I want, but still a little confused, why we use MC to
> describe "cluster" and use CLS describe "complex", can you show some details?

The DT entity `cluster` has nothing to do with the task scheduler domain
name `SCHED_CLUSTER`. The name is actually meaningless and just there for
debug purposes.

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-27 15:47                     ` Dietmar Eggemann
  0 siblings, 0 replies; 24+ messages in thread
From: Dietmar Eggemann @ 2022-04-27 15:47 UTC (permalink / raw)
  To: 王擎, Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz

On 27/04/2022 04:18, 王擎 wrote:
> 
>>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>>>
>>>>>>>>>> From: Wang Qing <wangqing@vivo.com>

[...]

>>      .---------------.
>> CPU   |0 1 2 3 4 5 6 7|
>>      +---------------+
>> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>>      +---------------+
>>  L2  |   |   | | | | |
>>      +---------------+
>>  L3  |<--         -->|
>>      +---------------+
>>      |<-- cluster -->|
>>      +---------------+
>>      |<--   DSU   -->|
>>      '---------------'
>>
>> (and I'm not saying here it does!) then the existing SCHED_CLUSTER
>> should be used in DT as well. Since it provides exactly the same
>> functionality for the task scheduler no matter whether it's setup from
>> ACPI or DT.
>>
>> parse_cluster() -> parse_core() should be changed to be able to decode
>> both id's (package_id and cluster_id) in this case.
> 
> Totally agree, but not implemented yet. Because now cluster_id is used
> to describe the package/socket, the modification will involve all DTS.

You're talking about the fact that 1. level clusterX (1) in cpu_map is
used to set `cpu_topology[cpu].package_id`, right? That's the
information for the DIE layer.
The first level cluster[0,1,2] spawn all 8 CPUs and form 3 groups of
CPUS (0-3, 4-6, 7), the later sched groups of the DIE sched domain.
We don't have any socketN since it is a single socket system. Think
about a system with 2 DSUs where you would have socket[0,1].

Sub-chapter 4 in `Documentation/devicetree/bindings/cpu/cpu-topology.txt`:

cpu-map {
        socket0 {
                cluster0 { <--- (1)

Sub-chapter 3 says:

  - cluster node

    ... A system can contain several layers of clustering within a   
    single physical socket and cluster nodes can be contained in parent 
    cluster nodes.

    A cluster node's child nodes must be:
      one or more cluster nodes

This multi-level cluster thing hasn't been coded yet.

parse_cluster()

    ...
    /*
     * First check for child clusters; we currently ignore any
     * information about the nesting of clusters and present the
     * scheduler with a flat list of them.
     */
    ...

[...]

>> I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
>> L2-complexes on the LITTLES and I get:
> 
> This system is exactly what I mentioned, but I have a question,
> How did you parse out the cluster_id based on foo0/foo2?
> Because if ACPI is not used, cluster_id is always -1.

I haven't put in the extension to decode a 2-level clusterX cpu_map in
parse_cluster() -> parse_core(). I just did a mock-up for illustration
purpose inside parse_core() for my H960 with a 4-3-1 cpu-map:

cpu-map
    cluster0
        core0
        core1
        core2
        core3
    cluster1
        core0
        core1
        core2
    cluster2
        core0

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 1d6636ebaac5..8af40f13fdb5 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -529,6 +529,11 @@ static int __init parse_core(struct device_node *core, int package_id,
 
                cpu_topology[cpu].package_id = package_id;
                cpu_topology[cpu].core_id = core_id;
+
+               /* mock-up CLS SD on 4-3-1 Armv9 DSU cluster w/ L2-complexes */
+               if (cpu <= 3)
+                       cpu_topology[cpu].cluster_id = cpu / 2;
+
        } else if (leaf && cpu != -ENODEV) {
                pr_err("%pOF: Can't get CPU for leaf core\n", core);
                return -EINVAL;

And on the scheduler-side I only had to enable CONFIG_SCHED_CLUSTER and
everything worked just fine, no need for any arm64-specific topo table
and alike.

IMHO, this is what you have to do. Make a 2 level cluster cpumap:

cpu-map
    cluster0
        cluster0
            core0
            core1
        cluster1
            core2
            core3
    cluster1
        core0
        core1
        core2
    cluster2
        core0

parse-able and set `cpu_topology[cpu].cluster_id` in parse_core().
 
> What I want to do is to change the foo0/foo2 to complex0/complex2 here,
> then parse it like parse_cluster() -> parse_complex() -> parse_core().

You should read `Documentation/devicetree/bindings/cpu/cpu-topology.txt`
and implement the multi-level cluster approach instead. Big advantage
would be that there won't be any DT related changes/extensions needed.

[...]

> Yes, that's what I want, but still a little confused, why we use MC to
> describe "cluster" and use CLS describe "complex", can you show some details?

The DT entity `cluster` has nothing to do with the task scheduler domain
name `SCHED_CLUSTER`. The name is actually meaningless and just there for
debug purposes.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
  2022-04-27 15:47                     ` Dietmar Eggemann
@ 2022-04-28  0:04                       ` 王擎
  -1 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-28  0:04 UTC (permalink / raw)
  To: Dietmar Eggemann, Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz


>> 
>>>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>>>>
>>>>>>>>>>> From: Wang Qing <wangqing@vivo.com>
>
>[...]
>
>>>      .---------------.
>>> CPU   |0 1 2 3 4 5 6 7|
>>>      +---------------+
>>> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>>>      +---------------+
>>>  L2  |   |   | | | | |
>>>      +---------------+
>>>  L3  |<--         -->|
>>>      +---------------+
>>>      |<-- cluster -->|
>>>      +---------------+
>>>      |<--   DSU   -->|
>>>      '---------------'
>>>
>>> (and I'm not saying here it does!) then the existing SCHED_CLUSTER
>>> should be used in DT as well. Since it provides exactly the same
>>> functionality for the task scheduler no matter whether it's setup from
>>> ACPI or DT.
>>>
>>> parse_cluster() -> parse_core() should be changed to be able to decode
>>> both id's (package_id and cluster_id) in this case.
>> 
>> Totally agree, but not implemented yet. Because now cluster_id is used
>> to describe the package/socket, the modification will involve all DTS.
>
>You're talking about the fact that 1. level clusterX (1) in cpu_map is
>used to set `cpu_topology[cpu].package_id`, right? That's the
>information for the DIE layer.
>The first level cluster[0,1,2] spawn all 8 CPUs and form 3 groups of
>CPUS (0-3, 4-6, 7), the later sched groups of the DIE sched domain.
>We don't have any socketN since it is a single socket system. Think
>about a system with 2 DSUs where you would have socket[0,1].
>
>Sub-chapter 4 in `Documentation/devicetree/bindings/cpu/cpu-topology.txt`:
>
>cpu-map {
>        socket0 {
>                cluster0 { <--- (1)
>
>Sub-chapter 3 says:
>
>  - cluster node
>
>    ... A system can contain several layers of clustering within a   
>    single physical socket and cluster nodes can be contained in parent 
>    cluster nodes.
>
>    A cluster node's child nodes must be:
>      one or more cluster nodes
>
>This multi-level cluster thing hasn't been coded yet.
>
>parse_cluster()
>
>    ...
>    /*
>     * First check for child clusters; we currently ignore any
>     * information about the nesting of clusters and present the
>     * scheduler with a flat list of them.
>     */
>    ...
>
>[...]
>
>>> I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
>>> L2-complexes on the LITTLES and I get:
>> 
>> This system is exactly what I mentioned, but I have a question,
>> How did you parse out the cluster_id based on foo0/foo2?
>> Because if ACPI is not used, cluster_id is always -1.
>
>I haven't put in the extension to decode a 2-level clusterX cpu_map in
>parse_cluster() -> parse_core(). I just did a mock-up for illustration
>purpose inside parse_core() for my H960 with a 4-3-1 cpu-map:
>
>cpu-map
>    cluster0
>        core0
>        core1
>        core2
>        core3
>    cluster1
>        core0
>        core1
>        core2
>    cluster2
>        core0
>
>diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>index 1d6636ebaac5..8af40f13fdb5 100644
>--- a/drivers/base/arch_topology.c
>+++ b/drivers/base/arch_topology.c
>@@ -529,6 +529,11 @@ static int __init parse_core(struct device_node *core, int package_id,
> 
>                cpu_topology[cpu].package_id = package_id;
>                cpu_topology[cpu].core_id = core_id;
>+
>+               /* mock-up CLS SD on 4-3-1 Armv9 DSU cluster w/ L2-complexes */
>+               if (cpu <= 3)
>+                       cpu_topology[cpu].cluster_id = cpu / 2;
>+
>        } else if (leaf && cpu != -ENODEV) {
>                pr_err("%pOF: Can't get CPU for leaf core\n", core);
>                return -EINVAL;
>
>And on the scheduler-side I only had to enable CONFIG_SCHED_CLUSTER and
>everything worked just fine, no need for any arm64-specific topo table
>and alike.
>
>IMHO, this is what you have to do. Make a 2 level cluster cpumap:
>
>cpu-map
>    cluster0
>        cluster0
>            core0
>            core1
>        cluster1
>            core2
>            core3
>    cluster1
>        core0
>        core1
>        core2
>    cluster2
>        core0
>
>parse-able and set `cpu_topology[cpu].cluster_id` in parse_core().

Oh, there are many ways I can implement it in my own system, but my
purpose here is to modify the mainline code so that all Android
developers can use it.

> 
>> What I want to do is to change the foo0/foo2 to complex0/complex2 here,
>> then parse it like parse_cluster() -> parse_complex() -> parse_core().
>
>You should read `Documentation/devicetree/bindings/cpu/cpu-topology.txt`
>and implement the multi-level cluster approach instead. Big advantage
>would be that there won't be any DT related changes/extensions needed.
>

Thanks for the advice, I will consider about it.
Qing

>[...]
>
>> Yes, that's what I want, but still a little confused, why we use MC to
>> describe "cluster" and use CLS describe "complex", can you show some details?
>
>The DT entity `cluster` has nothing to do with the task scheduler domain
>name `SCHED_CLUSTER`. The name is actually meaningless and just there for
>debug purposes.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI
@ 2022-04-28  0:04                       ` 王擎
  0 siblings, 0 replies; 24+ messages in thread
From: 王擎 @ 2022-04-28  0:04 UTC (permalink / raw)
  To: Dietmar Eggemann, Sudeep Holla
  Cc: Catalin Marinas, Will Deacon, linux-arm-kernel, linux-kernel,
	vincent.guittot, peterz


>> 
>>>> On Tue, Apr 26, 2022 at 06:52:34AM +0000, 王擎 wrote:
>>>>>
>>>>>>>>>>> From: Wang Qing <wangqing@vivo.com>
>
>[...]
>
>>>      .---------------.
>>> CPU   |0 1 2 3 4 5 6 7|
>>>      +---------------+
>>> uarch |l l l l m m m b| (so called tri-gear: little, medium, big)
>>>      +---------------+
>>>  L2  |   |   | | | | |
>>>      +---------------+
>>>  L3  |<--         -->|
>>>      +---------------+
>>>      |<-- cluster -->|
>>>      +---------------+
>>>      |<--   DSU   -->|
>>>      '---------------'
>>>
>>> (and I'm not saying here it does!) then the existing SCHED_CLUSTER
>>> should be used in DT as well. Since it provides exactly the same
>>> functionality for the task scheduler no matter whether it's setup from
>>> ACPI or DT.
>>>
>>> parse_cluster() -> parse_core() should be changed to be able to decode
>>> both id's (package_id and cluster_id) in this case.
>> 
>> Totally agree, but not implemented yet. Because now cluster_id is used
>> to describe the package/socket, the modification will involve all DTS.
>
>You're talking about the fact that 1. level clusterX (1) in cpu_map is
>used to set `cpu_topology[cpu].package_id`, right? That's the
>information for the DIE layer.
>The first level cluster[0,1,2] spawn all 8 CPUs and form 3 groups of
>CPUS (0-3, 4-6, 7), the later sched groups of the DIE sched domain.
>We don't have any socketN since it is a single socket system. Think
>about a system with 2 DSUs where you would have socket[0,1].
>
>Sub-chapter 4 in `Documentation/devicetree/bindings/cpu/cpu-topology.txt`:
>
>cpu-map {
>        socket0 {
>                cluster0 { <--- (1)
>
>Sub-chapter 3 says:
>
>  - cluster node
>
>    ... A system can contain several layers of clustering within a   
>    single physical socket and cluster nodes can be contained in parent 
>    cluster nodes.
>
>    A cluster node's child nodes must be:
>      one or more cluster nodes
>
>This multi-level cluster thing hasn't been coded yet.
>
>parse_cluster()
>
>    ...
>    /*
>     * First check for child clusters; we currently ignore any
>     * information about the nesting of clusters and present the
>     * scheduler with a flat list of them.
>     */
>    ...
>
>[...]
>
>>> I pimped my Hikey 960 to look like one of those Armv9 4-3-1 systems with
>>> L2-complexes on the LITTLES and I get:
>> 
>> This system is exactly what I mentioned, but I have a question,
>> How did you parse out the cluster_id based on foo0/foo2?
>> Because if ACPI is not used, cluster_id is always -1.
>
>I haven't put in the extension to decode a 2-level clusterX cpu_map in
>parse_cluster() -> parse_core(). I just did a mock-up for illustration
>purpose inside parse_core() for my H960 with a 4-3-1 cpu-map:
>
>cpu-map
>    cluster0
>        core0
>        core1
>        core2
>        core3
>    cluster1
>        core0
>        core1
>        core2
>    cluster2
>        core0
>
>diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>index 1d6636ebaac5..8af40f13fdb5 100644
>--- a/drivers/base/arch_topology.c
>+++ b/drivers/base/arch_topology.c
>@@ -529,6 +529,11 @@ static int __init parse_core(struct device_node *core, int package_id,
> 
>                cpu_topology[cpu].package_id = package_id;
>                cpu_topology[cpu].core_id = core_id;
>+
>+               /* mock-up CLS SD on 4-3-1 Armv9 DSU cluster w/ L2-complexes */
>+               if (cpu <= 3)
>+                       cpu_topology[cpu].cluster_id = cpu / 2;
>+
>        } else if (leaf && cpu != -ENODEV) {
>                pr_err("%pOF: Can't get CPU for leaf core\n", core);
>                return -EINVAL;
>
>And on the scheduler-side I only had to enable CONFIG_SCHED_CLUSTER and
>everything worked just fine, no need for any arm64-specific topo table
>and alike.
>
>IMHO, this is what you have to do. Make a 2 level cluster cpumap:
>
>cpu-map
>    cluster0
>        cluster0
>            core0
>            core1
>        cluster1
>            core2
>            core3
>    cluster1
>        core0
>        core1
>        core2
>    cluster2
>        core0
>
>parse-able and set `cpu_topology[cpu].cluster_id` in parse_core().

Oh, there are many ways I can implement it in my own system, but my
purpose here is to modify the mainline code so that all Android
developers can use it.

> 
>> What I want to do is to change the foo0/foo2 to complex0/complex2 here,
>> then parse it like parse_cluster() -> parse_complex() -> parse_core().
>
>You should read `Documentation/devicetree/bindings/cpu/cpu-topology.txt`
>and implement the multi-level cluster approach instead. Big advantage
>would be that there won't be any DT related changes/extensions needed.
>

Thanks for the advice, I will consider about it.
Qing

>[...]
>
>> Yes, that's what I want, but still a little confused, why we use MC to
>> describe "cluster" and use CLS describe "complex", can you show some details?
>
>The DT entity `cluster` has nothing to do with the task scheduler domain
>name `SCHED_CLUSTER`. The name is actually meaningless and just there for
>debug purposes.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-04-28  0:05 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-25  2:55 [PATCH V2] arm64: add SCHED_CLUSTER's dependency on ACPI Qing Wang
2022-04-25  2:55 ` Qing Wang
2022-04-25 10:06 ` Sudeep Holla
2022-04-25 10:06   ` Sudeep Holla
2022-04-25 11:18   ` 王擎
2022-04-25 11:18     ` 王擎
2022-04-25 16:59     ` Sudeep Holla
2022-04-25 16:59       ` Sudeep Holla
2022-04-26  2:23       ` 王擎
2022-04-26  2:23         ` 王擎
2022-04-26  6:40         ` Sudeep Holla
2022-04-26  6:40           ` Sudeep Holla
2022-04-26  6:52           ` 王擎
2022-04-26  6:52             ` 王擎
2022-04-26 13:25             ` Sudeep Holla
2022-04-26 13:25               ` Sudeep Holla
2022-04-26 19:14               ` Dietmar Eggemann
2022-04-26 19:14                 ` Dietmar Eggemann
2022-04-27  2:18                 ` 王擎
2022-04-27  2:18                   ` 王擎
2022-04-27 15:47                   ` Dietmar Eggemann
2022-04-27 15:47                     ` Dietmar Eggemann
2022-04-28  0:04                     ` 王擎
2022-04-28  0:04                       ` 王擎

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.